THUNDERSTONE NEWS

June 2003 - Archive

CONTENTS


NEW FEATURES

File Splitting.   Some customers have very large documents.  A single PDF or word processor file can hold an entire book.  When indexing with Webinator, the default behavior is to treat each file as a single record.  It may be more user-friendly, however, to split such files into pieces, so that search results point to the relevant sections of the original document.

The new Plugin Split option makes that easy.  For example, you can specify that PDFs be broken into a separate record for each page, or each n pages.   Documents without page markers may be split at an arbitrary number of characters.

The feature is now part of the Webinator and Texis File-Format plug-in (anytotx) from version 4.3 on. Texis maintenance customers or those with Webinator paid versions 4.0+ may request a copy of the new plug-in from Tech Support. Other customers may obtain the new plug-in by upgrading Webinator or joining Texis Maintenance.

Lock and Run.   A new utility lockandrun is now part of Texis, which allows you to lock some or all tables in a database while an arbitrary shell command is run, for example a backup process.


CUSTOMER SPOTLIGHT: QVC

Most Americans know QVC by its shopping channel on television. QVC's web site, however, also is one of the most successful online retailers, bringing in a substantial portion of QVC's $4.4 billion of sales yearly.

Thunderstone's Texis software plays an important role at QVC in a variety of ways.

Unlike television sales, where each item is on sale for only a few minutes at a time, QVC's entire inventory is available all the time on its web site. The online database thus has hundreds of thousands of products, and requires a robust search engine to help users find the items that satisfy them.

more...

THUNDERSTONE AT HOMELAND SECURITY EXPO

Technology customers in the U.S. government and defense sectors are invited to stop by and meet us at the Symposium on Information Sharing & Homeland Security, June 30 and July 1 in Philadelphia. Approximately 100 other technology vendors also will be exhibiting. To be admitted to the exhibit hall without charge, mention that you are a guest of Thunderstone.


FEATURE SPOTLIGHT: The Texis Profiler

The Profiler is one of the under-appreciated capabilities of Texis. It optimizes the handling of stored queries. Your users might greatly appreciate having a stored query feature.

Originally the Profiler was developed for monitoring newswires. However, it could be useful in many other applications, such as tracking message board content, email, or new pages found by the Webinator walker.

The Profiler can efficiently power a real-time notification service, if needed. Even if you just send notifications as email, using the profiler insures that your users always have the most up-to-date information, whenever they choose to check their mail.

Technically, the essence of the Profiler is that it turns the standard search model on its head: Queries are stored in a table and indexed, and a new message or document is turned into a query against that table.

Suppose you have 10,000 stored queries. With the Profiler, only one SQL SELECT need be done to test each new item and find out which of the 10,000 users to notify about it. That's as opposed to a batch approach where you run 10,000 queries every so often. And the other disadvantage of a batch approach is that the notifications will be less timely.

Setting up the profiler involves these steps:
1. Queries are stored in a profiles table.
2. A Metamorph counter index is created on the query field of the profiles table.
3. The Vortex INIT command loads the words from the index into memory.
4. The GET command trims new items (news, messages etc.) down to only the words matching at least one query.
5. A query using the trimmed item and LIKEIN is run against the profiles table.
6. The result is the list of matching profiles or users to receive notifications.

More detail is in the tutorial chapter 16 and the Vortex manual Profiler section.


Feedback, suggestions and questions are welcome to
Copyright © 2024 Thunderstone Software LLC. All rights reserved.