THUNDERSTONE NEWS

October 2007 - Archive

CONTENTS


THUNDERSTONE PARAMETRIC SEARCH APPLIANCE SIMPLIFIES RETRIEVAL OF FIELDED DATA

Simultaneously Finds Text and Structured Data with Desired Attributes

On 21 September 2007 Thunderstone Software LLC released its newest product, the Thunderstone Parametric Search Appliance.

Parametric search is a search based on parameters, attributes or data fields associated with other full-text content. It is an “advanced search” that simultaneously satisfies a number of criteria (the parameters of the search.) For example, online shoppers routinely want to search for in-stock items in a chosen category/subcategory with a specified style, color, size and price range (parameters) plus a full-text description of the product.

Engineered to exceed expectations in the enterprise, the Thunderstone Parametric Search Appliance enables full-text, keyword searches combined with a user-selected “filter” on up to 50 data fields quickly returning results and accurately matching the desired data attributes. In addition the results can be sorted and/or grouped by any of the attributes.

Online publishers and news archives can use the Thunderstone Parametric Search Appliance to provide an easy way for searchers to select multiple document attributes and to quickly find relevant material with a specified creation date, article format, subject matter, author name, etc.

Vertical search portals can use the Thunderstone Parametric Search Appliance to provide an easy way for searchers to select multiple topic attributes and to quickly find timely information that corresponds to specified persons, internal/external web sources, particular companies, geographical locations, etc.

Powered by Thunderstone's Texis software, the only fully integrated SQL RDBMS that intelligently queries and manages databases containing natural language text, standard data types, geographic information and a wide range of other payload data, the Thunderstone Parametric Search Appliance makes it easier to take advantage of Thunderstone's unique marriage of structured and unstructured data retrieval in a plug-and-play environment. And, like all enterprise-class search appliances in the Thunderstone product line, it can:

  • Crawl, index and make searchable large quantities of web content (including targeted third-party sites,) file server-based content, database content and (through a library of available connectors) other enterprise-based content in a variety of proprietary formats.
  • Perform concept-based searching, natural language processing, numeric queries, wildcarding, fuzzy logic, standard/custom thesaurus queries and metasearch across collections.

NEW PRODUCTS SPECIAL OFFERS

Thunderstone Parametric Search Appliance

The NEW Thunderstone Parametric Search Appliance is being offered at an introductory price of 10% off until the end of November. Demonstrations and evaluations of the Parametric Search Appliance are available using your own data so you can see the immediate impact of having an easy to use complete search.

Thunderstone Search Appliance Connectors

We are extending our introductory offer of up to 25 percent off on the connectors being offered by Persistent Systems until the end of 2007 to ensure that you have time to test out these easy to use ways to load data into the Thunderstone Search Appliance. We currently have a test lab set up, and we invite anybody interested in seeing the connectors in action to contact us at +1 216 820 2200.


Quote of the Month

“All the laws of nature are conditional statements which permit a prediction of some future events on the basis of the knowledge of the present, except that some aspects of the present state of the world, in practice the overwhelming majority of the determinants of the present state of the world, are irrelevant from the point of view of the prediction.”

Eugene P. Wigner, in his 1960 essay:
The Unreasonable Effectiveness of Mathematics in the Natural Sciences


Happenings

Tom Schaefer of the Government Services Administration gave a September 2007 presentation to the North America Unisys User Association (http://www.unite.org) in Valley Forge, PA and shared technical details about GSA's successful implementation of the Thunderstone Search Appliance as the standardized search platform for his organization's government vehicle auction site.

Peter Thusat, Communication Director & CMO of Thunderstone Software LLC, met with former Cleveland Tech Czar Michael C. DeAloia, who recently joined SchoolOne LLC in Cleveland, Ohio as Senior Director of Business Development. Peter also interviewed Jesus Carrillo, Director of Information Technology at Trade Press Publishing Corporation and Derek Matthews, Lead Knowledge Architect at Ariba, Inc. for two new Thunderstone case studies on Webinator.

Thunderstone was awarded a contract by the Administrative Office of the U.S. Courts to expand the Administrative Office's use of their Thunderstone Search Appliance to three million documents.


Customer Quote

“The setup and deployment of Webinator is extremely easy and straightforward. All the core functionality is there plus the ability to access the source code and be as creative and as customized as your capabilities will let you be. In other words, Thunderstone doesn't hold you back. Thunderstone lets you take the product to whatever level you're ready, willing and able to take it. For that reason we've stuck with it, we've used it, and it's been great in that regard. That's not something you're going to get from the Googles of the world.”

Jesus Carrill
Director of Information Technology
Trade Press Publishing Corporation


TECH TIPS: USING RESULTS AUTHORIZATION

Results Authorization allows restriction of search results to authorized users only, on a per-URL basis. Only users with access to a given URL will ever see that URL in a result list, instead of all users seeing all matches (and potentially being denied access to results already shown).

Access to a URL, as well as the namespace of users, is determined by the URL's origin server, not the Thunderstone Search Appliance, so no reconfiguration of users or access is needed -- the pre-existing server access controls are just forwarded by the Thunderstone Search Appliance. And since access is determined on a per-result, not per-search, basis, a single profile can serve a multitude of users with any combination of whole/partial access to the underlying data.

Results Authorization works at search time by accessing each potential search result URL with the user's credentials. Only URLs authorized to that user are then shown in search results. The authentication method(s) used will depend on the existing system(s) already used by the indexed URLs. Since the authorization is done at search time it means that any changes to the permissions on a file or page will be immediately reflected, however the server must be available at search time, and the user will not be shown results for servers that are down.

Various schemes are supported:

  • None: No access verification; return all search results to all users. This is the default.
  • Cookie-based: Custom HTML-form-based single-sign-on systems. Users first login on a web server (not a Windows workstation login), which then sends an access cookie to the user's browser. This cookie is automatically returned to the server when accessing future pages, and grants the user access.
  • Basic: HTTP Basic authentication, for web servers.
  • NTLM: Windows NTLM authentication, for web servers.
  • SMB/Windows: SMB for Windows file servers.

For cookie-based systems, the Thunderstone Search Appliance will merely forward the cookies the user has already received from the site login page. For all others (Basic/NTLM/SMB), the Thunderstone Search Appliance must prompt for the user and password directly, as they are needed to verify result URLs. In the latter case, credentials will then be stored in a cookie by the Thunderstone Search Appliance so that future searches do not need to re-prompt for a login. Note that NFS-mounted file servers are not currently supported by Results Authorization, due to limitations of NFS.

Support for Single Sign On (SSO) via Active Directory is currently being tested, and if you are interested in a preview of the authentication module please contact us.

Results Authorization Crawl Settings

The Thunderstone Search Appliance itself needs read access to the entire set of URLs in order to build a search index. Therefore, before walking a protected data set for Results Authorization, it may be necessary to fill out the Login Info setting under All Walk Settings with a full-access admin type account, so that the Thunderstone Search Appliance can crawl the data.

Or it may be necessary to fill out a Primer URL containing login info to submit to a site's login form, so that the Thunderstone Search Appliance can obtain the login cookies needed for access to the rest of the site. If the login form requires the data to be posted, or you want to hide the login credentials from the web server log you can force a POST instead of a GET by using http-post in the URL, for example:

http-post://server/login.html?user=UserName&pass=Password

Results Authorization Search Settings

After a successful crawl, Results Authorization is configured with the Results Authorization Options group on the Search Settings page. The primary setting is Authorization Method, which is determined by the authentication system(s) in use by the indexed URLs. If cookie-based, this is set to Forward login cookies; for all other systems, it is set to Basic/NTLM/file - Prompt via form. Most of the remaining settings depend on which method was selected; see the Authorization Method setting for details.

There are also a few resource/tuning settings, such as Max Docs to Auth-Check, Successful Auth Result Limit, Total Auth Timeout, and Debug Results Authorization, which are not required, but merely fine-tune the results.


Feedback, suggestions and questions are welcome to

Copyright © 2024 Thunderstone Software LLC. All rights reserved.