Entity Recognition

With entity recognition, the Parametric Search Appliance can identify and extract document information that is not tagged as metadata, and thus may normally be unavailable for use by Group By or other features beyond keyword search. For example, cities, states or other proper names may be embedded in document text, instead of specifically set as <meta> tags. An entity can be defined to match such data, populate a parametric field (here) and thus enable a Group By facet.

An entity is composed of dictionary terms and/or regular expression patterns to search for, plus optional settings. These values are uploaded in an XML or text file to the Parametric Search Appliance, defining an entity. Once defined, entities may then be used with Data from Field (here) to populate parametric fields during walks.

The first step in creating an entity is creating an XML file with its terms, patterns and settings. (Alternatively, a plain text file may be created; see here.)



Copyright © Thunderstone Software     Last updated: Apr 18 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.