Setting Minimum Word Length

Minimum word length is the approximate number of significant characters the program will deal with at a morpheme level. You increase it to obtain more exactness to the search pattern entered, and decrease it for less exactness to that pattern. The smaller the minimum word length, the slower the search will be, although the difference may be imperceptible.

Vortex scripts, tsql and the Search Appliance use a default minimum word length of 255, which essentially turns off morpheme stripping and allows for exact searching in locating documents. To use morpheme processing on a content oriented, English-smart application, set the APICP flag for minwordlen to 5. From years of experience we have established this as the best place to start, and really do not advise changing this setting arbitrarily.

As applies to the Morpheme Stripping Routine, note the following: in general (about 90% of the time) these rules are followed exactly, and the word would never be stripped smaller than the set length. But in certain cases to take into account certain overlapping rules and/or idiosyncrasies as it sees fit, it will sometimes strip down further than minimum word length; but never more than 1 character.


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.