Limiting Sets to Roots Only

There may be times when you want to limit set size to root words only from your search query. This is the default in tsql, Vortex and the Search Appliance. If using the MM3 API, or if equivalences have been enabled with keepeqvs (Vortex/tsql) or Synonyms (Search Appliance), then equivalences can be excluded for a particular word - while still retaining morpheme processing - by preceding the word with a tilde (~). This should be done where you wish to look for intersections of concepts while holding abstraction to a minimum, and you do not want any automatic expansion of those words into word lists. If equivalences are not currently enabled via keepeqvs/Synonyms, then the ~ operator has the opposite effect: it enables equivalences for that term only. Thus, it always toggles the pre-set behavior for its word.

Regardless of settings, you can also give explicit equivalences directly in the query, instead of the thesaurus-provided ones, using parentheses; see here.

Where you wish to use morpheme processing on root words only, restricting concept expansion completely, turn off Equivalence File access altogether by setting the global APICP flag "keepeqvs" off. Where "keepeqvs" is set to off, the tilde (~) is not required (indeed, it would re-enable concept expansion).

Restricting set expansion is useful for proper nouns which you do not want expanded into abstract concepts (e.g., President Bush), technical or legal terminology, or simply any precise discrete search. Selectively cut off the set expansion by designating a tilde (~) preceding the word you are looking for.

Using REX syntax by preceding the word with a forward slash (/), can further delimit the pattern you are looking for, though in a different manner. Note however, that using non-SPM/PPM pattern matchers such as REX may slow the query, as Metamorph indexes cannot be used for such terms.

To check this out for yourself, in an application where Metamorph hit markup has been set up, compare the results of the following queries (assuming Vortex or tsql defaults):

Query 1:    President  Bush
     Query 2:   ~President ~Bush
     Query 3:   "President Bush"
     Query 4:   /President /Bush
     Query 5:   /\RPresident /\RBush

Query 1
President Bush: In the first example you would get any hit containing an occurrence of the word "President" and the word "Bush", including other related word forms (suffixes etc.). So you would get a hit like " President Bush came to tea."; as well as "Bush attended a conference of corporation presidents." There are no equivalences added to the "President" set, or the "Bush" set.

Query 2
~President ~Bush: In the second example you have elected to keep the full set size, so you would obtain references to "President" and "Bush" while also allowing for other abstractions. Since the word "chief" is associated with "president", and the word "jungle" is associated with "bush", you would retrieve a sentence such as, "We met the chief at his home deep in the Amazon jungle."

Query 3
"President Bush": The third example calls for "President" and "Bush" as a two word phrase by putting it in quotes, so that it will be treated as one set rather than as two. It has no equivalences, because the phrase "President Bush" has no equivalences known by the Equivalence File; you could add equivalences to that phrase if you wished by editing the User Equiv File. While you would retrieve the hit "President Bush came to tea.", you would exclude the hit "Bush attended a conference of corporation presidents." You would get a hit like "We elected a new president Bush."

Query 4
/President /Bush: In the fourth example you have limited the root word set in a different way. Signalling REX with the forward slash `/' means that you will use REX to accomplish a string search on whatever comes after the `/'. Therefore, you can find "Our president's name is Bush." and "We planted those bushes near the President's house." This search gets similar yet different results than Example 1. Look at exactly what is highlighted by the Metamorph hit markup to see the difference in what was located.

Query 5
/\RPresident /\RBush: In the fifth example there is better reason to use REX syntax, so that you can limit the set even further by specifying proper nouns only. The designation `\R' means to "respect case", and would retrieve the sentence "President Bush came to tea.", but would rule out the sentence "Bush attended a conference of corporation presidents." It would also rule out the hit "We elected a new president Bush.", and "We planted those bushes near the President's house."

NOTE: In the previous example, the "respect case" designation (\R) must follow a forward slash (/) which indicates that REX syntax follows. Remember that words in a query are not case sensitive unless you so designate, using REX. (See Chapter on REX syntax.)


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.