These settings affect the way that text searches are performed. They are
equivalent to changing the corresponding parameter in the profile, or by
calling the Metamorph API function to set them (if there is an equivalent).
They are:
- minwordlen
- The smallest a word can get due to suffix and prefix
removal. Removal of trailing vowel or double consonant can make it a
letter shorter than this. Default 255.
- keepnoise
- Whether noise words should be stripped from the query
and index. Default off.
- suffixproc
- Whether suffixes should be stripped from the words to
find a match. Default on.
- prefixproc
- Whether prefixes should be stripped from the words to
find a match. Turning this on is not suggested when using a Metamorph
index. Default off.
- rebuild
- Make sure that the word found can be built from the root
and appropriate suffixes and prefixes. This increases the accuracy of
the search. Default on.
- useequiv
- Perform thesaurus lookup. If this is on then the word
and all equivalences will be searched for. If it is off then only the
query word is searched for. Default off. Aka keepeqvs
in version 5.01.1171414736 20070213 and later.
- inc_sdexp
-
Include the start delimiter as part of the hit. This is not
generally useful in Texis unless hit offset information is being
retrieved. Default off.
- inc_edexp
-
Include the end delimiter as part of the hit. This is not
generally useful in Texis unless hit offset information is being
retrieved. Default on.
- sdexp
- Start delimiter to use: a regular expression to match
the start of a hit. The default is no delimiter.
- edexp
- End delimiter to use: a regular expression to match
the start of a hit. The default is no delimiter.
- intersects
- Default number of intersections in Metamorph
queries; overridden by the
@
operator. Added in version
7.06.1530212000 20180628.
- hyphenphrase
- Controls whether a hyphen between words searches
for the phrase of the two words next to each other, or searches for
the hyphen literally. The default value of 1 will search for the two
words as a phrase. Setting it to 0 will search for a single term
including the hyphen. If you anticipate setting hyphenphrase to 0 then
you should modify the index word expression to include hyphens.
- wordc
- For language or wildcard query terms during linear
(non-index) searches, this defines which characters in the document
consitute a word. When a match is found for language/wildcard
terms, the hit is expanded to include all surrounding word
characters, as defined by this setting. The resulting expansion
must then match the query term for the hit to be valid. (This
prevents the query "
pond
" from inadvertently matching the
text "correspondence
", for example.) The value is
specified as a REX character set. The default setting is
[\alpha\']
which corresponds to all letters and apostrophe.
For example, to exclude apostrophe and include digits use:
set wordc='[\alnum]'
Added in version 3.00.942260000. Note
that this setting is for linear searches: what constitutes a word
for Metamorph index searches is controlled by the index
expressions (addexp property, here).
Also note that non-language, non-wildcard query terms (e.g.
123
with default settings) are not word-expanded.
- langc
- Defines which characters make a query term a language
term. A language term will have prefix/suffix processing applied
(if enabled), as well as force the use of wordc to qualify the
hit (during linear searches). Normally langc should be set
the same as wordc with the addition of the phrase characters
space and hyphen. The default is
[\alpha\' \-]
Added in
version 3.00.942260000.
- withinmode
-
A space- or comma-separated unit and optional type for the
"within-N" operator (e.g.
w/5
). The unit is one of:
-
char
for within-N characters -
word
for within-N words
The optional type determines what distance the operator measures.
It is one of the following:
-
radius
(the default if no type is specified when
set) indicates all sets must be within a radius N of an
"anchor" set, i.e. there is a set in the match such that all
other sets are within N units right of its right edge or N
units left of its left edge. -
span
indicates all sets must be within an N-unit
span
Added in version 4.04.1077930936 20040227. The optional type was
added in version 5.01.1258712000 20091120; previously the only
type was implicitly radius
. In version 5 and earlier the
default setting was char
(i.e. char radius); in
version 6 and later the default is word span.
- phrasewordproc
-
Which words of a phrase to do suffix/wildcard processing on. The
possible values are
mono
to treat the phrase as a
monolithic word (i.e. only last word processed, but entire phrase
counts towards minwordlen); none
for no
suffix/wildcard processing on phrases; or last
to process just
the last word.
Note that a phrase is multi-word, i.e. a single word in double-quotes
is not considered a phrase, and thus phrasewordproc does not apply.
Added in version 4.03.1082000000 20040414. Mode none
supported in version 5.01.1127760000 20050926.
- mdparmodifyterms
-
If nonzero, allows the Metamorph query parser to modify search terms
by compression of whitespace and quoting/unquoting. This is for
back-compatibility with earlier versions; enabling it will break the
information from bit 4 of
mminfo()
(query offset/lengths of
sets). Added in version 5.01.1220640000 20080905.
Copyright © Thunderstone Software Last updated: Apr 15 2024