During query processing, setmmapi()
will call two user callback
functions to perform editing on the query terms. The processing
sequence is as follows:
(*eqedit2)()
is the recommended method for implementing on the fly
equiv editing because it is easier to use. (*eqedit)()
is
available for backwards compatibility.
int (*eqedit2)(APICP *,EQVLST ***):
This function is always called after a successful equiv lookup and
before the search begins. It is called with the current APICP
pointer and a pointer to the list of equivs generated by the query
(see the description of lists). The list pointer may be
reassigned as needed.
The return value from (*eqedit2)()
determines whether to go ahead
with the search or not. A return value of 0
means OK, go ahead
with the search. A return value of anything else means ERROR
,
don't do search. An ERROR
return from (*eqedit2)()
will then
cause setmmapi()
or openmmapi()
, depending on where it was called
from, to return an error. A NULL
list from (*eqedit2)()
is also
considered an error.
There is one EQVLST
for each term in the query. The array of
EQVLSTs
is terminated by an EQVLST
with the words member set to
(char *)NULL
(all other members of the terminator are ignored).
The EQVLST
structure contains the following members:
char logic: the logic for this set
char **words: the list of terms including the root term
char **clas : the list of classes for `words'
int sz : the allocated size of the `words' and `clas' arrays
int used : the number used (populated) of the `words' and `clas'
arrays, including the terminating empty string ("")
int qoff : the offset into user's query for this set (-1 if unknown)
int qlen : the length in user's query for this set (-1 if unknown)
char *originalPrefix: set logic/tilde/open-paren/pattern-matcher
char **sourceExprs: NULL-terminated list of source expressions for set
The words
and clas
arrays are allocated lists like
everything else in the APICP
, and are terminated by empty
strings. The sz
and used
fields are provided so that
editors may manage the lists more efficiently.
The words
and clas
lists are parallel. They are exactly
the same length and for every item, words[i]
, its classification
is clas[i]
.
The originalPrefix
field (added in Texis version 6) contains
the set logic ("+
", "-
", "=
"), tilde
("~
"), open-parenthesis, and/or pattern-matcher characters
("/
" for REX, "%
" for XPM, "#
" for NPM)
present in the original query for this set, if any. It can be used in
reconstructing the original query, e.g. if the terms are to be
modified but set logic etc. should be preserved as given.
The sourceExprs
field (added in Texis version 6) contains a
list of the source expressions or terms for the set, i.e. as given in
the original query. For SPM queries, this will be a single word or
phrase. For PPM queries given as parenthetical lists, this will be a
list of the individual terms or phrases. For REX/NPM/XPM queries,
this will be the expression (sans "/
"/"#
"/"%
").
For single terms that are expanded by equivalence lookup, this will be
the original single term, not the expanded list (as words
will be) - because sourceExprs
is from the source (original
query), not post-equivalence-processing. Note also that
sourceExprs
is NULL
(not empty-string) terminated.
The sourceExprs
array can be used in reconstructing or
modifying queries.
The default function is the function nulleqedit2
in
api3.c
which does nothing and returns 0
for OK.
int (*eqedit)(APICP *):
This function is always called after a successful equiv lookup and
before the search begins. It is called with the current APICP
pointer with the "set" list in the APICP structure set to the list
of equivs generated by the query (see the description of lists).
The return value from (*eqedit)()
determines whether to go ahead
with the search or not. A return value of 0
means OK, go ahead
with the search. A return value of anything else means ERROR
,
don't do search. An ERROR
return from (*eqedit)()
will then cause
setmmapi()
or openmmapi()
, depending on where it was called from,
to return an error.
The format of the sets is:
{-|+|=}word[;class][,equiv][...]
Or:
{-|+|=}{/|%99|#}word
Where:
[] surround optional sections.
{} surround required items to be chosen from.
| separates mutually exclusive items between {}.
9 represents a required decimal digit (0-9).
word is the word, phrase, or pattern from the query.
equiv is an equivalent for word.
class is a string representing the classification for the
following words.
... means any amount of the previous item.
Classifications in the default thesaurus (case is significant):
P = Pronoun c = Conjunction
i = Interjection m = Modifier
n = Noun p = Preposition
v = Verb u = Unknown/Don't care
Words and phrases will be in the first format.
Patterns will be in the second format.
=struggle;n,battle,combat,competition,conflict,compete;v,contest,strive
battle, combat, competition, and conflict are nouns
compete, contest, and strive are verbs
struggle can be a noun or verb
=status quo;n,average,normality
status quo, average, and normality are nouns
+Bush;P
Bush is a pronoun
-/19\digit{2}
a REX pattern to find "19" followed by 2 digits
=%80qadafi
an XPM pattern to find qadafi within 80%
=#>500
an NPM pattern to find numbers greater than 500
Remember that each of the "set" strings is allocated. So if you
replace a set you must free the old one, to prevent memory loss,
and use an allocated pointer for the replacement because it will
get freed in closeapicp()
, unless it is (byte *)NULL
.
The "set" format must be totally correct for the search process to work.
The default is the function nulleqedit
in api3.c
which does
nothing and returns 0
for OK.