Equivalence editing callbacks

During query processing, setmmapi() will call two user callback functions to perform editing on the query terms. The processing sequence is as follows:

  1. parse the query and lookup terms in equiv file.

  2. build eqvlist for eqedit2.

  3. * call (*eqedit2)().

  4. check for empty or NULL list.

  5. check for and remove duplication in set lists.

  6. set intersections if not already set (<0).

  7. build formatted sets for (*eqedit)() from eqvlist.

  8. free eqvlist.

  9. * call (*eqedit)().

  10. perform rest of internal setup.

  11. return to caller.

(*eqedit2)() is the recommended method for implementing on the fly equiv editing because it is easier to use. (*eqedit)() is available for backwards compatibility.

int (*eqedit2)(APICP *,EQVLST ***):

This function is always called after a successful equiv lookup and before the search begins. It is called with the current APICP pointer and a pointer to the list of equivs generated by the query (see the description of lists). The list pointer may be reassigned as needed.

The return value from (*eqedit2)() determines whether to go ahead with the search or not. A return value of 0 means OK, go ahead with the search. A return value of anything else means ERROR, don't do search. An ERROR return from (*eqedit2)() will then cause setmmapi() or openmmapi(), depending on where it was called from, to return an error. A NULL list from (*eqedit2)() is also considered an error.

There is one EQVLST for each term in the query. The array of EQVLSTs is terminated by an EQVLST with the words member set to (char *)NULL (all other members of the terminator are ignored). The EQVLST structure contains the following members:

char   logic: the logic for this set
char **words: the list of terms including the root term
char **clas : the list of classes for `words'
int    sz   : the allocated size of the `words' and `clas' arrays
int    used : the number used (populated) of the `words' and `clas'
              arrays, including the terminating empty string ("")
int    qoff : the offset into user's query for this set (-1 if unknown)
int    qlen : the length in user's query for this set (-1 if unknown)
char   *originalPrefix:  set logic/tilde/open-paren/pattern-matcher
char   **sourceExprs: NULL-terminated list of source expressions for set

The words and clas arrays are allocated lists like everything else in the APICP, and are terminated by empty strings. The sz and used fields are provided so that editors may manage the lists more efficiently.

The words and clas lists are parallel. They are exactly the same length and for every item, words[i], its classification is clas[i].

The originalPrefix field (added in Texis version 6) contains the set logic ("+", "-", "="), tilde ("~"), open-parenthesis, and/or pattern-matcher characters ("/" for REX, "%" for XPM, "#" for NPM) present in the original query for this set, if any. It can be used in reconstructing the original query, e.g. if the terms are to be modified but set logic etc. should be preserved as given.

The sourceExprs field (added in Texis version 6) contains a list of the source expressions or terms for the set, i.e. as given in the original query. For SPM queries, this will be a single word or phrase. For PPM queries given as parenthetical lists, this will be a list of the individual terms or phrases. For REX/NPM/XPM queries, this will be the expression (sans "/"/"#"/"%"). For single terms that are expanded by equivalence lookup, this will be the original single term, not the expanded list (as words will be) - because sourceExprs is from the source (original query), not post-equivalence-processing. Note also that sourceExprs is NULL (not empty-string) terminated. The sourceExprs array can be used in reconstructing or modifying queries.

The default function is the function nulleqedit2 in api3.c which does nothing and returns 0 for OK.

int (*eqedit)(APICP *):
This function is always called after a successful equiv lookup and before the search begins. It is called with the current APICP pointer with the "set" list in the APICP structure set to the list of equivs generated by the query (see the description of lists).

The return value from (*eqedit)() determines whether to go ahead with the search or not. A return value of 0 means OK, go ahead with the search. A return value of anything else means ERROR, don't do search. An ERROR return from (*eqedit)() will then cause setmmapi() or openmmapi(), depending on where it was called from, to return an error.

The format of the sets is:

{-|+|=}word[;class][,equiv][...]

Or:

{-|+|=}{/|%99|#}word

Where:

[]    surround optional sections.
{}    surround required items to be chosen from.
|     separates mutually exclusive items between {}.
9     represents a required decimal digit (0-9).
word  is the word, phrase, or pattern from the query.
equiv is an equivalent for word.
class is a string representing the classification for the
      following words.
...   means any amount of the previous item.

Classifications in the default thesaurus (case is significant):

P = Pronoun         c = Conjunction
i = Interjection    m = Modifier
n = Noun            p = Preposition
v = Verb            u = Unknown/Don't care
Words and phrases will be in the first format. Patterns will be in the second format.

=struggle;n,battle,combat,competition,conflict,compete;v,contest,strive
     battle, combat, competition, and conflict are nouns
     compete, contest, and strive are verbs
     struggle can be a noun or verb

=status quo;n,average,normality
     status quo, average, and normality are nouns

+Bush;P
     Bush is a pronoun

-/19\digit{2}
     a REX pattern to find "19" followed by 2 digits

=%80qadafi
     an XPM pattern to find qadafi within 80%

=#>500
     an NPM pattern to find numbers greater than 500

Remember that each of the "set" strings is allocated. So if you replace a set you must free the old one, to prevent memory loss, and use an allocated pointer for the replacement because it will get freed in closeapicp(), unless it is (byte *)NULL.

The "set" format must be totally correct for the search process to work.

The default is the function nulleqedit in api3.c which does nothing and returns 0 for OK.


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.