Query Protection

The following apicp settings alter the set of query syntax and features that are allowed. Metamorph has a powerful search syntax, but if improperly or inadvertently used can take a long time to resolve poorly constructed queries. In a high-load environment such as a Web search engine this can bog down a server, slowing all users for the sake of one bad search.

Therefore, Vortex is by default highly restrictive of the queries it will allow, denying some specialized features for the sake of quicker resolution of all queries. By altering these settings, script authors can "open up" Texis and Metamorph to allow more powerful searches, at the risk of higher load for special searches.

  • alequivs (boolean, off by default) If on, allows equivalences in queries. If off, only the actual terms in a query will be searched for; no equivalences. This is regardless of ~ usage or the setting of keepeqvs. Note that the equivalence file will still be used to check for phrases in the query, however. Turning this on allows greater search flexibility, as equivalent words to a term can be searched for, but decreases search speed. Note: In tsql version 5 and earlier the default was on.

  • alintersects (boolean, off by default) If on, allow use of the @ (intersections) operator in queries. Queries with few or no intersections (e.g. @0) may be slower, as they can generate a copious number of hits. Note: In tsql version 5 and earlier the default was on.

  • allinear (boolean, off by default) If on, an all-linear query-one without any indexable "anchor" words-is allowed. A query like "/money #million" where all the terms use unindexable pattern matchers (REX, NPM or XPM) is an example. Such a query requires that the entire table be linearly searched, which can be very slow for a table of significant size. Note: In tsql version 5 and earlier the default was on.

    If allinear is off, all queries must have at least one term that can be resolved with the Metamorph index, and a Metamorph index must exist on the field. Under such circumstances, other unindexable terms in the query can generally be resolved quickly, if the "anchor" term limits the linear search to a tiny fraction of the table. The error message "Query would require linear search" may be generated by linear queries if allinear is off.

    Note that an otherwise indexable query like "rocket" may become linear if there is no Metamorph index on its field, or if an index for another part of the SQL query is favored instead by Texis. For example, with the SQL query "select Title from Books where Date > 'May 1998' and Title like 'gardening'" Texis may use a Date index rather than a Title Metamorph index for speed. In such a case it may be necessary to enable linear processing for a complicated query to proceed-since part of the table is being linearly searched.

  • alnot (boolean, on by default) If on, allows "NOT" logic (e.g. the - operator) in a query.

  • alpostproc (boolean, off by default) If on, post-processing of queries is allowed when needed after an index lookup, e.g. to resolve unindexable terms like REX expressions, or like queries with a non-inverted Metamorph index. If off, some queries are faster, but may not be as accurate if they aren't completely resolved. The error message "Query would require post-processing" may be generated by such queries if alpostproc is off. Note: In tsql version 5 and earlier the default was on.

  • alwild (boolean, on by default) If on, wildcards are allowed in queries. Wildcards can slow searches because potentially many words must be looked for.

  • alwithin (boolean, off by default) If on, "within" operators (w/) are allowed. These generally require a post-process to resolve, and hence can slow searches. If off, the error message "'delimiters' not allowed in query" will be generated if the within operator is used in a query. Note: In tsql version 5 and earlier the default was on.

  • builtindefaults Restore all settings to builtin Thunderstone factory defaults, ignoring any texis.ini [Apicp] changes. Added in Texis version 6.

  • defaults Restore all settings to defaults set in the texis.ini) [Apicp] section (or builtin defaults for settings not set there).

  • denymode (string or integer; warning by default) What action to take when a disallowed query is attempted:

    • silent or 0 Silently remove the offending set or operation.

    • warning or 1 Remove the term and warn about it with a putmsg-catchable message.

    • error or 2 Fail the query.
    A message such as "'delimiters' not allowed in query" may be generated when a disallowed query is attempted and denymode is not silent.

  • qmaxsets (integer, 100 by default) The maximum number of sets (terms) allowed in a query. Added in version 2.6.934800000 19990816. Note: also settable as qmaxterms for back-compatibility with earlier versions.

  • qmaxsetwords (integer, 500 by default, unlimited by default in tsql) The maximum number of search words allowed per set (term), after equivalence and wildcard expansion. Some wildcard searches can potentially match thousands of distinct words in an index, many of which may be garbage or typos but still have to be looked up, slowing a query. If this limit is exceeded, a message such as "Max words per set exceeded at word `xyz*' in query `xyz* abc'" is generated, and the entire set is considered a noise word and not looked up in the index. A value of 0 means unlimited. Added in version 2.6.934900000 19990817.

    In version 3.0.947600000 20000110 and later, the set may only be partially dropped (with the message "Partially dropping term `xyz*' in query `xyz* abc'") depending on the setting of dropwordmode (which must be set with a SQL set statement). If dropwordmode is 0 (the default), the root word, valid suffixes, and more-common words are still searched, up to the qmaxsetwords limit if possible; the remaining wildcard matches are dropped. If dropwordmode is 1, the entire set is dropped as if a noise word.

    Note that qmaxsetwords is the max number of search words, not the number of matching hits after the search. Thus a single but often-occurring word like "html" counts as one word in this context. Note: In tsql version 5 and earlier the default was unlimited.

  • qmaxwords (integer, 1100 by default) The maximum number of words allowed in the entire query, after equivalence and wildcard expansion. If this limit is exceeded, a message such as "Max words per query exceeded at word `xyz*' in query `xyz* abc'" is generated, and the query cannot be resolved. 0 means unlimited. Added in version 2.6.934900000 19990817. Like qmaxsetwords, this is distinct search words, not hits. dropwordmode also applies here. Note: In tsql version 5 and earlier the default was unlimited.

  • qminprelen (integer, 2 by default) The minimum allowed length of the prefix (non-* part) of a wildcard term. Short prefixes (e.g. "a*") may match many words and thus slow the search. Note: In tsql version 5 and earlier the default was 1.

  • qminwordlen (integer, 2 by default) The minimum allowed length of a word in a query. Note that this is different from minwordlen, the minimum word length for prefix/suffix processing to occur. Note: In tsql version 5 and earlier the default was 1.

  • querysettings (string or integer) Container for changing all or a group of settings to a certain mode. The argument may be one of the following:

    • defaults or 0 Set Vortex defaults (with texis.ini [Apicp] overrides); same as <apicp defaults>.

    • texis5defaults or 1

      Set Texis (i.e. tsql not Vortex) version 5 and earlier defaults (with texis.ini [Apicp] overrides). Some of these defaults are in common with Texis 6 and later:

      • alprefixproc, keepnoise, keepeqvs are off

      • alwild, alnot are on

      • minwordlen 255

      • sdexp/edexp are empty

      • eqprefix set to "builtin"

      • ueqprefix set to "eqvsusr"

      • denymode is "warning"

      • qmaxsets is 100
      The rest are different from Texis 6 and later:

      • alpostproc, allinear, alwithin, alintersects, alequivs, alexactphrase are on (instead of off in version 6)

      • qminwordlen, qminprelen are 1 (instead of 2 in version 6)

      • qmaxsetwords is unlimited (instead of 500 in version 6)

      • qmaxwords is unlimited (instead of 1100 in version 6)

    • vortexdefaults or 2 Set Vortex defaults (with texis.ini [Apicp] overrides); same as <apicp defaults>.

    • protectionoff or 3

      Turn off query protection settings, i.e. set all al... settings on (allowed), exactphrase on, qmin... limits to minimums, qmax... limits to maximum (unlimited), denymode to warning. Any texis.ini [Apicp] values for these settings are ignored.

    Added in Texis version 6.

  • texisdefaults Restore Texis (as opposed to Vortex) version 5 and earlier default values. Note: This setting is deprecated in Texis version 6 and later (as Texis defaults have changed to match Vortex defaults for consistency), and may be removed in a future release. Set querysettings texis5defaults instead. The texisdefaults setting is still respected, but will cause a warning noting that it is deprecated. If legacy scripts cannot be updated to use querysettings texis5defaults instead, this warning can be silenced with the texis.ini setting [Texis] Texis Defaults Warning = off (here).

    Setting texisdefaults turns off query protection, e.g. it will enable linear searches, post-processing, within operators, etc. Note: this will permit some queries to run than can potentially take an inordinate amount of time, even with a Metamorph index. Use with caution.


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.