rex, split - regular expression search

 

SYNOPSIS

<rex $exprs $data[ /]>               <split $exprs $data[ /]>
  or                                   or
<rex [options] $exprs $data>         <split [options] $exprs $data>
  ...                                  ...
</rex>                               </split>


DESCRIPTION
The rex function searches for each REX expression value of $exprs in each value of $data. The split function acts the same way, except that it returns the non-matching data from $data (i.e. the SPLIT option below). The return type is varbyte if the $data is type varbyte or byte, otherwise it is varchar.

If given any options (except SYNTAX), rex and split are looping; otherwise, they are non-looping.

When non-looping, rex and split return a list of the matching (or non-matching) hits from data, in $ret. In addition, the variable $ret.off contains the integer byte offsets into the current search buffer where the hits start.

When looping however, hits (and offsets) are returned one at a time per iteration, and $loop/$next are also set as in SQL ($loop always starts at 0). Any statements inside the block are executed once per returned hit. The loop can be exited with BREAK or RETURN.

The looping syntax was added in version 2.6.938200000 19990924; $ret.off in version 3.01.966500000 20000816 (and supported for non-looping syntax as well in version 6.00.1355622000 20121215).

Options are:

  • ROW

    As in SQL, ROW indicates that values do not accumulate in $ret, and it should not be a loop variable; each new value erases the previous. ROW should be used in a looping rex/split when a large number of return values are expected but only need to be examined one at a time; this saves memory and time since all the hits do not have to be stored in memory. ROW should also be used when functions are called within the block, because otherwise $ret is a loop variable, hindering multi-value returns.

  • SKIP=$n Skip the first $n hits when returning values. This does not affect the value of $loop.

  • MAX=$n Return at most $n hits.

  • SPLIT Instead of returning the hit data, return non-matching data, i.e. the parts of $data outside the hits. The REX expressions in effect become delimiters for the data returned. This is similar to the command-line rex option -v (except there are no delimiters as with command-line rex). This is the default for the split command.

  • NONEMPTY Ignore empty (zero-length) return values. This is useful with SPLIT when empty values are not significant.

  • SYNTAX=re2|rex

    The $exprs syntax is RE2 or REX; the default is REX. Note that the expression syntax may also be changed by prefixing the expression with "\<re2\>" or "\<rex\>". Added in version 7.06. Note that RE2 expressions are not supported on all platforms; use <vxinfo features> (here) or the SQL function hasFeature() to determine if RE2 is supported on the current texis -platform platform (Windows, most Linux 2.6 versions except i686-unknown-linux2.6.17-64-32 are supported). Using an RE2 expression on an unsupported platform will result in the error message "REX: RE2 not supported on this platform". This option, unlike others, does not imply looping in version 7 Vortex syntax.


DIAGNOSTICS
rex returns a list of the matching hits from $data. split returns a list of the non-matching data. The corresponding byte offsets into the current search item are returned in $ret.off as well.



Copyright © Thunderstone Software     Last updated: Aug 4 2020
Copyright © 2020 Thunderstone Software LLC. All rights reserved.