SYNOPSIS<xtree [XTREE=$treename] [options] $command [$values] [$treename][ /]>
[ ...
</xtree>]
DESCRIPTION
The <xtree>
function is a fast way to maintain a sorted list of
temporary
string values in memory - faster than using a SQL table. Values can
be added, removed, counted and deleted from a tree, named with
$treename (default tree name is empty). The $values
parameter is the list of values to operate on, for INSERT
,
DELETE
and SEARCH
; for SET
it is the option(s) to
set. Otherwise the $values parameter is not given, or is empty
if $treename is specified via the last-argument syntax.
Each value in an xtree has a count and a sequence number associated
with it. The count is the number of times the value was inserted.
The sequence number is the order in which the value was first
inserted, starting with 0 for the first value inserted. These values
are retrieved either with the COUNT
and SEQ
commands, or
in the $ret.count and $ret.seq variables.
If the tag is self-closing ("<xtree ... />"), it returns
values in $ret, $ret.count, and $ret.seq as a
simple function. If a close tag is given ("</xtree>"), it is
a looping statement (and the same variables are set and looped over,
plus $loop and $next). Note that this is version 8 and
later syntax; the syntaxversion
pragma
(here) can revert to version
7 and earlier syntax, where looping syntax (and $ret.count/$ret.seq setting) is based on presence of an option other than
XTREE
.
In version 8.00.1645136290 20220217 and later, the self-closing syntax
also sets $loop and $next.
The variables $ret.count and $ret.seq, when set, are the corresponding count and sequence numbers for $ret. If no count/sequence numbers are appropriate, these variables are cleared (in version 8 syntax).
When <xtree>
is a looping statement (or self-closing if version
8.00.1645136290 20220217 or later), $loop/$next are set
as in SQL
($loop starts at 0). If looping, any
statements inside the block are executed once per returned value, with
$ret being a loop variable accumulating values in version 7.
The loop can be exited with BREAK
or RETURN
. The loop
syntax was added in version 2.6.938200000 19990924.
Options are:
ROW
Note that in version 8 and later syntax - i.e. when the
syntaxversion
pragma (here) is
8 or more - return values never accumulate in looping statements.
Thus the ROW
flag is unneeded, and not accepted. It is
only valid in version 7 and earlier syntax.
In version 7 and earlier syntax, the ROW
flag indicates
that $ret/$ret.count/$ret.seq should not
accumulate into arrays, nor be entered into loop contexts; each
new value erases the previous. ROW
should be used in a
looping <xtree>
when a large number of return values are
expected but only need to be examined one at a time; this saves
memory and time since all the values do not have to be stored in
memory. ROW
should also be used when functions are called
within the block, because otherwise $ret is a loop
variable, hindering multi-value returns (see -warn-ret-loop, here).
ROW
should also be used when the COUNT
and
SEQ
commands are not to be used, as it saves the memory of
preserving those values for a later COUNT
or SEQ
.
SKIP=$n
Skip the first $n
values. This does not affect the value
of $loop.MAX=$n
Return at most $n
values.XTREE=$treename
Use tree named $treename instead of the default. This
option was added in version 5.01.1217380000 20080729. The old
syntax was to give the tree name as the last argument, with no
option name, which can be confusing. Only one syntax -
XTREE
option or last-unnamed-argument - may be used
in a statement.
The $command
argument determines what <xtree>
does,
whether it takes a $values argument or not, and what values, if
any, are returned. It is one of the following commands:
CLEAR
Clear the tree. All counts are set to 0, but the tree and its
values remain in memory. $values need not be set. No
values are returned.DUMP
Dump the tree. The return value $ret is a unique, sorted
list of the values in the tree that have non-zero counts, in
ascending order.
In the looping syntax,
or in version 8 and later syntax (syntaxversion
pragma, here, is 8 or more),
$ret.count and
$ret.seq are set to the corresponding count and sequence
numbers; in the non-looping syntax for version 7 and earlier,
these values must be retrieved with
separate COUNT
and SEQ
commands afterwards.
If the CLEAR
option (not command) is set with SET
,
each value is also cleared as it is returned. $values need
not be set.COUNT
Dump the value counts from the immediately previous non-looping
DUMP
, SEARCH
or INSERT
command on the
same tree. $ret will contain a list of the number of
times each value from the operation occurs in the tree. For the
looping syntax, nothing is returned - the counts and sequences
are available during the DUMP
/SEARCH
/INSERT
in
$ret.count and $ret.seq. $values need not be
set.SEQ
Returns the corresponding sequence numbers (starting at 0), from
the immediately previous non-looping DUMP
, SEARCH
or
INSERT
on the same tree. These indicate the order in
which the strings were originally added. For the looping syntax,
nothing is returned - the counts and sequences are available
during the DUMP
/SEARCH
/INSERT
in
$ret.count and $ret.seq. $values need not be
set.INSERT
Adds the $values to the tree, if not already present, and
increment their counts. If the NOADD
option is set
(below), new values are not added; only already-present values are
incremented. In versions after 3.01.963400000 20000712, the
inserted values - and counts and sequence numbers, if looping
or version 8 and later syntax
(the syntaxversion
pragma,
here, is 8 or more)
- are returned. This
saves an extra SEARCH
to determine the numbers right after
an INSERT
.DELETE
Deletes the $values from the tree, i.e. sets their counts to
0. No values are returned in $ret.SEARCH
Returns a list of the $values that are in the tree. Counts
and sequences are available as per DUMP
and INSERT
.SET
Set the option(s) given in $values. In version
6.00.1305253000 20110512 and later, 1 is returned on success, 0 on
failure; previous versions always returned empty string.
Looping version iterates 0 times in version 7 and earlier, or once per $values in version 8 and later syntax (i.e. the
syntaxversion
pragma,
here, is 8 or more). Options:
noadd
Do not add new values to the tree during an INSERT
.
Inserting a previously-cleared value will make it
"reappear", because it was present before but now has a
non-zero count. But wholly-new values will not be inserted.
clear
Also clear the tree during a DUMP
.stringcomparemode $mode
Set string compare mode - how values are compared - for the
tree to $mode
. The syntax is the same as the
apicp
setting stringcomparemode
(here). The default (and the
meaning of values default
or builtin
) is the
current apicp
stringcomparemode
. This option
must be set before any data is inserted into the tree,
as it would alter the sort order. Added in version 6.storefolded $onOff
Set the store-folded flag for the tree to $onOff
(i.e. "on" or "off"). Enabling store-folded
increases speed slightly, at the expense of not retaining
inserted values exactly.
If storefolded
is on, values are folded (according to
stringcomparemode
) once, at insert time, and stored
folded. This increases speed because only one folding is
needed. However, values returned from DUMP
etc. will
be the folded value, not the originally-inserted value.
E.g. if stringcomparemode
is set to ignorecase
,
the inserted value "McDuff
" will be DUMP
ed as
"mcduff
".
If storefolded
is off (the default), values are folded
whenever needed for comparisons, and stored exactly as
inserted. This decreases speed because potentially more than
one folding is needed. However, the originally-inserted value
is preserved, i.e. if "McDuff
" is inserted in an
ignorecase
tree, it will be DUMP
ed as
"McDuff
".
In either case, stringcomparemode
is respected for
searches, inserts etc.; just the value of $ret is
affected. This option must be set before any data is
inserted into the tree, as it would alter the sort order.
Added in version 6.
defaults
Restores defaults: noadd
, clear
and
storefolded
are off; stringcomparemode
is the
same as the <apicp>
value. Note that since the last
two settings can only be set on an empty tree, defaults can
only be fully restored on an empty tree as well. In version
6.00.1305312600 20110513 and earlier, stringcomparemode
and storefolded
mode were not restored at all when
setting defaults.
GET
Gets the option(s) listed in $values. Added in version 6.00.1305253000 20110512. Options and what they return are:
noadd
1 if not adding new values to the tree during an INSERT
;
0 if adding.
clear
1 if clearing the tree during a DUMP
, 0 if not.
stringcomparemode
The current stringcomparemode
for the tree.
storefolded
1 if storing values folded, 0 if not.
numitems
The number of items in the tree, i.e. the number of non-zero
count items (returned by DUMP
). It is much faster
(constant-time) to obtain this value via get numitems
than by DUMP
ing the tree and checking $loop.
numallocateditems
The number of allocated items in the tree, i.e. including zero-count (cleared) items.
memused
The number of bytes of memory used by the tree.
FLUSH
Remove the tree and free its memory. Nothing is returned.
DIAGNOSTICS
The <xtree>
function returns either nothing, a list of strings,
or a list of integer counts/sequences. If invoked as a looping command,
or in version 8 with the syntaxversion
pragma set to 8 or more,
$ret.count and $ret.seq may be set as well.
EXAMPLE<$txt = "senselessness" "this" "test" "finesse">
<rex "." $txt> <!-- split the words into letters -->
<xtree insert $ret />
<xtree dump>
$ret $ret.count
</xtree>
The output would be a count of how many times each letter occurs among all the words:
e 7
f 1
h 1
i 2
l 1
n 3
s 10
t 3
CAVEATS
The <xtree>
function was added Apr. 29 1997. The loop syntax
was added in version 2.6.938200000 19990924, as was support for
full binary data.
Note that $loop, $next, $ret.count and $ret.seq are only set in the looping version in version 7 syntax.
It is not possible to modify an xtree while inside an <xtree>
loop for the same tree, nor is it possible to start another
DUMP
inside the loop. An error message such as "Cannot
flush xtree: still in use" or "Cannot insert into xtree while
walking it" may be generated in such cases.
In long-running scripts, trees should be FLUSH
ed when done,
to save memory. Since they are kept only in memory, they will be lost
when the script exits. ROW
should be used whenever possible
to save memory and speed.
In version 6 and later, string comparisons use the current apicp stringcomparemode setting (here), changeable with the set stringcomparemode tree option.
The self-closing syntax (<xtree ... />) was added in version 7.
In version 8 and later, the syntaxversion
pragma
(here) affects this statement's
syntax and behavior (especially the ROW
flag).
SEE ALSOsort
, uniq