10.4 Fast Value Lookup - xtree | |
The <xtree> function in Vortex is an extremely fast way to maintain a temporary list of strings, and be able to search the list quickly. Each value is maintained with the number of times it was inserted. This makes <xtree> useful for histogram operations.
For example, we could find out the most common words in a large chunk of English text like this:
(Run this example. Download the source.)
<SCRIPT LANGUAGE=vortex> <A NAME=main PUBLIC> <FORM METHOD=post ACTION=$url/search.html> Text:<BR> <TEXTAREA NAME=text ROWS=10 COLS=60>$text</TEXTAREA><BR> <INPUT TYPE=submit> </FORM> </A> <A NAME=search PUBLIC> <main> <lower $text> <rex ROW "\alnum+" $ret> <xtree INSERT $ret> </rex> <xtree SKIP=0 DUMP></xtree> <sort $ret.count DESC $ret> Top 10 words are: <P> <LOOP MAX=10 $ret $ret.count> <B>$ret</B> occured $ret.count times <BR> </LOOP> </A> </SCRIPT> |
Here we ask for a chunk of text, lower-case it for case insensitivity, and use <rex> in a loop to pull out every word, one at a time. Each word we insert into xtree , which will store only the unique words and count the duplicates.
Then we DUMP the entire set of unique words, along with their corresponding counts - <xtree> sets the special variables $ret.count and $ret.seq in addition to the usual $ret . This returns the words in sorted order; we want to know the most frequent, so we <sort> the list by frequency. Then the top 5 are listed.
If we run this example with the Gettysburg address, we see (next page):
Back: Delimiting an Entire Region | Next: xtree - Continued |