10.4 Fast Value Lookup - xtree

The <xtree> function in Vortex is an extremely fast way to maintain a temporary list of strings, and be able to search the list quickly. Each value is maintained with the number of times it was inserted. This makes <xtree> useful for histogram operations.

For example, we could find out the most common words in a large chunk of English text like this:

(Run this example.    Download the source.)


  <SCRIPT LANGUAGE=vortex>
  
  <A NAME=main PUBLIC>
    <FORM METHOD=post ACTION=$url/search.html>
      Text:<BR>
      <TEXTAREA NAME=text ROWS=10 COLS=60>$text</TEXTAREA><BR>
      <INPUT TYPE=submit>
    </FORM>
  </A>
  
  <A NAME=search PUBLIC>
    <main>
    <lower $text>
    <rex ROW "\alnum+" $ret>
      <xtree INSERT $ret>
    </rex>
    <xtree SKIP=0 DUMP></xtree>
    <sort $ret.count DESC $ret>
    Top 10 words are: <P>
    <LOOP MAX=10 $ret $ret.count>
      <B>$ret</B> occured $ret.count times <BR>
    </LOOP>
  </A>
  
  </SCRIPT>

Here we ask for a chunk of text, lower-case it for case insensitivity, and use <rex> in a loop to pull out every word, one at a time. Each word we insert into xtree , which will store only the unique words and count the duplicates.

Then we DUMP the entire set of unique words, along with their corresponding counts - <xtree> sets the special variables $ret.count and $ret.seq in addition to the usual $ret . This returns the words in sorted order; we want to know the most frequent, so we <sort> the list by frequency. Then the top 5 are listed.

If we run this example with the Gettysburg address, we see (next page):

Back: Delimiting an Entire Region Next: xtree - Continued
Copyright © 2024 Thunderstone Software LLC. All rights reserved.