14.7.3 Parsing Forms with SANDCALL

Every time <fmtcp SANDCALL> calls a replace function, it passes several parameters:

  • hit
    The text string matching the REX search expression.

  • expn
    The index number of the expression which matched. This is useful if we map several expressions to one function and want to find out which expression matched.

  • tag
    The first HTML tag name in the matching test.

  • attrs
    A list of the attribute names of that tag.

  • vals
    A list of the corresponding values of the attributes.

The last three parameters are useful for on-the-fly parsing of HTML. For example, we could use them to parse an HTML page for forms, and save what variables in needs:


  <SCRIPT LANGUAGE=vortex>
  
  <EXPORT $id URL>
  
  <A NAME=main PUBLIC>
    <FORM METHOD=post ACTION=$url/search.html>
      Enter URL:
      <INPUT NAME=theurl SIZE=60 VALUE="$theurl">
      <INPUT TYPE=submit>
    </FORM>
  </A>
  
  <A NAME=search PUBLIC>
    <main>
    <urlcp reparentmode abs>              <!-- full URLs -->
    <fetch $theurl>
    <$doc = $ret>
    <!-- Tags we want to search for, and the corresponding
      -- function to call for each:
      -->
    <$search = "<FORM=!>*" "<INPUT=!>*" "<TEXTAREA=!>*>=!<*"
      "<SELECT=!>*" "<OPTION=!>*>=![\x0A<]*" "</SELECT=!>*"
    >
    <$call   = formtag     inputtag     textareatag
      selecttag     optiontag                endselecttag
    >
    <!-- Create the data tables.  We only need to create them once;
      -- they're dropped each time here only for this quick
      -- example:
      -->
    <SQL "drop table forms"></SQL>
    <SQL "drop table tags"></SQL>
    <SQL "create table forms(id counter, Method varchar(10),
       Url varchar(100))">                <!-- tables for forms -->
    </SQL>
    <SQL "create table tags(FormId counter, Name varchar(10),
        Type varchar(10), UserValue varchar(10), SendValue varchar(10))">
    </SQL>
    <SQL "create index xfid on forms(id)"></SQL>
    <SQL "create index xtid on tags(FormId)"></SQL>
    <fmtcp SANDCALL $search $call>
    <CAPTURE><sb>$doc</sb></CAPTURE>
    <showresults>
  </A>
  
  <A NAME=formtag PRIVATE attrs vals>     <!-- we got a <FORM> tag -->
    <$u = ><$meth = >
    <LOOP $attrs $vals>
      <upper $attrs>
      <SWITCH $ret>
        <CASE "METHOD"><$meth = $vals>
        <CASE "ACTION"><$u = $vals>
      </SWITCH>
    </LOOP>
    <SQL "insert into forms values(counter, $meth, $u)"></SQL>
    <$FormId = $id>                       <!-- new FORM id -->
  </A>
  
  <A NAME=taginsert PRIVATE>              <!-- insert a completed tag -->
    <SQL NOVARS "insert into tags
       values($FormId, $name, $type, $uval, $sval)">
    </SQL>
  </A>
  
  <A NAME=inputtag PRIVATE attrs vals>    <!-- we got an <INPUT> tag -->
    <$name = ><$type = "TEXT"><$uval = ><$sval = >
    <LOOP $attrs $vals>
      <upper $vals><$v = $ret>
      <upper $attrs>
      <SWITCH $ret>
        <CASE "TYPE"><$type = $v>
        <CASE "NAME"><$name = $vals>
        <CASE "VALUE"><$uval = $vals><$sval = $vals>
      </SWITCH>
    </LOOP>
    <taginsert>
  </A>
  
  <A NAME=textareatag PRIVATE hit attrs vals>     <!-- <TEXTAREA> tag -->
    <$name = ><$type = "TEXTAREA"><$uval = ><$sval = >
    <LOOP $attrs $vals>
      <upper $attrs>
      <SWITCH $ret>
        <CASE "NAME"><$name = $vals>
      </SWITCH>
    </LOOP>
    <rex ">>\>\P=!\<*" $hit>
    <$uval = $ret>
    <$sval = $ret>
    <taginsert>
  </A>
  
  <A NAME=selecttag PRIVATE hit attrs vals>   <!-- <SELECT> -->
    <$name = ><$type = "SELECT"><$uval = ><$sval = >
    <LOOP $attrs $vals>
      <upper $attrs>
      <SWITCH $ret>
        <CASE "NAME"><$name = $vals>
      </SWITCH>
    </LOOP>
    <!-- Wait for options... -->
  </A>
  
  <A NAME=optiontag PRIVATE hit attrs vals>       <!-- <OPTION> -->
    <$uv = ><$sv = >
    <LOOP $attrs $vals>
      <upper $attrs>
      <SWITCH $ret>
        <CASE "VALUE"><$sv = $vals>
      </SWITCH>
    </LOOP>
    <rex ">>\>\P=.*" $hit>
    <$uv = $ret>
    <IF $sv eq ""><$sv = $uv></IF>
    <!-- Append to our uval/sval list for this <SELECT>: -->
    <$uval = $uval $uv>
    <$sval = $sval $sv>
  </A>
  
  <A NAME=endselecttag PRIVATE>           <!-- </SELECT> -->
    <!-- OPTIONs are now done.  Sum them up and insert into insert: -->
    <sum ",%s" "" $sval>                  <!-- cat $sval together -->
    <substr $ret 2 -1>
    <$sval = $ret>
    <sum ",%s" "" $uval>                  <!-- cat $uval together -->
    <substr $ret 2 -1>
    <$uval = $ret>
    <taginsert>
  </A>
  
  <A NAME=showresults PUBLIC>
    <P>Results:<P>                        <!-- dump the tables -->
    <SQL "select * from forms">
      <HR>
      <B>Form id:</B> $id <BR>
      <B>Method:</B> $Method <BR>
      <B>Url:</B> $Url <BR>
      <P>
      <SQL "select * from tags where FormId = $id">
        <B>Name:</B> $Name <B>Type:</B> $Type
        <B>User Values:</B> $UserValue <BR>
      </SQL>
      <A HREF=$url/submitform.html>Fetch it</A>
    </SQL>
  </A>
  
  <A NAME=submitform PUBLIC>              <!-- Submit the form $id. -->
    <$dumvars = a b c d e f g h i j>
    <$dumvals = A B C D E F G H I J>
    <SQL MAX=1 "select Url, Method from forms where id = $id">
      Submitting $Url <P>
      <flush>
      <SQL "select Name, SendValue, Type from tags where FormId = $id"></SQL>
      <LOOP $Name $SendValue $dumvars $dumvals>
        <IF $Name neq "">
          <setvar $dumvars $Name>
          <rex ">>=[^,]*" $SendValue>     <!-- only 1st value -->
          <IF $ret eq "">                 <!-- no value? -->
            <SWITCH $Type>                <!-- set a value for var -->
              <CASE "TEXTAREA"><$ret = "User text data">
              <CASE "TEXT"><$ret = "User input data">
            </SWITCH>
          </IF>
          <setvar $dumvals $ret>
        </IF>
      </LOOP>
      <urlcp reparentmode abs>            <!-- full links -->
      <!-- Here's where those <setvar>s are used.  We submitted our
        -- dummy vars to <submit>, but set their values AND NAMES
        -- via <setvar> above.  That's why the variable names here
        -- are variables themselves; <submit> is one of the few
        -- functions we can pull off this stunt with:
        -->
      <submit METHOD=$Method URL=$Url $a=$A $b=$B $c=$C $d=$D $e=$E
         $f=$F $g=$G $h=$H $i=$I $j=$J>
      Result: <HR><send $ret>
    </SQL>
  </A>    
  
  </SCRIPT>

Our main function prompts for a URL to parse.

search fetches that URL, and uses <fmtcp SANDCALL> to parse it. We have several search expressions corresponding to the important form tags we're concerned with. Each will call a function xxxtag when it matches.

Each xxxtag callback function loops over the tag's attributes and inserts the needed data into two tables, forms and tags . forms contains the form URLs and their methods. tags contains the variables, types, and values (user-visible and server-sent data).

Some tags are fairly simple; <FORM> and <INPUT> are self-contained. But others, like <SELECT> , span multiple tags: the <SELECT> gives the name, but we need multiple <OPTION> tags to give us all the values. We save each option into the $uval and $sval arrays, until a </SELECT> tells us the var's done. Then we just cat the values together, comma-separated for ease of retrieval. (A more robust application might encode them differently.)

Finally, the showresults function dumps our form data tables back out.

When we ran this against Thunderstone's message-keeper demo, which contains a form, we get (next page):

Back: Dynamic Replace Strings - Continued Next: Parsing Forms with SANDCALL - Continued
Copyright © 2024 Thunderstone Software LLC. All rights reserved.