Using dowalk

Normally a walk is initiated from the administrative interface. There may, however, be times when it is desirable to start a walk by hand from a shell (or command) prompt or as a part of some other automated task. When the administrative interface starts a walk it shows you the command line to use It is of the form:

texis profile=PROFILENAME dowalk/dispatch.txt

You may also specify the parameter ttyverbose to be 1, or higher, to tell dowalk to print various status messages to the screen when being run by hand. The form would be

texis profile=PROFILENAME ttyverbose=1 dowalk/dispatch.txt

Where PROFILENAME is the name of the profile you have configured using the administrative interface. You will need to supply the full path to texis if it is not in your PATH. You will also need to supply the path to the dowalk script if it is not in the current directory when you run the command.

INSTALLDIR/bin/texis profile=PROFILENAME INSTALLDIR/texis/scripts/webinator/dowalk/dispatch.txt

or

INSTALLDIR\texis profile=PROFILENAME INSTALLDIR\Texis\Scripts\Webinator\dowalk/dispatch.txt

The walker will behave the same as it does from the administrative interface. Walk info will be logged to the same files. See section 6.4.

There are several other "entry points" that can be used to get various different behaviors when starting the walker. They all take the same form as dispatch above except that dispatch is replaced by the name of the entry point. The entry points are:

  • dispatch Starts a walk on the profile, using the profile's Rewalk Type setting to determine if a New or Refresh walk should be performed.

  • hold Stops a walk that is in progress, create/update the search indices and make it the live search.

  • stop Stops and abandons a walk that is in progress.

  • indexmakelive Creates/updates the search indices on an abandoned walk and makes it the live search.

  • refreshnow Mark a URL for refresh-ASAP. This requires an extra u=THEURL argument to tell it what URL to refresh. This will flag the page for refresh on the next refresh walk. It will not refresh anything itself. So you need to have walk type set to refresh and a schedule set. texis profile=PROFILENAME u=THEURL dowalk/refreshnow.txt

  • ifmodified Checks the Watch URL. If the watched page has changed a walk is started. If not no action is taken. This is generally used on a frequent schedule to automatically rewalk a site if it changes.

  • singles Fetches and indexes any single pages specified in the profile that are not yet in the database. You would call this after adding adding to Single Page, Page File, or Page URL.

  • recat Recategorizes the database based on the current settings of Categories. This may take some time on large walks.

  • updateindex Updates the Metamorph index on the html table. This would be used after performing manual SQL operations against the html table.

  • remakeindex Drops and recreates all (standard) indices on the database. This has little use except in the case where indices are corrupted by disk errors or such.

  • remakemmindex Drops and recreates the Metamorph index on the html table. This would be used after changing the Word Definition expressions.

  • tsverrors Dumps the error table as tab separated values of Date, Url, Reason. Optional start and end date-times may be specified. Not specifying start means start at beginning. Not specifying end means continue to end. texis profile=PROFILENAME start="2004-10-01" end="2004-11-01" dowalk/refreshnow.txt


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2024 Thunderstone Software LLC. All rights reserved.