Toggle navigation
+1 216-820-2200
+1 216-820-2200
Toggle navigation
Products
Solutions
How to Buy
Support
Contact Us
News
About
Contents
Document Conventions
Features
Obtaining Webinator
Technical Support
Installation
Linux/Unix Download and Installation
Windows Download and Installation
Filesystem Layout
File Permissions and OS Specific Notes
Adding Storage
Customizing Webinator's Appearance
Operation
Running the Administrative Interface
First Time Run: Quick Start
Step 1: Create an Account
Step 2: Create a Profile
Step 3: Walk the Profile
Last Step: Search
Administrative Interface Overview
Basic Walk Settings
All Walk Settings
Search Settings
List/Edit URLs
Browse URLs by Folder
List Duplicates
Test Fetch
Test Search
Query Log
Replication Tools
SOAP Tools
Integration Tools
Best Bet Groups
Status
Search
Profiles
System
System Information
Document Usage Overview
Test Network and Servers
Task Monitor
Thesaurus
Client Certificates
OneBox Providers
System Wide Settings
AWS Tools
Apply a License
Backup Webinator Settings
Restore Webinator Settings
System Replication Queue
System Replication Target Status
Accounts & Groups
Access Control Lists
Support Command
Repair Tools
Check Version Upgrade Actions
Re-output XSL files
Re-schedule walks
Remake Task Tables
Docs
Basic Walk Settings
Database
Walk Summary
Notes
Base URL(s)
Robots
Robots Crawl-delay
Allow Extensions
Exclude Extensions
Exclusions
Walk Delay
Parallelism
Verbosity
Disable Starting Walks
Rewalk Type
New
Refresh All
Refresh
Singles Only
Rewalk Type Summary Table
Rewalk Schedule
Action Buttons
Advanced Walk Settings
Watch URL
End of Walk Email
Attach Logs
Categories
Categories Type
URL File
URL URL
Single Page
Page File
Page URL
Strip Queries
Keep Query Vars
Ignore Query Vars
Sort Query Vars
Lower Query Var Values
Ignore Case
Host Aliases
Host Aliases from robots.txt
Extra Domains
Extra Networks
Extra URLs REX
Exclusion REX
Exclusion Prefix
RSS Feeds
Exclude by Field
Additional Fields
Data from Field
Data From Field Example - Using Description for Title
Data From Field Example - Using PublishDate for Last Modified Date
Data From Field Example - Grabbing Price from Meta
Data From Field Example - Grabbing Price from Text
Data From Field Example - Subfetch to use PDF Contents for a Web Page
Required REX
Required Prefix
Max Page Size
Max Pages
Max Bytes
Max Depth
Max URL Size
Max Requests
Max Connection Lifetime
Page Timeout
Meta Tags
Standard Meta
All Meta
Storage Charset
Source Default Charset
XML UTF-8
Keep Links
Remove Common
Ignore Selectors
Ignore HTML Strings
Keep Selectors
Keep HTML Strings
Ignore Characters
Plugin Split
Language Analysis
CJK Mode
Unknown File Formats
PDF Title Action
Word Definition
Text Search Mode
Attribute Compare Mode
Index Fields
Compound Index Fields
Extra Indexes
Spell-check Dictionaries
Primer Type
Primer URLs
Submitting the Form Directly: Custom Primer URL
Filling Out the Form: Custom Primer Variables
Checking for Bad Logins: Bad Login MM Query
Multiple Primers: Base URL MM Query
Following additional links with the !FOLLOW_LINK token
Multi-step primers with the !PREVIOUS_RESPONSE_FORM token
Unprimer URLs
Submitting the Form Directly: Custom Unprimer URL
Filling Out the Form: Custom Unprimer Variables
Checking for Bad Logins: Bad Login MM Query
Multiple Unprimers: Base URL MM Query
Following additional links with the !FOLLOW_LINK token
Login Info
Proxy Auto-Config URL
Proxy
Proxy Login Info
Client Certificate
Cookie Source Path
Cookie Jar
Strict Cookie Paths
Off-Site Pages
Off-Site Components
Stay Under
Prevent Duplicates
Respect Canonical URLs
Duplicate Check Fields
Store Refs
Inline Iframes
Max Components
Execute JavaScript
Fetch JavaScript
JavaScript String Links
Debug JavaScript
JavaScript Memory
JavaScript Timeout
AJAX Crawlable URLs
Walk Trace Settings
Audit Log
Performance Logging
Batch Locks
URL Protocols
HTTP Version
SSL Client Protocols
SSL Client Ciphers
SSL Use SNI
SSL Allow Unsafe Renegotiation
IP Protocols
Network Share Access Method
Network Share Protocols
File URL Get Owner Headers
Authentication Schemes
Embedded Security
Body Storage Method
Multiple Fetches
Follow Cross-Site Links
Max Redirects
Empty Form Redirects
Execute Walked Dataload
Index Name
DNS Mode
Net Mode
User Agent
Robots.txt Agents
Mime Types
Custom Headers
Respect Expires Header
Cache Content
Default Refresh Time
Minimum Refresh Time
Maximum Refresh Time
Maximum Process Size
Always Refresh Listing Page
Replication Settings
Send Data
Send Settings
Batch Rows
Batch Size
Batch Idle
Log Replication
Search Settings
Notes
Query Logging
Log Result Clicks
Log As Metasearch Backend
Rotate Schedule
Email
Result Order
Results Style
Allow RSS
Format XSL Output
XSL File
Abstract Style
Abstract Length
Max Title Length
Max URL Display Length
Results per Page
Max User Results per Page
Page Links Shown
Results per Site
Allow site: syntax
Allow link: syntax
Results Width
Box Color
Show File Icons
Show Advanced Search
Query Autocomplete
Max Completions
Results Highlighting
Context Highlighting
PDF Query Highlighting
PDF Highlighting Format
Font
Display Charset
Top HTML and Bottom HTML
Enable Sherlock
Best Bet Match Mode
Top Best Bet Title
Right Best Bet Title
Top Best Bet Group
Right Best Bet Group
Top Best Bet Box Color
Right Best Bet Box Color
Top Best Bet Border Style
Right Best Bet Border Style
Right Best Bet Box Width
Authorization Method
Login Cookies
Login URL
Additional CAS Setup
Basic/NTLM/file Cookie Type
Login Verification URL
Authorization Target
Unauthorized Result Query
Username Fixup
Examples
Max Docs to Auth-Check
Successful Auth Result Limit
Total Auth Timeout
Allow Authorization URL
Authorization Caching
Authorization Debug Log
Show Authorization Info
Enable Spell Check
Suggest Time Limit
Number of Suggestions
Synonyms
Main Thesaurus
Secondary Thesaurus
Translate Boolean
Quotes for Literal
Allow the @ Operator
Allow Linear
Allow "NOT" Logic
Allow Post-Processing
Allow Wildcards
Allow Leading Wildcards
Single-Word Wildcards
Allow WITHIN Operators
Require All Words
Resolve Phrase Noise Words
Phrase Word Processing
Keep Noise Words
Noise List
Search Timeout
Show Error Messages
Debug SQL Level
Debug Metamorph Level
Search Trace Settings
Fast Result Counts
Proximity
Language Characters
Word Forms
Custom Suffix List
Custom Suffix Default Removal
Custom Suffix Min Length
Word Ordering
Word Proximity
Database Frequency
Document Frequency
Position in Text
Depth in Site
Date Bias
Ranked Rows
XML Export Variables
Phishing Protection
Prevent Find Similar Fetch
Decode Displayed URLs
Max Cache Entry Age
Max Cache Size
Min Search Time
Visible
System Wide Settings
Admin Theme
Admin Logo
Current Default Profile
Default Profile Sources
Ignore Host Prefix
Ignore Host Suffix
Default Profile
Cluster Members
API Logging
Task Monitor Logging
Audit Logging
Admin Banner
Login Expiration
Search Security Header Level
Disable Starting All Walks
Profile Dataspace Roots
Network Share Mounts Root
System Replication Settings
Allow Receiving
Log All Replication
Experimental Features
Results Authorization
Results Authorization Walk Settings
Results Authorization Search Settings
Meta Search - Search multiple profiles as one
Profile Creation
Meta Search Walk Settings
Search Settings
Access Control
User Groups
Object hierarchy
Access Control Lists
Determining Effective Rights
Required Rights for Admin Actions
Walk and Search Settings
Starting and stopping a walk
Best Bets
List/Edit URLs
List Duplicates
Walk Status
Query Log
Profiles
Accounts
User Groups
Access Control
Maintenance
Running the Walker by Hand
Using dowalk
Running the Search Interface
Procedures and Examples
Searching your Index
Similarity Searching
Using the Thesaurus Feature
Page Exclusion, Robots.txt, and Meta-robots
Indexing Other Sites
Indexing Individual Pages
Reindexing on a Schedule
Checking for Web Server Errors
Removing Pages from the Database
Troubleshooting missing content URLs
Erasing the Entire Database
Using Multiple Databases
Integrating Search with your Site
Link to the Webinator
Embed a search box
Request XML search results
Issuing a Query Programmatically
Search Parameters
XML Elements in Search Results
Invoking Query Autocomplete
Invoke the search SOAP API
Sample ASP Code
Using Default Profile Sources Hostname
Search Result RSS Feeds
OpenSearch Support
Using Best Bets
Quick Creation
Fully Customized
Using Access Control
Initial Lockdown
Example: User with Complete Control on One Profile
Example: User with Look and Feel Control on All Profiles
Replication
Replication Overview
Procedure - Replicating One Profile
Set up the Sender Profile
Create the Receiver Profile
Procedure - Separate Hot Backup Machine
Configure the Backup Machine
Configure the Main Machine
Synchronize Pre-existing Profiles
Making Backup Live on Main Failure
Using Circular Replication
Setup
Notes and Limitations
Dataload API
Submitting Content
Uploading a binary file
Combining the two: binary files with custom fields
Additional Fields
Refs and Errors
Setting Best Bet Groups
Setting Best Bets
Reply Format
Dataload SOAP API
Additional Fields
Overview
Populating
Sorting
Searching
SOAP API
SOAP Overview
SOAP API vs. XML Output
Getting the WSDL
Global vs. per-profile WSDLs
Configuring the SOAP Interface
Dataload SOAP API
C# example project
SOAP Links for Languages
SOAP API search Reference
about
search
moreLikeThis
matchInfo
showParents
getCompletions
SOAP API dataload reference
about
dataload
SOAP API admin Reference
login
about
listProfiles
getDocumentUsageOverview
getProfileStatus
addProfile
deleteProfile
getSettings
setSettings
getQueryLogRaw
pauseWalk
stopWalk
startWalk
getTask
getTasks
getProfileErrors
getProfileLog
setParametricFields
getBestBetGroups
saveBestBetGroup
deleteBestBetGroup
getBestBets
saveBestBets
deleteBestBets
getThesauruses
setThesaurus
deleteThesaurus
Auth Proxy
conf/texis.ini
Section
Migrating to a new installation
Reference
REX Syntax
Expressions
Repetition Operators
RE2 Syntax
\<nomatch\> Syntax
REX Caveats and Commentary
Some Useful REX Expressions
REX Replace Syntax
Supported File Formats
Database and File Usage
Walk Database Tables and Fields
Options Table Fields
Customizing the Search
Customizing the Walker
Texis ISAPI
Overview
How it Works
Settings for Texis ISAPI
Reading values from
texis.ini
Reading values from the Registry
CGI Mapping by Vortex File Extension
Microsoft IIS
Apache
Preferred Method: Redirect Handler
Alternate Method: Direct Execution
Third-Party Software
Version Differences
Search Interface Help
Forming a Query
Query Rules of Thumb
Overview of Query Abilities
Controlling Proximity
Ranking Factors
Keywords Phrases and Wild-cards
Applying Search Logic
Natural Language Query
Using the Special Pattern Matchers
Invoking Thesaurus Expansion
Using Word Forms
Controlling Proximity
Interpreting Search Results
Viewing Match Info
Finding Similar Documents
Showing Document Parents
Copyright © Thunderstone Software
Last updated: Apr 15 2024
Webinator Manual
Top
Up: Thunderstone Webinator WWW Site
Next: Document Conventions
Back: Thunderstone Webinator WWW Site
PDF
Contact
Submit Request
Copyright © 2024 Thunderstone Software LLC. All rights reserved.