Syntax: REX expression, Replace expression, field to search, where to store it
This provides alternate means of setting both the HTML fields
Description etc.) and any
Additional Fields. It allows getting page information from
non-default places by searching and optionally replacing the data.
New blank rows will be provided as rows are used. See below for
REX Search - Allows you to specify a REX expression to narrow
down what contents of the
From Field will be used. Leave it
empty to use the entire field. See here
for details on REX search syntax.
Note that a
REX Search must be specified for the following
From field types:
HTML, raw output
Replace can be used to specify a subset of the
value to be stored in the
To field (or subset of the match, if
REX Search is used. See here for
details on REX replace syntax.
From Field - specifies what the source field is for the data.
HTML- the raw HTML source of the page. After matching, HTML tags are removed and HTML entities are resolved.
HTML, raw output- the raw HTML source of the page. Content is left as-is, with tags in place.
Text- the text of the page, after HTML rendering has been applied.
Title- the HTML title of the page
All Meta- the contents of all HTML <meta> headers - name, http-equiv, property, itemprop (but see From Meta Field footnote) - and HTTP headers specified in the document.
Meta Field ->- the contents of a specific <meta>/HTTP field, specified in the next input box, From Meta Field.
Keywords- the contents of the Keywords and/or Keyword meta field.
Description- the contents of the Description and/or Subject meta field.
Mime Typethe MIME type of the page. This may have been derived from the
Content-Typeheader, a <meta http-equiv> tag, or the URL extension, depending on what is available.
URL- the URL of the page.
URL Decoded- the decoded version of the URL. Any
%XX'URL-safe' sequences in the URL are replaced with their real characters. E.g. Pre%20%2D%20Expense%20Report.doc is decoded into Pre - Expense Report.doc.
URL Protocol- the URL's protocol, e.g. http.
URL Host- the host (without port number) from the URL.
URL Host and Port- the host (and port number if given) from the URL.
URL Path- the file path from the URL.
URL Path Decoded- the file path from the URL, URL-decoded.
URL Anchor- the anchor from the URL (if any), i.e. the part after the
#(pound sign). May not be available if already stripped.
URL Query- the query string from the URL (if any), i.e. the part after the ? (question mark).
URL Query Var ->- the value of the URL query-string variable named in From Meta Field, URL-decoded.
Referrer's Data- the value of a referring pages field. Store refs is required for this. The field selected will be the same field being populated.
From Meta Field - If
Meta Field -> or
URL Query Var -> is given as the From Field, this field is
used to specify which meta field's or query var's contents to use as
data. Leave blank otherwise.
Entering text in this field will force the use of
Meta Field ->,
if From Field is set to anything besides Meta Field or URL Query Var.
To Field - specifies where information should be stored.
Body- Override the standard fields extracted from the content.
Authorization URL- Populates the URL used when checking this result for Results Authorization. Please see the
Allow Authorization URLsection (4.6.55) for more details.
Category- To populate the category via Data From Field, all the possible category names must be entered in the Category setting. Using one or more Data From Field rules to set Category will cause Webinator to ignore the Categories' URL Patterns and instead set category membership based on these Data From Field rules.
Note: due to the way categories are stored, if categories are added, reordered, or removed after content has been walked, then a New walk will need to be performed to update the content's categories. Renaming categories does not need a rewalk.
Additional Links- This target allows you to use Data From Field to create links that will be walked. These links are subject to the normal indexing rules, will be rejected if they match exclusions, etc.
Use of this Data From Field target has no effect on the existing links found on the current URL. The links generated by this target will be added to the standard set of links on the page.
Subfetch- This causes the Webinator to take the value(s) it finds and performs a fetch as URL(s). The URL can be absolute, or relative to the current URL.
Nothing is changed by the subfetch itself, but any further Data From Field rules will use that fetched document(s) as the source of its content. Please see the Subfetch example below for a situation where this could be used.
Additional Fields- If this profile has any Additional Fields, they will be available as a target
If you just added the name of a new Additional Field, you will need to hit
Update for the new Additional Field to appear in the
Additional Fields are supported in the full Texis product, but not Webinator-only.
Append - If set to
Y, then the Data From Field content will be
appended to the field's existing data instead of overwriting it. Date-type
targets, such as
Modified, do not support