XML Elements in Search Results

 

Search results can be sent as XML from Webinator to the host server. This section describes the XML elements.

<ThunderstoneResults> Overall container for the search results

  • <XmlOutputVersion> Defines the version of this xml output

  • <Query> Main text search string

  • <TitleQuery> Query applied only to titles

  • <UrlQuery> Query applied to URL

  • <DepthQuery> Maximum Depth

  • <MimeTypeQuery> Query applied to Mime Type

  • <CategoryQuery> Numeric index of a category to require results to be in. 1 is the first category, etc. Deprecated; use <CategoryName> instead.

  • <CategoryName> Name of a category to require results to be in. Added in version 20.1.

  • <RequireAllCategories> Set to Y to only match results in all specified categories instead of any one (or more) of them.

  • <ResultsPerSiteQuery> Max results per site, as specified by user

  • <UserResultsPerPage> Max results per page, as/if specified by user

  • <TextQuery> Text part of main search query

  • <TextQueryHighlight> TextQuery with query highlighting (if enabled)

  • <PreviousRefine> Additional refine queries

  • <SiteQuery> Site query (from site:host in the query, or dedicated sq query string variable)

  • <LinkQuery> Link query (from link:URL in the query)

  • <InFieldQueriesAllowed> Set to Y if infield: queries (a Parametric search operator) are allowed

  • <ModifiedDateLessThan> Only return results with Modified date earlier than this

  • <ModifiedDateGreaterThan> Only return results with Modified date greater than this

  • <UrlRoot> URL root of the search script, for making links

  • <Profile> Profile used

  • <dropXSL> Whether to apply or drop the XSL stylesheet

  • <AdvancedSearch> Set to 1 if the advanced form should be displayed

  • <Proximity> Proximity used for the search. Possible values:

    • line - Must occur on the same line

    • sentence - Must occur within the same sentence

    • paragraph - Must occur within the same paragraph

    • page - must occur within same HTML document (default)

  • <Suffixes> Suffix processing for the search. Possible values:

    • 0 - Exact Match only

    • 1 - Plurals and Possessives

    • 2 - All Word Forms

    • 3 - Custom

  • <Thesaurus> Set to 1 if the Thesaurus was used for synonyms

  • <Order> Ordering of the search. Possible values:

    • r - relevance

    • dd - newest first

    • da - oldest first

  • <RankOrder> Favors results with query terms in the same order as the query

  • <RankProximity> Favors results with query terms close together

  • <RankDatabaseFrequency> Favors results with query terms more rare across the entire profile

  • <RankDocumentFrequency> Favors results with query terms repeated more often

  • <RankPosition> Favors results with query terms earlier in the document

  • <RankDepth> Favors results fewer links away from the starting point

  • <RankDateBiasWeight> Date biasing: weight to favor newer results. Present if non-zero.

  • <RankDateBiasHalfLife> Decay rate of <RankDateBiasWeight>: age (in seconds) at which only half of it applies. Present if set by search user.

  • <RankDateBiasAnchor> Date of theoretical "newest" (best) possible (full weight) result for date biasing. Present if set by search user.

  • <RankDateBiasField> Field to use to compute age of documents for date biasing. Present if set by search user.

  • <mode> Set to admin if this is a Test Search

  • <opts> Internal use only

  • <authUser> User that was authenticated via the Proxy Module

  • <metasearchTarget> Indicates what backend metasearch targets are available, one element for each target. Currently selected targets will have a selected="selected" attribute

  • <AdminUrl> URL to the admin interface

  • <MakeLiveUrl> URL to make this Look and Feel live

  • <RssUrl> URL to RSS version of this search

  • <OpensearchUrl> URL to the OpenSearch version of this search

  • <OpensearchTitle> Suggested title for this OpenSearch

  • <QueryAutocomplete> Set to Y if Query Autocomplete is enabled

  • <LogoutUrl> URL for a 'Logout' link

  • <Category> Categories available for search

    • <CatVisible> Set to Y if the category should be selectable in the list of categories

    • <CatSel> Set to Y if this category is currently selected

    • <CatVal> Value to submit to search for this category

    • <CatName> Display name for this category

  • <TopBestBets> List of "Best Bets" links

    • <BBTitle> Title for this section of Best Bets

    • <BestBet> Individual Best Bet records

      • <BBResultNum> Ordered number for this Best Bet

      • <BBPriority> Priority for this Best Bet, as assigned in the admin interface

      • <BBLink> URL for this Best Bet

      • <BBLinkDisplay> URL that displays for this Best Bet. Long Urls are intelligently truncated for display

      • <BBResult> URL for this individual Best Bet, as assigned in the admin interface

      • <BBDescription> Description for this individual Best Bet, as assigned in the admin interface

      • <BBGroupname> Name of the Best Bet group this Best Bet belongs to

      • <BBGroupid> id of the Best Bet group this Best Bet belongs to

      • <BBKeywords> Keywords that trigger this Best Bet record to display. This is all keywords for this individual record, not just the one that triggered this activation

  • <ProfileInfo> Encloses some profile summary info

    • <Profile> Profile to which this ProfileInfo refers to

    • <Feature> Notes whether a feature is enabled: feature name is name attribute (e.g. proximity), enabled if isEnabled attribute is Y

    • <ResultDecl> Declarations of User Fields that will be in Result elements, each has a name and type attribute

    • <ExitIsEarly> Set to Y if search aborted

    • <ExitReason> Set to ok if search finished normally, otherwise token indicating reason (see ExitReason table below)

    • <RedirectUrl> Only used when results Authorization Method is set to Forward login cookies or CAS. If present, specifies a (%REFERER%-modified) version of Login URL (the search setting, not XML element). Its value is an external (not Webinator) URL to redirect the user to, which will prompt the user to log in and obtain the authentication cookies or parameters needed for a Results Authorization search.

    • <LoginUrl> Only used when results Authorization Method is set to Basic/NTLM/file - prompt via form. If present, specifies a local (Webinator) <form action> URL which will prompt for (and accept) the rauser/rapass variables, which contain user credentials needed for a Results Authorization search.

  • <Summary> Encloses search results summary, only present if a search was actually performed

    • <Profile> Profile that this Summary element applies to

    • <Start> First result item to list

    • <End> Last result item to list

    • <TotalNum> Total number of result items found, before Results Authorization

    • <TotalIsEstimate> Set to Y if TotalNum is an estimate

    • <TotalIsShort> Set to Y if TotalNum is known to be short (e.g. early exit)

    • <UserResultsNum> Total number of result items found, after Results Authorization

    • <UserResultsIsEstimate> Set to Y if UserResultsNum is an estimate

    • <UserResultsIsShort> Set to Y if UserResultsNum is known to be short (e.g. early exit)

    • <ResultsAuthorization> Set to Y if Results Authorization was used

    • <Total> Readable text for total number of results, after Results Authorization

    • <GroupBySite> Set to Y if Results per Site was used with this query.

    • <CurOrder> Text that describes the order by which results are listed

    • <OrderLink> URL that provides an alternative sorting order results list

    • <OrderType> Text that describes OrderLink

    • <NewSkip> (Metasearch only) Skip value to use for any further request. Only needed with the SOAP API

    • <PreviousLink> URL to the previous page of results

    • <FirstPage> Set to 1 if this is the first page of results

    • <Pages> Contains data on pages of results

      • <PageLink> URL to a certain page of results

      • <PageNumber> Page number a page of results

    • <NextLink> URL to the next page of results

    • <LastPage> Set to 1 if this is the last page of results

    • <Credit> Text to introduce the credit image

    • <CreditImage> URL of the credit image

  • <Result> Contains data about a given result

    • <Profile> Profile for this Result

    • <BackendProfile> Profiles used by metasearch backends

    • <Num> Number of this result item

    • <Skip> Internal use: raw skip(s) for result. Valid for Meta Search back-ends

    • <Id> Identifier for this result

    • <ResultTitle> Title of this result

    • <Url> URL of this result

    • <ClickUrl> URL for this result item, as should be clicked by the user. Use Url if not present. Only sent if Query Logging is enabled, in which case it contains redirect for logging the click-through

    • <UrlPDFHi> URL to highlight this PDF in Acrobat Reader, only used with Legacy highlighting

    • <UrlDisplay> Displayed URL for this result

    • <UrlWalk> URL used during the walk, if different from <Url>. Only used when a custom Result URL Source is set.

    • <UrlCached> URL to retrieve the cached version of this result

    • <RawRank> Raw relevance rank value for this result (0-1000)

    • <ScaledRank> Raw rank scaled up for a more-like-this search (0-1000)

    • <PercentRank> ScaledRank as a percentage (0-100)

    • <DocSize> Size (bytes) of this result

    • <MimeType> MimeType for this result

    • <MimeTypeIcon> Icon file to use for this MimeType

    • <Depth> Number of links walked from Base URL(s) to this URL

    • <UrlSimilar> URL to search for pages similar to this result

    • <UrlInfo> URL for context of answers within a matching document

    • <UrlParents> URL of pages that link to this search result

    • <Modified> Date and time this result was last modified

    • <Visited> Date and time this result was walked

    • <Abstract> Brief text surrounding the matched word or phrase

    • <Charset> Character set of the formatted text of the page (typically Storage Charset unless conversion failure)

    • <SiteName> Name of the site for this result item

    • <UrlMoreResultsFromSite> URL for more results from this site

    • In addition, any Additional Fields that have been selected for Output will be sent as child elements of Result, one per field. Each element is named after the field, with a u: XML namespace prefix since they are custom fields. The value of the field will be the content of the element.

      For example, an Integer field Quantity and a GMLPoint field Location may be given as:

      <u:Quantity>57</u:Quantity>
      <u:Location>47.4500 -122.3000</u:Location>

  • <RightBestBets> List of right "Best Bets" links, see TopBestBets

  • <Spelling> Spelling suggestions

    • <SuggestWord> An individual spelling suggestion

      • <SpellPhrase> Label for the suggestions

      • <SpellLink> URL to search for the suggestion

      • <SpellWord> Suggestion content

      • <SpellCount> Number of results for this suggestion

  • <exportVar> Additional exported variables

  • <QueryMessage> Messages to show to the user

  • <Message> Additional diagnostic messages

    Attributes:

    • @type - Set to user for messages meant for end users, admin for Webinator administrator diagnostics

    • @code - Code for this message

    • @script - Script of this message

    • @line - Line number this message occurred

 

Token Description
ok Normal exit
ResAuth-ExternalLoginRequired Need Login Cookies: redirect to <RedirectUrl>
ResAuth-CredentialsRequired Need user/pass: send rauser/rapass to <LoginUrl>
ResAuth-LoginIncorrect User/pass incorrect; re-send to <LoginUrl>
ResAuth-SuccessLimit Successful Auth Result Limit reached
ResAuth-Timeout Results Authorization timeout
ResAuth-MaxDocsCheck Max Docs to Auth-Check exceeded
ResAuth-SmbError SMB error
ResAuth-NoSmb SMB unavailable/could not be run
NoProfileSpecified No profile specified
InvalidProfileName Invalid profile name (e.g. illegal characters)
NoSuchProfile No such profile
Timeout Search Timeout exceeded

Table 5.2: XML <ExitReason> Tokens

Match Info output is similar to search results, except it contains a ContextResult element instead of Result elements. ContextResult contains:

<ContextResult> Container for the "Match Info" for this result

  • <Url> URL of this result

  • <ClickUrl> URL for this result item, as should be clicked by the user. Use Url if not present. Only sent if Query Logging is enabled, in which case it contains redirect for logging the click-through

  • <UrlDisplay> Displayed URL for this result

  • <Depth> Number of links walked from Base URL(s) to this URL, with a full text label

  • <Size> Size (bytes) of this result

  • <MimeType> MimeType for this result

  • <MimeTypeIcon> Icon file to use for this MimeType

  • <Modified> Date and time this result was last modified

  • <Visited> Date and time this result was walked

  • <RecordCategory> Categories that would match this result

  • <Title> Title of this result

  • <Description> Description of the result

  • <Keywords> Keywords of the result

  • <Meta> Extracted metadata of the result

  • <Body> Body text the result

  • In addition, any Additional Fields that have been selected for Output will be sent as child elements of Result, one per field. Each element is named after the field, with a u: XML namespace prefix since they are custom fields. The value of the field will be the content of the element.

    For example, an Integer field Quantity and a GMLPoint field Location may be given as:

    <u:Quantity>57</u:Quantity>
    <u:Location>47.4500 -122.3000</u:Location>


Copyright © Thunderstone Software     Last updated: Dec 5 2019
Copyright © 2019 Thunderstone Software LLC. All rights reserved.