Syntax: zero or more regular expressions (REX), separated by space or line break
Restricts walks to fetch URLs only matching any of the specified regular expressions anywhere in the URL (hostname, path, or query) when the Base URL matches.
If a Base URL is matched by an Extra URLs REX, then the only URLs that match the Extra URLs REX will be walked on that host. If a Base URL does not match an Extra URLs REX, then it is walked as normal.
It is a rarely used setting, most commonly used in conjunction with a hostname to fetch matching URLs on an additional host. Links still need to be found to those pages for them to be indexed.
For example, with the following Extra URLs REX:
(which matches a URL that begins with
products.example.com and contains
supplierid=BigCo), and using the following Base URLs:
The Extra URLs REX matches the
products.example.com URL, so only pages with
supplier=BigCo will be walked, while all of
help.example.com will be walked (following other inclusion/exclusion rules).
Available from version
Extra Domains, here.
See here for details on REX search syntax.