Syntax: select Yes or No buttons
With this set to Yes, Webinator will initially get
/robots.txt from any site being indexed and respect its
directives for what prefixes to ignore. Turning this setting off is
not generally recommended. Supported directives in
When set to Y, Webinator will process and respect the meta tag robots within each retrieved HTML page. This tag contains per-page robot (walker) control information; see here for details on its syntax.
Whether to still put an (empty) entry - a placeholder - in the
html search table for URLs that are excluded via
<meta name="robots"> tags. Leaving a placeholder improves
refresh walks, as the URL can then have its own individual refresh
time like any other stored URL. Without a placeholder, the URL would
be fetched every time a link to it is found, because no knowledge that
it has been recently fetched would be stored.
The downside to placeholders is that if the URL is also being searched
in queries - i.e.
Url is part of Index Fields - then
the excluded URL might be found in results. Placeholders have empty
text fields (e.g. no body, meta, etc.) to avoid matches on text, but
the URL field must remain.