The following urlcp
settings control what headers are sent,
which can affect what document the remote web server will return.
Some settings control what headers are "received" or returned.
accept
(2 arguments)
Set the HTTP Accept
header list of acceptable/desired MIME
types. Each value of the first argument ($value1
) is a
MIME media range, e.g. "text/html
" or "image/*
".
The corresponding value of $value2
, if given, is a
"quality" value, a percentage number from 0-100. If greater
than 0, the q
value of the corresponding media range is set
to that value. If $value2
has fewer values than
$value1
, the last value of $value2
, if any, is
reused. See the HTTP specification for details on how these
values are used by Web servers. The default Accept
list
(if not set) is "*/*
", e.g. any type.
Changing the Accept
list may affect the content type of
the document a Web server will send for a given URL, but it is no
guarantee that the requested type(s) will be returned. It is up
to the server to send the most appropriate form of a document
based on the Accept
list.
clearheaders
(no arguments)
Undo all headers set with header
. Any header
values
that overrode builtin headers will be restored to their builtin
values.fileresolveownership
(boolean)
Whether to resolve the owner and group SIDs (under Windows) and names of locally-fetched file:// URLs. If enabled, these will be returned in the response headers File-Owner-SID, File-Group-SID (under Windows), and File-Owner-Name, File-Group-Name (all platforms). (File-Uid, File-Gid are always returned under Unix, since no addditional traffic is needed to determine them.)
Off by default, since resolving this information uses extra network traffic and time, possibly blocking if the domain controller or NIS server cannot be reached. Added in version 8.01.1669072604 20221121.
header
(list, 2 arguments)
Set the HTTP request headers given in the first argument, to the
corresponding values in the second argument. This can be used to
set additional headers not otherwise settable. Note that cookies
are automatically handled in version 4.01.1022000000 20020521 and
later and thus Cookie
headers do not generally need to be
set in those versions.
In version 5.01.1245974000 20090625 and later, headers specified
with this setting will replace builtin headers of the same name
(e.g. Host
etc.), instead of causing a second copy of the
header to be sent. Note that setting/overriding builtin headers
can cause erratic behavior, as user-specified values may interfere
with library functionality. Builtin headers include
Accept
, Authorization
, Connection
,
Content-Length
, Content-Type
, Cookie
,
Host
, If-Modified-Since
, Proxy-Authorization
,
Upgrade
and User-Agent
. All of these are set
automatically by the library and/or have other <urlcp>
settings that are the preferred method of controlling them.
Setting a single empty value for a header will clear it (prevent
it from being sent, even if there is normally a builtin value for
the header). Setting no values (i.e. $null
in version 8+)
will undo any previous <urlcp header>
set for the header,
i.e. the builtin value (if any) will be sent.
It is not possible to send the same header multiple times: later values set will merely replace earlier ones and the header will be sent at most once. To send multiple values for a single header, set a single value with multiple tokens according to the HTTP syntax for the given header (typically comma-separated).
ifmodsince
(string)
Sets the HTTP If-Modified-Since
header to the given value.
The argument is a time, either in Texis-parseable format or HTTP
date format (www, dd mmm yyyy hh:mm:ss GMT
). If the
argument is empty, the header is cancelled.
Setting the If-Modified-Since
header creates a
conditional request: the document is only returned if it has been
changed since the given time, otherwise an empty document is
returned. Setting this header on a per-page basis, to the
Last-Modified
value from the previous fetch, can reduce the
traffic when re-walking a site: only new documents are returned.
Note that it is up to the remote server to handle the
If-Modified-Since
header, and the given time is interpreted
in its domain.
useragent
(string)
Sets the User-Agent
header sent with HTTP requests. The
default is
Mozilla/5.0 (compatible; T-H-U-N-D-E-R-S-T-O-N-E).