Headers

The following urlcp settings control what headers are sent with requests, which can affect what document the remote web server will return:

  • accept (2 arguments) Set the HTTP Accept header list of acceptable/desired MIME types. Each value of the first argument ($value1) is a MIME media range, e.g. "text/html" or "image/*". The corresponding value of $value2, if given, is a "quality" value, a percentage number from 0-100. If greater than 0, the q value of the corresponding media range is set to that value. If $value2 has fewer values than $value1, the last value of $value2, if any, is reused. See the HTTP specification for details on how these values are used by Web servers. The default Accept list (if not set) is "*/*", e.g. any type.

    Changing the Accept list may affect the content type of the document a Web server will send for a given URL, but it is no guarantee that the requested type(s) will be returned. It is up to the server to send the most appropriate form of a document based on the Accept list.

  • clearheaders (no arguments) Clear any headers set with header.

  • header (list, 2 arguments)

    Set the HTTP request headers given in the first argument, to the corresponding values in the second argument. This can be used to set additional headers not otherwise settable. Note that cookies are automatically handled in version 4.01.1022000000 20020521 and later and thus Cookie headers do not generally need to be set in those versions.

    In version 5.01.1245974000 20090625 and later, headers specified with this setting will replace builtin headers of the same name (e.g. Host etc.), instead of causing a second copy of the header to be sent. Note that setting/overriding builtin headers can cause erratic behavior, as user-specified values may interfere with library functionality. Builtin headers include Accept, Authorization, Connection, Content-Length, Content-Type, Cookie, Host, If-Modified-Since, Proxy-Authorization, Upgrade and User-Agent. All of these are set automatically by the library and/or have other <urlcp> settings that are the preferred method of controlling them.

    Setting an empty value for a header will clear it (prevent it from being sent). It is not possible to send the same header two or more times: later values will merely replace earlier ones and the header will be sent at most once. To send multiple values for a header, set a single value with multiple tokens according to the HTTP syntax for the given header (typically comma-separated).

  • ifmodsince (string) Sets the HTTP If-Modified-Since header to the given value. The argument is a time, either in Texis-parseable format or HTTP date format (www, dd mmm yyyy hh:mm:ss GMT). If the argument is empty, the header is cancelled.

    Setting the If-Modified-Since header creates a conditional request: the document is only returned if it has been changed since the given time, otherwise an empty document is returned. Setting this header on a per-page basis, to the Last-Modified value from the previous fetch, can reduce the traffic when re-walking a site: only new documents are returned. Note that it is up to the remote server to handle the If-Modified-Since header, and the given time is interpreted in its domain.

  • useragent (string) Sets the User-Agent header sent with HTTP requests. The default is a Netscape-compatible string.

Copyright © Thunderstone Software     Last updated: Dec 10 2018
Copyright © 2019 Thunderstone Software LLC. All rights reserved.