<urlutil $action [$arg ...]>
urlutil function provides URL and other network-related
utility functions. The
$action argument determines what it does:
abs $absurl $relurlor
absurl $absurl $relurlMakes URLs absolute (fully specified). The
$absurlvalues are one or more absolute page URLs. The
$relurlvalues are corresponding links - relative or not - from those page(s). For each
$relurlvalue, its absolute value is returned, as if it were a link on that page. If there are fewer
$relurlvalues, the last
$absurlvalue is re-used. The protocol and hostname (if any) in each returned value will be lowercase.
charsetcanon $charsetReturns canonical name for charset name
$charset, according to current Charset Config file (here). Can be used to map charset aliases to canonical names.
charsetconv $buf $from [$to]Converts text buffer
$to. The default for
$toif unspecified or empty is the current
<urlcp charsettxt>setting. Some character sets may require the use of an external charset converter (the default is
<urlcp charsetconverter>to change it), which is automatically executed when needed. Added in version 5.00.1090598954 20040723.
charsetdetect $bufReturns guess at charset for text buffer
$buf, or "
Unknown" if charset unknown. Only limited charset detection is supported, primarily UTF-8, UTF-16BE/UTF-16LE, and all-7-bit ISO-8859-1. Added in version 7.02.1398457000 20140425.
$u, which must be a
file://URL, and returns the local file path that would be used to read the file, as determined by the current
<urlcp fileroot>etc. settings.
Initializes proxy auto-config by fetching PAC script (if
configured, here) and running it. Returns 1 if
successful, 0 if not. The error from the fetch, messages from the
fetch and script execution, and the body of the script (if
fetched) are available afterwards via
<urlinfo>. If no PAC
script nor URL is configured, or the script was already
initialized, no action is taken, and 1 (success) is returned.
<urlutil pacinit> when using a PAC script is not
necessary: the PAC script is automatically fetched and run when
needed, i.e. at the first
any messages at PAC initialization are reported. However any PAC
failure during such automatic initialization merely translates
into a Proxy auto-config error for the
<urlutil pacinit> action provides a way to get more
detailed information about the PAC script, if desired for
split $u $partSplits a URL into parts. The
$uvalue is the URL to split. The
$partvalue is a single part to return. The part can be any of
In Texis version 8 and later,
hostIsIPv6 were added. The
authority part is a
port: it is the part of the URL after the trailing //
of the protocol and before the path, including all separators
therein. Thus if present, it contains the host (with any IPv6
brackets), optional user/pass info, and optional port (with
hostIsIPv6 value is 1 if the host looks
like a bracketed IPv6 address - the
host value will have
the brackets stripped then - or 0 if not; in version
8.00.1637010861 20211115 and later, it is a
long value, in
earlier versions, a string.
In version 8.00.1637010861 20211115,
support was added, and
allpartnames was added. Also in
this version, support for multiple parts in
removed (now gives an error message). This allows a missing part
(zero return values) to be distinguished from a present but empty
part (one empty string return value). In previous versions,
multiple parts could be requested, and thus the return values were
in sync with
$part, which required missing part(s) to be
returned as empty string instead;
also always silently returned as empty.
return a list of the names of the zero or more part(s) that are
present in the URL.
sslcertificate $pem tostringParses an SSL certificate string buffer
$pem(in PEM format). The
tostringsub-action returns a human-readable string version of the certificate, with subject, issuer, expiration etc. printed. This can be used to view a server certificate returned from <urlinfo sslservercertificate>.
Several actions take
inet style argument(s).
This is an IPv4
address string, optionally followed by a netmask.
For IPv4, the format is dotted-decimal, i.e. N[.N[.[N.N]]] where N is a decimal, octal or hexadecimal integer from 0 to 255. If x < 4 values of N are given, the last N is taken as the last 5-x bytes instead of 1 byte, with missing bytes padded to the right. E.g. 192.258 is valid and equivalent to 18.104.22.168: the last N is 2 bytes in size, and covers 5 - 2 = 3 needed bytes, including 1 zero pad to the right. Conversely, 192.168.4.1027 is not valid: the last N is too large.
An IPv4 address may optionally be followed by a netmask, either of the form /B or :IPv4, where B is a decimal, octal or hexadecimal netmask integer from 0 to 32, and IPv4 is a dotted-decimal IPv4 address of the same format described above. If an :IPv4 netmask is given, only the largest contiguous set of most-significant 1 bits are used (because netmasks are contiguous). If no netmask is given, it will be calculated from standard IPv4 class A/B/C/D/E rules, but will be large enough to include all given bytes of the IP. E.g. 22.214.171.124 is Class A which has a netmask of 8, but the netmask will be extended to 32 to include all 4 given bytes.
In version 8 and later, IPv6 addresses are supported as well. These are given in standard IPv6 hex format, i.e. H:H:H:H where H is a 16-bit hexadecimal number, with :: supported for a single span of zero bits, as per canonical IPv6 text representation.
An IPv6 address may optionally be followed by a netmask, of the form /B, where B is a decimal, octal or hexadecimal netmask integer from 0 to 128. If no netmask is given, it defaults to the host-only network (i.e. 128).
In version 7.07.1554395000 20190404 and later, error messages are reported.
inet actions were added in version 5.01.1112986377 20050408,
and include the following (see also the SQL equivalents):
Returns a possibly shorter-than-canonical representation of
$inet, where trailing zero byte(s) of an IPv4 address may
be omitted. All bytes of the network, and leading non-zero bytes
of the host, will be included. E.g. <urlutil inetabbrev
"126.96.36.199/24"> returns 192.100.0/24. The /B
netmask is included, except if (in version 7.07.1554840000
20190409 and later) the network is host-only (i.e. netmask is the
full size of the IP address). Empty string is returned on error.
Returns canonical representation of
$inet. For IPv4, this
is dotted-decimal with all 4 bytes.
For IPv6, this is 8 16-bit hexadecimal integers (no leading
zeroes), colon-separated, possibly with a :: for zero bits.
The /B netmask is included, except if (in version
7.07.1554840000 20190409 and later) the network is host-only
(i.e. netmask is the full size of the IP address). Empty string
is returned on error.
inetnetwork $inetReturns string IP address with the network bits of
$inet, and the host bits set to 0. Empty string is returned on error.
inethost $inetReturns string IP address with the host bits of
$inet, and the network bits set to 0. Empty string is returned on error.
inetbroadcast $inetReturns string IP broadcast address for
$inet, i.e. with the network bits, and host bits set to 1. Empty string is returned on error.
inetnetmask $inetReturns string IP netmask for
$inet, i.e. with the network bits set to 1, and host bits set to 0. Empty string is returned on error.
inetnetmasklen $inetReturns integer netmask length of
$inet. -1 is returned on error.
inetcontains $inetA $inetBReturns 1 if
$inetB, i.e. every address in
$inetBoccurs within the
$inetAnetwork. 0 is returned if not, or -1 on error. Note that an IPv4 address is not considered to be contained within the equivalent IPv4-mapped IPv6 address, nor vice-versa (e.g. ::ffff:188.8.131.52 is considered different from 184.108.40.206). To treat IPv4 addresses the same as their IPv4-mapped IPv6 equivalents, promote both arguments to IPv6 with
inetclass $inetReturns class of
classlessif a different netmask is used (or the address is IPv6). Empty string is returned on error.
Returns integer representation of IP network/host bits of
$inet (i.e. without netmask); useful for compact storage of
address as integer(s) instead of string.
Returns a varint with 1 value for IPv4 addresses, 4 for IPv6
addresses, or 0 values on error (i.e. return compares equal to
empty string on error). Note that in version 7 and earlier, a
single int was always returned, with -1 for error (or 255.255.255.255).
inetstring for integer
$itaken as an IP address. Since no netmask can be stored in the integer form of an IP address, the returned IP string will not have a netmask. Empty string is returned on error.
$inet to IPv4 (including netmask), iff IPv4-mapped
IPv6. Returns the equivalent IPv4 address for
$inet iff it
is an IPv4-mapped IPv6 address; e.g. ::ffff:220.127.116.11 would
return 18.104.22.168. Otherwise, returns canonical version of
$inet iff it is some other IPv6 address; e.g. 2000::a:000b:c:d would return 2000::a:b:c:d. Otherwise
returns empty string (i.e. on error). May be useful when storing
both IPv4 and IPv6 addresses in a common compact int(4)
inet2int, in order to recover original IP family
format on display (after
int2inet reconversion). Added in
$inet to IPv4-mapped IPv6 (including netmask), iff
IPv4. Returns the equivalent IPv4-mapped IPv6 address for
$inet iff it is IPv4; e.g. 22.214.171.124 would return ::ffff:126.96.36.199. Otherwise, returns canonical version of
$inet iff it is IPv6; e.g. 2000::a:000b:c:d would
return 2000::a:b:c:d. Otherwise returns empty string
(i.e. on error). May be useful when storing both IPv4 and IPv6
addresses in a common compact int(4) field from
inet2int, in order to convert potential IPv4 addresses to
inet2int conversion. Added in version 8.
Returns IP address family for
$inet: IPv4 iff IPv4
address, IPv6 iff IPv6 address, otherwise empty string.
Added in version 8.
<urlutil abs "http://example.com/dir/page.html" "other.html">
The return value in
$ret would be
urlutil function was added in version 3.0.957600000 20000505.