<urlutil $action [$arg ...]>
urlutil function provides URL and other network-related
utility functions. The
$action argument determines what it does:
abs $absurl $relurlor
absurl $absurl $relurlMakes URLs absolute (fully specified). The
$absurlvalues are one or more absolute page URLs. The
$relurlvalues are corresponding links - relative or not - from those page(s). For each
$relurlvalue, its absolute value is returned, as if it were a link on that page. If there are fewer
$relurlvalues, the last
$absurlvalue is re-used. The protocol and hostname (if any) in each returned value will be lowercase.
charsetcanon $charsetReturns canonical name for charset name
$charset, according to current Charset Config file (here). Can be used to map charset aliases to canonical names.
charsetconv $buf $from [$to]Converts text buffer
$to. The default for
$toif unspecified or empty is the current
<urlcp charsettxt>setting. Some character sets may require the use of an external charset converter (the default is
<urlcp charsetconverter>to change it), which is automatically executed when needed. Added in version 5.00.1090598954 20040723.
charsetdetect $bufReturns guess at charset for text buffer
$buf, or "
Unknown" if charset unknown. Only limited charset detection is supported, primarily UTF-8, UTF-16BE/UTF-16LE, and all-7-bit ISO-8859-1. Added in version 7.02.1398457000 20140425.
$u, which must be a
file://URL, and returns the local file path that would be used to read the file, as determined by the current
<urlcp fileroot>etc. settings.
Initializes proxy auto-config by fetching PAC script (if
configured, here) and running it. Returns 1 if
successful, 0 if not. The error from the fetch, messages from the
fetch and script execution, and the body of the script (if
fetched) are available afterwards via
<urlinfo>. If no PAC
script nor URL is configured, or the script was already
initialized, no action is taken, and 1 (success) is returned.
<urlutil pacinit> when using a PAC script is not
necessary: the PAC script is automatically fetched and run when
needed, i.e. at the first
any messages at PAC initialization are reported. However any PAC
failure during such automatic initialization merely translates
into a Proxy auto-config error for the
<urlutil pacinit> action provides a way to get more
detailed information about the PAC script, if desired for
split $u $partsSplits a URL into one or more parts. The
$uvalue is the URL to split. The
$partsvalues are a list of the parts to return, in the same order, as
$retvalues. The parts can be any of
anchor. Note that
passare not yet supported.
In Texis version 8 and later, the
hostIsIPv6 values were added. The
is the part of the URL after the trailing // of the protocol
and before the path, including all separators therein. Thus if
present, it contains the host (with any IPv6 brackets), optional
user info, and optional port (with colon). The
value is 1 if the host looks like a bracketed IPv6 address
host value will have the brackets stripped then -
or 0 if not.
sslcertificate $pem tostringParses an SSL certificate string buffer
$pem(in PEM format). The
tostringsub-action returns a human-readable string version of the certificate, with subject, issuer, expiration etc. printed. This can be used to view a server certificate returned from <urlinfo sslservercertificate>.
Several actions take
inet style argument(s).
This is an IPv4
address string, optionally followed by a netmask.
For IPv4, the format is dotted-decimal, i.e. $N$[.$N$[.$[N$.$N$]]] where $N$ is a decimal, octal or hexadecimal integer from 0 to 255. If $x < 4$ values of $N$ are given, the last $N$ is taken as the last $5-x$ bytes instead of 1 byte, with missing bytes padded to the right. E.g. 192.258 is valid and equivalent to 126.96.36.199: the last $N$ is 2 bytes in size, and covers 5 - 2 = 3 needed bytes, including 1 zero pad to the right. Conversely, 192.168.4.1027 is not valid: the last $N$ is too large.
An IPv4 address may optionally be followed by a netmask, either of the form /$B$ or :$IPv4$, where $B$ is a decimal, octal or hexadecimal netmask integer from 0 to 32, and $IPv4$ is a dotted-decimal IPv4 address of the same format described above. If an :$IPv4$ netmask is given, only the largest contiguous set of most-significant 1 bits are used (because netmasks are contiguous). If no netmask is given, it will be calculated from standard IPv4 class A/B/C/D/E rules, but will be large enough to include all given bytes of the IP. E.g. 188.8.131.52 is Class A which has a netmask of 8, but the netmask will be extended to 32 to include all 4 given bytes.
In version 8 and later, IPv6 addresses are supported as well. These are given in standard IPv6 hex format, i.e. $H$:$H$:$H$:$H$ where $H$ is a 16-bit hexadecimal number, with :: supported for a single span of zero bits, as per canonical IPv6 text representation.
An IPv6 address may optionally be followed by a netmask, of the form /$B$, where $B$ is a decimal, octal or hexadecimal netmask integer from 0 to 128. If no netmask is given, it defaults to the host-only network (i.e. 128).
In version 7.07.1554395000 20190404 and later, error messages are reported.
inet actions were added in version 5.01.1112986377 20050408,
and include the following (see also the SQL equivalents):
Returns a possibly shorter-than-canonical representation of
$inet, where trailing zero byte(s) of an IPv4 address may
be omitted. All bytes of the network, and leading non-zero bytes
of the host, will be included. E.g. <urlutil inetabbrev
"184.108.40.206/24"> returns 192.100.0/24. The /$B$
netmask is included, except if (in version 7.07.1554840000
20190409 and later) the network is host-only (i.e. netmask is the
full size of the IP address). Empty string is returned on error.
Returns canonical representation of
$inet. For IPv4, this
is dotted-decimal with all 4 bytes.
For IPv6, this is 8 16-bit hexadecimal integers (no leading
zeroes), colon-separated, possibly with a :: for zero bits.
The /$B$ netmask is included, except if (in version
7.07.1554840000 20190409 and later) the network is host-only
(i.e. netmask is the full size of the IP address). Empty string
is returned on error.
inetnetwork $inetReturns string IP address with the network bits of
$inet, and the host bits set to 0. Empty string is returned on error.
inethost $inetReturns string IP address with the host bits of
$inet, and the network bits set to 0. Empty string is returned on error.
inetbroadcast $inetReturns string IP broadcast address for
$inet, i.e. with the network bits, and host bits set to 1. Empty string is returned on error.
inetnetmask $inetReturns string IP netmask for
$inet, i.e. with the network bits set to 1, and host bits set to 0. Empty string is returned on error.
inetnetmasklen $inetReturns integer netmask length of
$inet. -1 is returned on error.
inetcontains $inetA $inetBReturns 1 if
$inetB, i.e. every address in
$inetBoccurs within the
$inetAnetwork. 0 is returned if not, or -1 on error. Note that an IPv4 address is not considered to be contained within the equivalent IPv4-mapped IPv6 address, nor vice-versa (e.g. ::ffff:220.127.116.11 is considered different from 18.104.22.168). To treat IPv4 addresses the same as their IPv4-mapped IPv6 equivalents, promote both arguments to IPv6 with
inetclass $inetReturns class of
classlessif a different netmask is used (or the address is IPv6). Empty string is returned on error.
Returns integer representation of IP network/host bits of
$inet (i.e. without netmask); useful for compact storage of
address as integer(s) instead of string.
Returns a varint with 1 value for IPv4 addresses, 4 for IPv6
addresses, or 0 values on error (i.e. return compares equal to
empty string on error). Note that in version 7 and earlier, a
single int was always returned, with -1 for error (or 255.255.255.255).
inetstring for integer
$itaken as an IP address. Since no netmask can be stored in the integer form of an IP address, the returned IP string will not have a netmask. Empty string is returned on error.
$inet to IPv4 (including netmask), iff IPv4-mapped
IPv6. Returns the equivalent IPv4 address for
$inet iff it
is an IPv4-mapped IPv6 address; e.g. ::ffff:22.214.171.124 would
return 126.96.36.199. Otherwise, returns canonical version of
$inet iff it is some other IPv6 address; e.g. 2000::a:000b:c:d would return 2000::a:b:c:d. Otherwise
returns empty string (i.e. on error). May be useful when storing
both IPv4 and IPv6 addresses in a common compact int(4)
inet2int, in order to recover original IP family
format on display (after
int2inet reconversion). Added in
$inet to IPv4-mapped IPv6 (including netmask), iff
IPv4. Returns the equivalent IPv4-mapped IPv6 address for
$inet iff it is IPv4; e.g. 188.8.131.52 would return ::ffff:184.108.40.206. Otherwise, returns canonical version of
$inet iff it is IPv6; e.g. 2000::a:000b:c:d would
return 2000::a:b:c:d. Otherwise returns empty string
(i.e. on error). May be useful when storing both IPv4 and IPv6
addresses in a common compact int(4) field from
inet2int, in order to convert potential IPv4 addresses to
inet2int conversion. Added in version 8.
Returns IP address family for
$inet: IPv4 iff IPv4
address, IPv6 iff IPv6 address, otherwise empty string.
Added in version 8.
<urlutil abs "http://example.com/dir/page.html" "other.html">
The return value in
$ret would be
urlutil function was added in version 3.0.957600000 20000505.