SYNOPSIS<urlutil $action [$arg ...]>
DESCRIPTION
The urlutil
function provides URL and other network-related
utility functions. The $action
argument determines what it does:
abs $absurl $relurl
or absurl $absurl $relurl
Makes URLs absolute (fully specified). The $absurl
values
are one or more absolute page URLs. The $relurl
values are
corresponding links - relative or not - from those page(s). For
each $relurl
value, its absolute value is returned, as if it
were a link on that page. If there are fewer $absurl
values
than $relurl
values, the last $absurl
value is re-used.
The protocol and hostname (if any) in each returned value will be
lowercase.charsetcanon $charset
Returns canonical name for charset name $charset
, according
to current Charset Config file (here).
Can be used to map charset aliases to canonical names.charsetconv $buf $from [$to]
Converts text buffer $buf
from charset $from
to
charset $to
. The default for $to
if unspecified or
empty is the current <urlcp charsettxt>
setting. Some
character sets may require the use of an external charset
converter (the default is iconv
, see
<urlcp charsetconverter>
to change it), which is automatically
executed when needed. Added in version 5.00.1090598954 20040723.charsetdetect $buf
Returns guess at charset for text buffer $buf
, or
"Unknown
" if charset unknown. Only limited charset
detection is supported, primarily UTF-8, UTF-16BE/UTF-16LE, and
all-7-bit ISO-8859-1. Added in version 7.02.1398457000 20140425.filepath $u
Takes $u
, which must be a file://
URL, and
returns the local file path that would be used to read the file,
as determined by the current <urlcp fileroot>
etc. settings.pacinit
Initializes proxy auto-config by fetching PAC script (if
configured, here) and running it. Returns 1 if
successful, 0 if not. The error from the fetch, messages from the
fetch and script execution, and the body of the script (if
fetched) are available afterwards via <urlinfo>
. If no PAC
script nor URL is configured, or the script was already
initialized, no action is taken, and 1 (success) is returned.
Calling <urlutil pacinit>
when using a PAC script is not
necessary: the PAC script is automatically fetched and run when
needed, i.e. at the first <fetch>
or <submit>
, and
any messages at PAC initialization are reported. However any PAC
failure during such automatic initialization merely translates
into a Proxy auto-config error for the <fetch>
. The
<urlutil pacinit>
action provides a way to get more
detailed information about the PAC script, if desired for
diagnostic purposes.
split $u $parts
Splits a URL into one or more parts. The $u
value is the
URL to split. The $parts
values are a list of the parts to
return, in the same order, as $ret
values. The parts
can be any of protocol
, user
, pass
,
host
,
port
, path
, type
, query
or anchor
.
Note that user
and pass
are not yet supported.sslcertificate $pem tostring
Parses an SSL certificate string buffer $pem
(in PEM
format). The tostring
sub-action returns a human-readable
string version of the certificate, with subject, issuer,
expiration etc. printed. This can be used to view a server
certificate returned from <urlinfo sslservercertificate>.
Several actions take inet
style argument(s).
This is an IPv4
address string, optionally followed by a netmask.
For IPv4, the format is dotted-decimal, i.e. N[.N[.[N.N]]] where N is a decimal, octal or hexadecimal integer from 0 to 255. If x < 4 values of N are given, the last N is taken as the last 5-x bytes instead of 1 byte, with missing bytes padded to the right. E.g. 192.258 is valid and equivalent to 192.1.2.0: the last N is 2 bytes in size, and covers 5 - 2 = 3 needed bytes, including 1 zero pad to the right. Conversely, 192.168.4.1027 is not valid: the last N is too large.
An IPv4 address may optionally be followed by a netmask, either of the form /B or :IPv4, where B is a decimal, octal or hexadecimal netmask integer from 0 to 32, and IPv4 is a dotted-decimal IPv4 address of the same format described above. If an :IPv4 netmask is given, only the largest contiguous set of most-significant 1 bits are used (because netmasks are contiguous). If no netmask is given, it will be calculated from standard IPv4 class A/B/C/D/E rules, but will be large enough to include all given bytes of the IP. E.g. 1.2.3.4 is Class A which has a netmask of 8, but the netmask will be extended to 32 to include all 4 given bytes.
In version 7.07.1554395000 20190404 and later, error messages are reported.
The inet
actions were added in version 5.01.1112986377 20050408,
and include the following (see also the SQL equivalents):
inetabbrev $inet
Returns a possibly shorter-than-canonical representation of
$inet
, where trailing zero byte(s) of an IPv4 address may
be omitted. All bytes of the network, and leading non-zero bytes
of the host, will be included. E.g. <urlutil inetabbrev
"192.100.0.0/24"> returns 192.100.0/24. The /B
netmask is included, except if (in version 7.07.1554840000
20190409 and later) the network is host-only (i.e. netmask is the
full size of the IP address). Empty string is returned on error.
inetcanon $inet
Returns canonical representation of $inet
. For IPv4, this
is dotted-decimal with all 4 bytes.
The /B netmask is included, except if (in version
7.07.1554840000 20190409 and later) the network is host-only
(i.e. netmask is the full size of the IP address). Empty string
is returned on error.
inetnetwork $inet
Returns string IP address with the network bits of $inet
,
and the host bits set to 0. Empty string is returned on error.inethost $inet
Returns string IP address with the host bits of $inet
,
and the network bits set to 0. Empty string is returned on error.inetbroadcast $inet
Returns string IP broadcast address for $inet
, i.e. with
the network bits, and host bits set to 1. Empty string is
returned on error.inetnetmask $inet
Returns string IP netmask for $inet
, i.e. with the
network bits set to 1, and host bits set to 0. Empty string is
returned on error.inetnetmasklen $inet
Returns integer netmask length of $inet
. -1 is returned
on error.inetcontains $inetA $inetB
Returns 1 if $inetA
contains $inetB
, i.e. every
address in $inetB
occurs within the $inetA
network.
0 is returned if not, or -1 on error.inetclass $inet
Returns class of $inet
, e.g. A
, B
, C
,
D
, E
or classless
if a different netmask is
used (or the address is IPv6). Empty string is returned on error.inet2int $inet
Returns integer representation of IP network/host bits of
$inet
(i.e. without netmask); useful for compact storage of
address as integer(s) instead of string.
Returns -1 is returned on error (note that -1 may also be
returned for an all-ones IP address, e.g. 255.255.255.255).
int2inet $i
Returns inet
string for
1- or 4-value varint $i
taken as an IP address. Since no netmask can be stored in the
integer form of an IP address, the returned IP string will not
have a netmask. Empty string is returned on error.
EXAMPLE<urlutil abs "http://example.com/dir/page.html" "other.html">
The return value in $ret
would be
http://example.com/dir/other.html.
CAVEATS
The urlutil
function was added in version 3.0.957600000 20000505.