urlcanonicalize

Canonicalize a URL. Usage:

urlcanonicalize(url[, flags])

Returns a copy of url, canonicalized according to case-insensitive comma-separated flags, which are zero or more of:

  • lowerProtocol

    Lower-cases the protocol.

  • lowerHost

    Lower-cases the hostname.

  • removeTrailingDot

    Removes trailing dot(s) in hostname.

  • reverseHost

    Reverse the host/domains in the hostname. E.g. http://host.example.com/ becomes http://com.example.host/. This can be used to put the most-significant part of the hostname leftmost.

  • removeStandardPort

    Remove the port number if it is the standard port for the protocol.

  • decodeSafeBytes

    URL-decode safe bytes, where semantics are unlikely to change. E.g. "%41" becomes "A", but "%2F" remains encoded, because it would decode to "/".

  • upperEncoded

    Upper-case the hex characters of encoded bytes.

  • lowerPath

    Lower-case the (non-encoded) characters in the path. May be used for URLs known to point to case-insensitive filesystems, e.g. Windows.

  • addTrailingSlash

    Adds a trailing slash to the path, if no path is present.

Default flags are all but reverseHost, lowerPath. A flag may be prefixed with the operator + to append the flag to existing flags; - to remove the flag from existing flags; or = (default) to clear existing flags first and then set the flag. Operators remain in effect for subsequent flags until the next operator (if any) is used. Function added in Texis version 7.05.


Copyright © Thunderstone Software     Last updated: Apr 15 2024
Copyright © 2025 Thunderstone Software LLC. All rights reserved.