Extended Flags

 

The following flags are available for fmt codes, in addition to the standard printf() flags described above:

  • a Next argument is strftime() format string; used for %t/%T time code (here).

  • k For numeric formats, print a comma (,) every 3 places to the left of the decimal (e.g. every multiple of a thousand).

  • K (upper case `K') Same as k, but print the next argument instead of a comma.

  • & (ampersand) Use the HTML entity   instead of space when padding fields. This is of some use when printing in an HTML environment where spaces are normally compressed when displayed, and thus space padding would be lost. Added in version 2.6.931500000 19990709.

  • ! (exclamation point)

    When used with %H, %U, %V, %B, %c, %W or %z, decode appropriately instead of encoding. Added in version 3.01.969000000 20000914. (Note that for %H, only ampersand-escaped entities are decoded; for parsing and removal of tags see fetch, here.)

  • _ (underscore) Use decimal ASCII value 160 instead of 32 (space) when padding fields. This is the ISO Latin-1 character for the HTML entity  . Added in version 2.6.931500000 19990709. For the "%v" (UTF-16 encode) format code, a leading BOM (byte-order-mark) will not be output. For the "%!v" (UTF-16 decode) format code, a leading BOM in the input will be preserved instead of stripped in the output. For the "%Q"/"%!Q" (quoted-printable encode/decode) format codes, the "Q" encoding will be used instead of quoted-printable.

  • ^ (caret)

    Output only XML-safe characters; unsafe characters are replaced with a question mark. Valid for %V, %=V, %!V, %v, %!v, %W, %!W and %s format codes (text is assumed to be ISO-8859-1 for %s). XML safe characters are all characters except: U+0000 through U+0008 inclusive, U+000B, U+000C, U+000E through U+001F inclusive, U+FFFE and U+FFFF. Added in version 5.01.1220392000 20080902.

  • = (equal sign) Input encoding is "equal to" (the same) as output encoding, i.e. just validate it and replace illegal encoding sequences with "?". Unescaping of HTML sequences in the source (h flag) is disabled. Valid for %V format code. Added in version 5.01.1220402000 20080902.

  • | (pipe) Interpret illegal encoding sequences in the source as individual ISO-8859-1 bytes, instead of replacing with the "?" character. When used with %=V for example, this allows UTF-8 to be validated and passed through as-is, yet isolated ISO-8859-1 characters (if any) will still be converted to UTF-8. Valid for %!V, %=V, %v, %W and %!W format codes. Added in version 5.01.1220406000 20080902.

  • h For %!V (UTF-8 decode) and %v (UTF-16 encode): if given once, HTML-escapes out-of-range (over 255 for %!V , over 0x10FFFF for %v) characters instead of replacing with ?. For %V (UTF-8 encode) and %!v (UTF-16 decode): if given once, unescapes HTML sequences first; this allows characters that are out-of-range in the input encoding to be represented natively in the output encoding.

    For %V, %!V, %v, %!v, %W and %!W, if given twice (e.g. hh), also HTML-escapes low (7-bit) values (e.g. control chars, <, >) in the output. Added in version 3.01.969000000 20000914. (The h flag is also used in another context as a sub-flag for Metamorph mark-up, here.) In version 6.00.1335996839 20120502 and later, if given three times (e.g. hhh), just HTML-escapes 7-bit values; does not also decode HTML entities in the input.

  • j (jay)   For the %s, %H, %v, %V, %B and %Q format codes (and their !-decode variants), also do newline translation. Any of the newline byte sequences CR, LF, or CRLF in the input will be replaced with the machine-native newline sequence in the output, instead of being output as-is. This allows text newlines to be portably "cleaned up" for the current system, without having to detect what the system is. If c is given immediately after the j, CR is used as the output sequence, instead of the machine-native sequence. If l (el) is given immediately after the j, LF is used as the output sequence. If both c and l are given (in either order), CRLF is used. The c and l subflags allow a non-native system's newline convention to be used, e.g. by a web application that is adapting to browsers of varying operating systems. Note that for the %B format code, input CR/LF bytes are never translated (since it is a binary encoding); j and its subflags only affect the output of "soft" line-wrap newlines that do not correspond to any input character. Added in version 4.03.1056420269 20030623.

  • l (el) For %H, only encode low (7-bit) characters; leave characters above 127 as-is. This is useful when HTML-escaping UTF-8 text, to avoid disturbing multi-byte characters. When combined with ! (decode), escape sequences are decoded to low (7-bit) strings, e.g. "&copy;" is replaced with "(c)" instead of ASCII character 169. (The l flag is also used with numeric format codes to indicate a long integer or double, and with the j flag as a subflag.) Added in version 3.01.969000000 20000914. The l flag has yet another meaning when used with the %/ or %: format codes; see discussion of those codes above.

  • m For the %s, %H, %V and %v codes, mark up with a Metamorph query. See next section for a discussion of this flag and its subflags b, B, U, R, h, n, p, P, c and e.

  • p Perform paragraph markup (for %s and %H codes). Paragraph breaks (text matching the REX expression "$=\space+") are replaced with "<p/>" tags in the output. For the %U code, do path instead of query-string escapement: space is encoded as %20 instead of +.

  • P (upper case `P') Same as p, but use the next additional argument as the REX expression to match paragraph breaks. If given twice (PP), use another additional argument after the REX expression as the replacement string, instead of "<p/>". PP was added in version 6.

  • q

    For the %U code, do full-encoding: encode "/" (forward slash) and "@" (at-sign) as well. Implies p flag as well. Added Dec. 2 1998. For %!U (URL decode), only decode unreserved (per RFC 2396 section 2.3) characters: alphanumeric, dash, underscore, period, exclamation point, tilde, asterisk, single-quote, left and right parentheses - this was added in version 7.04.1444076000 20151005. For the %W code, only the "Q" encoding will be used (no base64).

Example:

  <fmt "You owe $$%10.2kf to us." 56387.34>
  You owe $ 56,387.34 to us.


Copyright © Thunderstone Software     Last updated: Dec 10 2018
Copyright © 2019 Thunderstone Software LLC. All rights reserved.