The following flags are available for
fmt codes, in addition
to the standard
printf() flags described above:
aNext argument is
strftime()format string; used for
%Ttime code (here).
kFor numeric formats, print a comma (
,) every 3 places to the left of the decimal (e.g. every multiple of a thousand).
K(upper case `K') Same as
k, but print the next argument instead of a comma.
&(ampersand) Use the HTML entity
instead of space when padding fields. This is of some use when printing in an HTML environment where spaces are normally compressed when displayed, and thus space padding would be lost. Added in version 2.6.931500000 19990709.
When used with
%z, decode appropriately instead of
encoding. Added in version 3.01.969000000 20000914. (Note that
%H, only ampersand-escaped entities are decoded; for
parsing and removal of tags see
_(underscore) Use decimal ASCII value 160 instead of 32 (space) when padding fields. This is the ISO Latin-1 character for the HTML entity
. Added in version 2.6.931500000 19990709. For the "
%v" (UTF-16 encode) format code, a leading BOM (byte-order-mark) will not be output. For the "
%!v" (UTF-16 decode) format code, a leading BOM in the input will be preserved instead of stripped in the output. For the "
%!Q" (quoted-printable encode/decode) format codes, the "Q" encoding will be used instead of quoted-printable.
Output only XML-safe characters; unsafe characters are replaced
with a question mark. Valid for
%s format codes (text is assumed to be ISO-8859-1 for
%s). XML safe characters are all characters except: U+0000
through U+0008 inclusive, U+000B, U+000C, U+000E through U+001F
inclusive, U+FFFE and U+FFFF. Added in version 5.01.1220392000
=(equal sign) Input encoding is "equal to" (the same) as output encoding, i.e. just validate it and replace illegal encoding sequences with "
?". Unescaping of HTML sequences in the source (
hflag) is disabled. Valid for
%Vformat code. Added in version 5.01.1220402000 20080902.
|(pipe) Interpret illegal encoding sequences in the source as individual ISO-8859-1 bytes, instead of replacing with the "
?" character. When used with
%=Vfor example, this allows UTF-8 to be validated and passed through as-is, yet isolated ISO-8859-1 characters (if any) will still be converted to UTF-8. Valid for
%!Wformat codes. Added in version 5.01.1220406000 20080902.
%!V(UTF-8 decode) and
%v(UTF-16 encode): if given once, HTML-escapes out-of-range (over 255 for
%!V, over 0x10FFFF for
%v) characters instead of replacing with
%V(UTF-8 encode) and
%!v(UTF-16 decode): if given once, unescapes HTML sequences first; this allows characters that are out-of-range in the input encoding to be represented natively in the output encoding.
%!W, if given twice (e.g.
hh), also HTML-escapes low
(7-bit) values (e.g. control chars,
>) in the
output. Added in version 3.01.969000000 20000914. (The
flag is also used in another context as a sub-flag for Metamorph
mark-up, here.) In version 6.00.1335996839
20120502 and later, if given three times (e.g.
just HTML-escapes 7-bit values; does not also decode HTML entities
in the input.
j(jay) For the
%Qformat codes (and their
!-decode variants), also do newline translation. Any of the newline byte sequences CR, LF, or CRLF in the input will be replaced with the machine-native newline sequence in the output, instead of being output as-is. This allows text newlines to be portably "cleaned up" for the current system, without having to detect what the system is. If
cis given immediately after the
j, CR is used as the output sequence, instead of the machine-native sequence. If
l(el) is given immediately after the
j, LF is used as the output sequence. If both
lare given (in either order), CRLF is used. The
lsubflags allow a non-native system's newline convention to be used, e.g. by a web application that is adapting to browsers of varying operating systems. Note that for the
%Bformat code, input CR/LF bytes are never translated (since it is a binary encoding);
jand its subflags only affect the output of "soft" line-wrap newlines that do not correspond to any input character. Added in version 4.03.1056420269 20030623.
%H, only encode low (7-bit) characters; leave characters above 127 as-is. This is useful when HTML-escaping UTF-8 text, to avoid disturbing multi-byte characters. When combined with
!(decode), escape sequences are decoded to low (7-bit) strings, e.g. "
©" is replaced with "(c)" instead of ASCII character 169. (The
lflag is also used with numeric format codes to indicate a long integer or double, and with the
jflag as a subflag.) Added in version 3.01.969000000 20000914. The
lflag has yet another meaning when used with the
%:format codes; see discussion of those codes above.
%vcodes, mark up with a Metamorph query. See next section for a discussion of this flag and its subflags
Perform paragraph markup (for
Paragraph breaks (text matching the REX expression
$=\space+") are replaced with "
<p/>" tags in
the output. For the
%U code, do path escapement: space is
encoded to %20 not +, and (in version 8 and later)
&+;= are left as-is and + is not decoded (when also
P(upper case `P') For
%H, same as
p, but use the next additional argument as the REX expression to match paragraph breaks. If given twice (
PP), use another additional argument after the REX expression as the replacement string, instead of "
PPwas added in version 6.
in version 7 and earlier, do full-encoding: encode "
(forward slash) and "
@" (at-sign) as well (implies
p flag as well). Added Dec. 2 1998. For
decode) in version 7 and earlier, only decode unreserved (per RFC
2396 section 2.3) characters: alphanumeric, dash, underscore,
period, exclamation point, tilde, asterisk, single-quote, left and
right parentheses - this was added in version 7.04.1444076000
%W code, only the "Q" encoding will be used (no
<fmt "You owe $$%10.2kf to us." 56387.34>
You owe $ 56,387.34 to us.