22 General utilities library [utilities]

22.14 Formatting [format]

22.14.6 Formatter [format.formatter]

22.14.6.4 Formatting escaped characters and strings [format.string.escaped]

A character or string can be formatted as escaped to make it more suitable for debugging or for logging.
The escaped string E representation of a string S is constructed by encoding a sequence of characters as follows.
The associated character encoding CE for charT (Table 13) is used to both interpret S and construct E.
  • U+0022 quotation mark (") is appended to E.
  • For each code unit sequence X in S that either encodes a single character, is a shift sequence, or is a sequence of ill-formed code units, processing is in order as follows:
    • If X encodes a single character C, then:
      • If C is one of the characters in Table 76, then the two characters shown as the corresponding escape sequence are appended to E.
      • Otherwise, if C is not U+0020 space and
        • CE is a Unicode encoding and C corresponds to either a UCS scalar value whose Unicode property General_­Category has a value in the groups Separator (Z) or Other (C) or to a UCS scalar value which has the Unicode property Grapheme_­Extend=Yes, as described by table 12 of UAX #44, or
        • CE is not a Unicode encoding and C is one of an implementation-defined set of separator or non-printable characters
        then the sequence \u{hex-digit-sequence} is appended to E, where hex-digit-sequence is the shortest hexadecimal representation of C using lower-case hexadecimal digits.
      • Otherwise, C is appended to E.
    • Otherwise, if X is a shift sequence, the effect on E and further decoding of S is unspecified.
      Recommended practice: A shift sequence should be represented in E such that the original code unit sequence of S can be reconstructed.
    • Otherwise (X is a sequence of ill-formed code units), each code unit U is appended to E in order as the sequence \x{hex-digit-sequence}, where hex-digit-sequence is the shortest hexadecimal representation of U using lower-case hexadecimal digits.
  • Finally, U+0022 quotation mark (") is appended to E.
Table 76: Mapping of characters to escape sequences [tab:format.escape.sequences]
Character
Escape sequence
U+0009 character tabulation
\t
U+000a line feed
\n
U+000d carriage return
\r
U+0022 quotation mark
\"
U+005c reverse solidus
\\
The escaped string representation of a character C is equivalent to the escaped string representation of a string of C, except that:
  • the result starts and ends with U+0027 apostrophe (') instead of U+0022 quotation mark ("), and
  • if C is U+0027 apostrophe, the two characters \' are appended to E, and
  • if C is U+0022 quotation mark, then C is appended unchanged.
[Example 1: string s0 = format("[{}]", "h\tllo"); // s0 has value: [h llo] string s1 = format("[{:?}]", "h\tllo"); // s1 has value: ["h\tllo"] string s3 = format("[{:?}, {:?}]", '\'', '"'); // s3 has value: ['\'', '"'] // The following examples assume use of the UTF-8 encoding string s4 = format("[{:?}]", string("\0 \n \t \x02 \x1b", 9)); // s4 has value: ["\u{0} \n \t \u{2} \u{1b}"] string s5 = format("[{:?}]", "\xc3\x28"); // invalid UTF-8, s5 has value: ["\x{c3}("] — end example]