5 Lexical conventions [lex]

5.13 Literals [lex.literal]

5.13.3 Character literals [lex.ccon]

encoding-prefix: one of
u8  u  U  L
basic-c-char:
any member of the translation character set except the U+0027 apostrophe,
   U+005c reverse solidus, or new-line character
simple-escape-sequence-char: one of
' " ? \ a b f n r t v
conditional-escape-sequence-char:
any member of the basic character set that is not an octal-digit, a simple-escape-sequence-char, or the characters N, o, u, U, or x
A non-encodable character literal is a character-literal whose c-char-sequence consists of a single c-char that is not a numeric-escape-sequence and that specifies a character that either lacks representation in the literal's associated character encoding or that cannot be encoded as a single code unit.
A multicharacter literal is a character-literal whose c-char-sequence consists of more than one c-char.
The encoding-prefix of a non-encodable character literal or a multicharacter literal shall be absent.
Such character-literals are conditionally-supported.
The kind of a character-literal, its type, and its associated character encoding are determined by its encoding-prefix and its c-char-sequence as defined by Table 10.
The special cases for non-encodable character literals and multicharacter literals take precedence over the base kind.
[Note 1:
The associated character encoding for ordinary character literals determines encodability, but does not determine the value of non-encodable ordinary character literals or ordinary multicharacter literals.
The examples in Table 10 for non-encodable ordinary character literals assume that the specified character lacks representation in the ordinary literal encoding or that encoding the character would require more than one code unit.
— end note]
Table 10: Character literals [tab:lex.ccon.literal]
Encoding
Kind
Type
Associated char-
Example
prefix
acter encoding
none
char
ordinary
'v'
non-encodable ordinary character literal
int
literal
'\U0001F525'
ordinary multicharacter literal
int
encoding
'abcd'
L
wchar_­t
wide literal
L'w'
encoding
u8
char8_­t
UTF-8
u8'x'
u
char16_­t
UTF-16
u'y'
U
char32_­t
UTF-32
U'z'
In translation phase 4, the value of a character-literal is determined using the range of representable values of the character-literal's type in translation phase 7.
A non-encodable character literal or a multicharacter literal has an implementation-defined value.
The value of any other kind of character-literal is determined as follows:
The character specified by a simple-escape-sequence is specified in Table 11.
[Note 3:
Using an escape sequence for a question mark is supported for compatibility with ISO C++ 2014 and ISO C.
— end note]
Table 11: Simple escape sequences [tab:lex.ccon.esc]
character
U+000a
line feed
\n
U+0009
character tabulation
\t
U+000b
line tabulation
\v
U+0008
backspace
\b
U+000d
carriage return
\r
U+000c
form feed
\f
U+0007
alert
\a
U+005c
reverse solidus
\\
U+003f
question mark
\?
U+0027
apostrophe
\'
U+0022
quotation mark
\"