5 Lexical conventions [lex]

5.10 Identifiers [lex.name]

identifier-start:
nondigit
an element of the translation character set with the Unicode property XID_Start
identifier-continue:
digit
nondigit
an element of the translation character set with the Unicode property XID_Continue
nondigit: one of
a b c d e f g h i j k l m
n o p q r s t u v w x y z
A B C D E F G H I J K L M
N O P Q R S T U V W X Y Z _
digit: one of
0 1 2 3 4 5 6 7 8 9
[Note 1: 
The character properties XID_Start and XID_Continue are Derived Core Properties as described by UAX #44 of the Unicode Standard.13
— end note]
The program is ill-formed if an identifier does not conform to Normalization Form C as specified in the Unicode Standard.
[Note 2: 
Identifiers are case-sensitive.
— end note]
[Note 3: 
[uaxid] compares the requirements of UAX #31 of the Unicode Standard with the C++ rules for identifiers.
— end note]
[Note 4: 
In translation phase 4, identifier also includes those preprocessing-tokens ([lex.pptoken]) differentiated as keywords ([lex.key]) in the later translation phase 7 ([lex.token]).
— end note]
The identifiers in Table 4 have a special meaning when appearing in a certain context.
When referred to in the grammar, these identifiers are used explicitly rather than using the identifier grammar production.
Unless otherwise specified, any ambiguity as to whether a given identifier has a special meaning is resolved to interpret the token as a regular identifier.
Table 4: Identifiers with special meaning [tab:lex.name.special]
final
import
module
override
In addition, some identifiers appearing as a token or preprocessing-token are reserved for use by C++ implementations and shall not be used otherwise; no diagnostic is required.
  • Each identifier that contains a double underscore __ or begins with an underscore followed by an uppercase letter is reserved to the implementation for any use.
  • Each identifier that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
13)13)
On systems in which linkers cannot accept extended characters, an encoding of the universal-character-name can be used in forming valid external identifiers.
For example, some otherwise unused character or sequence of characters can be used to encode the \u in a universal-character-name.
Extended characters can produce a long external identifier, but C++ does not place a translation limit on significant characters for external identifiers.