15 Preprocessing directives [cpp]

15.4 Resource inclusion [cpp.embed]

15.4.1 General [cpp.embed.gen]

A bracket resource search for a sequence of characters searches a sequence of places for a resource identified uniquely by that sequence of characters.
How the places are determined or the resource identified is implementation-defined.
A quote resource search for a sequence of characters attempts to identify a resource that is named by the sequence of characters.
The named resource is searched for in an implementation-defined manner.
If the implementation does not support a quote resource search for that sequence of characters, or if the search fails, the result of the quote resource search is the result of a bracket resource search for the same sequence of characters.
A preprocessing directive of the form causes the replacement of that directive by preprocessing tokens derived from data in the resource identified by header-name, as specified below.
If the header-name is of the form the resource is identified by a bracket resource search for the sequence of characters of the h-char-sequence.
If the header-name is of the form the resource is identified by a quote resource search for the sequence of characters of the q-char-sequence.
If a bracket resource search fails, or if a quote or bracket resource search identifies a resource that cannot be processed by the implementation, the program is ill-formed.
[Note 1: 
If the resource cannot be processed, the program is ill-formed even when processing #embed with limit(0) ([cpp.embed.param.limit]) or evaluating __has_embed.
— end note]
Recommended practice: A mechanism similar to, but distinct from, the implementation-defined search paths used for #include ([cpp.include]) is encouraged.
Either form of the #embed directive processes the pp-tokens, if present, just as in normal text.
The pp-tokens shall then have the form embed-parameter-seq.
A resource is a source of data accessible from the translation environment.
A resource has an implementation-resource-width, which is the implementation-defined size in bits of the resource.
If the implementation-resource-width is not an integral multiple of CHAR_BIT, the program is ill-formed.
Let implementation-resource-count be implementation-resource-width divided by CHAR_BIT.
Every resource also has a resource-count, which is
A resource is empty if the resource-count is zero.
[Example 1: // ill-formed if the implementation-resource-width is 6 bits #embed "6_bits.bin" — end example]
The #embed directive is replaced by a comma-separated list of integer literals of type int, unless otherwise modified by embed parameters ([cpp.embed.param]).
The integer literals in the comma-separated list correspond to resource-count consecutive calls to std​::​fgetc ([cstdio.syn]) from the resource, as a binary file.
If any call to std​::​fgetc returns EOF, the program is ill-formed.
Recommended practice: The value of each integer literal should closely represent the bit stream of the resource unmodified.
This can require an implementation to consider potential differences between translation and execution environments, as well as any other applicable sources of mismatch.
[Example 2: #include <cstring> #include <cstddef> #include <fstream> #include <vector> #include <cassert> int main() { // If the file is the same as the resource in the translation environment, no assert in this program should fail. constexpr unsigned char d[] = { #embed <data.dat> }; const std::vector<unsigned char> vec_d = { #embed <data.dat> }; constexpr std::size_t expected_size = sizeof(d); // same file in execution environment as was embedded std::ifstream f_source("data.dat", std::ios::binary | std::ios::in); unsigned char runtime_d[expected_size]; char* ifstream_ptr = reinterpret_cast<char*>(runtime_d); assert(!f_source.read(ifstream_ptr, expected_size)); std::size_t ifstream_size = f_source.gcount(); assert (ifstream_size == expected_size); int is_same = std::memcmp(&d[0], ifstream_ptr, ifstream_size); assert(is_same == 0); int is_same_vec = std::memcmp(vec_d.data(), ifstream_ptr, ifstream_size); assert(is_same_vec == 0); } — end example]
[Example 3: int i = { #embed "i.dat" }; // well-formed if i.dat produces a single value int i2 = #embed "i.dat" ; // also well-formed if i.dat produces a single value struct s { double a, b, c; struct { double e, f, g; } x; double h, i, j; }; s x = { // well-formed if the directive produces nine or fewer values #embed "s.dat" }; — end example]
A preprocessing directive of the form (that does not match the previous form) is permitted.
The preprocessing tokens after embed in the directive are processed just as in normal text (i.e., each identifier currently defined as a macro name is replaced by its replacement list of preprocessing tokens).
Then, an attempt is made to form a header-name preprocessing token ([lex.header]) from the whitespace and the characters of the spellings of the resulting sequence of preprocessing tokens immediately after embed; the treatment of whitespace is implementation-defined.
If the attempt succeeds, the directive with the so-formed header-name is processed as specified for the previous form.
Otherwise, the program is ill-formed.
[Note 2: 
Adjacent string-literals are not concatenated into a single string-literal (see the translation phases in [lex.phases]); thus, an expansion that results in two string-literals is an invalid directive.
— end note]
Any further processing as in normal text described for the previous form is not performed.
[Note 3: 
That is, processing as in normal text happens once and only once for the entire directive.
— end note]
[Example 4: 
If the directive matches the second form, the whole directive is replaced.
If the directive matches the first form, everything after the name is replaced.
#define EMPTY #define X myfile #define Y rsc #define Z 42 #embed <myfile.rsc> prefix(Z) #embed EMPTY <X.Y> prefix(Z) is equivalent to: #embed <myfile.rsc> prefix(42) #embed <myfile.rsc> prefix(42)
— end example]

15.4.2 Embed parameters [cpp.embed.param]

15.4.2.1 limit parameter [cpp.embed.param.limit]

An embed-parameter of the form limit ( pp-balanced-token-seq ) specifies the maximum possible number of elements in the comma-delimited list.
It shall appear at most once in the embed-parameter-seq.
The preprocessing token defined shall not appear in the pp-balanced-token-seq.
The pp-balanced-token-seq is evaluated as a constant-expression using the rules as described in conditional inclusion ([cpp.cond]), but without being processed as in normal text an additional time.
[Example 1: #undef DATA_LIMIT #if __has_embed(<data.dat> limit(DATA_LIMIT)) #endif
is equivalent to:
#if __has_embed(<data.dat> limit(0)) #endif — end example]
[Example 2: #embed <data.dat> limit(__has_include("a.h")) #if __has_embed(<data.dat> limit(__has_include("a.h"))) // ill-formed: __has_include ([cpp.cond]) cannot appear here #endif — end example]
The constant-expression shall be an integral constant expression whose value is greater than or equal to zero.
The resource-count ([cpp.embed.gen]) becomes implementation-resource-count, if the value of the constant-expression is greater than implementation-resource-count; otherwise, the value of the constant-expression.
[Example 3: constexpr unsigned char sound_signature[] = { // a hypothetical resource capable of expanding to four or more elements #embed <sdk/jump.wav> limit(2+2) }; static_assert(sizeof(sound_signature) == 4); // OK — end example]

15.4.2.2 prefix parameter [cpp.embed.param.prefix]

An embed-parameter of the form
prefix ( pp-balanced-token-seq )
shall appear at most once in the embed-parameter-seq.
If the resource is empty, this embed-parameter is ignored.
Otherwise, the pp-balanced-token-seq is placed immediately before the comma-delimited list of integral literals.

15.4.2.3 suffix parameter [cpp.embed.param.suffix]

An embed-parameter of the form
suffix ( pp-balanced-token-seq )
shall appear at most once in the embed-parameter-seq.
If the resource is empty, this embed-parameter is ignored.
Otherwise, the pp-balanced-token-seq is placed immediately after the comma-delimited list of the integral constant expressions.
[Example 1: constexpr unsigned char whl[] = { #embed "ches.glsl" \ prefix(0xEF, 0xBB, 0xBF, ) /* a sequence of bytes */ \ suffix(,) 0 }; // always null-terminated, contains the sequence if not empty constexpr bool is_empty = sizeof(whl) == 1 && whl[0] == '\0'; constexpr bool is_not_empty = sizeof(whl) >= 4 && whl[sizeof(whl) - 1] == '\0' && whl[0] == '\xEF' && whl[1] == '\xBB' && whl[2] == '\xBF'; static_assert(is_empty || is_not_empty); — end example]

15.4.2.4 if_empty parameter [cpp.embed.param.if.empty]

An embed-parameter of the form
if_empty ( pp-balanced-token-seq )
shall appear at most once in the embed-parameter-seq.
If the resource is not empty, this embed-parameter is ignored.
Otherwise, the #embed directive is replaced by the pp-balanced-token-seq.
[Example 1: 
limit(0) affects when a resource is considered empty.
Therefore, the following program:
#embed </owo/uwurandom> \ if_empty(42203) limit(0) expands to 42203
— end example]
[Example 2: 
This resource is considered empty due to the limit(0) embed-parameter, always, including in __has_embed clauses.
int infinity_zero () { #if __has_embed(</owo/uwurandom> limit(0) prefix(some tokens)) == __STDC_EMBED_EMPTY__ // if </owo/uwurandom> exists, this conditional inclusion branch is taken and the function returns 0. return 0; #else // otherwise, the resource does not exist #error "The resource does not exist" #endif } — end example]