Igor's C++ Grimoire aims to be a reasonably complete reference to C++11, C++14, and C++17
Many sources have been used to compile it, but the major ones, and some of particular note are as follows. The C++ Programming Language 4th Ed, by Bjarne Stroustrup. Effective C++ 3rd Ed, and Effective Modern C++ 1st Ed, November 2014, both by Scott Meyers. On-line resources include ISO C++ FAQ, cppreference.com, CppCon and many others
The original intention was to just present the facts; the syntax, some examples, and a thorough description of the language with the aim of being as complete and unambiguous as possible whilst keeping the fluff to a minimum. As time has marched on, considerable 'other stuff' has been added particularly in the area of advice on avoiding common (and sometimes obscure) pitfalls, particular techniques, and general advice that should make life easier, Notable exceptions to the original ethos of only the facts are the Templates and Meta-Programming sections which almost entirely describe the application of technique rather than syntax, and the Design Considerations section which is mostly made-up of (often other people's) rambling thoughts and ideas
The standard library is not described in any detail. Exceptions are the Concurrency support (which is almost entirely implemented by standard library components), and some basic components and/or particular types/functions that are referred to by other items. These are all described in the Standard Techniques section and the sections that follow it. Be aware though, that most of this latter section provides only cursory descriptions, and those descriptions that are more complete may not be absolutely comprehensive
This document is intended as a reference; not a tutorial. As such, early sections often refer to concepts that are not introduced until later; hopefully the extensive cross-referencing will help in this respect
Hints, tips, and any musings that wander away from the narrow criteria of simply facts are highlighted in a box like this one. None of this information is necessary in order to understand C++, but there are definitely some very good ideas and a number of "gotcha's" here that are worth knowing about. Useful hints are shown with the symbol shown at the start of this paragraph
Warnings and general observations of a potentially problematic nature are shown with this symbol
Significant or obscure causes of error are shown with this symbol
Always do this if you want your code to work correctly and/or avoid future problems
Never do this if you want your code to work correctly and/or avoid future problems
Specific points in the main text may also be highlighted with one of the above symbols as appropriate
Items that only relate to a specific C++ version are identified with , and as appropriate. Whilst no attempt is made to document versions earlier than C++11, there is the odd item that is specific to C++98/03. These are identified with . A few non-standard (but commonly supported) features are described, and these are identified with
Features that have been introduced or modified in a particular version and retained in all subsequent versions are identified with , or , though in the case of (indicating a change since C++98/03), only major features are identified this way
Any points that are incomplete, seemingly inconsistent or based on ambiguous source material are marked with . Hovering over these markers should display further details
All but the most trivial of code examples have been compiled, run, and shown to work. However, they typically omit the required header file inclusions, and may require some using declarations where components from the std namespace have not been qualified (which is often the case)
References within this document may be textual links or may be shown as just tiny links like (which should pop-up a destination hint if hovered over). Many links refer to very specific items, paragraphs or examples appropriate to the context of the referrer
If you spot any errors, inconsistencies, or think something has been missed out or is incomplete or ambiguous then please let me know; no-matter how trivial the issue may seem. You can email me at igorknockknock.org.uk (note that this will not cut-and-paste correctly - that's deliberate. Also, take care spelling 'knockknock'!)
You are free to make copies of this document for your own use. You may not publicly republish it. Distribution within an organisation may only be in its original, unmodified form. You may not charge for it
If you enable Javascript, some controls shall appear to help with navigation and shall be described here
A number of navigation tools are provided by the buttons on the right of the display;-
C | Quick link to list of contents |
I | Quick link to the index |
☯ | Toggles between normal left-click operation and 'panning' (enables the page to be pushed up and down) |
⚑ |
Right-clicking one of these buttons once shall allow placement of a bookmark in the text; after right-clicking the button, move to where you would like the bookmark and ('left' or 'right') click again. The bookmark will not be placed if the selection is too vague (eg, you can't bookmark the whole document) Once a bookmark is placed, a ⚑ is placed in the left margin and its button is changed to a ⚐ A previously placed bookmark may be repositioned by repeating the same procedure A bookmark may be deleted by double-right-clicking the appropriate bookmark button To jump to a placed bookmark, left-click the appropriate button |
☞ | Many of the cross-reference links refer to very specific sections or paragraphs. This symbol is placed in the left margin to identify the link destination |
C++ is written in plain ASCII text and any one program is implemented as one or more text files
Whitespace (' ', TAB, and NEWLINE) must be used to separate type/variable/function names and reserved keywords. It is generally not required between names/keywords and syntactic elements and operators such as ( or ->. It is significant within string literals. Some operators are composed of multiple characters such as +=. Adding whitespace to these (ie, + =) would result in a syntax error. Other than these points and the occasional special case, whitespace may be freely used (or not) as required
A trigraph is a series of three characters which represent a single other character. The full collection of supported trigraphs are;-
Trigraph | Represents |
---|---|
??= | # |
??! | | |
??/ | \ |
??' | ^ |
??( | [ |
??) | ] |
??< | { |
??> | } |
??- | ~ |
Unless you absolutely have to, do not use trigraphs. However, be aware of them as they can be accidentally specified, thus leading to unexpected results
A digraph is a series of two characters which represent a single other character. The full collection of supported digraphs are;-
Trigraph | Represents |
---|---|
<: | [ |
:> | ] |
<% | ( |
%> | ) |
%: | # |
As with trigraphs, unless you absolutely have to, do not use digraphs, but be aware of them
Apart from the maintainability and management improvements that come from dividing a program into multiple files, the technique improves modularity, enhances logical structure, and allows separation of interfaces from implementation
The text files are generally divided into two groups; implementation files and interface (header) files. Implementation files will typically include one or more header files, and header files may (and often do) also include other header files. This distinction is rendered a little fuzzy in practice because it is common for header files to define implementation in the form of inline functions, template definitions, etc
A compiler shall typically deal with each implementation file in turn. It will first invoke the preprocessor to create a translation unit. It is to this that the C++ language rules are applied. The translation unit is compiled into object code. All the individual object code parts are then passed to a linker to form the final executable code
Where multiple header files are used that include each other, repeated inclusion of the same header can occur. Apart from wasting compilation time, this can cause problems with multiple definitions of the same thing(s). To protect against this, use something like this (see also pre-processor);-
Minimise compilation dependencies between files
A class declaration is usually also its definition, and that includes its interface plus significant implementation detail in the form of data member declarations and even private member functions
Therefore, when a client wanting to use the class includes the header file that provides the definition, it implicitly creates a dependency between itself and the types and values used in the class' implementation details. This creates a number of problems;-
The basic principle that leads to reduced compilation dependency is to replace definitions with declarations wherever possible. To achieve this;-
One option is to define a reference or pointer to an external type, rather than embed an object of that type. The former only requires a declaration, whereas the latter requires the full definition
Ultimately, this approach leads to a pimpl (pointer-to-implementation - see also special member functions) structure for the class with the class becoming just a handle. This way, the exposed details are considerably reduced, ideally to a reference to a single implementation type for which a minimalist forward declaration only need be provided. The implementation details do not have to be included in the header and therefore clients of the class are no longer dependent on them, and are not directly effected by changes to them. This also has the additional benefit of truly hiding the implementation details from the client. To complete the picture, the class member functions would typically forward to (possibly other member) functions that operate on the (hidden) implementation type
Another approach is to define an interface-only abstract base class. Such a type typically declares a virtual destructor, and a set of pure virtual interface functions that specify the interface. It generally does not define any data members or any constructors. Non-virtual functions may be defined in the base class to implement common functionality. See also Alternatives to virtual functions
Factory functions can be defined for each type of class derived from the interface-only base, and used like this;-
An application operating on objects derived from such an interface-only base class would only require recompilation if the interface changed; changes to object definitions derived from the interface-only class would not ripple-through to the application code
Another example shows how to use an interface-only base class using multiple inheritance
Remember that a function may be declared with argument and return types that are only declared and not defined
This moves the requirement to include the definition of the passed type from the header file that declares the function to the client code that calls the function. This is particularly useful if a header file declares many functions, of which only a few will be called in any one context
There are some special considerations when dealing with templates. In particular, there are two rules with regard to template compilation;-
Probably the most common code organisation approach is to include the same template definition into all translation units and rely on the compiler to optimise-away all duplicate specialisations; the "include everywhere" technique;-
One problem with this approach is that it tends to (unintentionally) encourage undesirable dependencies to grow between the user-code and the template definition
This problem can be mitigated by taking the approach of "include template definitions later (after they are used)". This can be achieved by dividing the template into a declaration .h file, and a definition .cpp file, and then arranging the translation unit thus;-
This minimises the changes of the template definition having some unanticipated and detrimental effect on the user code, but makes the reverse risk greater
Although most will, an implementation is not required to be able to delete duplicate/redundant copies of a template instantiation. This can lead to "multiple definition" errors at link-time
An implementation is not required to analyse duplicate/redundant copies of a template instantiation prior to deleting duplicates. This highlights the importance of ensuring that all instantiations for a specialisation are identical, so that whichever ones are discarded, the result will be the same
Regardless of how clever the compiler is, in a large application, building the multiple instantiations only to throw them away later can increase build times considerably
An implementation may be "hosted" or "free-standing"; the former includes all the standard library headers by default. The latter does not, but must support at least the header files highlighted (eg, <cstddef>) as a minimum
Everything defined in these headers is in the std namespace, so the definitions within them must be explicitly qualified or appropriate using-declarations and/or a using-directive must be used to bring them into scope
The C Library is based on C99
The C Library is based on C11 with the 2012 technical corrections applied, except it does not include stdatomic.h, stdnoreturn.h or threads.h as these features are provided by other library components
To access an external 'C' function from C++, use the following;-
A group declaration may also be made in order to create a linkage block;-
Assuming the above is within a header file, it is more common to see it written as follows; to avoid errors when processed by a C compiler;-
Assembler code may be embedded into C++ source code with the asm statement;-
The entry point of any C++ program is the function main(). It's prototype is the same as for plain 'C'. In fact, two prototypes are supported. A program must specify only one of them;-
Where;-
A program shall terminate if any of the following occurs;-
In any implementation, there are probably other ways of terminating a program such as division by zero, illegal memory access, etc
The plain 'C' (and C++) standard library function std::atexit() may be used to register a function that should be executed on normal program termination. For example;-
There are a number of pre-processor directives. Parsing these can (and likely does) result in code modification or compiler parameter modification. Only after all pre-processor directives are parsed is the compiler presented with the resulting code
All pre-processor directives start with the character #. Here is a list of them;-
Directive | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
#include filename |
Replace the #include line with the contents of the specified file. This is used to bring-in header files into code files, or into other header files. #include directives can be nested (a header files can be 'included' that itself includes other header files) There are two formats for specifying filename; by using <name> or by using "name". The former syntax uses the compiler's include path, and the latter is a path relative to the compiler's current directory path. However, this distinction can be somewhat blurred in some implementations. Here are some examples;-
#include <cstddef> // Include the standard header file cstddef
#include "./includes/common_defs.h" // Include local header file
// ./includes/common_defs.h
The standard header names do not have a .h extension, hence <cstddef> and not <cstddef.h> |
||||||||||
#define symbol value |
Define a symbol with a specified value. Equivalent to using the -D option on the command line of most compilers. For example;-
#define HELLO Hello-World
…would define the symbol HELLO with the value Hello-World. Whether the value makes sense or not will depend on the context it is subsequently used in |
||||||||||
#define symbol(arguments) value |
Define a macro; a pre-processor symbol that takes arguments. For example;-
#define FN_CALL(fn, a, b) \
(fn(a * b))
// Use the macro
FN_CALL(sleep, 5, 7); // This expands out to 'sleep(5 * 7);'
It is possible to make a macro take a variable number of arguments by using ... in the argument list and __VA_ARGS__ within the macro body All the above examples define their values within (…). Not doing this is legal but often causes errors as the expanded result creates an unexpected expression. The above also shows the use of \ as a continuation character. Macros may extend over many lines by terminating all the lines (except the last) with \ Macros cannot recurs and macro names cannot be overloaded. If adding comments to macros, use the /*…*/ syntax as some older tools may not understand the //… form Macros may use the # and ## operations (described below) |
||||||||||
#undef symbol | Undefine a symbol/macro previously defined with #define | ||||||||||
#line line-num #line line-num filename #line other |
Overrides the values returned by __FILE__/__LINE__, and reported by the compiler. line-num is a positive integer and filename follows normal preprocessor rules for a string constant. If the supplied parameters do not match the standard formats then the supplied format is macro-expanded; the result being expected to match one of the two standard formats |
||||||||||
#if expression |
Defines a section of code that is to be parsed if the specified expression equates to true. The section of code extends to the next #else, #elif or #endif. If the expression equates to false then the section of code is removed/ignored. The following forms are allowed for expression (the (…) are optional but improve readability);- The logical operators || and &&, and the relational operators ==, !=, <, >, <= and >= may be used in the expression. (…) may be used to force evaluation ordering
Note that expression must resolve to an integral value; specifying a string or floating point shall either yield a pre-processor error or shall parse but will not generate the intended result. Note specifically that operations such as sizeof() are not allowed; they are not understood by the pre-processor |
||||||||||
#elif expression | Defines an 'else if'. This is generally used to create 'if-else-if' ladders and defines an alternate branch to a preceding #if or #elif pre-processor statement | ||||||||||
#ifdef symbol | Defines a section of code that is to be parsed if the specified symbol has been previously defined by a #define pre-processor statement. The section of code extends to the next #else or #endif. If the symbol has not been defined then the section of code is removed/ignored | ||||||||||
#ifndef symbol | This is the same as #ifdef except the logic is reversed | ||||||||||
#else | Defines an alternate (false) branch to a preceding #if (or one of the variants) | ||||||||||
#endif | Terminates the optional section indicated by a preceding #if (or one of the variants) or else | ||||||||||
# symbol |
Convert symbol into a string by bracketing it with " characters. For example;-
#Hello-World
…would yield;-
"Hello-World"
|
||||||||||
symbol ## symbol |
Concatenates two symbols. For example;-
Hello- ## World
…would yield;-
Hello-World
|
||||||||||
#pragma option | Sets a compiler-specific option. The format of option depends on the option and the compiler. Unrecognised/unsupported #pragma directives should be ignored by the pre-processor | ||||||||||
symbol | Specifying a symbol (previously defined by a #define statement) on its own shall cause it to be expanded/replaced in-situ to its value | ||||||||||
#error text | Generates a compiler error, typically outputting text | ||||||||||
#warning text |
Generates a compiler warning, typically outputting text This is a non-standard directive but is supported by several implementations. Consider using the more widely supported #pragma message instead |
||||||||||
#pragma message ("text") |
Outputs text at compile-time This is a non-standard directive but is widely supported. Some implementations will simply output the message during compilation. Some will output the message and treat it as a compilation warning. Some will treat the message as a warning only if text begins with the text warning This has the advantage over using #warning in that if it is not supported, it won't cause an error (unrecognised #pragma directives should be ignored by the pre-processor) |
Using the pre-processor #define statement to define macros (ie, something that will expand to a piece of C++ code) is ugly and can be the cause of many errors. Don't do it; use constant expressions instead
Notwithstanding this, there are a couple of legitimate uses of macros; to support conditional compilation and to protect against recursive inclusion
To support conditional compilation. Limit #define statements to setting control values that will only be used by subsequent #if statements, and NOT to embed into C++ code. For example;-
The compiler defines a number of macros which may be used within code. Some of these are very useful, especially for debugging
Macro | Description |
---|---|
__cplusplus | Indication of C++ compilation (rather than plain 'C'). Its value is 201103L / 201402L / 201703L. See also __STDC__ |
__DATE__ | Current date in the format Mm dd yyyy |
__TIME__ | Current time in the format hh:mm:ss |
__FILE__ | Name of current source file. See also #line |
__LINE__ | Line number of source code within current file. See also #line |
__func__ | Name of the current function. This is a 'C'-style string and implementation defined. This is defined only within the body of a function and technically is not actually a macro (though this distinction doesn't matter in use) |
__STDC_HOSTED__ | Has a value of 1 if the implementation is hosted, 0 (zero) otherwise |
__STDCPP_DEFAULT_NEW_ALIGNMENT__ | Integer of type std::size_t that defines the minimum byte alignment guaranteed by the default implementation of operator new |
Some other macros are conditionally defined by the implementation;-
Macro | Description |
---|---|
__STDC__ | Indication of plain 'C' compilation (rather than C++). See also __cplusplus |
__STDC_VERSION__ | May or may not be defined. Implementation-specific |
__STDC_MB_MIGHT_NEQ_WC__ | Set to 1 if, in the encoding for wchar_t, a member of the basic character set might have a code value that differs from its value as an ordinary character literal |
__STDC_ISO_10646__ | An integer in the format yyyymmL. If defined then it indicates that all characters in the Unicode required set, when stored in a wchar_t type, have the same value as the short identifier of each character. The Unicode required set is specified by ISO/IEC 10646; the version being adhered to being specified by yyyymm |
__STDCPP_STRICT_POINTER_SAFETY__ | Set to 1 if the implementation has strict pointer safety. Otherwise it is undefined. The function std::get_pointer_safety() returns an enumeration indicating similar information. This is only of relevance if the implementation supports and uses garbage collection |
__STDCPP_THREADS__ | Set to 1 if a program can have more than one thread of execution. otherwise it is undefined. See also Concurrency |
Comments are defined as follows. There are two forms; the traditional 'C' comments which start with /* and end with */. Such comments may extend for as many lines as required. There is also the C++ comment style. This allows single-line comments only, starts with // and ends with the end-of-line character
Here is a complete list of reserved keywords;-
In addition, the keyword export is reserved but currently not used. It was originally intended to facilitate template definition/declaration separation, but the idea failed
The following contextual keywords are defined;-
Attributes provide a means of tagging certain names, types, functions, etc as having particular features. The following standard attributes are defined (there are also several non-standard attributes in common use);-
The [[deprecated]] attribute may be used to mark a name as deprecated. The name can still be used, but such marking gives the compiler an opportunity to issue a warning of its use
There are two forms of this attribute;-
This attribute may be applied to the following;-
Application | Example |
---|---|
Class declaration |
class [[deprecated]] X
{
// ...
};
The placement of [[deprecated]] is important; the following will deprecate the variable a, and not the class Y
[[deprecated]] class Y
{
// ...
} a;
|
Name alias |
using car [[deprecated]] = int;
[[deprecated]] typedef int colour;
|
Class data member |
[[deprecated]]
int a;
To deprecate all variables within a multi-declaration;-
[[deprecated]]
int a, b, c;
To deprecate one variable within a multi-declaration;-
int a [[deprecated]], b, c;
|
Structured binding |
[[deprecated]] auto [x, y] = a;
|
Class member function Friend function Function argument |
To deprecate a function name;-
[[deprecated]]
void fn();
To deprecate a function argument. For this to be meaningful (ie, to allow the user to avoid the deprecated argument), an overload of the deprecated form would likely to also need defining;-
void fn(int a, [[deprecated]] int b);
// The 'new' (overloaded) function without the deprecated argument
void fn(int a);
|
Namespace name |
namespace [[deprecated]] useful_stuff
{
// ...
}
|
Enumeration and enumerator values |
Deprecate the whole enumeration;-
enum [[deprecated]] colours
{
RED,
GREEN,
BLUE
};
Deprecate a single enumerator;-
enum colours
{
RED,
GREEN [[deprecated]],
BLUE
};
|
Template specialisation |
// Primary template
template<typename T>
class X
{
// ...
};
// Specialisation
template<>
class [[deprecated]] X<int>
{
// ...
};
|
The [[maybe_unused]] attribute is used to indicate that a declaration may not be used. If a compiler would normally issue a warning for an unused entity, then this attribute will suppress it
It may be applied to the same entities as [[deprecated]] (except for namespace names and template specialisations) in the same way
The [[nodiscard]] attribute is used to indicate that the return value of a function holds some significance and should not be ignored by the caller. It may be applied to a function declaration, a class declaration or an enum declaration
If the caller of a function with this attribute ignores the function's return value, then the compiler should issue a warning
Similarly, if a function returns a class or enum type that has been declared with this attribute then the compiler should issue a warning. For example;-
A Name refers any of the following;-
A template. Given;-
X is the template's name
A declaration (or a definition, if no previous declaration exists) introduces a name into a scope. Here is a list of possible scopes and details of what qualifies a name to be considered to be within each scope;-
Scope | Details |
---|---|
Local | A name declared within a function or lambda, or as an argument to the function/lambda. The scope of the name extends from its point of declaration to the end of the enclosing block |
Class | A ('member') name defined within a class, outside of any function, embedded class, enum, or namespace. The scope of the name extends to all parts of the class |
Namespace | A ('namespace member') name defined within a namespace, outside of any function, lambda, class, enum, or other namespace. The scope of the name extends from the point of declaration to the end of the namespace, but may be made accessible to other translation units |
Global | A ('global') name defined outside of any function, lambda, class, enum, or namespace. The scope of the name extends from the point of declaration to the end of the file it is declared in, but may be made accessible to other translation units |
Statement | A 'local' name defined within the {} block of a for, while, if, or switch statement, or naked; with no preceding statement. The scope of the name extends from the point of declaration to the end of the enclosing statement block |
Function | A label defined within a function. Its scope extends from the start of the function to the end |
Any name may be a type name or a non-type name. A type name is a struct, class, enum or union. A non-type name is a variable, function or an argument to a function
For the purposes of the following discussion, alias, typedef and template names do not feature (it's not possible to do so without causing an error)
The same type name or the same non-type name may not be defined twice within the same scope. So;-
It is possible to define a type name that is the same as a non-type name in the same scope. For example, this is legal;-
When referring to a name that is duplicated in this way, the default is to assume use of the non-type name. To use the type version of the name, it must be qualified by preceding its use with the appropriate struct, class, enum or union. This is called an Elaborated Type Specifier. Expanding on the previous example;-
The same name may be defined in two different scopes. It may be necessary to disambiguate which of the names is being referred to by qualifying the reference. For example;-
Qualification can be nested. Expanding on the above example;-
The qualification syntax (::) may be used to identify type or (as in the above example) non-type names.
A Namespace defines a named scope. This allows collections of logically related (type/function/object) names to be grouped together; they become members of the namespace. This notion allows the same names to exist in multiple parts of the codebase. Because they are in different scopes, the names do not interfere with each other. A namespace is declared like this;-
This example demonstrates the scope of namespaces and implicit/explicit name resolution;-
Argument-dependant lookup may be used to qualify name lookup, rather than explicit qualification with ::. For example;-
Namespaces are open and may be declared in parts. This allows the namespace members to be declared in separate source files, or maybe to allow division of members for improved readability (for example, separating the private from the public interface). For example;-
Namespace alias' may not be used to 'add' to a namespace in this way; the original namespace-name must be used
Namespace members may be declared within the namespace but defined outside it by qualification. For example;-
This is an important feature; it allows the namespace to be declared in a header file and some or all of its members to be defined elsewhere
Namespaces may be nested (this is used in the standard library in the chrono and rel_ops classes);-
Nested namespaces may be specified directly. For example, declaring sub_fn() from the previous example could also be written;-
Any members within a namespace declared as inline will take on the scope of the including namespace. For example;-
A namespace can be created without a name such as;-
Don't use unnamed namespaces within header files; their implicit scope mechanism can cause problems
Namespaces can include other namespaces. This can lead to naming conflicts. using-declarations, using-directives and namespace-alias' can resolve these issues and are described in the following sections
There is often a trade-off when using (or not using) using-declarations and using-directives; a trade-off between convenience, verbosity and clarity (of where a referenced object comes from). This must be dealt-with on a case-by-case basis
Generally, if references to many names from the same namespace are being made, then a using-directive may be appropriate. If there are multiple references to only a single (or a few) members of a particular namespace then a using-declaration is probably more appropriate. For infrequent references to individual names, explicit qualification is probably better
A number of using-declarations provides much finer-grained control than a single using-directive
The using qualifier should be restricted to small scopes to avoid confusion and accidental misuse. Overuse can also cause the very name clashes that namespaces are intended to avoid
If a namespace-scoped name is used often, a synonym (a using-declaration) may be defined via the using qualifier. This eliminates the need to constantly explicitly qualify the name with ::. Rather than this;-
…it is possible to do this;-
A using-directive may be used to bring into scope all members of a namespace. For example;-
Both using-declarations and using-directives may be used within other namespaces. Apart from bringing-in common external namespaces, this can be useful if a hierarchy of namespaces are being defined, such as a 'user' part and an 'implementation' part
The technique can also be used to construct local collections of other namespaces. For example;-
The inline qualifier may be applied to a using-directive. In the event of a name clash, this will give priority to the effected namespace. For example;-
While this may have some use in specific cases such as version control, it is not something that should be used generally
A namespace alias may be defined. This is usually used to provide a shorter, more convenient name for a namespace, or (for example) to provide a generic name for a versioned library;-
For example;-
This idea can also be applied to individual namespace members (though through a different mechanism);-
The C++ language entertains the concept of Undefined Behaviour. Essentially, this arises from code structures, techniques and/or control paths that do not perform in a predictable way; often without any warning from the compiler
If a program exhibits Undefined Behaviour then, essentially, anything that follows in the execution cannot be relied upon to be correct, or even to make sense. The program may crash. It may exit. It may continue to run but behave incorrectly or generate an indeterminate result. It may continue to run and operate perfectly well. It may proceed to calculate the answer to life, the universe, and everything; anything is possible. For example, the following will invoke Undefined Behaviour but is unlikely to cause a crash or provoke the program to become spontaneously self-aware. It will, however, generate an indeterminate result;-
Avoid any and all operations that may lead to Undefined Behaviour. Even if the operation appears to work correctly, it should be avoided, even if the alternative (that does not exhibit Undefined Behaviour) is more expensive
Most cases of Undefined Behaviour will not elicit any warnings from the compiler
Some behaviour is left to the discretion of the implementation, though it must be handled 'correctly' by the implementation
Some behaviour is left to the discretion of the implementation, but it should be documented by the implementation sufficiently well enough to be able to make meaningful analysis of the behaviour
Type | Family | Guaranteed Minimum Data Size (bits) |
---|---|---|
char | Character | 8 - may be signed or unsigned |
signed char | Character | 8 - signed |
unsigned char | Character | 8 - unsigned |
wchar_t | Character | Implementation-specific but ≤ sizeof(long) |
char16_t | Character | 16 (for UTF-16) |
char32_t | Character | 32 (for UTF-32) |
short int | Integral | 16 - signed by default |
int | Integral | 16 (usually native arch. size) - signed by default |
unsigned | Integral | as int but unsigned |
long int | Integral | 32 - signed by default |
long long int | Integral | 64 - signed by default |
float | Floating Point | 32 |
double | Floating Point | 64 |
long double | Floating Point | 64 (see note below) |
bool | Boolean | Implementation-specific but ≤ sizeof(long) |
void | no data | Implementation-specific - an incomplete type |
Data Model | int | long | pointer | Common Platform(s) |
---|---|---|---|---|
LP32 or 2/4/4 | 16 bit | 32 bit | 32 bit | Win16 API |
ILP32 or 4/4/4 | 32 bit | 32 bit | 32 bit | Unix, BSD, Linux, Win32 API |
LLP64 or 4/4/8 | 32 bit | 32 bit | 64 bit | Win64 API |
LP64 or 4/8/8 | 32 bit | 64 bit | 64 bit | Unix, BSD, Linux |
ILP64 or 8/8/8 | 64 bit | 64 bit | 64 bit | Cray, Unicos |
In all of the above Data Models, long long is 64 bit
Generally, LP32 and ILP32 are considered 32 bit systems and LLP64, LP64 and ILP64 are considered 64 bit systems though this definition is somewhat controversial
ILP64 is not very common; defining int as 64 bit tends to cause problems with portability, and it artificially inflates data structure sizes. It is common for implementations of this model to define a specific 32 bit integer type, often called _int32, but this is outside of the C++ specification
int vs unsigned int
Whether to prefer (signed) int or unisgned int types can invoke passionate argument. In theory, there is no reason why one should provide better performance than the other
However, there is one crucial difference that can and does provide a performance advantage for one. That difference is that the behaviour of arithmetic underflow and overflow are undefined for signed integer types, whereas it is very well defined for unsigned integer types
An implementation can use this undefined behaviour for signed types to make certain optimisations; essentially, the compiler can ignore the possibility of underflow and overflow because such behaviour is undefined anyway. Consider the following;-
In this example, the compiler can always assume that a will never overflow and so the loop will execute exactly max + 1 times. The compiler can use this assumption to make certain optimisations
If the type of max and a were both changed to unsigned int types then this assumption no longer holds true; if max were set to std::numeric_limits<unignsed int>::max() then the loop would run indefinitely because a would eventually overflow and wrap-round back to zero. This possibility introduces uncertainty and so prevents the compiler from implementing the same optimisations it could for the signed version
Essentially, the lack of well-defined underflow and overflow for signed integer types allows the compiler to be lazy in certain situations, and this laziness can lead to improvements in the generated code
Operations on signed integer types can also sometimes be faster because some processors devote more resources (literally, more transistors) to signed operations than unsigned operations. Any such differences are likely to be minute but nonetheless, they do exist
None of the above should be taken as an argument against using unsigned integer types, but if performance is absolutely imperative then the coder should be aware of the issues. In reality, improving an algorithm or restructuring how critical operations are performed are likely to yield orders of magnitude greater performance benefits than changing unsigned types to signed types
long double
Because of changes in modern machine architectures (in particular x86), the size of long double is tending to move (possibly counter-intuitively) towards 64 bits
The following modifiers may be applied to char, short, int, long and long long types
Modifier | Effect |
---|---|
const volatile signed type varname | Signed |
const volatile unsigned type varname | Unsigned |
Regular type is a loose term but generally refers to a type that;-
A user-defined type can be used as a constant expression if it is sufficiently simple. 'Sufficiently simple' means the constructor must have an empty body and all members must be able to be initialised with constant expressions. Such a type is called a Literal Type
Here is an example of a literal type being used by several constant expressions;-
A standard library type predicate is_literal_type<T>::value is defined in <type_traits> that returns whether a type is a literal type or not. This may be used in exactly the same way as is_pod<T>
is_literal_type<> is deprecated; not considered useful
The use of constexpr for the function calc() in the above example implies const and therefore the latter should not have to be specified. However, experience has shown that this is not always the case; the 'Clang/LLVM' compiler complains if const is omitted; not because it is wrong but because if the function were also used at run-time, it could behave differently
A trivial type is one that has standard copy semantics; ie, it must be trivially copyable and movable. It must also have a trivial destructor. A copy/move/destructor operation is trivial if;-
The standard library predicate is_trivial<T> may be used to test the above rules
A standard layout type is one that;-
For non-union types, have no base class sub-objects of the same type as the first non-static data member of the derived class (this is because of empty-base optimisation). This rule applies recursively throughout the class tree
For union types, this rule extends to all members, not just the first
Basically, if a type can be expressed in plain 'C' then it is probably a standard layout
The standard library predicate is_standard_layout<T> may be used to test the above rules
An aggregate type is an array (even an array of non-aggregate types) or a class (often a struct or a union) that has the following properties;-
An aggregate type may be aggregate initialised
An aggregate type may be copy list initialised
Defined in <cstddef> (note that the types defined in the std namespace are always the same as the 'C' equivalents; both are shown below where applicable);-
Type | Description |
---|---|
std::byte |
Implements the concept of a byte in memory. It is the same size as a char but is not a character type. Only bitwise logical operations are defined for it (no arithmetic) Implemented as an empty enum with an underlying type of unsigned char Explicit casts must be used to convert to/from std::byte; an integral type can be converted to a byte by using std::byte{n}, and a std::byte can be converted to an integral with std::to_integer<T>(std::byte b). For example, int a = std::to_integer<int>(b); See also this point |
std::size_t size_t |
An unsigned integer large enough to hold the size (in bytes) of any other type. It is returned from sizeof(), alignof() and offsetof() size_t is a good choice for an array index as it is guaranteed to be large enough for any possible index value Note that there is no std::ssize_t to match the 'C' (signed) ssize_t |
std::max_align_t max_align_t | A trivial type whose alignment requirement is at least as strict/large as any other scalar type. In practice this means that its alignment is often that of long double which is the largest scalar type |
std::ptrdiff_t ptrdiff_t | A signed integer large enough to hold the result of any pointer subtraction |
std::nullptr_t |
The type of the literal nullptr. This is a distinct type and is not actually a pointer type std::is_null_pointer<T>::value may be used to test if type T is a nullptr_t type. Defined in <type_traits> |
Defined in <cstdint>;-
Type | Description |
---|---|
std::std::intptr_t intptr_t std::uintptr_t uintptr_t |
Signed and unsigned integer types large enough to hold a pointer value. These can be readily copied to/from void* without casting. Unlike a void*, these types support arithmetic and logical operations. These types are most useful for memory management applications The min and max values of intptr_t are indicated by the macros INTPTR_MIN and INTPTR_MAX respectively. the max value of uintptr_t is indicated by the macro UINTPTR_MAX with the min value always being zero Because a char is not guaranteed to be 8 bits, adding 1 to a char* is not guaranteed to actually increment it by 1. In contrast, adding 1 to a intptr_t or uintptr_t will always do precisely that An implementation may opt to not define these types |
int8_t int16_t int32_t int64_t |
Signed integer types of exactly the number of bits indicated Min and max values are indicated by the macros INTn_MIN and INTn_MAX respectively for each data size Negative values are implemented as 2's complement An implementation may opt to not define these types, and may not be able to if the underlying architecture does not directly support them |
uint8_t uint16_t uint32_t uint64_t |
Unsigned integer types of exactly the number of bits indicated Max values are indicated by the macros UINTn_MAX for each data size. Min value is always zero An implementation may opt to not define these types, and may not be able to if the underlying architecture does not directly support them |
int_fast8_t int_fast16_t int_fast32_t int_fast64_t |
Fastest signed integer types of at least the number of bits indicated Min and max values are indicated by the macros INT_FASTn_MIN and INT_FASTn_MAX respectively for each data size |
uint_fast8_t uint_fast16_t uint_fast32_t uint_fast64_t |
Fastest unsigned integer types of at least the number of bits indicated Max values are indicated by the macros UINT_FASTn_MAX for each data size. Min value is always zero |
int_least8_t int_least16_t int_least32_t int_least64_t |
Smallest signed integer types of at least the number of bits indicated Min and max values are indicated by the macros INT_LEASTn_MIN and INT_LEASTn_MAX respectively for each data size |
uint_least8_t uint_least16_t uint_least32_t uint_least64_t |
Smallest unsigned integer types of at least the number of bits indicated Max values are indicated by the macros UINT_LEASTn_MAX for each data size. Min value is always zero |
intmax_t |
Maximum sized signed integer type supported by the implementation Min and max values are indicated by the macros INTMAX_MIN and INTMAX_MAX respectively |
uintmax_t |
Maximum sized unsigned integer type supported by the implementation Max value is indicated by the macro UINTMAX_MAX. Min value is always zero |
auto | This is not actually a type at all (it is a keyword) but it is used in place of a type name. A variable of type auto must be initialised at definition; the actual type of the variable is selected by the compiler to something appropriate based on the type of the variable or literal that is assigned to it |
The following are considered incomplete types
In general terms, an incomplete type may not be used if the type's layout or size is required
In particular, a pointer to an incomplete type can be used, as long as it is not dereferenced. For example, the following is legal;-
Operator | Effect |
---|---|
type* var-name | Pointer |
type** var-name | Pointer to a pointer |
type* const var-name | Constant pointer |
type* volatile var-name | Volatile pointer |
type& var-name | lvalue reference (must be initialised) |
type&& var-name | rvalue reference (must be initialised) |
type var-name[] | Array |
type fn-name(args) | Function |
auto fn-name(args) -> type | Two operators; auto indicating a function with a suffix return-type and -> indicating the return type |
auto fn-name(args) | Function with deduced return type |
Modifier | Effect |
---|---|
type* var-name; | Pointer |
type var-name[n]; | Array |
type& var-name = initialiser; | lvalue reference (must be initialised) |
type&& var-name = initialiser; | rvalue reference (must be initialised) |
struct type-name {…}; | Structure |
union type-name {…}; | Union |
enum {…} type-name; | "Plain" Enumeration. Size is implementation-defined |
enum class type-name {…}; | Class enumeration. Size is implementation-defined |
class type-name {…}; | Class |
If a type is required but that type is not yet defined, a Forward Declaration may be specified. For example;-
There are several sets of rules for type deduction; used in a variety of scenarios;-
Class Templates constructor arguments
Generic lambda auto parameters
Manually determining a deduced type can become very difficult and complex. One method of getting the compiler to output a type is to deliberately cause an error. The following definition will do this;-
The above could be used like this;-
Don't use std::type_info, ie typeid(T).name(); it almost always gives the wrong answer! The reason for this is that std::type_info::name is specified to return a type as if the argument to type_info had been passed by value to a function template. This makes it unreliable because the function template type deduction rules strip references, const and volatile from such arguments. If you choose to ignore this advice, then note that some compilers provide a c++filt command that can interpret and present the name and type information returned from type_info::name
The Boost library provides type_index.hpp which defines type_id_with_cvr<> which can be used to retrieve run-time type information;-
Here is a complete list of function template argument constructs from which it is possible to deduce a type T or U, and a non-type argument N;-
Type Deduction Rules
When a function template is called, two type deductions take place; one for the type T and one for the function argument(s) based on T. How this is performed depends on the function argument declaration(s) and how the function is called. Consider;-
The above could be called with;-
The form of expr and arg-type interact to deduce the type of T and the type arg-type
If arg-type is a (non-forwarding) reference or a pointer;-
Function Decl. | Call | T | arg-type | Notes |
---|---|---|---|---|
void fn(T& a) | int b = 3; fn(b); | int | int& | |
const int c = 3; fn(c); | const int | const int& | ||
const int& d = b; fn(d); | const int | const int& | ||
const int e[] = {1, 2, 3}; fn(e); | const int[3] | const int (&)[3] | #1 | |
void f(int, double); fn(f); | void (int, double) | void (&)(int, double) | #2 | |
void fn(const T& a) | int b = 3; fn(b); | int | const int& | #3 |
const int c = 3; fn(c); | int | const int& | ||
const int& d = b; fn(d); | int | const int& | ||
void fn(T* a) | int b = 3; fn(&b); | int | int* | |
const int* d = &b; fn(d); | const int | const int* |
Note #1 Because fn() takes a reference, the array type passed to it does not decay to a pointer. This technique can be used (for example) to create a template that returns the number of elements in an array;-
Note #2 Because fn() takes a reference, the function type passed to it does not decay to a pointer
Note #3 The derived type of T is always non-const even if a const value is supplied in the call. This is because the const-ness is taken care of, and guaranteed by the function declaration itself
If arg-type is a forwarding reference;-
Function Decl. | Call | T | arg-type |
---|---|---|---|
void fn(T&& a) | int b = 3; fn(b); | int& | int& |
const int c = 3; fn(c); | const int& | const int& | |
const int& d = b; fn(d); | const int& | const int& | |
fn(83); | int | int&& |
Note b, c and d are all lvalues, but 83 is an rvalue. This distinction is maintained in the deduction of arg-type
If arg-type is neither a pointer or reference (ie, it is passed-by-value);-
Function Decl. | Call | T | arg-type | Notes |
---|---|---|---|---|
void fn(T a) | int b = 3; fn(b); | int | int | |
const int c = 3; fn(c); | int | int | ||
const int& d = b; fn(d); | int | int | ||
const int* const e = &b; fn(e); | const int* | const int* | #1 | |
const int f[] = {1, 2, 3}; fn(f); | const int* | const int* | #2 | |
void g(int, double); fn(g); | void (*)(int, double) | void (*)(int, double) | #3 |
Note #1 The actual pointer e is passed by value and therefore its const-ness can be discarded. However, the referenced target value is not copied or moved. Therefore, the target's const-ness is honoured
Note #2 An array passed by-value decays to a pointer
Note #3 A function passed by-value decays to a pointer
If expr is a uniform initialiser
Function Decl. | Call | Result |
---|---|---|
template<typename T> void fn(T a); | fn({1, 2, 3}); | Error: The type cannot be deduced |
template<typename T> void fn(std::initializer_list<T> a); | fn({1, 2, 3}); | Ok: T deduced to be an int |
The use of auto to indicate that a function's return type should be deduced, and the use of auto argument declarations in lambda expressions use the template type deduction rules, not the auto type deduction rules; ie, {…} does not automatically imply a std::initializer_list<> type
auto type deduction rules no longer deduce to an std::initializer_list<> anyway so this distinction is now moot
auto varname = expr; deduces an object's type from its initialiser. The type may be a variable type, a const or constexpr
auto becomes more useful the harder the actual type is to determine. For example;-
…is much easier to write (and read) than;-
It is also more resilient to change; eg, changing arg to a list type would not break the above function
Example: auto can prevent problems resulting from lazy type specification. For example;-
The above will compile and will probably work as intended. But it may not. The type returned from a.size() is actually std::vector<int>::size_type. It is an unsigned integral, but depending on the platform, it may or may not be the same size as unsigned int; if it isn't, an implicit conversion must take place which could cause trouble. Using auto sz = a.size(); instead will ensure correct behaviour
Example: An very nasty but subtle problem that auto can avoid;-
This looks perfectly reasonable and will compile and run, but there is a problem. The type of the key for an unordered_map is const std::string, and not std::string. Therefore, n is the wrong type. As a result, the compiler will convert std::pair<const std::string, int> objects from the unordered_map container to std::pair<std::string, int> (the type of n). It can only achieve this by creating a temporary object on each iteration, binding n to that temporary, and then destroying the temporary at the end of each iteration; almost certainly not what was intended
This entire problem can be solved by using auto;-
Assignment to an auto type implies type deduction only; there is no implicit type conversion performed. This can improve performance and reduce errors by preventing unintentional type conversions
Make extensive use of auto and prefer it to explicit types;-
Using auto with specific types
The following two expressions are (almost; see below) equivalent;-
However, the latter is the preferred style (called auto to stick). The reasons are that it still provides the advantages of using auto with no loss of control
Note that this preferred style implies a copy (actually, a move) operation; the expression initialises the object and then the object is moved to the new auto variable. In practice, this move is almost always optimised away by the compiler. This point is important in understanding the following situations where this preferred style cannot (or should not) be used;-
1 If the type is non-moveable, then using auto will fail. Move-elision is guaranteed when declaring auto objects in this way; hence none-movable types can be used. See also Return Value Mechanics
If the type is expensive to move, then using auto could be (potentially) inefficient
The rules for auto type deduction are exactly as for function template type deduction, ( with one exception). With reference to the simple function template;-
…think of an auto expression as taking the form;-
Where auto takes the part of T from the template, and the deduced type of the variable a takes the part of arg-type
As with function template type deduction, there are three cases;-
If arg-type is a (non-forwarding) reference or pointer;-
Given…. | auto Decl. | Deduced Type Of a | Notes |
---|---|---|---|
int b = 0; const int c = 3 const int& d = b; | auto& a = b; | int& | |
auto& a = c; | const int& | ||
auto& a = d; | const int& | ||
const auto& a = b; | const int& | ||
const auto& a = c; | const int& | ||
const auto& a = d; | const int& | ||
int b = 0; int* c = &b; const int* d = &b; const int* const e = &b; | auto* a = c; | int* | #1 |
auto* a = d; | const int* | ||
auto* a = e; | const int* | ||
const int b[] = {1, 2, 3}; | auto a = b; | const int* | |
auto& a = b; | const int (&)[3] | ||
void f(int double); | auto a = f; | void (*)(int, double) | |
auto& a = f; | void (&)(int, double) |
Note #1 The '*' is optional because it can be deduced
If arg-type is a forwarding reference;-
Given…. | auto Decl. | Deduced Type Of a | Notes |
---|---|---|---|
int b = 0; | auto&& a = b; | int& | b is an int lvalue |
const int b = 0; | auto&& a = b; | const int& | b is a const int lvalue |
auto&& a = 22; | int&& | 22 is an int rvalue |
If arg-type is neither a pointer or reference;-
Given…. | auto Decl. | Deduced Type Of a | Notes |
---|---|---|---|
int b = 0; const int c = 3 const int& d = b; | auto a = b; | int | |
const auto a = c; | const int | ||
auto a = c; | int | ||
auto a = d; | int | #1 | |
decltype(auto) a = d; | const int& | #2 | |
auto a = 42; | int |
Note #1 Derived type is int and not const int&. This is because of the rules concerning stripping of references from the initialiser before deducing the type
Note #2 The decltype preserves the reference from the initialiser
The one difference between template and auto type deduction comes about from the use of uniform initialisers (which get interpreted as std::initialiser_list constructs);-
Non-auto case | auto equivalent | Deduced type of auto case |
---|---|---|
int b = 42; | auto a = 42; | int |
int b(42); | auto a(42); | int |
int b = {42}; | auto a = {42}; | std::initializer_list<int> containing the single int element of value 42 |
int b{42}; | auto a{42}; | std::initializer_list<int> containing the single int element of value 42 |
std::initializer_list<int> b = {42, 83, 11}; | auto a = {42, 83, 11}; | std::initializer_list<int> containing the three int elements of values 42, 83, 11 |
std::initializer_list<int> b{42, 83, 11}; | auto a{42, 83, 11}; | std::initializer_list<int> containing the three int elements of values 42, 83, 11 |
auto never deduces to a std::initialiser_list type. The above examples therefore give the following results;-
Non-auto case | auto equivalent | Deduced type of auto case |
---|---|---|
int b = 42; | auto a = 42; | int |
int b(42); | auto a(42); | int |
int b = {42}; | auto a = {42}; | int |
int b{42}; | auto a{42}; | int |
std::initializer_list<int> b = {42, 83, 11}; | auto a = {42, 83, 11}; | Error |
std::initializer_list<int> b{42, 83, 11}; | auto a{42, 83, 11}; | Error |
auto sometimes deduces a type other than one expects. A common example is when proxy classes are being used; classes that emulate some other type. For example;-
The above problem comes about because vector defines a specialisation for bool; packing single bit bool values into words. As a result, vector<bool>::operator[]() returns a proxy class to hide this fact and to provide a clean interface to the resulting expression, with the primary intention of making vector<bool>::operator[]() look like it returns a T& in the same way as the general vector<T>::operator[]() operation does
Such a proxy class is intended to be transparent and usually only used an an rvalue. In the above case, the type std::vector<bool>::reference probably contains an internal pointer which, if a were an rvalue (say, returned from a function), would be left dangling when assigned to c. Other proxy classes include 'smart' pointers std::unique_ptr and std::shared_ptr (though these are designed to be more visible), std::bitset (similar issue to vector<boot>), and many others.
In cases such as these, options include:-
There are three types of lambda capture;-
By-value This uses function template by-value argument type deduction rules except that const and volatile are retained. For example;-
Within the body of the lambda, the captured a value shall be const
This is the only case in C++ where by-value assignment retains const and volatile
Init Capture This uses the auto type deduction rules. Expanding on the previous example;-
Within the body of the lambda, the captured a value shall not be const
Normally, the difference between the way by-value and init-capture handle const (and volatile) is not important because the object that a lambda creates is const anyway. However, if the lambda is made mutable then the difference does become apparent
decltype(expr) deduces a type from an expression; that is, the declared type
decltype() is most useful in generic programming. The following example defines a function that adds two matrices with possibly different element types. The problem that decltype() resolves here is in determining an appropriate result type;-
See also Functions for an example of defining a function's return type in terms of some other expression
Given two functions such as;-
…we could write wrapper functions;-
…or we can automate the process by using decltype(auto);-
Given | decltype Expression | Yields Deduced Type |
---|---|---|
int a = 0; | decltype(a) | int |
const int b = 0; | decltype(b) | const int |
const int& c = b; | decltype(c) | const int& |
X d; char fn(const X& e); | decltype(d) | X |
decltype(fn) | char(const X&) | |
decltype(fn(d)) | char |
Applying decltype to an lvalue expression more complex than just a name causes the type to be deduced in the same way as for a name, but then an lvalue reference is added to the result
This distinction usually doesn't matter because an lvalue expression is almost always by-reference already; for example, given int x[50];, the expression x[42] actually behaves as if it is of type int& rather than just int, and this is supported by the fact that an operator[] function usually returns a reference. If this were not so, it would not be possible to assign to such an expression; x[42] = 3; would be illegal
Therefore, the addition of the reference by decltype() fits the general use of most expressions and function calls
There is a case where this implicit reference behaviour can cause a problem. Given int a = 0;, the expression decltype(a) shall yield a type of int. However, decltype((a)) is taking an expression of '(a)' which is more complex than just a name and so yields a type of int&
This rarely matters for C++11, but for auto return deduction, it can cause real problems;-
The most common use of decltype is probably in declaring the return type of a function template, based on the type of a supplied argument when such a return type cannot be easily derived. For example, given a container of type T, what is the type of the objects it contains?;-
See function return type deduction
Type | Family | Implied Type |
---|---|---|
n | Integral | int sign extended |
nL nl | Integral | long sign extended |
nLL nll | Integral | long long sign extended |
nU nu | Integral | unsigned |
nUL nul nLU nul | Integral | unsigned long |
nULL null nLLU nllu | Integral | unsigned long long |
0Bb 0bb | Integral | int binary notation |
0o | Integral | int octal notation |
0Xh 0xh | Integral | int hexadecimal notation |
n.
n.n
.n nEn n.En n.nEn .nEn nE+n n.E+n n.nE+n .nE+n nE-n n.E-n n.nE-n .nE-n | Floating Point | double |
0xhPn
0xh.Pn
0xh.hPn
0x.hPn 0xhP+n 0xh.P+n 0xh.hP+n 0x.hP+n 0xhP-n 0xh.P-n 0xh.hP-n 0x.hP-n | Floating Point | double |
n.nL n.nl (etc. for other forms) | Floating Point | long double |
n.nF n.nf (etc. for other forms) | Floating Point | float |
true / false | Boolean | bool |
'c' | Character | char (int in 'C') |
"c…" | String | char[n] where 'n' is the length of the string + 1 (for the NULL terminator) |
nullptr | Pointer | Pointer |
For the hexadecimal floating point format, 0x is case insensitive
The mantissa is specified as one or more hexadecimal digits. The exponent must always be specified and is expressed in decimal. The exponent represents an integer power of 2. So, 0x123.4p5 is interpreted as the hexadecimal fraction 123.4 (291.25 decimal) scaled by 2 to the power 5 (32) = 9320.0
The L/l (long double) and F/f (float) suffixes may also be used
This format is (clearly!) not really useful generally; it is mostly of use when specific bit patterns are required in specialist cases
The digit separator ' (single quote) may be used to make literals easier to read. For example;-
The following 'escape' (\) codes may be used in characters or strings
Name | ASCII Name | C++ Name | |
---|---|---|---|
Newline | NL (LF) | \n | |
Horizontal tab | HT | \t | |
Vertical tab | VT | \v | |
Backspace | BS | \b | |
Carriage return | CR | \r | |
Form feed | FF | \f | |
Alert | BEL or alert | \a | Emits a sound on some consoles |
Backslash | \ | \\ | |
Question mark | ? | \? | Can be useful for avoiding confusion with trigraph characters |
Single quote | ' | \' | |
Double quote | " | \" | |
Decimal number | nnn | \nnn | Must start with a non-zero digit |
Octal number | ooo | \0ooo | Must start with a zero |
Hexadecimal number | hh | \xhh |
Type | Representation | Description |
---|---|---|
char | 'c' | Plain char. Almost always ASCII |
wchar_t | L'c' | Wide character. Implementation-specific |
char | u'c' | UTF-8 character |
char16_t | u'\Uhhhh' u'\uhhhh' u'\xhhhh' | UTF-16 character. Expands to 0000hhhh (16 bit Hex) Unicode Code-Point |
char32_t | U'\Uhhhhhhhh' U'\uhhhhhhhh' U'\xhhhhhhhh' | UTF-32 character. 32 bit Hex Unicode Code-Point |
char string | "c…" | Defines const char[n] where 'n' is the length of the string + 1 (for the NULL terminator) |
Raw char string | R"(c…)" | A char string but the normal escaped '\' characters are not interpreted. Useful for creating regex |
wchar_t string | L"c…" | Wide character string. Terminated with L'\0' |
LR"(ccc…)" | Raw wide character string. Terminated with L'\0' | |
UTF-8 char string | u8"c…" | UTF-8 string. Terminated with '\0' |
u8R"(c…)" | Raw UTF-8 string. Terminated with '\0' | |
UTF-16 char16_t string | u"c…" | UTF-16 string. Terminated with u'\0' |
uR"(c…)" | Raw UTF-16 string. Terminated with u'\0' | |
UTF-32 char32_t string | U"c…" | UTF-32 string. Terminated with U'\0' |
UR"(c…)" | Raw UTF-32 string. Terminated with U'\0' |
A Raw string may include "real" newline characters. eg, the following two strings are equal;-
Example | Description |
---|---|
const char* p = "Hello"; | Assign a string literal to a const pointer |
const char16_t* p = uR"(Hello)"; | Assign a (raw) UTF-16 string literal to a const pointer |
char* p = "Hello"; | Illegal; not a const pointer |
char p[] = {"Hello"}; | Assign to an array. Receiving array does not have to be const and (in this case) is automatically sized to that of the string + 1 (for the NULL terminator); ie, 6. Note that the {} are optional |
char p[] = {"Goodbye " "Cruel " "World"}; | A string literal may be composed of one or more sub-strings (automatically) concatenated together; useful for specifying very long strings. In this example, the resulting string shall be 18 characters + 1 (for the NULL terminator) |
The namespace std::literals::string_literals defines operators for defining std::string literals;-
Example | Description |
---|---|
auto s = "Hello"s; | A std::string literal |
auto s = L"Hello"s; | A std::wstring literal |
auto s = u"Hello"s; | A std::u16string literal |
auto s = U"Hello"s; | A std::u32string literal |
The following compile-time operators are available to determine the physical size and alignment of an object;-
Operator | Description |
---|---|
std::size_t sizeof expression | Returns the number of bytes occupied by the physical object representation resulting from expression. No implicit type conversions are performed |
std::size_t sizeof(type) | Where type cannot be a function type, an incomplete type, void, or a bitfield. The returned value is the number of bytes occupied by the physical representation of the type |
std::size_t alignof(type) |
The returned value is the alignment requirement of type, in bytes. If the type is a reference then the referenced type is assessed. If the type is an array then the array element type is assessed See also max_align_t |
alignas(type) | Allows aligning of a variable to be the same as some other type. For example alignas(X) int data[42]; will set the alignment of the array data to be the same as that of type X |
std::size_t offsetof(type, member) |
The returned value is the number of bytes member is displaced from the beginning of type (which must be a struct or class type). Padding bytes within the type will, naturally, effect the result If the type is not a standard layout then the result is undefined The implementation of offsetof is optional Example, given;-
struct X
{
char m1;
int m2;
};
…offsetof(X, m1) will return 0 (zero), and (depending on the types' sizes and alignment requirements), offsetof(X, m2) will (probably) return 4 or 8 |
Every expression is either an lvalue or an rvalue, but not both
Type | Description |
---|---|
object | A contiguous region of memory/storage (this is a low level definition and not to be confused with class objects) |
lvalue | A name or expression that refers to an 'object'. An lvalue has an identity and cannot be moved. Generally, an lvalue is a name for which the address may be taken |
rvalue |
A "value that is not an lvalue"; often ( though not always; see below) a temporary object created as an artefact of an expression or returned from a function An rvalue can be moved/copied (assigned to an lvalue). It is not possible to take the address of an rvalue |
glvalue, xvalue, prvalue | For completeness only; a glvalue (generalised lvalue) is an lvalue that has identity, an xvalue (extraordinary) has identity and can be moved, and a prvalue (pure rvalue) has no identity but can be moved |
The value class (that is, lvalue, rvalue, etc) is a property of an expression; not of an object
Do not assume that all temporary objects are rvalues, or assume some other similar relationship; the two concepts are entirely separate
Type | Initialisation and Lifetime |
---|---|
automatic | (applicable only in function scope) Initialised on each encounter (if at all). Usually stack-based. Exist from definition to end of scope . The term 'automatic' is a legacy term and has now largely fallen out of use. It is not to be confused with the auto type |
static | When used within a function scope, initialised once only at definition and exists until end of program execution |
Free store | Exist from the time of an explicit new operation until destroyed with delete |
temporary object | intermediate result in computation, or an object for a const reference. the lifetime depends on context. these are almost always stack-based objects |
thread-local object | an object declared thread_local. lifetime is that of the host thread |
Storage Class | Effect on variable |
---|---|
extern type var | Indicates that var is defined elsewhere (possibly in some other compilation unit), and this only a declaration. A variable declaration that includes an initial value is a definition regardless of the use of extern |
static type var |
If in file (global) scope, static prevents the variable from being accessed from an external file; ie, it creates an internal linkage If the static variable is part of a class, it is common (shared) between all instances of that class If in function scope, a static variable retains its value between function invocations. Such a variable will be initialised once only; on the first invocation of the function. Initialisation is undefined if performed recursively. For example;-
void fn(int a)
{
static int n = a; // Ok
static int n = fn(a + 1); // Undefined
return n;
}
|
register type var |
Hint to compiler to give speed priority to variable. Cannot be referred to by a pointer. Often ignored by modern compilers register is deprecated register is removed, though the keyword remains reserved |
volatile type var | All reads/writes from/to var shall be honoured (not optimised away) |
automatic |
Variable is allocated on the stack and is destroyed at the end of the containing block's scope. Implicit when variable is defined within a function The auto keyword could be used to explicitly identify an automatic variable Such use is now deprecated. …and now auto means something quite different! |
global |
Implicit when variable is allocated at file scope, outside of any function |
thread_local type var | Indicates that each thread is allocated its own copy of var |
The volatile modifier may be applied to any object declaration. Some examples;-
The purpose of volatile is as an instruction to the compiler not to optimise-away what may seem like redundant read/write operations. For example;-
In the above example, had p not been declared volatile then the compiler would likely optimise-away the line *p = 0; because it would "know" that it was redundant. The volatile modifier prevents such optimisation
All variables and functions must be declared or defined before they are referenced
Declaration | Description |
---|---|
char c; | A variable of type char. Implicitly initialised to zero if in global scope, or static |
char c, d, e; | Three variables of type char. Implicitly initialised to zero if in global scope, or static |
char c = 'B'; | A char, assigned a value of 'B' |
int i = 123; | An int, assigned a value of 123 |
int i = j; | An int, initialised with the variable 'j'; the type of 'j' may or may not be the same; if it is not the same then some implicit conversion is required/expected |
double f {3.14}; | A double, assigned a value of 3.14 |
const char* a = "Hello"; | A char pointer, assigned a (const) string value of "Hello" |
const char* a[] = {"yes", "no"}; | An unsized array of char pointers, each assigned a separate (const) string value. The array is unsized; its actual size shall be determined by the number of values assigned |
auto a = 123; | A variable of type auto, assigned a value of 123. The actual type shall be selected at compiler time (in this example, it will probably be of type int) |
X y {123, "Hello", true}; | A user-defined type X, initialised with three values; an Integral, a const string, and a bool |
int stop() {…} | A function definition. The function takes no parameters and returns an int |
bool go(const colour_t colour); | A function declaration. The function takes a single parameter of type colour_t and returns a bool |
using T = int; | An alias for type int |
There is rarely a need to introduce a new variable before there is a value available for it; reducing variable scope and delaying name introduction within a scope can help minimise errors and reduces namespace pollution
This is essentially the RAII (Resource Acquisition Is Initialisation) principle; this improves performance as no default initialiser is called prior to assigning a 'real' value. The most common reason for NOT initialising a value to something other than its default at definition is that it needs to be passed to a function to initialise it
If a variable is required within a loop, such as;-
Whether it is best for the variable to be (1) declared once before the loop and then assigned a new value for each iteration, or whether it is best to be (2) declared within the loop and constructed/destroyed on each iteration, depends largely on the relative cost of assignment vs construction/destruction. The advantage of the second approach is that it narrows the scope of the variable, but if construction/destruction is relatively expensive then the first approach may be preferred
One very neat example of employing the principles of limiting a variable's scope and delaying declaration until we have a value to initialise it with is to define the definition within an if statement;-
Only a single declaration may be made in this way, and it must be initialised (which is implied by the fact that the if statement operates on an expression)
A similar technique may be used with for
A variable may be defined as inline. For example;-
An inline variable may be defined multiple times in multiple translation units (but only once per translation unit). This allows the variable to be defined in a header file
If different definitions exist, spread across different translation units then the program may still link, but the result is undefined
An object may be variable (mutable) or constant (immutable). The former is the default. The latter can be achieved with the const specifier. The general form is;-
Some examples;-
Example | Description |
---|---|
const int life = 42; | A constant int |
const char* p = "Hello"; | A pointer to a constant |
char* const p = "Hello"; | A constant pointer |
const char* const p = "Hello"; | A constant pointer to a constant |
A type alias generates a new name for an existing type. The new and old names are interchangeable; they are not distinctly different types. Aliases are useful to insulate the code from the underlying type details and allow for simpler modification. Not to be confused with object/variable aliasing
There are two methods of creating a type alias;-
Syntax | Details |
---|---|
typedef existing_type new_type; | 'C'-style |
using new_type = existing_type; | 'C++'-style |
Every storage class in the standard library defines value_type which is a synonym for the type it has been instantiated with. For example;-
Note the difference in syntax when defining an alias for a function pointer;-
Structured bindings provide a method of decomposing a complex object into individual variables. The syntax is as follows;-
Note: The process of constructing the variables defined by identifier-list starts with the creation of a temporary object to hold the initialising values. It is to this temporary object that the [[attributes]] const volatile applies, and not to the identifier-list variables
Array binding
The number of names specified by identifier-list must match the size of the array. Each name is then bound to each array element in-turn
struct and class binding
The struct/class must either contain only public non-static members, or no non-static members at all, but one or more bases, only one of which (which must be public) may contain non-static members, all of which must be public
The struct/class and/or its bases may contain static members
The number of names specified by identifier-list must match the number of non-static members of the struct/class. Each name is then bound to each member in-turn (in the order they are defined in the struct/class)
Expanding on this example, showing a different (and common) context;-
If a class/struct does not meet the criteria for structured binding, it can be modified to make it so. See tuple binding for details of how this works. For example;-
To address this, first create a specialisation of a tuple_size for type Z. There are 3 members we wish to consider;-
Next, add a get() function to Z. This could be added as a member function (as shown here) or as a friend function;-
Lastly, create a specialisation of a tuple_element for type Z. This uses the get() function already defined. This specialisation is independent of the number or type of members of Z;-
An easier to understand (but less maintainable) version of the above would be;-
With all the above in place, it is now possible to use structured binding on type Z;-
tuple binding
The number of names specified by identifier-list must match the size of the tuple. Each name is then bound to each tuple element in-turn (in the order they are defined in the tuple)
Further example;-
The initialising value for tuple element N is determined as follows (where I is the initialising object);-
C++ initialisation is notoriously complex; there are several forms, each with its own set of rules and caveats, and these are detailed below
However, in most normal use cases, the fine detail can be largely ignored (or at least not worried about too much). The point being, don't get bogged-down in the detail; in many cases, as long as the code is sensible, initialisation will behave sensibly. Mostly… maybe
In the following text, the term initialisation is synonymous with construction
If an object of type T is declared static or thread_local then it is zero initialised
If an object of type T is defined with no explicit initialiser at all, for example T a; then default initialisation takes place
If an object of type T is defined with an explicit empty initialiser, for example T a{};, T a = {};, or T a(); (though this last form will be parsed as a function declaration in most cases) then value initialisation takes place
This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a(arguments);
This behaves like a function call (to a constructor). Both explicit and non-explicit constructors are considered and ADNL is employed to identify the best candidate, and (if required) implicit conversion is used to modify the argument(s) to match the constructor (unless the constructor is explicit)
Here describes the effects of direct initialisation on particular types
This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a = initialiser;
This is an initialisation and conversion operation. If the initialiser does not exactly match the object's type, then methods shall be sought to convert the initialiser to the correct type and ADNL shall be employed to determine if the initialiser can be used to construct the object
Given the initialiser X a = b;, b is converted to a type X and the result passed to a copy constructor of X (or if b is an rvalue, possibly a move constructor). Both explicit and non-explicit constructors are considered for ADNL purposes, but only non-explicit versions may be selected. In practice, the copy/move is usually optimised-away by the compiler. The effect of this behaviour when compared with direct initialisation can be seen here
The optimisation that can elide the call to the constructor is mandatory, even if invoking such a constructor would have side-effects. As a result, explicit constructors may be selected. Technically, the rule regarding not selecting explicit constructors still applies, but as the constructor is guaranteed to never be called, the rule never comes into play. See also Return Value Mechanics
This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a{arguments};
If the type defines one or more constructors, then this forms Direct List Initialisation. This behaves like a function call (which is what it is; a function call to a constructor). Both explicit and non-explicit constructors are considered and Argument Dependent Name Lookup is employed to identify the best candidate, and (if required) implicit conversion is used to modify the argument(s) to match the constructor (unless the constructor is explicit). If an initializer_list constructor is defined then this will usually be selected in preference to other constructors
Here describes the effects of direct list initialisation on particular types
This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a = {arguments};
This behaves similarly to direct list initialisation except that explicit constructors are considered for ADNL purposes but never used; selection of an explicit constructor is an error
An aggregate type is eligible for aggregate initialisation
For example, the declaration;-
…can be aggregate initialised with;-
This will result in m1 == 42, m2 == 83, m3 == 3.14
Aggregate initialisation is not possible if any members of the type have default initialisers
Given the following type with some default initialisers defined;-
…an aggregate initialisation of X b{42, 83} shall result in m1 == 42, m2 == 83, m3 == 8.3. The value of m4 will depend on the Unspecified Initialisation rules, and may be undefined
An aggregate class type may contain aggregate and non-aggregate members. Aggregate initialisation of such a type is performed as follows;-
An aggregate class type may have one or more bases (which may or may not also be aggregate types). Aggregate initialisation of such a type is performed as follows;-
An aggregate may have an empty base. Such a type is initialised as follows;-
An example of how the declaration (or not) of user-provided constructors can effect initialisation;-
The reason for the above result is as follows;-
The way to avoid the above problem is;-
The plain copy initialisation form, auto var = init_val, may be preferred when the type is auto. This is because the form '{init_val}' or '{init_val, …}' always results in a generated type of std::initializer_list<T> rather than just 'T' which is often not the intended result
auto never deduces to a std::initialiser_list type so this is not an issue
For user-defined types, there can be differences between the behaviour of the different initialisation forms; it depends on the type's implementation
Example; given the following declarations;-
…the following definitions will invoke the indicated constructor(s)/operation(s);-
If the type Y were replaced with the following (ie, by adding the conversion operator X());-
…then the behaviour is as the original example except for the copy initialisation forms;-
If the conversion operator 3 is left in-place but 8 is deleted then the behaviour is as the original example except;-
There is a rare case where the (), rather than {}, notation must be used; that is to force initialisation by non-list constructor rather than list constructor (which {} also supports). For example;-
In theory, this distinction may be applied to any type, but in reality the situation shown above only manifests itself if an initializer_list constructor is defined for the type. This is because if the type has no such constructor then the distinction is not useful, or even possible, and because of the next point…
This distinction is not always uniform though. For example, if the vector<int> in the above example were to be changed to a vector<string>;-
The reason the {} notation in this case gives different behaviour to the first example is because there is no initialiser-list constructor for vector<string> that accepts 42 as an argument (it expects one or more strings instead). As a result, the construction resolves to the same as if () has been specified
If a class defines an initializer_list constructor, it is normally preferred over a "standard" constructor in the event of an overload resolution. Using the () notation will force use of the "standard" constructor. For example;-
The {} notation can only be used as an initialiser and on the right-hand side of an assignment. For example, assuming a and b are of type X, and the function is declared as fn(X j, X k);;-
An empty {} or () indicates the default value for the type;-
Any type, built-in or user-defined, can be default-initialised with any of;-
This works for arrays too. Either of the following will initialise all 42 elements to zero;-
Note that it is not the new operation that performs the initialisation; new simply allocated raw memory. The initialisation is applied to the returned memory pointer after new completes
More complex and multi-element initialisation is achieved with an initialiser list
Example | Effect |
---|---|
int a[]{1, 2, 3}; | Array initialisation. Array is implicitly sized to 3 elements |
int a[42]{1, 2, 3}; | Array initialisation. Array is explicitly sized to 42 elements. The first three elements are initialised as specified and the remainder are initialised to zero |
int a[42]{}; | Array initialisation. Array is explicitly sized to 42 elements and all elements are initialised to zero |
struct S {int x, string s}; S s{1, "Hello"}; | Structure initialisation |
struct S {int x, string s}; S s{}; | Default structure initialisation; S s{} is equivalent to S s{{}, {}} which expands out to S s{{0}, {""}} |
complex<double> z{0, pi}; | Use of constructor |
vector<double> v{0.0, 1.1, 2.2}; | Use of list constructor |
vector<double> v(10, 8.3); | Invoking constructor with (…) (10 elements, all initialised to 8.3) |
vector<double> v{10, 8.3}; | Invoking constructor with {…} ((2 elements, initialised to 10.0 and 8.3) |
complex<double> z(); | This is a function declaration! (because in a declaration, an empty () always indicates a function) |
complex<double> z{}; | …whereas this is a default initialiser |
Default initialisation may be specified as an empty list. For a type X, this could be X a{};. It is also possible to omit even the default initialiser; eg, X a;
When no initialiser is specified at all like this, there are special rules concerning the initialisation;-
For example;-
The reason why global and static variables of built-in types are always initialised is because it incurs no run-time cost
In contrast, a stack-based variable must be initalised at run-time. It is not acceptable for user-defined types not to be initialised (whether global, static or stack-based) and so they always are, despite any run-time cost. However, it may be acceptable for built-in types to not be initialised (hence, this is the default for stack-based variables)
This reasoning allows (for example) large amounts of raw memory (say, a byte buffer) to be allocated without the overhead of forcing initialisation of all its elements
Be safe by always explicitly initialising objects unless there is a specific reason not to; a classic case being a read/write buffer
NOTE: All the following examples use pointers. However, exactly the same rules and procedures can be applied to references as well
Aliasing is the process of accessing an object via multiple means (not to be confused with type aliasing which is quite different). For example;-
Here, the variable a can be accessed directly or via the alias b. There is an implicit assumption that the object's value will be the same regardless of whether it is accessed directly or via the alias, and that if the value is modified (by either means), its new value will become immediately available directly and via any and all alias'
There are a number of rules that dictate which types of expression (the alias) may be used to access a particular type of value. It is important that these rules are strictly followed (hence the often-quoted term strict aliasing rules, although this is not an officially used term) because the compiler will make assumptions based on them and if the rules are broken then undefined behaviour shall result and the compiler is likely to generate incorrect code, especially with more aggressive optimisation levels
The aliasing rules
A pointer b may alias a pointer a if;-
It is of the same type;-
It is a const and/or volatile of the same type;-
It the two types are similar. That is;-
It is a signed or unsigned version of the same type;-
It is a const and/or volatile of a signed or unsigned version of the same type;-
It is a char, unsigned char (but not signed char) or std::byte;-
Being able to legally alias any type with char* is necessary for functions like memcpy() to work. It also allows any type to be handled in sizeof(char) parts, which is often necessary for functions such as serialisation, network traffic handling, etc
In theory, there is no reason why an int8_t or uint8_t type must be based on a char type. In reality, all current implementations that define int8_t and uint8_t define them as type aliases of char. Because of this, it is also possible to alias any other type with int8_t* or uint8_t*. If an implementation did not follow this convention then quite a lot of code would probably break
It is a (possibly const and/or volatile) base class of the other type;-
If a pointer a is a struct, class or union type and includes one of the above types among its elements or non-static data members (including, recursively, an element or non-static data member of a sub-struct/class or contained union), then a pointer b may alias such a member by employing the appropriate rule shown above
A pointer to a standard layout structure may be safely cast to the type of its first member, and visa-versa. This works because the first member of any structure is aligned with the very start of the structure (there are no intervening padding bytes)
Similarly, a pointer may be safely cast between union members consisting of standard layout structures with common initial types, though clearly it's possible to easily get into trouble!
The above is only safe up to the point of accessing the start of the structures that match each other. In this example, only the first member matches both structures
Aliasing effects which optimisations may be performed by a compiler. For example;-
It may seem obvious that a compiler could optimise the above function to return the literal 1. However, it can't because a and b may refer to the same object, rendering such an assumption dangerous and potentially wrong; instead, the compiler must abandon an otherwise useful optimisation. This problem can be addressed by specifying restrict, if it's available
In contrast, given;-
…the compiler can assume that a and b do not refer to the same object, because to do so would be counter to the aliasing rules (it is not legal to alias an int* with a float*) and would lead to undefined behaviour. It is therefore possible to optimise the function and return the literal 1; the aliasing rules allow the compiler and optimiser to make certain assumptions and therefore generate better code
The aliasing rules can be deliberately subverted; for example, the above function could be called as;-
The above may even execute 'correctly' (as expected). However, as already explained, it represents undefined behaviour. If the size of a float is not the same as the size of an int then it is also likely to lead to memory corruption. See also Punning
Punning is the technique of forcing an otherwise illegal object alias to be used. In general use, this is (for obvious reasons) not a good thing to do as it deliberately circumvents the type system and can easily lead to undefined behaviour
However, there are cases where it is necessary; low-level driver interfaces to hardware, some low-level network operations, etc
The only safe method to employ in forcing such an operation is to use memcpy() to copy the raw bit pattern from one variable (of a particular type) to another (of another type). In all cases, the following conditions must be met, otherwise undefined behaviour will result;-
Note that in any event, the resulting value (the one copied-to) may or may not hold a value representation that makes sense
A pointer type is specified by the syntax T* varname (a pointer to type T). A pointer holds an address. The address of an object is derived with the & (address-of) operator. A pointer is dereferenced (that is, access is gained to the object being pointed-to) by using the * (dereference) operator
A pointer may refer to any object that has identity; that is, an object that resides at a specific address
Example | Description |
---|---|
char c = 'a'; | |
char* p = &c; | p is a pointer and holds the address of c. '&' is the address-of operator |
char* q = p; | q is a pointer and holds the address of c |
char d = *p; | d == 'a'. '*' is the dereference operator |
int c[] = {7, 11, 13}; | |
int* p = c; | p is a pointer to c[0]. The array decays to a pointer |
int d = *p; | d == 7 |
int e = p[2]; | e == 13. See also Arrays |
int* a; | Pointer to an int |
const int* a; | Pointer to a const int. The object (int) referred to may not be modified via this pointer |
int* const a = b; | const pointer to an int (must be initialised) |
const int* const a = b; | const pointer to a const int (must be initialised) |
const volatile int* a; | Pointer to a const volatile int |
volatile int* const a; | const pointer to a volatile int (must be initialised) |
char** c; | Pointer to pointer to a char |
int* a[8]; | An array of 8 pointers to ints. This will decay to a pointer to pointer; ie, int** a |
int (*fn)(char* param); | Pointer to a function that takes a char* argument and returns an int. See also Function Pointers |
int* fn(char* param); | Function declaration that takes a char* argument and returns a pointer to an int |
void* a; | Pointer to an object of unknown type |
nullptr | A literal that represents a null pointer |
A pointer may point to another pointer. Such a type is specified by the syntax T** varname (a pointer to a pointer to type T)
Example | Description |
---|---|
char c = 'a'; | |
char* p = &c; | p is a pointer and holds the address of c. '&' is the address-of operator |
char** pp = &p; | pp is a pointer to a pointer and holds the address of the pointer p |
char* q = *pp; | q is a pointer and holds the address of c. The second '*' is the dereference operator |
char** qq = pp; | qq is a pointer to a pointer and holds the address of the pointer p |
int** a; | Pointer to a pointer to an int |
const int** a; | Pointer to a pointer to an const int |
int** const a = b; | const pointer to a pointer to an int (must be initialised) |
const int** const a = b; | const pointer to a (non-const) pointer to a const int (must be initialised) |
int* const * const a = b; | const pointer to a const pointer to a non-const int (must be initialised) |
const int* const * a; | Non-const pointer to a const pointer to a const |
Pointer indirection may be extended beyond two levels;-
Use of more than 2 levels of indirection is probably an indication of a design fault
nullptr is a literal that represents a null pointer
Prefer nullptr to 0 or NULL
restrict is not a standard C++ keyword, but it is supported as an extension by several implementations. As such, if it is supported, it will actually be named __restrict or __restrict__
It may be applied to pointer and reference types, and it hints to the compiler that the particular pointer/reference is the only one used in the current scope that refers to a particular set of data; no other pointers/references are used by the current scope that refer to the same object. For example;-
In this example, the arguments a and b are both marked as __restrict__. This indicates a promise (to the compiler) that a is guaranteed to be the only reference to the data that it refers to (ie, no other pointers (in this case b) will ever refer to the same area of memory that a can), and similarly for b. This allows the compiler to make certain assumptions, which in this case will almost certainly mean reducing the loop within the function to two calls to memset (which is typically optimised for the platform to perform such an operation very efficiently)
Example | Description |
---|---|
type varname[size]; | An array with 'size' number of elements is defined like this |
varname[n] | Array element 'n' is referenced like this |
varname[n] | Array element 'n' is referenced like this |
*varname | Directly dereference the first element |
varname | Yields (decays to) a pointer to the first element of the array. For an array int array[4], array would yield a pointer of type int* referring to array[0]. Decay to a pointer causes the size information (number of elements) to be lost |
&varname | Yields a pointer to the whole array. For an array int array[4], &array would yield a pointer of type int(*)[4]. This will hold the same address as array but is not the same type. Incrementing such a pointer would step the address on by sizeof(int) * 4. Given auto parray = &array, dereferencing the pointer (ie, *parray or parray[0]) will yield the original array, and parray[0][0] or (*parray)[0] would yield array[0] |
type(*varname)[3] | Specifies an array type of specified length. See example |
using typename = type[3] typedef type(typename)[3] | An array alias. See example |
type* v = &varname[n] | Create a pointer to element 'n' of an array |
type varname[size_x][size_y]; | A 2 dimensional array (defined as an array of arrays) |
type* varname = new type[size]; | An array allocated on the heap |
int nums[3] = {1, 2, 3}; | A 3 element array populated with 3 values |
int nums[42] = {1, 2, 3}; | A 42 element array. The first 3 elements are populated with values and the remaining elements are initialised to zero |
int nums[] = {1, 2, 3}; | An array initialised with 3 numbers. The array is automatically sized to match the number of initial values (3) and populated accordingly |
char name[] = {'H', 'i'}; | An array initialised with 2 characters. The array is automatically sized to match the number of initial values (2) and populated accordingly |
char name[] = {"Hello"}; | An array initialised with a 'C-style' string literal. The array is automatically sized to that of the string + 1 (for a NULL terminator) and populated accordingly |
A reference may refer to any object that has identity; that is, the object resides at a specific address. Unlike a pointer though, a reference is a named alias for the target object rather than a distinct variable. A reference is NOT an object. It is not unreasonable to view a reference as a const pointer to an object that is implicitly dereferenced when required
There are three 'types' of references; those that refer to lvalues (objects that we want to modify), those that refer to const lvalues (objects we don't want to modify), and those that refer to rvalues (generally, temporary references generated by the compiler/run-time environment). See also lvalues And rvalues
Example | Description |
---|---|
int i = 42; | |
int& r = i; | r is a reference to i |
int& r2 {i}; | r2 is a reference to i |
int j = r; | j == 42 (note implicit dereference; no special syntax) |
int* p = &r; | p is a pointer to i |
r = 83; | i == 83 |
++r; | i == 84 |
const int& s = r; | s is a const reference to r (r may not be modified via this reference) |
const int& t {42}; | It is not possible to initialise a reference with an rvalue. Here, a temporary object is created to hold the (rvalue) literal 42, thus allowing the desired effect. The reference must be const. The temporary's scope is the same as that of the reference |
int& f(char& param); | Function declaration that takes a char reference argument and returns a pointer to an int. The object that param refers to may be modified from within the function |
int& f(const char& param); | Same as the previous example except that the object that param refers to may NOT be modified from within the function |
References and pointers differ in the following ways;-
It is very useful to know if an rvalue reference is pointing to a temporary object (which will not be used after the operation in-hand), because if it is and we want to save that temporary object, we can sometimes perform an inexpensive move operation rather than a (potentially) expensive copy operation
The most common example of this is a function return value where a temporary object being returned will be saved into a caller-specified variable and then the temporary object destroyed. If we can just move the temporary object to the destination variable rather than copying it then this is a "good thing". Another example is an object (such as a std::string or std::list) that is actually just a very small handle to a potentially huge amount of data
Example | Description |
---|---|
int&& r {f()}; | rvalue reference to a function |
int&& r {var}; | Illegal; rvalue reference to an lvalue (var) |
string&& r {"Hello"}; | rvalue reference to a temporary |
void fn(string&& r); | A function that takes an rvalue string reference |
A classic example of where rvalue references become valuable is in a swap function. A traditional swap function would have to create a temporary and make at least one copy of the parameters offered to it. Using rvalue references, the copy operations are avoided;-
By using the static_cast<T&&> (resulting in a T&& type), the compiler is able to make use of any optimised operators for the type. In this example, that would be a move-constructor or a move-assignment. The standard library containers support these as well as rvalue versions of insert() and push_back() etc
Because the static_cast<T&&> construct is a little verbose and ugly, the standard library provides the function move(x) which means the same thing. Note that move() does not actually move anything; it is a somewhat misleading name. Regardless, we can improve the example above as follows;-
As it stands, the above swap function will only accept two lvalues as parameters. To allow it to accept an rvalue as a parameter as well, we could include the two overloads;-
The standard library containers deal with this issue in a different way, using shrink_to_fit() and clear()
It is possible to take a reference to a reference, though this is only syntactically legal via use of an alias or a template reference parameter
Following on from this point, reference collapsing is the mechanism that allows a function template that takes a forwarding reference to be called with an argument that is already a reference, producing (say) fn(T& && a) which then reduces-down to something that is syntactically legal; fn(T& a)
Reference collapsing occurs in four scenarios;-
Here are the possible combinations and the resulting type that is defined;-
Syntax | Interpretation |
---|---|
using lref = int& | int& |
using rref = int&& | int&& |
using lref_to_lref = lref& | int & & ≡ int& |
using lref_to_rref = rref& | int && & ≡ int& |
using rref_to_rref = rref&& | int && && ≡ int&& |
using rref_to_lref = lref&& | int & && ≡ int& |
int && & a = b | This and similar direct (non-alias) syntax is illegal |
The syntax T&& can mean one of two things;-
A T&& is a forwarding reference (rather than an rvalue reference) in the following contexts;-
As a function template argument such as;-
As part of an auto declaration;-
The link between these two contexts is that they both employ type deduction
If no type deduction takes place (ie, is not required) then T&& will be an rvalue reference. For example;-
Here, no type deduction is required for T&& because the type has already been determined by virtue of instantiating the parent class. Therefore T&& is an rvalue reference
Type-deduction is also not required (and therefore, forwarding references will not result) if the type is explicitly stated at instantiation, such as the call fn<int>(b);
Expanding on the previous example, the following will result in a forwarding reference because of the resulting type deduction;-
A structure is a user-defined type. It is defined thus;-
For example;-
Structures may be declared and used exactly like any other variable. Declare a structure variable thus (note that struct does not have to be specified again like it does in plain 'C');-
Structures may be initialised using {} notation. If there is no constructor defined, then initialisation of the structure elements is in the order they are defined;-
A field element is defined like this;-
Fields are most commonly used within a struct to allow several very small values to be packed, or to ease mapping onto an externally imposed layout such as a hardware interface. For example;-
A field as a function argument is always passed by-value. If a field argument is a reference then it must be const;-
A union is a user-defined type. Specifically, it is a struct in which all elements are allocated at the same address. For example;-
An element of a union is called a variant. The union may hold the value of only one variant at a time
The union-name is optional. If it is omitted then the union definition implicitly defines an object, and the members of the union are accessed as if they were members of the parent scope. This is called an anonymous union. For example;-
Additional objects of the same type as an anonymous union may not be defined (because the union type has no name that can be specified in such a definition)
See also tagged union
If union-name is specified then the union defines a type only, and objects of the type must be explicitly defined and the union members must be qualified in terms of the object(s) names. For example;-
Only one element may have an default initialiser, and this shall be used in all instantiations. For example;-
In addition, an anonymous union;-
Do not use a union type for type conversion; ie, writing to one element and then reading from a different element; it is ugly, error-prone, non-portable, and represents undefined behaviour. See also punning
It is possible to improve slightly on the raw union by encapsulating it in a class with accessor functions to maintain state and force correct usage by providing 'set' functions, and 'read' functions that check that the correct (ie, the last one that was written) element is being read. Such an arrangement is referred to as a tagged or discriminated union
Encapsulating unions does NOT fix all their problems;-
An alternative to using unions would be to use a set of derived classes. This also has the advantage of not imposing the cost of the union size (the size of the largest element) on the other (smaller) elements
In short, unless all the union elements are simple types (types without user-defined constructors/destructors, copy or move operations), they can cause a lot of trouble and are best avoided. Even if all the elements are 'simple', think twice; there's usually a better alternative
An enumeration is a user-defines set of named integer values (enumerators). There are two types of enum;-
Individual enumerators may be explicitly set to an integral type constant expression (with successive values incrementing by 1);-
For both enum and enum class, the type may be explicitly stated, or changed;-
It is possible to subsequently retrieve the underlying type as follows;-
An enum class can be forward-declared in a similar way to a struct. This also works for plain enum types but only if the underlying type is explicitly stated;-
Forward-declaration of an enum is not possible
See also Deferred Types
All enumeration types ("plain" and "class") may have operators defined for them, though in the case of "plain" enumerations, the operator functions must be non-member types. A classic example is;-
A "plain" or "unscoped" enumeration is a user-defined type, akin to a 'C'-type enum
"Plain" enums are termed unscoped because the names defined within them are in the enclosing scope rather than the scope of the enum definition
Example;-
An anonymous enum may be created if all that is required is a set of constants, rather than an enumeration;-
One use for "plain" enums is in declaring names for tuple elements; their ability to implicitly convert to an integral makes them less awkward to use than a similar "class" enum, though see this example
A "class" or "scoped" enumeration is a user-defined type that is scoped and strongly typed. For example;-
Sometimes, enumerator values are chosen to provide a bitmask. AND and OR functions can be created to safely manipulate these, such as the following. Note that explicit conversion is necessary because the enum class does not support implicit conversion;-
Because the '&' and '|' functions are constexpr, they can be used at compile time such as in a switch clause; case bitmask::BIT1 & bitmask::BIT3 {…}. Take care how the & and | functions are used though; they can (and do) return a bitmask value that is not actually legal!
"Class" enums can be used to index a tuple;-
Adding a helper function template reduces the syntax a little. This is general enough to work with any tuple and any enumeration regardless of underlying type;-
Use the helper function;-
Sometimes, especially at a low level or when implementing a container class, it is useful to be able to treat an object as 'plain old data' (POD); ie, just a bunch of structureless bytes. Doing so can massively increase copy performance for example as it avoids the need to call constructors for each of the object's members
A POD type can be safely passed between C++ and 'C' code
For an object to be successfully (ie, without breaking any C++ language guarantees) treated as a POD, it must be a scalar type, a class, struct or union that complies with the following constraints, or an array of such a type;-
A standard library type predicate is_pod<T> is defined in <type_traits> that returns whether a type is a POD or not. This is much more convenient than remembering the above rules and applying them correctly!
A related concept to a 'standard' type is a trivial type which is one that has a trivial default constructor and trivial copy and move operations
See also Regular Types
Example
Here is a generalised copy function that uses the standard library type predicate is_pod<T>::value;-
...or here is a better technique that uses std::enable_if;-
Lists can be used for initialising named variables and may be used as expressions in many (but not all) cases. There are two forms;-
Form | Meaning |
---|---|
T {…} | Qualified. Means "create object of type T and initialise it with T{…} |
{…} | Unqualified. Type must be determined from the context and must be unambiguous |
Example | Description |
---|---|
T{v} | Create a temporary object |
T a{v} | Create a local object and initialise |
T a = T{v} | Create a temporary object and assign |
T a = new T{v} | Create an object in free-store |
A list is interpreted as follows';-
The standard library std::initializer_list<T> type is used to construct variable length lists. It is mostly used for initialising user-defined containers. For example, the standard library vector has an initializer_list, so this;-
…is actually interpreted as this;-
initialiser_list can be used directly. One useful technique is for passing a varying size list of homogeneous values to a function, for example-
This is a complete list of statements;-
A label specifies a point to which program flow may be directed. A label is only used in two contexts;-
A label is defined as label:. However, the format of label is quite different between the two uses; see the appropriate sections for details
An if statement is defined thus;-
…or…
The if clause may (optionally) be specified as;-
The condition does not necessarily have to include an explicit operator. For example, these two statements will yield the same result where x is an integer;-
An example of how the optional init expression may be used;-
In this example, a mutex is acquired for access to a UART. If the UART is in the appropriate state, then it is written to. lock will go out of scope at the end of the block and so automatically release the mutex
Note that this is nothing that could not be done by other means; it's main advantage is that it restricts the scope of a controlling object (in this example, the mutex)
A constexpr if statement is defined the same as a standard other if statement, with the addition of the constexpr specifier;-
…or…
The condition is evaluated at compile time. Because of this, the branch containing the discarded statements may not even be evaluated by the compiler
One consequence of this is that the discarded statements may not even need to work (though they must be legal). For example;-
The above will compile and link even though c is not defined; the branch referring to c is discarded and does not exist in the final program
If the constexpr if statement is within a templated entity and the condition does not have a name dependency then the discarded statements will not be instantiated along with the template instantiation;-
Any return statements in a discarded branch will not participate in function's return type deduction
A switch statement selects from among a set of alternative values (indicated by the case labels). It is defined thus;-
The switch clause may (optionally) be specified as;-
Deliberately omitting a break can be indicated by using a [[fallthrough]] attribute. This should also prevent the compiler from warning about the omission. For example;-
[[fallthrough]] must be specified in isolation as shown above
Technically, the [[fallthrough]] attribute is applied to a null statement, hence the (required) trailing ';' (semicolon)
[[fallthrough]] must be followed by a case label (or default:). As such, it may not be applied at the end of the very last option in the switch
An example of how the optional init expression may be used;-
Note that, as with the similar feature for if, this is nothing that could not be done by other means; it just allows the scope of a controlling object to be limited
Although not strictly always required by the syntax, if a declaration is made within a switch, always encapsulate the declaration within a block {…}. If you don't, the name will pollute the scope of the switch and can lead to subtle errors such as;-
It is not legal/possible to by-pass an initialisation. The compiler should detect that b and c have such initialisations and that if the switch expression x is 2 then that initialisation shall be by-passed. This is why the declaration of b and c will fail
It so happens that an int does not require initialisation and for this reason, the declaration of a will succeed. However, the compiler should catch the fact that if x is 2 then a shall be used without being initialised
There may be other subtle combinations of similar errors that may or may not be caught by the compiler. In contrast, if the declarations were within a block, then this would be the result, which is much more sensible;-
In short, always encapsulate switch declarations within a block, therefore limiting their scope to a single case label
Always add a comment ( and/or use a [[fallthrough]] attribute) in-place of any absent break statements to show the intentional omission; eg, // Fall through...
One instance when default should NOT be used is when the switch expression is an enumeration type and the intention is to provide a case label for each enumerator. In this case, including the default label would prevent the compiler from detecting if any of the enumerators had not been accounted for. If there is a need to test for an illegal enumerator value then this is possibly best done separately (prior to the switch) rather than using a default label
A for statement is defined thus;-
It is often useful to use auto or auto& as the loop control variable. eg;-
An alternative to a while statement is a for statement in the following form which combines the condition of the for with its expression. Like the while statement, this still allowing a non-determinate number of iterations but it also has the advantage of not requiring a separate loop control variable; the element being operated on is used for this purpose, and that variable's scope is confined to the loop itself;-
A range-for statement is defined thus;-
In order to make a container type usable with the range-for statement; the following must be defined;-
A while statement is defined thus;-
A do statement is defined thus;-
The iteration of any loop statement, for, range-for, while or do, may be modified from within the body of the loop in several ways
The continue statement is used exclusively to control loop iteration. The break statement is used likewise and also within switch statements. They are defined like this;-
A goto statement is defined thus;-
Don't use goto. It's hideous, it subverts the logical flow of the program, it's the source of countless bugs, it's NEVER necessary, EVER!
A return statement is defined thus;-
Observe the principle of "one point of entry, one point of exit"; a function has one point of entry; the top. It should also have one point of exit; the bottom. Using multiple return statements within a function is ugly, it subverts the logical flow of the program, and can be a cause of error
Many things are expressions; assignment, function calls, object construction, and many others
When parsing an expression, the compiler first extracts lexical tokens from the expression string. It does this following a 'greedy' technique; that is, each token is extracted to make it as long as possible while still being syntactically legal. Tokens are composed of the following elements;-
Order of evaluation of sub-expressions within an expression is undefined. Given;-
…where @A and @B are arbitrary operators, it is undefined in which order the expr expressions are evaluated. Operator precedence rules will effect the order in which the expressions are combined, however
In summary, in the following examples, the order of evaluation is A, then B;-
Example 1 The effect of indeterminate evaluation order can be significant;-
Because the value of each sub-expression relies on side-effects from the other, the value of c is indeterminate; actually this invokes undefined behaviour. The result could be (1 * 3) + (2 * 4) or (2 * 3) + (1 * 4)
Example 2
…invokes undefined behaviour because at least one of the sub-expressions (in this case, both of them), modifies a value used by the other
Example 3
Result is unspecified
Evaluation follows operator precedence rules resulting in expected left-ro-right evaluation
See also Function Argument Evaluation Order
A conditional expression is a more direct alternative to an if statement;-
For example;-
When evaluating an expression, it may be necessary to create a temporary object. For example, with the expression (a + b) * c, the value (a + b) must be held somewhere before evaluating the rest of the expression
Unless it is to be bound to a const (lvalue) reference or non-const rvalue reference, and used to initialise a named object, any temporary object that comes into being shall be destroyed at the end of the whole expression (NOT just the sub-expression) in which is was created. For fundamental types, this process is largely irrelevant to the application. However, for more complex types, the lifetime of any temporary object may become an issue
Example | Description |
---|---|
const char* c = (s1 + s2).c_str(); // ...use c... | Error. s1 and s2 are of type string. The string class includes the method c_str() that returns a ref. to the raw plain 'C' string used internally by it. The problem here is that (s1 + s2) creates a temporary object, and so c points to the plain 'C' string of that temporary …which will be destroyed at the end of the expression! |
if (strlen(c = (s1 + s2).c_str()) < 42) { // ...Use c... } | Error. The if statement will actually work as intended because the comparison is part of the same expression that creates the temporary and therefore the temporary will still exist. However, subsequent use of c is undefined because, like the previous example, the temporary will be destroyed at the end of the expression |
const string& c = s1 + s2; // ...Use c... | Ok. By assigning the temporary to a (const) reference, its scope is extended |
string& c = s1 + s2; // ...Use c... | Error. The reference is not const. The scope of the temporary object is not extended resulting in a dangling reference |
const string c = s1 + s2; // ...Use c... | Ok. Use the temporary to initialise a new object |
fn(s1 + s2); | Ok. The temporary will exist until the function call ends |
Structured binding follows the same temporary object scope extension rules as shown above
Do not assume that all temporary objects are rvalues, or assume some other similar relationship; the two concepts are entirely separate
Polymorphism Without a Virtual Destructor
The following example highlights a useful feature;-
Here, the call to the function get_x() yields a temporary object of type X. However, the reference a is of type const Y& (ie, the base of X). The point to note is that despite the base reference, when a goes out of scope and is destroyed at the end of the function fn(), the object that a refers to will still be destroyed correctly; that is, the X destructor shall be called first followed by the (base) Y destructor. This works correctly even without a virtual destructor being defined for Y
This technique only works for local const references; not for class member references or non-local references
A constant expressions is defined thus;-
For an object constant expression;-
…it must satisfy the following;-
Example;-
For a function-type constant expression;-
…it must satisfy the following;-
It can only accept and return objects of literal types
void is not considered a literal type and so may not, for example, be used as a constexpr function return type
void is considered a literal type and so may be used by a constexpr function
If the function is a constructor then;-
The function body must be either deleted, defaulted or contain;-
…any statement except;-
If the function is a constructor then;-
A constexpr function may only contain a single executable statement; return expression;. The expression may use the ?: conditional and may recurse. For example, a function to return an integer power;-
A constexpr function is much less restricted than in C++11. For example, a function to return an integer power;-
Use constexpr whenever possible; it can provide safer (compile-time) initialisation, and may allow optimisations and use-cases not possible by other means
Integral and floating-point types may be freely mixed in assignments and expressions, and implicit conversion between types is performed in such a way as to try and preserve information. Implicit type conversions that preserve value are called promotions
Sometimes, it is not possible to preserve value. In this case, a narrowing conversion is performed. For example;-
Basic conversion procedure and how narrowing is handled;-
Floating-point Integral conversion. First, fractional part is discarded. Then;-
Try to avoid narrowing conversions, but if they are unavoidable (or possible, in (say) a template function), consider the use of run-time checked conversion functions such as narrow_cast<>. This tests for loss of data by comparing the result after the conversion with the original value, at the obvious cost of additional overhead
Implicit type conversions that preserves value are called promotions. Integral types are promoted before any arithmetic operation is performed. The main purpose is to convert numeric values to the 'natural' size of the underlying machine architecture (ie, int)
For an integral type, as long as the type is smaller than an int, promotion shall be performed. In contrast floating-point types are only converted if necessary (ie, if the types in an expression differ) by following the usual arithmetic conversion rules
For example, given the following;-
…the following types shall result for the respective arithmetic and bitwise logical operations;-
The implicit promotion of types smaller than int when performing arithmetic or bitwise operations can result in unexpected values being generated, especially when unsigned/signed conversion is implied or when negative signed values are being used. The following examples assume int is 32 bits
Example 1;-
The reason for the above result is that a is first promoted to an int, resulting in a value of 0x00000080. The ~ operation is then applied, resulting in the value 0xffffff7f. This is then right-shifted by 4, resulting in 0xfffffff7, and then implicitly cast back to a uint8_t resulting in a value of 0xf7
One way to correct this problem is to modify the expression as follows;-
Example 2;-
The result d is correct. It's worth understanding why. First, a is promoted to an int maintaining its value of -1. b is also promoted to an int, resulting in a value of 0x00008000 (32768). -1 is less than 32768, hence the result true
The result e is not correct. Here, a is promoted to an int maintaining its value of -1. c is already of a type whose size is ≥ int and so is not promoted. retaining its value of 0x80000000. As a result, a is now converted to the same type as c, resulting in the unsigned value 0xffffffff. This is not smaller than 0x8000000, hence the result false
The result of an arithmetic expression between two operands is determined by the "usual arithmetic conversion" rules; the general aim being to produce a result that is as large as the largest operand type. The rules are;-
Conversion from one type to another may be performed explicitly, using several syntax forms and operators (in vague order of 'niceness' and safety);-
Form | Description |
---|---|
{…} | Construction. This only allows safe conversions |
const_cast |
This takes the form type const_cast<type>(expr) Cast-away const and volatile qualifiers. This is only safe if the original object was defined as non-const and/or non-volatile (and has since acquired these attributes) The type being cast from must be a pointer, reference, or pointer-to-data-member The type being cast to must be the same as that being cast from (except for any const or volatile qualifiers). For example;-
const int* a{};
int* b = const_cast<int*>(a);
const_cast imposes no run-time overhead |
static_cast |
This takes the form type static_cast<type>(expr) Converts between related types such as one pointer type to another in the same class hierarchy, an integral type to an enumeration, or a floating-point type to an integral type. It also does conversions defined by constructors (§16.2.6, §18.3.3, §iso.5.2.9) and conversion operators (§18.4). For example-
void* my_allocator(size_t sz);
int* p = static_cast<int*>(my_allocator(100));
For static_cast to work, there must be an implicit conversion available to convert from expr to type (in which case static_cast isn't actually needed), or from type to the type of the expr static_cast may be used to add const or volatile to a type, but not remove them, eg, assuming X a{};, const X* b = static_cast<const X*>(&a); static_cast cannot be used on a polymorphic class hierarchy; a dynamic_cast is needed in this case The address of the destination type may differ from that of the source type. For example, if casting to one of the bases of a multiply-inherited derived type, then clearly not all the bases can reside at the same address (unless they are empty) As a result, static_cast may impose a (typically very small) run-time overhead for some conversions |
reinterpret_cast |
This takes the form type reinterpret_cast<type>(expr) Changes the meaning of a bit pattern (does not change the actual data). Handles conversion between unrelated types such as a pointer to an integer. For example;-
char x[] = "1234";
int* y = static_cast<int*>(x); // Error: No implicit char* to int* conversion
int* y = reinterpret_cast<int*>(x); // Ok: Hope you know what you're doing!
An example of an "acceptable" use of reinterpret_cast is in mapping a literal representing (say) a hardware register to a usable type;-
uint32_t* reg = reinterpret_cast<uint32_t*>(0x10004120);
The following conversions are possible;-
reinterpret_cast is a dangerous (though sometimes necessary) operation. It provides no safeguards. Apart from the obvious issue of possibly casting to a type that yields nonsense results, some other issues that can arise are;-
Generally, it is best to consider the only guaranteed safe use of reinterpret_cast to be that of casting a pointer or reference back to its original type in the event that its type has become modified for some reason reinterpret_cast imposes no run-time overhead, except (possibly) when casting between integrals and pointers, or (on some unconventional platforms that use different pointer representations, depending on type) conversion between pointer types |
dynamic_cast | Dynamically checked (at run-time) conversion of pointers and references within a class hierarchy. See dynamic_cast |
(type)value | 'C'-style cast. This uses a combination of const_cast, static_cast and reinterpret_cast to perform whatever cast is specified. As a result it is very dangerous; virtually any cast can be performed and with virtually no safeguards |
type(value) | Function-style cast. Note that for a built-in or "plain" enumeration type T, T(e) is interpreted in the same way as (T)e (ie, a 'C'-style cast), with all the dangers that brings with it |
The address of an object of a derived class may differ depending on whether it is being referred to by its derived type or its base type. Therefore, conversion between the two pointer types may involve (at run-time) the adjustment of an address by some offset;-
This example is just one reason why one should never make assumptions about the memory layout of objects
Think twice before using any cast. Casts are almost always avoidable, and if they are not, confine them to small, well-defined areas; consider providing a function specially for the purpose to isolates the operation and avoid the need to scatter cast operations throughout the application code
Don't cast-away const. Especially don't define a const member function that actually modifies member data by casting-away const from this or from such members. There is always a better way. See also mutable
One possible exception to this rule is if passing const data by reference/pointer to a legacy function that you have no control over and that is known not to modify the argument but which fails to specify it as const. There are many examples of such functions in the standard 'C' library
Casts, whether explicit or implicit, are usually performed at run-time and often (but not always) involve a call to a non-default constructor for the type being cast-to. That constructor will create a new object. This can lead to all sorts of problems;-
The static_cast will invoke the copy constructor for Y and return a newly constructed object. Therefore, the call to fn() will be invoked on a copy of 'a' and not on 'a' itself. When fn() returns, the Y object shall be destroyed. Why you would want to do this at all is another issue entirely!
Templates provide a means of avoiding casts altogether
reinterpret_cast is often non-portable, usually because of differences in type sizes
'C'-style casts and function-style casts are both rather dangerous and best avoided. There are no scenarios where they MUST be used
Here is an alternative explicit conversion function that handles possible narrowing (loss of data) of scalar types;-
Note that this is more likely to throw with floating-point types because of rounding errors. In that case, a range test rather than a hard != is probably a better test. This can be achieved with operator overloading or traits. The standard library round() function is also available
The dynamic_cast operator takes the form type dynamic_cast<type>(expr) and provides a dynamically checked (at run-time) conversion of pointers and references within a class hierarchy. It is useful when it is not possible to determine the correct cast at compile-time. For example, this;-
…gives the following hierarchy;-
Given a pointer pz that refers to the base Z, we can derive a pointer to the base Y;-
The above example shows a very simple hierarchy but it may be arbitrarily complex
Converting Pointers
Converting References
General
Avoid getting into a situation where you need to downcast or crosscast a non-polymorphic type. There is no guaranteed type-safe way of doing this
This is a complete list of operators, in decreasing order of precedence and with each block containing operators of the same precedence. The following definitions are used;-
In addition to the above, a user-defined type may also define literal operators
The Overload Impl. Type column indicates the prototype/signature of the operator function when implemented for a user-defined type. For further details, see Overloading Operators
For unary +expr and -expr operators, integral promotion is first applied to expr and then the operator is applied. The result is the type after promotion
With a typical implementation of a signed integer as a 2's compliment representation, applying unary minus to std::numeric_limits<T>::min() will yield undefined behaviour owing to integer overflow
Unary plus may be applied to other built-in types such as arrays and functions
For binary expr + expr operator, both expr must be arithmetic types, or one must be a pointer type (not void*) and the other an integral type
For binary expr - expr operator, both expr must be arithmetic types, or the first must be a pointer type (not void*) and the second an integral type, or both expressions must be pointers of the same type (ignoring any modifiers) but not void*. The result of such an operation only makes sense if the two pointers refer to objects within the same array (or one element past the end of the array)
If both expr are arithmetic types then Usual Arithmetic Conversion rules are applied to each before performing the operation
For an unsigned integral type, arithmetic underflow/overflow shall result in the value wrapping-round in the normal binary fashion
For other types (in particular, signed integral types), arithmetic underflow/overflow is undefined behaviour. See also int vs unsigned int
For binary expr << expr and expr >> expr operators, both expr must be integral types. Integral promotion is applied to each before performing the operation. The result of the operation is that of the first expr after promotion
The second expr must be in the range zero to the size (in bits) of the first expr (after promotion) minus 1. Otherwise the operation is undefined
If the first expr is unsigned or positive signed then zero bits are shifted in to the ms (for >>) / ls (for <<) bit as required
If the first expr is signed then the result of << is undefined if (after promotion) the result cannot be fully represented (ie, if it would result in bits being 'lost')
If the first expr is signed and negative then the result of >> is implementation defined. With a typical implementation of a signed integer as a 2's compliment representation, the result is very likely to involve extending the (ms) sign bit, maintaining a negative value (ie, an "arithmetic" shift is performed)
Some precedence examples;-
Example | Description |
---|---|
a + b * c | Means a + (b * c) because * has a higher precedence than + |
a = b = c | Means a = (b = c) |
a + b + c | Means (a + b) +c |
if (x & mask == 0) {…} | Means x & (mask == 0). Take care! |
if (0 <= x <= 42) {…} | Means (0 <= x) <= 42. This is interpreted as follows; 0 <= x yields a bool of true or false. This is implicitly converted to an int yielding 0 or 1. This is then compared with 42 which will always yield true |
a+++ 1 | Means (a++) + 1 |
Explicit use of bitwise operators such as &, |, ^ etc can sometimes be avoided by using a bitfield
The bitwise operators &, |, ^, etc can be used for logical 'set' manipulation. However, consider the higher-level standard library types set and bitset instead
Operator overloading allows conventional notation to be used to manipulate an object of a user-defined type. For example, given two objects of a user-defined class; a and b, it may be useful to check for equality a == b, or to be able to add a + b (whatever 'add' means in the context of the type)
The Overload Impl. Type column in the table of operators indicates the prototype/signature of the operator function when implemented for a user-defined type. Most operators follow the same few function signature/prototype patterns and these are indicated below. For any operators that deviate from the standard patterns, their specific function signature/prototype is indicated specifically in the table and are described in more detail in the following sections
The meaning of Overload Impl. Type is as follows
Note: The arguments and return types are flexible for most operators; in all cases, lhs and rhs can be any type, and is commonly another X type. In the case of a binary operator, rtn-type is often another X but may be a different type. For example, a != operator would probably return a bool;-
Overload Impl. Type | Description |
---|---|
- | Operator may not be overloaded in user-defined type |
Prefix Unary |
Prefix unary operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two argument. The operator acts directly on and modifies the supplied object. An operator @ is defined with any of the following;-
X& X::operator@()
X& operator@(X& lhs)
Generally, a prefix unary operator member function should return a reference to *this, and a non-member function should return a reference to lhs |
Postfix Unary |
Postfix unary operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two arguments. The operator does not modify the supplied object but (typically) creates a copy of it and operates on (and returns) the copy. An operator @ is defined with any of the following;-
rtn-type X::operator@() const
rtn-type operator@(const X& lhs)
|
Binary |
Binary operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two arguments. An operator @ is defined with any of the following;-
rtn-type X::operator@(const rhs) const
rtn-type operator@(X lhs, const rhs)
rtn-type operator@(const lhs, X rhs)
An Arithmetic Binary operator member function often returns a modified copy of *this, and a non-member function often returns a modified copy of the X argument, though in both cases it may be appropriate to return some other type A conventional Bitwise Binary operation on an integral type returns a similar integral type. For a user-defined type, the return type often follows that of an Arithmetic Binary operator A Logical (Comparison) Binary operation is usually best defined as non-member function(s) so that the argument types may be specified either way. The operation usually returns a bool value |
Binary Assignment |
Binary Assignment operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two arguments. An operator @ is defined with any of the following;-
X& X::operator@(const rhs)
X& operator@(X& lhs, const rhs)
Generally, unless there is a good reason not to, a binary assignment operator member function should return a reference to *this, and a non-member function should return a reference to lhs. This allows chaining; a = b = c. An assignment operator that behaves differently will not operate in a 'normal' way and may cause problems within expressions |
Following is an example of one of each of the four main operator formats; operator '++' (prefix Unary), operator '!' (postfix Unary), operator '+' (Binary) and operator '=' (Binary Assignment);-
…or;-
It is common for non-member operator functions to be defined as friends; here is an example of operators for X + int and int + X, with the latter defined as an inline friend function (see also an alternative approach below);-
For example, consider a binary operator @, an object x of type X, and an object y of type Y. The operation x @ y would be resolved thus;-
Following all these lookups, normal overload resolution will select the most appropriate version (if any) of operator@
User-defined conversions are considered during the search and subsequent overload resolution
Most operators that can be overloaded follow the same few patterns and can be done so with a member function or a non-member function. A few may only be member functions. This is reflected in the function signature/prototype of each operator in the table and is summarised as follows
The following operators cannot be defined as non-member functions;-
If an operator is to accept a built-in type as its first argument, then it cannot be defined as a member-function; it must be defined as a non-member function. This is because the compiler does not know what the operator means for the type and cannot assume it is commutative (if it could then it could silently swap the argument round and call a member-function operator)
In practice, it is usually a good idea to define operators so they are commutative; ie, a + b should give the same result as b + a even if a and b are different types
Note that this example defines the 'int + X' operator in terms of 'X + int'. This only works if the operations are intended to be commutative. If they are not then distinct definitions would be required
An alternate solution is to define appropriate conversion constructors or type conversion operators
Passing arguments to an operator function;-
Returning a value from an operator function;-
It is acceptable (and common) to return a result as a reference as long as the reference is to one of the objects pass-in as an argument. In the case of an assignment operator, this is the usual case. For example;-
Consider returning const values from operator functions, eg, const operator+(const X& lhs, const X& rhs);. This makes incorrect use such as (a + b) = c; illegal; a syntax more commonly written (by mistake) as if (a + b = c). Note that this const behaviour is in-line with that of built-in types
Returning a const value from an operator function can prevent the compiler from optimising and employing a move operation in assigning the returned value. See also Return Value Mechanics
The operators = (assignment), & (address-of) and , (sequencing) have predefined meaning when applied to class objects. These meanings can be changed by redefining the operators or revoked by deleting the operator member functions;-
Non-member functions are often preferable to member functions. In this spirit, it is best to limit member operator functions to those that inherently act directly on and modify a type. For example-
Note that + requires the creation of a new object which is created as part of the argument passing and requires a return-by-value result, whereas += does not (hence, return-by-reference)
This example also demonstrates defining one (non-member) operator in terms of another (member) operator
Wherever possible, maintain the normal meaning of operators. For example, + should not (say) perform a square-root operation or something completely unrelated
One obvious exception to this is the way the standard I/O library overloads the operators << and >> to do something completely different. Even so, these still 'sort-of' keep the traditional meaning in an abstract way
Also, maintain the relationships between operators. For example, a = a + 1 conventionally means the same as a += 1 or a = a - -1 or a -= -1, but these are all different operators; = (assignment), + (addition), - (subtraction), += (add and assign), -= (subtract and assign) and -expr (unary minus)
It is usually a good idea to define operators so they are commutative; ie, a + b should give the same result as b + a even if a and b are different types
Operators should be carefully considered and defined as a whole to avoid any discrepancies between them
There are a number of operators that are considered "special" in that they do not follow the normal pattern of arguments and/or return values or their use is not standard. These are;-
All the above are described in the following sections
The following operators are also "special" but are described elsewhere;-
The subscripting operator [] allows a type (or more specifically, the member(s) of a type) to be indexed like an array. The argument is the index and may be of any type, thus allowing associative arrays to be constructed. For example;-
The function call operator () (also known as the Application Operator) allows a type to be directly used as a function name. That is, an object of the type can act like a function; a Function Object. For example;-
The dereferencing operator -> supports the important concept of indirection . In practice, this usually means returning a pointer reference to some member object or the internal pointer to some external resource. For example;-
Note that the variable a is not a pointer, which is a deviation from the standard usage of ->
As usual, it is often best to maintain consistency between operators. In this respect, operators ->, * and [] are closely linked and it may be desirable to ensure the following is true;-
The increment/decrement operators are specified with ++ and -- respectively. They are unique in that they may be used in both prefix and postfix operations
Consider not providing postfix increment/decrement for a type. They are not as efficient as the prefix operators and are less frequently used anyway
It is, of course, possible to take the address of a class member in the normal way;-
However the above example is NOT what a pointer-to-member is
A pointer-to-member is best thought of as a named offset into a class (type) rather than as a normal pointer with a specific absolute address. The operators .* and ->* are used in this context to dereference such a pointer
Despite being thought of as a named offset, a pointer-to-member is most useful when the the user of the pointer does not (cannot) know the exact member being referred to
Here is an example that uses a base class with a number of pure virtual functions defined (possibly the most common structure for such cases);-
Using the above definitions, this is how we might use pointers-to-data-members;-
…and this is how we might use pointers-to-function-members;-
The above examples are very simple. When they get more complex, it is easy to get the syntax wrong. The standard function std::invoke may be used as an alternative to the above syntax. Expanding on the previous example;-
A pointer-to-member may not be initialised with a reference to a member that does not exist in that class and only exists in a derived class. For example;-
This is the rule of contravariance and ensures that a pointer only points to an object that can support what is implied by the pointer type
These two binary operators are of the form expr1 && expr2 and expr1 || expr2 respectively
Don't overload the operators && or ||; doing so will result in trouble at some point
This operator is of the form expr1 , expr2
The , (comma) operator has the lowest precedence of any operator. Consider;-
The reason c == a and not c == b is because the precedence of operator = is higher than operator ,; c = a, b is the equivalent of (c = a), b. To achieve c == b, the expression would have to be written as c = (a, b)
Don't overload operator , (comma); doing so will result in trouble at some point
Notwithstanding the previous warning about not overloading , (comma), here is an example of doing exactly that. This method allows the subscripting [] operator to (at least give the impression of) taking multiple indices;-
A user-defined type may define conversion operators with the notation operator T(). Such operators convert from the user-defined type to the type T
Here is an example of a class X implementing an assignment operator that takes an int and a complimentary conversion operator to convert an object of type X to an int;-
A conversion operator may be declared explicit like a constructor. For example, here we define an operator that returns a bool to indicate whether the member m is set (to a non-zero value) or not;-
In this example, had operator bool not been declared explicit then the assignment to b would have succeeded and set b to a value of 1. This would almost certainly not be what was wanted
Expanding on this example, consider what would happen if we introduced another type Y that also implemented operator bool;-
If the two operator bool functions were not declared explicit then the test if (d == e) would compile and run, but on the face of it, this statement seems to be comparing two completely different types for equality, which is not allowed. By making the operator bool functions explicit, if (d == e) would not compile which is the safer option. Indeed, this approach is often called the Safe bool Idiom
It is not possible to define a template conversion operator such as;-
This does not work because T cannot be deduced as it does not feature as a function argument
Literal operators allow new literal notations to be defined for built-in and user-defined types so that a user-defined suffix applied to a "naked" literal value such as 123 or "Hello" or 1.23 can be interpreted specially
A literal operator function is defined with the notation operator"" _U, where U is the "unit" suffix. This does not fit into the normal pattern of operators as there isn't (by default) a "" operator
Unlike most other operators, a literal operator is very restricted in the types of arguments it may take. It must take one of the following forms;-
For an operator "unit" of blob, here are example uses;-
Integer literal operators may still be used with the existing radix notation. Expanding on the above example;-
Similarly, character and string literal operators may still be used with the existing character encoding notation. Expanding on the above example;-
Here is a more complete example;-
A template literal operator is one that takes its arguments as a variadic template parameter pack rather than as function arguments. For example;-
The implementation of a template literal operator typically needs to step through each character in-turn and construct a value. In order to do this, some helper functions are useful. The following example implementation of operator"" _bcd converts a string of digits into a BCD number;-
Note that this example checks that the first digit of the literal is not zero; if it is then that indicates a non-decimal (ie, binary 0b, octal 0n, or hex 0x) value (which makes no sense in this context)
A number of standard literal operators/suffixes are defined in the namespace std::literals;-
Literal | Description | Inline Namespace |
---|---|---|
operator""if operator""i operator""il | std::complex complex number specifiers | std::literals::complex_literals |
operator""h operator""min operator""s operator""ms operator""us operator""ns | std::chrono::duration specifiers | std::literals::chrono_literals |
operator""s | Converts a character array literal to a std::basic_string literal | std::literals::string_literals |
operator""sv | Creates a string view of a character array literal | std::literals::string_view_literals |
The operators new and delete allocate and de-allocate objects on the free store (heap) memory. They also have counterparts, new[] and delete[] for allocating/de-allocating arrays though the existence of the former is not obvious from the syntax. These operators are used as follows;-
Using new and delete expressions in their raw form requires care; memory leaks, or incorrect application of delete on an allocated object can cause serious problems
An example of using a manager object;-
Performing a new but not a subsequent delete will leak memory
Performing a delete and then later using the object at the deleted address, or performing a delete on an object not allocated by new, on an object that has already been deleted or on an invalid address are all undefined behaviour and usually disastrous
The function std::set_new_handler() may be called by an application to register a callback (ie, a new-handler) that shall be called if operator new fails to fulfil an allocation request. It returns the previously registered callback function and is declared as;-
If new fails to fulfil a memory request, it shall call the registered new-handler function. If the new-handler returns normally (ie, it does not throw an exception or terminate the program) then new shall try to perform the allocation again. The cycle repeats until the allocation succeeds or the new-handler does not return normally. This behaviour implies that the new-handler must do one or more of the following;-
Example Set the new_handler so that all new operations that fail to allocate will terminate the application rather than throwing an exception or returning nullptr (an exception could still be thrown by a constructor);-
The new expression and operator new
new is actually composed of what may be thought of as two functions; the 'new expression', and operator new. In general use, this distinction is not apparent, but it is useful in order to fully understand the role of set_new_handler and is critical in understanding overriding or overloading operator new;-
The new expression |
This is the function that is directly called by an expression such as X* a = new X; It will first call operator new to acquire some memory of size (eg) sizeof(X) If the memory acquisition is successful then the appropriate constructor is executed for the allocated type, within the acquired memory. If the construction fails (ie, throws an exception) then operator delete is executed to free the allocated memory and the exception is re-thrown If the memory acquisition fails then a std::bad_alloc exception is thrown or (if the invoked operator new is a non-throwing version) nullptr is returned to the application The new expression can also allocate groups of objects, as an array. This occurs with an expression such as X* b = new X[3]; In this case, an appropriate operator new[] is called to acquire some memory of size (eg) sizeof(X) * 3, and if successful, then it executes the constructor for each element of the array If any constructor fails (ie, throws an exception), then the whole allocation is considered to have failed, the destructor is called for any successfully constructed elements and the appropriate operator delete[] is executed and the exception is re-thrown It is not possible to override the new expression. It can be thought of as being aware of any overrides/overloads to operator new though and shall invoke the appropriate version |
operator new |
The sole purpose of this function is to allocate memory of the requested size. Exactly how it does this, and details such as alignment etc, is implementation-specific operator new may be overridden; globally and/or specifically for a type. If it is, it should follow these conventions;-
|
operator new[] |
This can be, and often is, identical to (or simply calls) operator new It can be different to operator new though, should the application need to handle array allocation differently to single object allocation |
The delete expression
As with new, delete is also composed of what may be thought of as two distinct functions;-
The delete expression |
This is the function that is directly called by an expression such as delete a; It will first call the destructor for the object. This is why it is undefined behaviour to pass a void* to the delete expression It will then call operator delete to free the memory previously allocated by operator new This is one reason why a destructor should never throw an exception; doing so will almost certainly cause a memory leak The delete expression can also de-allocate arrays of objects. This is performed by using the form delete[] b First, the destructor is called for each element of the array, followed by a call to an appropriate version of operator delete[] It is not possible to override the delete expression. It can be thought of as being aware of any overrides/overloads to operator delete though and shall usually invoke the appropriate version |
operator delete |
This function is expected to free the memory previously allocated by operator new operator delete may be overridden and should be done so to match any override of operator new. If it overridden, it should follow these conventions;-
|
operator delete[] |
This can be, and often is, identical to (or simply calls) operator delete, though like operator new[], it may do something else |
Some reasons why one may wish to replace operator new and operator delete operators include;-
Standard global operator new and delete prototypes
The standard global function prototypes are as follows. For each version of operator new, there is also an operator new[] (though it is not shown here for brevity). Similarly for operator delete and operator delete[];-
void* operator new(size_t size); N1
void* operator new(size_t size, const std::nothrow_t& tag) noexcept; N2
Invoked as auto a = new X; and auto a = new(std::nothrow) X; respectively
The array-allocation version is invoked as auto a = new X[3]; and auto a = new(std::nothrow) X[3];
The standard operator new throws the exception std::bad_alloc if it fails
A user-defined version may be assumed to do the same. The optional const std::nothrow_t& argument may be specified to indicate a non-throwing version (N2). The default implementation of N2 calls N1, catches any exceptions and returns nullptr on a failure
The default alignment provided by these functions is specified by __STDCPP_DEFAULT_NEW_ALIGNMENT__ and is suitable for allocating any fundamental type
The default implementation of the operator new[] versions of these prototypes simply call the operator new versions
If multiple versions are defined call preference (in descending order) is MN1 and N1
void operator delete(void* ptr) noexcept; D1
void operator delete(void* ptr, const std::nothrow_t& tag) noexcept; D2
Invoked as delete a;
The standard operator delete. The default implementation deletes memory allocated by N1 and N2
The standard implementation of the non-throwing version behaves the same as the throwing version. It is invoked by the new expression if the standard non-throwing operator new (N2) is successfully called followed by the object's constructor throwing an exception
If multiple versions are defined, call preference (in descending order) is D3, D6 and D1
void operator delete(void* ptr, std::size_t size) noexcept; D3
Invoked as delete a;
The standard implementation behaves the same as D1
If multiple versions are defined, call preference (in descending order) is D3, D6 and D1
It is implementation-defined whether D1 or D3 is called when deleting objects of incomplete type and arrays of non-class and trivially-destructible class types
void* operator new(size_t size, std::align_val_t align); N3
void* operator new(size_t size, std::align_val_t align, const std::nothrow_t& tag) noexcept; N4
Invoked as auto a = new(alignof(X)) X; and auto a = new(alignof(X), std::nothrow) X; respectively
As N1 and N2 but adds an alignment argument; useful if a memory alignment other than __STDCPP_DEFAULT_NEW_ALIGNMENT__ is required
If multiple versions are defined, call preference (in descending order) is MN1 and N3, or MN2 and N4 respectively
void operator delete(void* ptr, std::align_val_t align) noexcept; D4
void operator delete(void* ptr, std::align_val_t align, const std::nothrow_t& tag) noexcept; D5
Invoked as delete a;
Deletes memory allocated by N3 and N4
void operator delete(void* ptr, std::size_t size, std::align_val_t align) noexcept; D6
Invoked as delete a;
If multiple versions are defined, call preference (in descending order) is D3, D6 and D1
void* operator new(size_t size, void* ptr) noexcept; N5
Invoked as auto a = new(p) X;
The 'standard' placement operator new. The default version of this function simply returns ptr unmodified
void operator delete(void* ptr, void* place) noexcept; D7
Cannot be called from user code via the delete expression
Deletes memory allocated by N5
The default implementation does nothing
void* operator new(size_t size, placement-args); N6
Invoked as auto a = new(placement-args) X;
All custom placement operator new functions follow this prototype format
If multiple versions are defined, call preference (in descending order) is MN4, MN3, N7 and N6
void* operator new(size_t size, std::align_val_t align, placement-args); N7
Invoked as auto a = new(alignof(X), placement-args) X;
An extension of N6 that includes an alignment argument
If multiple versions are defined, call preference (in descending order) is MN4, MN3, N7 and N6
void operator delete(void* ptr, placement-args) noexcept; D8
Cannot be called from user code via the delete expression
Called by the new expression if N6, N7, MN3 or MN4 is successfully called followed by the object's constructor throwing an exception
If multiple versions are defined, call preference (in descending order) is MD6 and D8
If neither D8 or MD6 are defined then no de-allocation will take place for the object
Overloading global operator new and delete
'Non-throwing' operator new and delete
Standard type-specific operator new and delete prototypes
It is also possible to overload operator new and operator delete for a specific user-defined class. For each version of operator new, there is also an operator new[]. Similarly for operator delete and operator delete[];-
static void* T::operator new(size_t size); MN1
Invoked as auto a = new X;
If multiple versions are defined call preference (in descending order) is MN1 and N1, or MN2, MN1 and N3
static void T::operator delete(void* ptr); MD1
Invoked as delete a;
Deletes memory allocated by N1 and MN1
If multiple versions are defined call preference (in descending order) is MD1 and MD2
static void T::operator delete(void* ptr, size_t size); MD2
Invoked as delete a;
Deletes memory allocated by N1 and MN1
If multiple versions are defined call preference (in descending order) is MD1 and MD2
static void* T::operator new(size_t size, std::align_val_t align); MN2
Invoked as auto a = new(alignof(X)) X;
As MN1 but adds an alignment argument; useful if a memory alignment other than __STDCPP_DEFAULT_NEW_ALIGNMENT__ is required
If multiple versions are defined call preference (in descending order) is MN2 and MN1 and N3
static void T::operator delete(void* ptr, std::align_val_t align); MD4
Invoked as delete a;
Deletes memory allocated by MN2
If multiple versions are defined call preference (in descending order) is MD4 and MD5
static void T::operator delete(void* ptr, size_t size, std::align_val_t align); MD5
Invoked as delete a;
Deletes memory allocated by MN2
If multiple versions are defined call preference (in descending order) is MD4 and MD5
static void* T::operator new(size_t size, placement-args); MN3
Invoked as auto a = new(placement-args) X;
All custom type-specific placement operator new functions follow this prototype format
If multiple versions are defined call preference (in descending order) is MN3 and N6
static void* T::operator new(size_t size, std::align_val_t align, placement-args); MN4
Invoked as auto a = new(alignof(X), placement-args) X;
An extension of MN3 that includes an alignment argument
If multiple versions are defined call preference (in descending order) is MN4 and N7
static void T::operator delete(void* ptr, placement-args); MD6
Cannot be directly called from user code
Called by the new expression if N6 or MN3 is successfully called followed by the object's constructor throwing an exception
If multiple versions are defined call preference (in descending order) is MD6 and D8
If neither D8 or MD6 are defined then no de-allocation will take place for the object
Overloading type-specific operator new and delete
General operator new and delete overloading considerations
Remember that names declared at class scope hide names declared at global scope (and names declared at derived-class scope hide names declared at base-class scope). Therefore, overloading new and delete within a class will hide the standard (global) versions. A way to correct this is to also declare the standard versions (if appropriate) and forward to them. Creating a base class that does just this is a convenient way of addressing the problem. For example;-
The variety of operator new and operator delete functions can appear confusing
One way of looking at this is that the standard versions (N1, N2, MN1 and D1 and MD1) are invoked by the new and delete expressions unless one of the other versions are defined
There are a couple of additional rules as indicated by the above prototype lists, but this is the general principle
Any version of operator new may be invoked (assuming it is defined) by providing the appropriate arguments to the new expression
However, the delete expression does not take any arguments (other than a pointer to the memory to delete) and so any explicit call to the delete expression will only behave correctly for non-placement allocations. That is, the delete expression will not behave correctly for allocations made with N6, N7, MN3 or MN4
The way around this is that if using one of the other versions of operator delete, it must be called directly, and not via the delete expression. This also means that the object's destructor must be explicitly called before deletion of the memory. For example;-
There is also a more complete example
new and delete within a class hierarchy
If an operator delete is defined in a base class and another defined in a derived class, then as long as the base class defines a virtual destructor, the appropriate operator delete shall be called. For example;-
If, in the above example, Y did not define its own versions of operator new and operator delete, then the statement X* b = new Y{} would call 1
This would result in the base operator new 1 being requested to make an allocation that does not match the size of its host class X
Noting that a common reason for defining a custom operator new is to facilitate efficient memory allocation, such a 'non-standard' sized allocation could be a problem. One way of dealing with this is to defer requests of an unexpected size to the default global operator new;-
A placement operator new function is one of the forms N5, N6, N7, MN3 or MN4
The general form of any placement operator new function is defined thus;-
The most common form of placement new is one that takes a pointer to some already-allocated memory;-
This particular form is defined in the standard header <new> and it is from this specific form that the term placement derives
The default version of this form simply returns the raw pointer passed to it. A custom version could do something more complex though
Another common choice of placement new is one that takes a reference to some sort of allocator object, with the operator new function passing on the task of allocation to that object and handling the return value;-
std::allocator provides a base for such an allocator type
A placement new function is called thus;-
The following example illustrates the basic use of placement new;-
Here is a more complete example. It defines an allocator object that performs the actual memory allocation. It then defines a placement operator new function that uses the allocator object;-
Using the above definitions, we could allocate memory from a specific allocator object as follows;-
Placement Delete
A placement operator delete function is one of the forms D7, D8 or MD6
It should be defined to complement the placement operator new function. However, this cannot be called via the delete expression; the freeing of memory allocated by a placement new must be handled in a special way
Because the delete expression cannot be used, the object destructor will not be called automatically. Therefore, any deleter function needs to take full responsibility for object destruction. This is normally achieved by calling the destructor explicitly; immediately before freeing the memory
Here is an example destroy() function that is suitable for deleting memory allocated with the above allocator;-
In the case of placement new 'appropriate' means a placement operator delete function that takes the same additional arguments as the operator new. So if the operator new function was;-
…then it shall expect a operator delete function with the prototype;-
Defined in the standard header <new> as;-
Example
Consider the following:-
The member m1 is const and so following the above, the compiler can assume that any subsequent reference to a.m1 will always have the value 42, and may perform some optimisations based on this assumption
The type X is a trivial type, so it is safe not to call its destructor. Therefore, it is legal to use placement new such as;-
This will simply create another instance of X, overwriting the original instance. But there is a problem; the compiler can assume that any subsequent reference to b->m1 will always have the value 83, and this would be correct. However, the original assumption that any reference to a.m1 yielding the value 42 is no longer (or at least, should no longer be) true
In reality, it is likely that the compiler will return 42 for a.m1 though. Essentially, the program is now broken; the const has been implicitly cast-away and the member variable modified, leading to undefined behaviour
This problem can be solved by accessing a.m1 as *std::launder(&a.m1). This effectively prevents the compiler from maintaining the assumption that it once held about the value of a.m1, and forces it to re-evaluate it
Example
Another case where std::launder is useful is in circumventing the rule that states that a new object cannot be accessed via a pointer to the old object if the two pointers are of different types. For example;-
Without the std::launder, the above would not guarantee a correct result
Most functions make some assumptions about the arguments passed to them and the state of any objects that the function may deal with. These preconditions may be implied (assumed) or explicitly tested for within the function. Similarly, the postconditions represent a guarantee that the function makes to its caller about the state of any values returned or any modifications it may make to any other objects
Preconditions and postconditions are important in maintaining invariance
There are several ways of dealing with preconditions;-
In reality, a combination of the above approaches may be suitable in any one situation
The standard C++ implementation supports two mechanisms for the checking of preconditions (and postconditions);-
The operation static_assert(A, "message"). This unconditionally evaluates the expression A at compile-time. If it resolves to true then no action is taken. Otherwise, the compiler shall output the specified message and the compilation shall fail. A must be a constant expression but there are few restrictions on where static_assert may be placed in the code
The message is optional (eg, static_assert(A)) and if omitted, a string is compiler-generated from the supplied expression (which is often all that is needed)
The std::terminate() function is defined in the standard header <exception> and may be called explicitly from within normal code if other error handling techniques are not an option
By default, std::terminate() will call abort(). This behaviour may be changed by specifying an alternative handler function to std::set_terminate(). For example;-
The program is exited via an implicit call to std::terminate() if any of the following conditions are met;-
If a program terminates owing to an uncaught exception, it is implementation-specific as to whether destructors are called or not. This may depend on the environment; for example, if invoked from a debugger then it is probably desirable NOT to call destructors
It is not uncommon for a program (or part of a program) to be able to detect an error but have no idea how to deal with it. For example, a library function that has no concept of how it is being used
Exceptions provide a mechanism for such a program (or part of a program) to propagate the error back up the call stack in the general hope/expectation that some code, somewhere will know what to do about it; that is, the exception mechanism provides a means of getting error information from the point of detection to the point of handling
There are some key concepts that make the exception mechanism reliable/usable. These are; the exception safety guarantee (which are central to effective recovery of run-time errors), and Resource Acquisition Is Initialisation (RAII). Both of these concepts rely on the specification of invariants
The construct for preparing to handle the possibility of an exception is the try block
The two basic constructs for propagating an exception and handling it are called throw and catch. Here is an example;-
In the above example, if do_something() were to throw an exception other than serious_error (or it called some other function that threw some other exception) then the calling function fn() would not handle it. Instead, the exception would propagate further up the call stack to whatever called fn()
Type | Description |
---|---|
runtime_error | Constructor accepts a single string argument. A virtual function what() will return the string again |
out_of_range | Constructor accepts a single string argument. A virtual function what() will return the string again |
A function may be declared as not throwing any exceptions by using noexcept. For example;-
This declares a guarantee that the function will not throw or propagate any exceptions. If this guarantee is broken at run-time then the program will end immediately by calling std::terminate() (no destructors further up the call-tree shall be called)
It is possible to make noexcept conditional with noexcept(expression). The expression must be a constexpr. If it evaluates to true then the function will be declared noexcept. For example;-
This example shall make fn() a noexcept if the data type is a POD; the rationale could be (say) that fn() will need to copy the data and copying a POD will not throw an exception, whereas copying (say) a vector<T> might
The test for noexcept is made by simply checking all the operations specified in expression and if they are all noexcept then the result is true
If the expression needs to be more complex, such as "if swap() is noexcept for this type", then use this form. This example tests the noexcept status of the version of swap() that matches the supplied types (taken from the standard library where this is common);-
The expression may be more complex. Again, taken from the standard library;-
When is noexcept not noexcept?
Consider the following;-
The function takes an argument of type X by-value. On the face of it, the fact that it is declared noexcept is not unreasonable. However, consider what could happen if the constructor for X could throw an exception. When the function is called, a copy of b is made and it is this copy that is passed to fn(). What would happen if the constructor threw an exception at this point? The exception is not being thrown from fn() and so, technically, the noexcept condition is not violated. The question, though, is whether the noexcept is misleading
In cases like this, it may be more "in-the-spirit" to declare the function as not being noexcept
A general approach to this could be to change the function declaration to;-
It is possible to catch all exceptions of all types. This is a common feature in main() in order to output a debug report for any otherwise uncaught exceptions;-
The "catch all" syntax does not allow direct assignment of the exception to a variable, but the current_exception() method is still available
There is a school of thought which says that catching any exceptions is more trouble than it's worth. That is, the program should allow any and all exceptions to make their way up to main() and terminate the program (possibly with some diagnostic output first)
The reason for this thinking is that even if an exception is caught, it is often very difficult to know what to do with it and to recover in any safe and meaningful sense from whatever error is being indicated. Often, the only exception that can be usefully handled is a new bad_alloc error, and even this can be avoided by using the non-throwing version of new and employing more traditional methods for handling the error
There are, of course, other philosophies
A caught exception may be re-thrown if it is decided it can't be dealt with after all, or if it is being caught only to allow local/partial handling. For example;-
Note that the this technique may indicate a flaw in the design of the function if the catch block is manually releasing resources; why does it doing so manually rather than relying on container/handler destructors? It is also potentially expensive
An exception within a thread can be transferred to some other thread by using std::current_exception() and a promise;-
This example shall output Exception in thread: Oops!. This technique is used by std::packaged_task<T>
For various historical and practical reasons, there are cases where exceptions cannot be used (or it would be unwise to use them);-
In order to make best use of exceptions within a program, a solid, simple strategy needs to be put in place. Specifically, key functions or subsystem should be designed to either always succeed or fail in a controlled and well-defined way that leaves the state of the program consistent with no lost or broken resources
A function that throws an exception (or fails to catch an exception) should deal with any resource cleanup at the time. It should not rely on its caller to do it for it
When using external libraries, it may be necessary to convert from one error-handling strategy to another. For example, checking for errno after a system function call and throwing an exception if appropriate
Unless a function guarantees noexcept, assume it might throw an exception and protect against this when handling resources
Even (apparently) simple operations such a =, < and sort() may throw exceptions
The exception mechanism is intended to provide a consistent error handling method spanning multiple modules and libraries, possibly developed independently of each other. It also shifts error handling code out of the main flow of execution into specific (catch) blocks which keeps the main flow cleaner and makes the error handling more obvious and visible
The exception mechanism is intended to be used. That is, an error condition does not have to be particularly rare or particularly catastrophic in order to warrant the use of an exception. An error may be considered quite common, and/or not particularly disastrous (as is the case with many I/O operations), but the exception mechanism could still be appropriate. "Exception" should be interpreted as "something that the code was unable to do" rather than "we're all going to die!"
Most large programs will be expected to throw and catch at least some exceptions during a normal and successful run
Notwithstanding this, see also this option
Do not allow exceptions to be emitted from destructors
Only throw objects that are user-defined types specifically defined for the purpose, rather than (say) an int. This will minimise the chance of two exceptions (possibly from different libraries, for example) being confused
Do not use exceptions to perform non-error asynchronous tasks, such as (say) a key entry or I/O interrupt. There are other mechanisms to handle this sort of activity and using exceptions in the role is an abuse of the exception mechanism. It is worth noting that an implementation will typically optimise the exception handling mechanism based on the assumption that it is used only for ("out of band") error reporting
Older code may use the following syntax;-
Both these forms are now deprecated. The first has been replaced with noexcept, and the second proved unsuccessful and has been abandoned
It is possible to define an entire function body as a try block. For example;-
For most functions (such as the above example), there is very little to be gained by this syntax. However, function-try blocks are more useful in constructors
Normally, if an exception occurs within a base or member initialiser then the exception is passed-up to whatever invoked the constructor rather than the constructor itself. A function-try block allows the latter. For example;-
A function that leaves the program in a valid state, with no resource leaks and no inconsistencies is considered exception-safe
Generally, for a class to be exception-safe, it must have an invariant. Objects that are not classes but have some relationship to each other (a relationship that is assumed at all times) must also have an invariant. If such invariants prove false (not maintained) then exception-safety will usually be compromised
Before an exception is thrown, all objects that may be effected must be placed into a valid state (a state that meets each object's invariant). Unfortunately, the state chosen, while valid, may not be the best one for the caller
A function should offer one of the following three exception-safety guarantees;-
Guarantee | Description |
---|---|
The basic guarantee | If an exception is thrown, the object (or objects) being operated on (and by extension, the whole program) will always be left in a state that meets its invariants. Any co-dependencies it shares with other objects shall remain valid and meet all the invariants for the object (though the state may have changed, possibly in unpredictable ways). No resources shall be leaked |
The strong guarantee | If an exception is thrown, the object (or objects) being operated on (and by extension, the whole program) will be left in exactly the same state it was in before the function started. It is as if the function had never been called, apart from the fact that there is now an exception working its way up the call stack! |
The nothrow guarantee | The function shall never throw an exception and shall always run to completion (and, one assumes, always leaves the object(s) being operated on in a valid state). All operations on any built-in or pointer type offer this guarantee |
Exception-Safe Construction
The nothrow guarantee is the most desirable but often impossible to provide if the function is anything other than trivial and/or deals with anything other than built-in types; many 'innocent looking' standard container operations may throw, for example
A general design technique that is often used to provide the strong guarantee is that of copy-and-swap; that is, copy the object (make a temporary), modify the copy (if an exception is thrown at this point then the original remains unchanged), and if all is well, swap the copy with the original using a non-throwing swap()
Offering the strong guarantee rather than the basic guarantee is highly desirable and is often relatively simple as long as the function is dealing with data that it has full visibility of, and control over. If it needs to invoke other functions that may themselves throw then offering such a guarantee becomes much harder even if those functions offer the strong guarantee as well; what happens if the first function call succeeds and the second one fails and throws an exception? Is it possible to undo the effect of the first function?
Another possible obstacle to offering the strong guarantee is that there may be an unacceptable cost involved; the copy-and-swap technique is not free—by definition it involves creating a temporary object
The presence of noexcept is not an indication of a function offering the nothrow guarantee. It is an indication that if an exception is thrown then something has gone seriously wrong and the program is now in an undefined state. The nothrow guarantee is a feature of a function's implementation, not its declaration. See also unexpected() and set_unexpected()
The only time no exception safety guarantee can be offered at all is if the function relies on some other (legacy) function that is itself not exception-safe
The exception safety of a function is only as strong as that of the weakest operation it performs (assuming recovery from the effects of the weakest operation is not possible)
Whenever possible, use pointer manager objects
Although inferior to following true RAII principles, if it is absolutely necessary to use raw resources ("naked" pointers etc), then the following technique can provide exception safety;-
The function [[noreturn]] void unexpected() is called in the event that an exception is thrown from a function marked as noexcept
Two different syntax styles are available for declaring a function; the traditional 'C'-style;-
…or this (suffix return-type syntax);-
A function is defined like this ('C'-style syntax); all three forms follow the same pattern;-
…or this (suffix return-type syntax);-
All functions consist of the following components;-
A return-type. This may be void
A return-type is not specified for constructors or type conversion functions, but must be specified for all other functions
Optional [[attributes]], zero or more attributes
The following attributes may be applied to a function; [[carries_dependency]], [[deprecated]], [[maybe_unused]], [[nodiscard]], [[noreturn]]
A function may also be declared with a previously defined type. For example;-
This technique may only be applied to a function's declaration; its definition must be made in the normal way; void fn(int a, const double b) { /* ... */ }. See also Aliases and Function Pointers
In addition to the above, a member function may also be specified as;-
Here is a rather complex example using many of the above options;-
Arguments
Arguments are specified as a comma-separated list in the form;-
In a function declaration, the argument names are optional, and if specified are ignored (they do not even have to match the names used in the definition, or other declarations of the same function)
Return Type
An auto return type shall never return a reference type; in accordance with the function template type deduction rules, any reference is stripped from the initialising expression before deducing the type. Therefore, if a reference type is specified by the return statement then the result is return-by-value
This example returns an element from a supplied container; it is usual to return a reference to such an element so that it can be assigned-to by the caller. The expression container[index] will return a reference, but the function itself will return by-value;-
The way to preserve a reference return is to use the following form; following the decltype rules, this will preserve the reference in the deduced type, and will still behave correctly if an object rather than a reference is returned;-
See also this important warning regarding using decltype in this way
As an aside, the above can be improved by making is accept a forwarding reference for the container (so that it can also accept rvalues). This also implies (if one wants it to work correctly) the use of perfect forwarding;-
The C++11 version of the above is similar but the return type needs to be more explicit;-
Suffix return-type syntax may be used for any function, but its main use is in specifying the return type for a parameterised function template (ie, where the return type is expressed in terms of the deduced argument types). For example;-
This doesn't work for prefix return-type syntax (where the arguments are parsed after the return type, thus preventing the return type being deduced)
This example also demonstrates how a return type can be defined in terms of some other type by using decltype; that is, the return type of product() is defined in terms of the expression x * y
A function's type is that of the argument-list and the return-type. If it is a class member, then the class name is also a component of the function type
A function's noexcept state does not form part of the function's type. Therefore, void fn() and void fn() noexcept are treated as the same type
A function's noexcept state forms part of the function's type. Therefore, void fn() is not the same type as void fn() noexcept. This allows the compiler to more easily identify incorrect calls; ie, a noexcept function calling a non-noexcept function
Function Declaration | Type |
---|---|
int fn(int a, const bool& b) | int(int, const bool&) |
char& String::operator[](int index) | char& String::(int) |
void fn(int index); void fn(const int index); 1 | void(int) |
void fn(const int* index); void fn(const int* volatile index); 2 | void(const int*) |
void fn(int); | void(int) void(int) |
void fn(int) noexcept; | void(int) void(int) noexcept |
The examples 1 and 2 above demonstrate an important issue; that is, top level const and/or volatile qualifiers are ignored when determining a function's type
When determining a function's type, all top/first level qualifiers are discarded. Any secondary qualifiers are honoured. Therefore the following functions all have the same type of void fn(int*);-
The following shows how to determine the number of, and types of arguments of a function (and a lambda expression), and its return type, given only a pointer to the function;-
The above could be invoked as follows and the function 'fn' would be able to determine the supplied function/lambda arguments etc;-
Example 1
In the above example, the order in which the three calls to get_int() shall be executed is unspecified
Example 2
In the above example, the order in which the two calls to get_int() shall be executed is unspecified. However, it is guaranteed that the call to get_fn() shall be evaluated first
Example 3
Because at least one of the two expressions a++ and a modify a value used by the other, the result is undefined behaviour
Result is unspecified
Example 4
The above call to fn() actually contains 4 function calls in its arguments; two calls to new and two calls to std::unique_ptr
The execution order of the four function calls is indeterminate; it could be any of;-
Only the first two sequences are exception-safe. The others will leak memory if the second new in the sequence throws an exception. This is because the result of the first new has not yet been committed to its std::unique_ptr object that would protect it from leaking
Each function argument is evaluated in its entirety before the next argument (there is no 'interleaving' of evaluation). Therefore, only the first two possible evaluation orders are possible, making this function call exception-safe
See also Expression Evaluation Order
A function may be declared as inline. For example;-
It is much more common for an inline function to just be defined (often, but not necessarily, within a header file). This ensures that at the point of use, the definition is known and so actual inlining is more likely
The historical meaning of inline is as a hint to the compiler that the function body should be placed in-situ at the point the function is called, rather than instantiating the function once and performing a normal function call to it
A function that is declared inline takes-on certain characteristics;-
An inline function must be defined for any translation unit that refers to it, though only its declaration need be specified before its use (just like a non-inline function)
Any default arguments specified in the function declaration must be identical in all cases
If different definitions exist, spread across different translation units then the program may still link, but the result is undefined
Actual inlining of a function (ie, substituting calls to it with a copy of the function) is left to the implementation to decide. Indeed, the existence (or absence) of the inline specifier generally has no effect on an implementation's decision to inline (or not) a function
Some criteria that an implementation may use to not inline a function are;-
In summary, apply inline to those functions that are known to be truly trivial, and are called frequently. Treat everything else as an optimisation (that in most cases will make virtually no difference to the total performance, and is likely to be ignored by the compiler anyway!). Beware of bloat
If a function is declared as a constexpr then as long as its arguments are also constexpr and it only uses constexpr internally (which, by definition, it MUST), then it can be invoked and its value determined at compile-time. See constant expressions
A function is invoked thus;-
When a function is called, a new stack-frame is created. Formal arguments are allocated within this and are initialised from the function's actual arguments. Local variables are also created within the function's stack-frame
A function is normally exited with the return statement;-
There are actually five ways to exit a function;-
Returning an object from a function, rather than writing to it via an argument reference, is often not as expensive as it looks on the surface. move rather than copy operations are used wherever possible so if the returned object is a container/handler then it can be passed-by-value back to the caller relatively inexpensively
Some functions, notably many operators, return references
However, returning (and then using) a pointer or reference to an object that was created within the function on the stack will result in undefined behaviour as all such objects are destroyed when the function exits
Consider the following;-
The logical sequence of operations of the above example is as follows;-
In practice, a compiler will typically optimise-away all the temporary objects and, via RVO, initialise the variable a directly. However, the logical sequence must still be adhered to (because it may not always be possible to elide all of the move operations)
Any copy/move elision occurs even if the copy/move constructor would create side-effects (for example, logging debug information). For this reason, do not rely on copy and move constructors to be executed in any particular case; they may not be!
If the move constructor is deleted by redefining X as follows;-
…then the program will fail to compile, with an error relating to the deleted move constructor
One consequence of this is that it is not possible to return a non-movable object by-value from a function. However, it is possible to achieve a similar effect by changing the above example to the following;-
The logical sequence of operations of the above (first) example is as follows;-
In reality, when the call to fn() is made, a reference to the variable a is also passed to it. The abject can therefore be initialised in-situ
This mechanism cascades, and multiple initialisers may be defined at different points. So in the following example, there are still no temporary objects created; a is initialised directly;-
The return {arguments} form returns an initialiser rather than an object. For example;-
This mechanism cascades, and multiple initialisers may be defined at different points. So in the following example, there are still no temporary objects created; a is initialised directly;-
The return value optimisation (RVO) is a technique used by implementations to avoid the copying of a function's local variable that is returned. For example;-
Here, a is not copied back to the caller. What happens is that the compiler recognises that b will be set to a and therefore, at 1, sets a to refer to the same location as b. When fn() returns, b is already set to the required value and so nothing is actually returned/copied
There may still be cases where an explicit std::move in the return can be useful. If the value to be returned;-
For example;-
Function arguments may be passed by value, or by reference. Pointers are passed by value and explicitly dereferenced within the function. For example;-
Arguments may be passed as const to prevent the function modifying them;-
Arguments may also be qualified as volatile in addition to, or instead of const
An Analysis Of Pass-By-Value
(Consider pass-by-value for copyable arguments that are cheap to move and are always copied)
Example Consider a function that always makes a copy of its supplied argument(s);-
Solution 1: For efficiency, we might define two versions of the function; one that takes an lvalue and one that takes an rvalue;-
This works, but requires two functions to be maintained and (unless the functions are always inline) results in more code than is necessarily desirable, resulting in bloat
Solution 2: An alternative would be to define a single function that takes a forwarding reference;-
Again, this works, but forwarding references are not without their issues and, depending on the range of arguments passed to the function can, again, result in bloat
Solution 3: Another solution is to define a single function and use pass-by-value;-
In this case, pass-by-value is efficient because if the caller supplies an lvalue then the local variable 'a' is copy-constructed, and if an rvalue is supplied then it is move-constructed
Because the argument is a copy of the supplied value, and it won't be used again within the function, the move() is safe
This technique is not quite as efficient as the solutions 1 or 2; these both require a single copy construction for lvalue arguments and a single move construction for rvalue arguments. The pass-by-value technique requires the same copy/move construction to create the local variable 'a', plus an additional move operation to add 'a' to the animals container object
If the supplied argument is not a copyable type (for example, a std::unique_ptr) then solution 1 would be more efficient because the function overload that takes an lvalue would never be used
If the move operation for the argument type is expensive then having to perform two moves (as the solution 3 requires) may not be an option
Pass-by-value arguments may be subject to slicing
Pass-by-value should only be considered if the supplied argument is always copied/stored anyway. If it isn't then the cost of destroying the local 'a' object would also be incurred
If the function copies/stores the supplied argument by-assignment rather than by-construction then there could be the additional cost of destroying the original value (which may or may not be expensive) before re-assigning the new value. For example, a std::string that uses heap storage—it generally tries to avoid de-allocating/re-allocating its storage if new value is moved into it (as would happen with solutions 1 and 2), but this optimisation is not possible if the object is destroyed and reconstructed (as would happen with solution 3)
In accordance with the rules for reference initialisation, a literal, constant, or a value that requires conversion can be passed as a const T& argument, but NOT as a (non-const) T&. Allowing conversions for a const T& argument ensures that it can be given exactly the same set of values as a (pass-by-value) T argument by passing the value in a temporary, if necessary
A function may also take rvalue references as arguments. The main use for these is in defining move-constructor, move-assignment operations, and forwarding functions. For example;-
For user-defined types, pass-by-reference is usually preferable to pass-by-value
For small objects (say, up to 4 words), it can be more efficient to pass-by-value. Accessing an object that has been passed-by-reference is almost always slower than accessing one that is passed-by-value so if the object is small and is accessed several times from within the function, then pass-by-value may be the faster method. This is very platform and compiler-dependant though
Non-const pass-by-reference can often be eliminated by using suitable move-constructor and move-assignment operations and returning the result in the standard way instead
Pass-by-pointer is useful when 'no object' (indicated by nullptr) is a valid option. Compared to non-const pass-by-reference, it is also more explicit even if 'no object' is not a requirement
Passing an array as a function argument will implicitly pass a pointer to the start of the array; that is, an argument of type T[] will decay to the type T*. Therefore, the following are (mostly) equivalent;-
Passing an array pointer in such a way that the number of elements is enforced (the array size becomes part of the argument type) can be achieved with;-
Expanding on this, an alias can be created for the array type ( though see this point);-
Similarly, preservation of the array size can also be achieved by passing a reference;-
One use for the above technique is in templates where the number of elements must be deduced. For example;-
The following example demonstrates a nasty error that can occur when using "naked" arrays to maintain a hierarchy of class objects;-
The above will compile but will result in a catastrophic failure. What will happen is that the array dogs[8] shall be implicitly converted to a dog* and then implicitly to an animal* (because a dog is a type of animal) in the call to fn(). The problem is sizeof(dog) is not sizeof(animal) which has obvious implications when fn() iterates through the pointer p and tries to access *p. Using containers such as std::array can help avoid these problems
Be extremely wary of any interface of the form (T*, count); if T is a base class then the results can be fatal
A list (indicated with {}) may be passed as a function argument as long as the values in the list can be used to initialise the specified argument type. For example;-
A variable number (and in some cases, type) of arguments may be expressed in three different ways;-
The first two methods are described elsewhere. Here is an example of the last method; the standard 'C' printf() function. This takes at least one parameter, a plain string reference. It may also have zero or more additional parameters;-
Within such a function, the variable arguments are accessed thus;-
va_list, va_start(), va_arg(), va_end() and va_copy() are usually implemented as macros. They are defined in <cstdarg>. va_list is a type. The others are functions and are declared as follows;-
void va_start(va_list ap, paramN) |
type va_arg(va_list ap, type) |
void va_end(va_list ap) |
void va_copy(va_list dest, va_list source) |
It is sometimes necessary to forward variadic parameters on to some other function as-is. This can be achieved if the function being called takes a va_list rather than ..., as follows;-
Default values may be specified for function arguments, with the default(s) being used in the event the caller does not provide a value for that argument. For example;-
Default argument values do not have to be literals. For example;-
The above example would fail if my_name were a non-static variable; such default assignment is not allowed
Only trailing arguments may have default values assigned. For example, in isolation, this is illegal (but see the next point);-
A function declaration may be redeclared with additional default arguments, with each additional declaration inheriting any previously set defaults. For example, the following sequence of declarations is legal;-
An ellipsis may follow a default value;-
…or in the case of a function template;-
The following default assignments are not allowed, or are restricted;-
Assignment from a local (non-static) variable
Use of a local variable is allowed if used in an unevaluated context, eg;-
When calling a function with default arguments, only trailing arguments may be omitted. For example, this is illegal;-
Be aware of subtle syntax issues such as the following; the space between * and = is important because *= is an illegal operator in this context;-
Two or more functions declared within the same scope and with the same name, but different arguments are said to be "overloaded". This is useful if the functions conceptually perform the same task. For example;-
Which version of an overloaded function is called is controlled by the argument-dependent lookup mechanism
When an overloaded function is called, the compiler determines which version of the function resolve to by comparing the type of each overloaded function with the caller's argument(s). The criteria for "best match" of each argument is as follows, in this order;-
In addition;-
If the function takes a single argument then the one with the "best match" is called. If the function takes multiple arguments then the called function is the one that has a "best match" for one of the arguments and a better or equal match for the others
If two functions match equally well, except for the const-ness and/or volatile-ness of their argument(s), then the non-const/volatile version shall be chosen over the const/volatile version unless the const-ness/volatile-ness of the supplied argument(s) dictates otherwise (for example, an already const reference)
If more than one function matches at the same level then the call is considered ambiguous and a compiler error is raised. One exception to this is if a templated function expands to give the exact same signature as a standard (non-template) function, then overload resolution will favour the standard function
This technique is referred to as Argument Dependent Name Lookup (ADL) or Koenig Lookup
Ambiguities in overload resolution can often be eliminated by providing a version that explicitly resolves the ambiguity. For example;-
The above ambiguity could be solved by adding this;-
Another option would be to use explicit type conversion but this is a somewhat ugly solution
ADL will break-out of the current scope where namespaces are involved. If a function is not found in the current scope then the namespaces of its argument types are also searched (the namespace itself must be in-scope). This follows the idea that functions and the types they deal with tend to be in the same namespace. For example;-
The call fn(b) shall locate N::fn(X a) because it is in the same namespace as X. This is despite N::fn(X a) not actually being within scope. See also namespaces
A base class is a different scope to a derived class, but when a derived class member references a name, any base class declaration of that name shall be considered in preference to any external version. For example;-
The above shows that the base class name fn(char a) is called in preference to the global fn(int a) even though the chosen function call involves an implicit type conversion. Y::fn() would still be selected if it were declared private, though a compiler error would result
Avoid overloading on forwarding references
Consider two overloaded functions;-
The above functions could be called with;-
In the last case, fn(T&& a) was a better match than fn(int a) as the latter would require a promotion of the supplied argument
A case where overloading of a function taking a forwarding reference is particularly problematic is if a class contains a function template constructor that takes a forwarding reference (ie, a perfect-forwarding constructor). In addition to the general overloading problem described above, a compiler may also generate default copy and move constructors which make the situation even worse because it can result in something like this;-
The definition X e{d}; will invoke 3 rather than 4 because the latter would require conversion to const, whereas the former requires no conversion
Alternatives to overloading on forwarding references
Possiible options are;-
None of the above alternatives allows perfect forwarding within the function (for that, one must use a forwarding reference). If this is a requirement then one solution is to use Tag Dispatch. Expanding on the above example;-
In the case of a perfect-forwarding constructor, compiler-generated copy and move functions may still conspire to make a Tag Dispatch technique fail. In this case, we only want to invoke the perfect-forwarding constructor if the argument is not an X type. This can be achieved with std::enable_if;-
The first check, !std::is_base_of<X, typename std::decay<T>::type>::value disables the function if T is an X (or a derivation of X). std::decay is used to remove any const and/or volatile type modifiers
The second check, !std::is_integral<typename std::remove_reference<T>::type>::value is similar to that used in the previous example and disables the function if T is an integral type
However, some arguments cannot be perfect-forwarded and perfect-forwarding can generate some incomprehensible error reports from the compiler when things go wrong, especially if an argument is passed down through several layers before it is finally used. This can actually be serious enough to avoid the technique except where performance is a real concern. Use of static_assert can (maybe) help catch errors earlier. For example;-
It is possible to take the address of a function and assign it to a pointer in the same way as for an object. Two function pointer forms are supported;-
The function pointer may be used to call the function. For example;-
A function pointer alias is defined like this (using the above example);-
…or using typedef;-
A function pointer may refer to a noexcept function;-
Example of a function taking a function pointer;-
A lambda expression facilitates the definition of an anonymous function object (though it can also be named). It is a shorthand to the notion of defining a class with an operator(), making an object of that type and then invoking it. Lambda expressions may be passed to functions as an operation for the function to execute
A lambda expression is defined like this;-
All lambda expressions consist of the following components;-
A (possibly empty) capture-list defined with []. This must always be specified and defines which names from the defining environment can be used within the lambda expression and how they are captured (by value or by reference);-
Syntax | Description |
---|---|
[] | No local variables are captured. Only non-local variables and arguments are available to the lambda |
[&] | All local (stack-based) variables are captured by reference. See warning |
[=] | All local (stack-based) variables are captured by value. See warning |
[capture-list] | Explicit capture of a subset of local variables, individually named. Variable names in the form &name are captured by reference. Other variables are captured by value. The list may contain this (which will be captured by reference). ( The list may contain *this which will capture a copy of the host object). A variadic template's parameter pack may be captured with the form name.... For example [&a, b, c] captures a by reference, and captures b and c by value |
[&, capture-list] | All local variables are captured by reference except those listed in the capture-list which are captured by value. None of the listed names may be preceded by &. The list may contain this (which will be captured by reference). ( The list may contain *this which will capture a copy of the host object) |
[=, capture-list] | All local variables are captured by value except those listed in the capture-list which are captured by reference. All of the listed names must be preceded by &. The list cannot include &this. ( The list may contain *this which will capture a copy of the host object) |
[init-capture-list] |
Init Capture (also sometimes called generalised capture). This is a much more flexible capture mode and allows, in addition to the by-value and by-reference capture modes, by-move and almost anything else by virtue of being able to take an arbitrary expression Init capture allows a local (to the lambda expression) name to be created and defines an expression to initialise it with. For example, to move a std::unique_ptr (which cannot be copied) into a lambda expression;-
auto px = std::make_unique<X>();
auto fn = [px = std::move(px)] { /* ...use px... */}
The general form of the init capture is capture-name = expr where capture-name and expr are actually in different scopes; the former being in the scope of the closure class, and the latter being in the scope where the lambda is being defined. Therefore, as demonstrated above, it is possible to give capture-name the same name as the variable being captured So, what the above example actually means is “create a data member px in the closure class and initialise it with the expression std::move(px)” expr is arbitrary and may not even refer to any variables at all; this example creates a new object and passes it directly into the lambda expression;-
auto fn = [px = std::make_unique<X>()] { /* ...use px... */}
The form &capture-name = expr captures by reference. The form capture-name = expr captures by value. A variadic template's parameter pack may be captured with the form &capture-name = name... Init capture arguments are treated as if declared as auto types and initialised with the specified expression |
Capture list arguments are subject to type deduction rules
An optional mutable specifier. If present, this removes the const from the operator() function of the closure object, allowing the lambda expression's body to modify the capture-list parameters (and therefore the (implicit) closure object) that was captured by value
Technically, owing to a fault in the C++ specification, it is not legal to specify mutable without also specifying the argument list (), even if it is empty, thus the third format shown at the start of this section is not actually legal. However, an implementation may (helpfully) allow it
An optional return type declaration of the form -> type
If no return type is specified and the body consists of just a single return statement with an expression, then the return type can be deduced
If no return type is specified and all return statements return the same type, then the return type can be deduced
In both cases, return type deduction is performed in the same way as for a function whose return type is declared auto. The return type is deduced using the function template type deduction rules. it may make more sense to refer to the auto type deduction rules ( …but remember the difference in the way uniform_initialisers are interpreted)
It is possible to emulate init capture in C++11 using std::bind;-
This works because std::bind move-constructs any of its members initialised from rvalues (which is exactly what std::move() produces). The lambda expression takes an lvalue reference to the captured pointer px. Note that it does not take an rvalue reference because although the initialisation value (returned from std::move()) is an rvalue, the member inside the bind object is an lvalue. Therefore, when the lambda expression executes (ie, when the closure's operator() operator is invoked), it operates on the move-constructed px member of the bind object
By default, the px member of the bind object is not const, and so in order to prevent the lambda expression from modifying it (ie, to maintain the same behaviour as a stand-along lambda expression), px is passed to the lambda explicitly as const
Because std::bind maintains copies of all its argument, the lifetime of the closure is the same as its parent bind object. It is therefore possible to treat objects within the bind object as if they were within the closure
The following example shows a function object that outputs all values from a vector that meet the criteria (v[i] % m) == 0
This works because the for_each() function template implicitly appends a () to its third argument (ie, it calls operator() for the objects it iterates through). Therefore, the example first constructs a modulo_print object with the initialisers os and m and then uses that object repeatedly by calling operator()(int x) where x is an element of vector v. This demonstrates an extremely useful technique
Defining operator() as const is the usual case, but not compulsory. Here is the equivalent lambda expression for the above code;-
…or we could name the lambda…
Defining the lambda expression as mutable would be the equivalent of defining the above operator() as non-const
Here is the same function using a range-for loop. For this simple example, this could be considered the best option;-
A lambda may be used to initialise an object of type auto or std::function<R(AL)>, where R is the return type and AL is the argument list of types. The use of this latter type is useful if the lambda recurses; it is not be possible to use auto until the type can be deduced. For example;-
Incidentally, the above lambda expression could be used like this;-
Avoid using default capture modes
There are two default capture modes; by-reference [&], and by-value [=]
The problem with the former is that it can lead to dangling references. The problem with the latter is that it implies that the resulting closure is self-contained. This is not necessarily the case
By-reference capture can lead to a dangling reference if the lifetime of the closure exceeds that of the (usually local) referenced variable or argument
There is nothing magic about (say) [&var1, &var2] over the default capture mode [&], but the former is explicit (and therefore doesn't ‘hide’ the references to the specific variables), narrows scope, and is generally good practice
By-value capture does not always isolate the lambda expression from relative lifetime problems; capturing a pointer by value does not prevent the pointed-to object being prematurely deleted
A lambda expression capture, ie [...] only applies to (ie, only captures) non-static, local variables and function arguments that are within scope where the lambda expression is defined. Therefore, given the following example where a lambda expression is passed to fn2();-
The above code will compile and run. However, because the capture [=] applies only to local variables, the single variable actually captured is this, and not this->m. Within the lambda, this-> is implied and is automatically applied to the reference of member m. It is as if the above were written as;-
Incidentally, a capture of [m] would fail (m is not a local variable), as would an empty capture [] (not capturing this)
Because the member m is referenced from this, the lambda expression in this example is dependent on the lifetime of the this object. Therefore, if fn2() saved the supplied closure for later use before returning, it would contain a dangling this pointer if the X object were deleted prematurely
The fix for this problem is to rework fn() so that it makes a local copy of the member before defining the lambda expression;-
Here is a tidier approach using init capture;-
A similar issue occurs with objects that are statically allocated; global and static variables (defined at file, namespace, function or class scope). Such objects cannot be captured by a lambda expression, but they can nonetheless be accessed from within it, and are, at the same time, modifiable from outside of the lambda's scope. Using a default by-value capture [=] may give the false impression that such a lambda expression has acquired local copies of all its values and is self-contained, when in reality, any global or static variables are used as if captured by-reference
A lambda expression may take arguments of type auto. For example;-
The above lambda expression will take two arguments of any type as long as the operator < may be applied to them
Note also the use of an auto return type. In the above example, the result will always be of type bool but this may not always be the case as shown in the following example;-
Another example;-
If the function fn() in the last example above needs to make a distinction between lvalue and rvalue arguments then b would need to be made a forwarding reference and perfect forwarded;-
Note that the expression std::forward<decltype(b)> is itself rather interesting (if one cares to work it out); it works by virtue of the reference collapsing rules, but generates the correct result via a rather unconventional route
Lambda expressions can be variadic;-
This example also uses generic arguments, and may be called like any other variadic function
A class is a user-defined type that provides a framework for defining data elements along with the functions and operations that relate directly to the data elements. A class allows a specific concept to be encapsulated into a single entity with a finite well-defined set of interfaces that abstract-away the internal complexities of the concept
Compartmentalising functionality in this way also helps the compiler detect incorrect use, and improves subsequent understanding
A class is declared like this;-
Here is an example of a simple class;-
…and a simple use of it;-
Empty Base Class Optimisation
There is a caveat to the rule stating that sizeof(X) is always > zero even if X is an empty class. That is, a compiler may optimize-away the non-zero size of an empty base class. This allows an empty base class to be used without any overhead. This is referred to as the Empty-Base Optimisation or EBO
Don't make assumptions about the internal layout of objects in memory. The address of a derived class object may not be the same as the address of its base. Members may not be laid-out in memory in the order specified in the source code
The exception is that of POD types
All members of a class fall into one of three access control specifiers;-
Specifier | Description |
---|---|
private: | Member is visible to member functions of the same class and to friends of the class |
protected: | Member is visible to member functions of the same class and to friends of the class, and to any member functions and friends of derived classes |
public: | Member is visible to all, including from outside the class |
The specifiers are used like this;-
A derived class member function may only access protected members for objects of its own type. That is;-
Take special care when using protected. Such members are more open to abuse than private members
In almost all cases, a protected interface should be restricted to exposing internal types, member functions and constants/enumerations. The need to define protected data members is usually a sign of a design error
Like exposing too many public members, protected members can easily lead to maintenance issues because of the scope of external access and external reliance on the members
Notwithstanding the above, protected member functions can sometimes provide an efficient and more closely controlled implementation platform for derived classes than could be achieved by other means
An example that makes use of protected is also available
More generally, members are accessed using '.' (dot) notation for objects, and '->' notation for pointers to objects. For example;-
If a class is derived from multiple base classes, then ambiguities can arise that need to be resolved. For example;-
If a reference were made to a.m then it would be unclear as to which m was being specified. One method would be to qualify the reference. For example, a.Z::m or a.Y::m
There are other (and often better) methods as well
Enumerations may be defined within a class like any other type;-
A data member is declared exactly like any non-member type. For example;-
A class data member may be declared static;-
There is an exception to this. If a static data member is const and of an integral type, or it is a constexpr of a literal type (which could itself just be an integral type), and (in all cases) it is initialised with a constant expression, then it can be directly defined and initialised within the class definition. For example;-
static constants defined in this way take up no memory. If only integer constants are required then an alternative would be to use an enumeration instead
If the static member is to be used in a way that requires it to be stored as an object in memory (eg, its address is taken), then it can still be initialised in this way but it must also be defined externally to the class in the normal way, except that the external definition must not repeat the initialiser
static const data members are deprecated. Use inline static const data members instead
One use of static members is to hold a default initial value for an object. This can be set statically or via some set() function and picked-up by the class' default constructor, eg, X{} and used to initialise any new object
One consequence of this approach is that it is not necessary to provide a separate function to read the default value; simply creating a default-constructed object is sufficient
As illustrated here, a static const data member may be defined (not just declared) and initialised directly within the class body
However, the type of the variable must be relatively simple, and the technique cannot be applied to non-const static data memebers
A solution to this is to declare the member inline and static. For example;-
The major advantage of inline static over plain static members is that the former doesn't require an external definition. This allows such use in header-file-only class definitions, which is not possible otherwise
A class member function is within the scope of the parent class, has access to all members of the class, and must be invoked with reference to an object of the class type (ie, it has a this pointer)
A member function may also be defined as a constexpr
A constexpr member is also often static but this is not a requirement
As is the case with namespaces, a member function may be defined within the class or it may be only declared within the class and defined outside it. For example;-
Here, the member function put_m() is declared but not defined. To define it externally, its name is associated with the type with the :: syntax;-
In addition to the inline and constexpr qualifiers that may be applied to any function, non static member functions also support the following additional qualifiers;-
const and/or volatile Member Functions
Member functions differing only in their const-ness, ie, void fn(int a); vs void fn(int a) const; are also eligible for overloading
Given two or more function overloads that differ only in their const/volatile, the const/volatile version will be preferred in name resolution, if viable
An example of where the const-ness of the return type is useful is;-
If the size of the functions warrants it, code duplication can be reduced by having the non-const version of the function call the const one, and cast-away the const of the return type. Function 2 could be written as;-
Casting-away the const in the return is safe because the non-const function must have been called with a non-const argument
Function Reference Qualifiers
Example;-
Combininng const, volatile, & and && qualifiers
Note the difference between how the const and volatile qualifiers are handled compared to the & and && qualifiers; the former modify the behaviour of the function, whereas the latter modify where and when the function may be called
The two sets of qualifiers may be combined. For example;-
As usual, if there is no exact overload match then the next best match may be used. For example, if function 7 were not defined then the call call 1 would call 8 instead
Member function arguments and return values
Values passed to functions are only eligible for implicit type conversion if they are listed in the function's argument list. The upshot of this is that for a member function, this is never eligible for implicit type conversion. This is why;-
The solution to the above problem is to declare operator+ a non-member function. That way, all its arguments are eligible for implicit type conversion. This solution assumes availability of a member operator+= function;-
This also demonstrates the notion of defining operator + in terms of operator +=. This pattern also extends to other operators, of course
Avoid returning/creating "handles" to object internals
A handle can take the form of a reference, pointer, iterator, or some other type. Returning a handle to a type's internal representation from a member function breaks encapsulation, especially if the handle refers to a private data member (or member function). When returned from a const member function, it also negates the ethos of the const because the caller can use the returned handle to change the internal representation
A const handle could be returned, and doing so controls the extent of the encapsulation breach somewhat; it may be perfectly reasonable for the application to recover details of the type's internals (at least some of them). However, if the returned reference outlives the member it refers to then undefined behaviour will result if the reference is used. This situation is almost inevitable if the host object is a temporary formed as part of an expression. The temporary shall be destroyed at the end of the expression leaving any handle that was acquired from it as part of the expression, dangling. Similarly, if the caller makes a reference from a returned temporary object. For example;-
Of course, some functions must return a handle, such as operator[] and operator-> but such functions are the exception
It can sometimes be useful to be able to chain member function calls. To achieve this, return a reference to the object operated on. For example;-
It is possible to delete a member function by specifying = delete. For example;-