Igor's C++ Grimoire
13/12/2017

Contents

Introduction

Igor's C++ Grimoire aims to be a reasonably complete reference to C++11, C++14 (and C++17 when I get around to it)

Many sources have been used to compile it, but the major ones, and some of particular note are as follows. The C++ Programming Language 4th Ed, by Bjarne Stroustrup. Effective C++ 3rd Ed, and Effective Modern C++ 1st Ed, November 2014, both by Scott Meyers. On-line resources include ISO C++ FAQ, cppreference.com, stackoverflow.com, CppCon and many others


The original intention was to just present the facts; the syntax, some examples, and a thorough description of the language with the aim of being as complete and unambiguous as possible whilst keeping the fluff to a minimum. As time has marched on, considerable 'other stuff' has been added particularly in the area of advice on avoiding common (and sometimes obscure) pitfalls, particular techniques, and general advice that should make life easier, Notable exceptions to the original ethos of only the facts are the Templates and Meta-Programming sections which almost entirely describe the application of technique rather than syntax, and the Design Considerations section which is mostly made-up of (often other people's) rambling thoughts and ideas

The standard library is not described in any detail. Exceptions are the Concurrency support (which is almost entirely implemented by standard library components), and some basic components and/or particular types/functions that are referred to by other items. These are all described in the Standard Techniques section and the sections that follow it. Be aware though, that most of this latter section provides only cursory descriptions, and those descriptions that are more complete may not be absolutely comprehensive

This document is in no-way a tutorial and should not be treated as such. It is a reference document that describes how to use C++ and, hopefully along the way, answer some of those "what is…?" and "how do I…?" and "why on Earth is it doing that…?" questions

Hints, tips, and any musings that wander away from the narrow criteria of simply facts are highlighted in a box like this one. None of this information is necessary in order to understand C++, but there are definitely some very good ideas and a number of "gotcha's" here that are worth knowing about. Useful hints are shown with the symbol shown at the start of this paragraph

Warnings and general observations of a potentially problematic nature are shown with this symbol

Significant or obscure causes of error are shown with this symbol

Always do this if you want your code to work correctly and/or avoid future problems

Never do this if you want your code to work correctly and/or avoid future problems

Specific points in the main text may also be highlighted with one of the above symbols as appropriate

Items that only relate to a specific C++ version are identified with and as appropriate. While no attempt is made to document versions earlier than C++11, there is the odd item that is specific to C++98/03. These are identified with

At the time of writing C++17 (C++1z) has not been published and the description of it in here is severely lacking in many areas. There are some references to it though, and these are marked with

Features that have been introduced or modified in a particular version and retained in subsequent versions are identified with or , though in the former case, only major features are identified this way

Any points that are incomplete, seemingly inconsistent or based on ambiguous source material are marked with . Hovering over these markers should display further details

Code snippets are shown like this; int a = fn();.
Individual code elements are shown like this; a.
Keywords are often shown like this; virtual.
If a particular part of the syntax is being highlighted or referred to, it is shown like this; class colour {…}.
Any optional parts of the syntax (or parts that may not be applicable in all cases) are indicated like this; short int a;.
Pseudocode or syntax which may represent a number of forms is shown like this; while(condition) {…}

All but the most trivial of code examples have been compiled, run, and shown to work. However, they typically omit the required header file inclusions, and may require some using directives where components from the std namespace have not been qualified

References within this document may be textual links or may be shown as just tiny links like (which should pop-up a destination hint if hovered over). Many links refer to very specific items, paragraphs or examples appropriate to the context of the referrer

If you spot any errors, inconsistencies, or think something has been missed out or is incomplete or ambiguous then please let Igor know; no-matter how trivial the issue may seem. You can email Igor at igorknockknock.org.uk (take care spelling 'knockknock'!)


If you enable Javascript, some controls shall appear to help with navigation and shall be described here

A number of navigation tools are provided by the buttons on the right of the display;-

C
Quick link to list of contents
I
Quick link to the index
Toggles between normal left-click operation and 'panning' (enables the page to be pushed up and down)

Right-clicking one of these buttons once shall allow placement of a bookmark in the text; after clicking the button, move to where you would like the bookmark and ('left' or 'right') click again. The bookmark will not be placed if the selection is too vague (eg, you can't bookmark the whole document)

Once a bookmark is placed, a is placed in the left margin and its button is changed to a

A previously placed bookmark may be repositioned by repeating the same procedure

A bookmark may be deleted by double-right-clicking the appropriate bookmark button

To jump to a placed bookmark, left-click the appropriate button

Many of the cross-reference links refer to very specific sections or paragraphs. This symbol is placed in the left margin to identify the link destination

Code Structure

C++ is written in plain ASCII text and any one program is implemented as one or more text files

Whitespace

Whitespace (' ', TAB, and NEWLINE) must be used to separate type/variable/function names and reserved words. It is generally not required between names/keywords and syntactic elements and operators such as ( or ->. It is significant within string literals. Some operators are composed of multiple characters such as +=. Adding whitespace to these (ie, + =) would result in a syntax error. Other than these points and the occasional special case, whitespace may be freely used (or not) as required

Source Files

Apart from the maintainability and management improvements that come from dividing a program into multiple files, the technique improves modularity, enhances logical structure, and allows separation of interfaces from implementation

The text files are generally divided into two groups; implementation files and interface (header) files. Implementation files will typically include one or more header files, and header files may also include other interface files

A compiler shall typically deal with each implementation file in turn. It will first invoke the preprocessor to create a translation unit. It is to this that the C++ language rules are applied. The translation unit is compiled into object code. All the individual object code parts are then passed to a linker to form the final executable code

Minimise compilation dependencies between files

A class declaration is usually also its definition, and that includes its interface plus significant implementation detail in the form of data member declarations and even private member functions

Therefore, when a client wanting to use the class includes the header file that provides the definition, it implicitly creates a dependency between itself and the types and values used in the class' implementation details. This creates a number of problems;-

The basic principle that leads to reduced compilation dependency is to replace definitions with declarations wherever possible. To achieve this;-

Template Code Organisation

There are some special considerations when dealing with templates. In particular, there are two rules with regard to template compilation;-

Probably the most common code organisation approach is to include the same template definition into all translation units and rely on the compiler to optimise-away all duplicate specialisations; the "include everywhere" technique;-

#include <my_template.h> // ...use the template...

One problem with this approach is that it tends to (unintentionally) encourage undesirable dependencies to grow between the user-code and the template definition

This problem can me mitigated by taking the approach of "include template definitions later (after they are used)". This can be achieved by dividing the template into a declaration .h file, and a definition .cpp file, and then arranging the translation unit thus;-

#include <my_template.h> // ...use the template... #include <my_template.cpp>

This minimises the changes of the template definition having some unanticipated and detrimental effect on the user code, but makes the reverse risk greater

Although most will, an implementation is not required to be able to delete duplicate/redundant copies of a template instantiation. This can lead to "multiple definition" errors at link-time

An implementation is not required to analyse duplicate/redundant copies of a template instantiation prior to deleting duplicates. This highlights the importance of ensuring that all instantiations for a specialisation are identical, so that whichever ones are discarded, the result will be the same

Regardless of how clever the compiler is, in a large application, building the multiple instantiations only to throw them away later can increase build times considerably

Standard Headers

An implementation may be "hosted" or "free-standing"; the former includes all the standard library headers by default. The latter does not, but must support at least the header files highlighted (eg, <cstddef>) as a minimum

Everything defined in these headers is in the std namespace, so the definitions within them must be explicitly qualified or appropriate using-declarations and/or a using-directive must be used to bring them into scope

HeaderDescriptionRef.
C Library (these all follow the same naming pattern based on the 'C' header file name )
<cassert>Diagnostics (assert.h)
<cctype>Character handling functions (ctype.h)
<cerrno>Errors (errno.h)
<cfenv>Floating-point environment (fenv.h)
<cfloat>Characteristics of floating-point types (float.h)
<cinttypes>Integer types (inttypes.h)
<ciso646>ISO 646 alternative operator spellings (iso646.h)
<climits>Sizes of integral types (limits.h)
<clocale>Localization library (locale.h)
<cmath>Maths library (math.h)
<csetjmp>Non-local jumps (setjmp.h)
<csignal>Signal handling library (signal.h)
<cstdalign>__alignas_is_defined (stdalign.h)
<cstdarg>Variable arguments handling (stdarg.h)
<cstdbool>Boolean type (stdbool.h)
<cstddef>Standard definitions (stddef.h)
<cstdint>Integer types (stdint.h)
<cstdio>Standard I/O (stdio.h)
<cstdlib>Standard general utilities library (stdlib.h)
<cstring>Strings and memcpy (string.h)
<ctgmath>Type-generic maths (tgmath.h)
<ctime>Time Library (time.h)
<cuchar>Unicode characters (uchar.h)
<cwchar>Wide characters (wchar.h)
<cwctype>Wide character type (wctype.h)
Containers
<array>array
<bitset>bitset
<deque>deque
<forward_list>forward_list
<list>list
<map>map, multimap
<queue>queue, priority_queue
<set>set, multiset
<stack>stack
<unordered_map>unordered_map, unordered_multimap
<unordered_set>unordered_set, unordered_multiset
<vector>vector, vector<bool>
I/O Streams
<ios>ios_base, ios
<istream>istream, iostream
<ostream>ostream
<streambuf>streambuf
<iostream>cin, cout, cerr, clog
<fstream>ifstream, fstream, ofstream, filebuf
<sstream>istringstream, stringstream, ostringstream, stringbuf
Concurrency
<atomic>atomic, atomic_flag, memory_order
<condition_variable>condition_variable, condition_variable_any, cv_status
<future>future
<mutex>mutex
<thread>thread, this_thread
Miscellaneous
<algorithm>Standard algorithms
<chrono>duration, time_point, system_clock, steady_clock, high_resolution_clock
<codecvt>Unicode conversion facets
<complex>Complex numbers library
<exception>Standard exceptions
<functional>Function objects
<initializer_list>Initializer list
<iterator>Iterators
<limits>Numeric limits
<locale>Localization
<memory>allocator, allocator_arg, etc, auto_ptr, shared_ptr, weak_ptr, unique_ptr, default_delete, make_shared, etc
<new>Dynamic memory handling
<numeric>Generalized numeric operations
<random>Random number generation
<ratio>ratio
<regex>Regular Expressions
<stdexcept>Standard exception types
<string>string, u16string, u32string, wstring
<system_error>System errors
<tuple>tuple
<typeindex>type_index
<typeinfo>Type information
<type_traits>type_traits
<utility>Utilities; pair, relational operators, rvalue handling (forward, move, move_if, etc), swap
<valarray>valarray - supports arrays of numeric values

'C' Functions

To access an external 'C' function from C++, use the following;-

extern "C" int fn(void);

A group declaration may also be made in order to create a linkage block;-

extern "C" { int a; void fn2(int a); // ...etc... }

Inline Assembler

Assembler code may be embedded into C++ source code with the asm statement;-

asm(string);

main()

The entry point of any C++ program is the function main(). It's prototype is the same as for plain 'C'. In fact, two prototypes are supported. A program must specify only one of them;-

int main(); // Short form int main(int argc, char* argv[]); // Full/long form

Where;-

Program Termination

A program shall terminate if any of the following occurs;-

In any implementation, there are probably other ways of terminating a program such as division by zero, illegal memory access, etc

The plain 'C' (and C++) standard library function std::atexit() may be used to register a function that should be executed on normal program termination. For example;-

// My cleanup function void my_cleanup(); // ... // Register my cleanup function if (!atexit(&my_cleanup)) { // Cleanup function registered ok } else { // Error: Too many cleanup functions registered }

Pre-Processor

There are a number of pre-processor directives. Parsing these can (and likely does) result in code modification or compiler parameter modification. Only after all pre-processor directives are parsed is the compiler presented with the resulting code

All pre-processor directives start with the character #. Here is a list of them;-

DirectiveDescription
#include filename

Replace the #include line with the contents of the specified file. This is used to bring-in header files into code files, or into other header files. #include directives can be nested (a header files can be 'included' that itself includes other header files)

There are two formats for specifying filename; by using <name> or by using "name". The former syntax uses the compiler's include path, and the latter is a path relative to the compiler's current directory path. However, this distinction can be somewhat blurred in some implementations. Here are some examples;-

#include <cstddef> // Include the standard header file cstddef #include "./includes/common_defs.h" // Include local header file // ./includes/common_defs.h

The standard header names do not have a .h extension, hence <cstddef> and not <cstddef.h>

#define symbol value

Define a symbol with a specified value. Equivalent to using the -D option on the command line of most compilers. For example;-

#define HELLO Hello-World

…would define the symbol HELLO with the value Hello-World. Whether the value makes sense or not will depend on the context it is subsequently used in

#define symbol(arguments) value

Define a macro; a pre-processor symbol that takes arguments. For example;-

#define FN_CALL(fn, a, b) \ (fn(a * b)) // Use the macro FN_CALL(sleep, 5, 7); // This expands out to 'sleep(5 * 7);'

It is possible to make a macro take a variable number of arguments by using ... in the argument list and __VA_ARGS__ within the macro body

All the above examples define their values within (). Not doing this is legal but often causes errors as the expanded result creates an unexpected expression. The above also shows the use of \ as a continuation character. Macros may extend over many lines by terminating all the lines (except the last) with \

Macros can not recurs and macro names can not be overloaded. If adding comments to macros, use the /**/ syntax as some older tools may not understand the //… form

Macros may use the # and ## operations (described below)

#undef symbol Undefine a symbol/macro previously defined with #define
#line line-num
#line line-num filename
#line other

Overrides the values returned by __FILE__/__LINE__, and reported by the compiler. line-num is a positive integer and filename follows normal preprocessor rules for a string constant. If the supplied parameters do not match the standard formats then the supplied format is macro-expanded; the result being expected to match one of the two standard formats

#if expression

Defines a section of code that is to be parsed if the specified expression equates to true. The section of code extends to the next #else, #elif or #endif. If the expression equates to false then the section of code is removed/ignored. The following forms are allowed for expression (the () are optional but improve readability);-

The logical operators || and &&, and the relational operators ==, !=, <, >, <= and >= may be used in the expression. () may be used to force evaluation ordering

(1)If the literal 1 ≠ zero (obviously it is)
(A)If symbol A is defined as having an integral value of ≠ zero
(B == 42)If symbol B = 42
defined(A)If symbol A is defined. Equivalent to #ifdef A

Note that expression must resolve to an integral value; specifying a string or floating point shall either yield a pre-processor error or shall parse but will not generate the intended result. Note specifically that operations such as sizeof() are not allowed; they are not understood by the pre-processor

#elif expression Defines an 'else if'. This is generally used to create 'if-else-if' ladders and defines an alternate branch to a preceding #if or #elif pre-processor statement
#ifdef symbol Defines a section of code that is to be parsed if the specified symbol has been previously defined by a #define pre-processor statement. The section of code extends to the next #else or #endif. If the symbol has not been defined then the section of code is removed/ignored
#ifndef symbol This is the same as #ifdef except the logic is reversed
#else Defines an alternate (false) branch to a preceding #if (or one of the variants)
#endif Terminates the optional section indicated by a preceding #if (or one of the variants) or else
# symbol

Convert symbol into a string by bracketing it with " characters. For example;-

#Hello-World

…would yield;-

"Hello-World"
symbol ## symbol

Concatenates two symbols. For example;-

Hello- ## World

…would yield;-

Hello-World
#pragma option Sets a compiler-specific option. The format of option depends on the option and the compiler. Unrecognised/unsupported #pragma directives should be ignored by the compiler
symbol Specifying a symbol (previously defined by a #define statement) on its own shall cause it to be expanded/replaced in-situ to its value
#error text Generates a compiler error, typically outputting text
#warning text

Generates a compiler warning, typically outputting text

This is a non-standard directive but is supported by several implementations. Consider using the more widely supported #pragma message instead

#pragma message ("text")

Outputs text at compile-time

This is a non-standard directive but is widely supported. Some implementations will simply output the message during compilation. Some will output the message and treat it as a compilation warning. Some will treat the message as a warning only if text begins with the text warning

This has the advantage over using #warning in that if it is not supported, it won't cause an error (unrecognised #pragma directives should be ignored by the compiler)


Using the pre-processor #define statement to define macros (ie, something that will expand to a piece of C++ code) is ugly and can be the cause of many errors. Don't do it; use constant expressions instead

Having said this, there are a couple of legitimate uses of macros; to support conditional compilation and to protect against recursive inclusion

To support conditional compilation. Limit #define statements to setting control values that will only be used by subsequent #if statements, and NOT to embed into C++ code. For example;-

// Compile-in experimental feature by setting this to '1' #define USE_EXPERIMENTAL_FEATURE (1) #if (USE_EXPERIMENTAL_FEATURE == 1) // Some code that implements our experimental feature #endif

How to protect against repeated inclusion in a header file

Pre-Defined Macros

The compiler defines a number of macros which may be used within code. Some of these are very useful, especially for debugging

MacroDescription
__cplusplus Indication of C++ compilation (rather than plain 'C'). Its value is 201103L / 201402L
__DATE__ Current date in the format Mm dd yyyy
__TIME__ Current time in the format hh:mm:ss
__FILE__ Name of current source file. See also #line
__LINE__ Line number of source code within current file. See also #line
__func__ Name of the current function. This is a 'C'-style string and implementation defined
__STDC_HOSTED__ Has a value of 1 if the implementation is hosted, 0 (zero) otherwise
__STDCPP_DEFAULT_NEW_ALIGNMENT__ Integer of type std::size_t that defines the minimum byte alignment guaranteed by the default implementation of operator new

Some other macros are conditionally defined by the implementation;-

MacroDescription
__STDC__ Indication of plain 'C' compilation (rather than C++)
__STDC_VERSION__ May or may not be defined. Implementation-specific
__STDC_MB_MIGHT_NEQ_WC__ Set to 1 if, in the encoding for wchar_t, a member of the basic character set might have a code value that differs from its value as an ordinary character literal
__STDC_ISO_10646__ An integer in the format yyyymmL. If defined then it indicates that all characters in the Unicode required set, when stored in a wchar_t type, have the same value as the short identifier of each character. The Unicode required set is specified by ISO/IEC 10646; the version being adhered to being specified by yyyymm
__STDCPP_STRICT_POINTER_SAFETY__ Set to 1 if the implementation has strict pointer safety. Otherwise it is undefined. The function std::get_pointer_safety() returns an enumeration indicating similar information. This is only of relevance if the implementation supports and uses garbage collection
__STDCPP_THREADS__ Set to 1 if a program can have more than one thread of execution. otherwise it is undefined. See also Concurrency

Comments

Comments are defined as follows. There are two forms; the traditional 'C' comments which start with /* and end with */. Such comments may extend for as many lines as required. Thee is also the C++ comment style. This allows single-line comments only, starts with // and ends with the end-of-line character

/* This is a 'C'-style single-line comment */ /* This is a 'C'-style multi- * line comment */ // This is a 'C++'-style single-line comment

Reserved Words

Here is a complete list of reserved keywords;-

alignas 
alignof 
and
and_eq
asm
auto 
bitand
bitor
bool
break 
case
catch
char
char16_t 
char32_t 
class 
compl
const 
constexpr 
const_cast
continue
decltype 
default 
delete 
do
double
dynamic_cast 
else
enum
explicit
extern
false
float
for
friend
goto
if
inline
int
long
mutable 
namespace
new
noexcept 
not
not_eq
nullptr 
operator
or
or_eq
private 
protected 
public 
register
reinterpret_cast
return
short
signed
sizeof 
static  
static_assert 
static_cast
struct
switch
template
this
thread_local 
throw
true
try
typedef
typeid
typename
union
unsigned
using    
virtual 
void
volatile 
wchar_t
while
xor
xor_eq

In addition, the keyword export is reserved but currently not used. It was originally intended to facilitate template definition/declaration separation, but the idea failed

The following contextual keywords are defined;-

final
override

The following attributes are defined (there are also several non-standard attributes in common use);-

[[carries_dependency]]
[[deprecated]] 
[[deprecated("reason")]] 
[[noreturn]]

Names

A Name refers any of the following;-


Scope

A declaration (or a definition, if no previous declaration exists) introduces a name into a scope. Here is a list of possible scopes and details of what qualifies a name to be considered to be within each scope;-

ScopeDetails
LocalA name declared within a function or lambda, or as an argument to the function/lambda. The scope of the name extends from its point of declaration to the end of the enclosing block
ClassA ('member') name defined within a class, outside of any function, embedded class, enum, or namespace. The scope of the name extends to all parts of the class
NamespaceA ('namespace member') name defined within a namespace, outside of any function, lambda, class, enum, or other namespace. The scope of the name extends from the point of declaration to the end of the namespace, but may be made accessible to other translation units
GlobalA ('global') name defined outside of any function, lambda, class, enum, or namespace. The scope of the name extends from the point of declaration to the end of the file it is declared in, but may be made accessible to other translation units
StatementA 'local' name defined within the {} block of a for, while, if, or switch statement, or naked; with no preceding statement. The scope of the name extends from the point of declaration to the end of the enclosing statement block
FunctionA label defined within a function. Its scope extends from the start of the function to the end

Any name may be a type name or a non-type name. A type name is a struct, class, enum or union. A non-type name is a variable, function or an argument to a function

For the purposes of the following discussion, alias, typedef and template names do not feature (it's not possible to do so without causing an error)

Name Hiding And Qualification

Namespaces

A Namespace defines a named scope. This allows collections of logically related (type/function/object) names to be grouped together; they become members of the namespace. This notion allows the same names to exist in multiple parts of the codebase. Because they are in different scopes, the names do not interfere with each other. A namespace is declared like this;-

namespace namespace-name { …member declarations… }

Nested Namespaces

Namespaces may be nested (this is used in the standard library in the chrono and rel_ops classes);-

namespace my_namespace { // ... namespace my_sub_namespace { // Declaration void sub_fn(); } } // Use sub_fn() my_namespace::my_sub_namespace::sub_fn();

Inline Namespaces

Any members within a namespace declared as inline will take on the scope of the including namespace. For example;-

// File: ver_3.h namespace ver3 { void fn(); }
// File: ver_4.h inline namespace ver4 { void fn(); // Improved version of ver3::fn() void fn2(); // New feature }
namespace my_library { #include "ver_3.h" // Because namespace 'ver4' is inline, the scope of all its members behave as if declared // directly at this point here (within namespace 'my_library') #include "ver_4.h" } // Use the default versions of fn() and fn2() - ie, my_library::ver4::fn() and // my_library::ver4::fn2() my_library::fn(); my_library::fn2(); // Use a specific version of fn() and fn2() my_library::ver3::fn(); my_library::ver4::fn(); my_library::ver4::fn2();

Unnamed Namespaces

A namespace can be created without a name such as;-

namespace { // ...definitions... }

Namespace Hierarchies

Namespaces can include other namespaces. This can lead to naming conflicts. using-declarations, using-directives and namespace-alias' can resolve these issues and are described in the following sections

There is often a trade-off when using (or not using) using-declarations and using-directives; a trade-off between convenience, verbosity and clarity (of where a referenced object comes from). This must be dealt-with on a case-by-case basis

Generally, if references to many names from the same namespace are being made, then a using-directive may be appropriate. If there are multiple references to only a single (or a few) members of a particular namespace then a using-declaration is probably more appropriate. For infrequent references to individual names, explicit qualification is probably better

A number of using-declarations provides much finer-grained control than a single using-directive

The using qualifier should be restricted to small scopes to avoid confusion and accidental misuse. Overuse can also cause the very name clashes that namespaces are intended to avoid

'using' Declaration

If a namespace-scoped name is used often, a synonym (a using-declaration) may be defined via the using qualifier. This eliminates the need to constantly explicitly qualify the name with ::. Rather than this;-

std::string s;

…it is possible to do this;-

using std::string; // From this point on, any unqualified use of 'string' will refer to std::string... string s;

'using' Directive

A using-directive may be used to bring into scope all members of a namespace. For example;-

// Bring in the whole standard library namespace using namespace std; // From this point on, we can refer to any member of std:: without qualification... string s; vector v;

Using 'using'

Both using-declarations and using-directives may be used within other namespaces. Apart from bringing-in common external namespaces, this can be useful if a hierarchy of namespaces are being defined, such as a 'user' part and an 'implementation' part

The technique can also be used to construct local collections of other namespaces. For example;-

namespace my_library { using external_lib1; using external_lib2; using external_lib2::Z; using void external_lib3::fn1(); using external_lib3::X; }

Namespace Alias'

Fundamental Data Types

TypeFamilyGuaranteed Minimum Data Size (bits)
charCharacter8 - may be signed or unsigned
signed charCharacter8 - signed
unsigned charCharacter8 - unsigned
wchar_tCharacterImplementation-specific but ≤ sizeof(long)
char16_tCharacter16 (for UTF-16)
char32_tCharacter32 (for UTF-32)
short intIntegral16 - signed by default
intIntegral16 (usually native arch. size) - signed by default
unsignedIntegralas int but unsigned
long intIntegral32 - signed by default
long long intIntegral64 - signed by default
floatFloating Point32
doubleFloating Point64
long doubleFloating Point128
boolBooleanImplementation-specific but ≤ sizeof(long)
voidno dataImplementation-specific

Fundamental Type Modifiers

The following modifiers may be applied to char, short, int, long and long long types

ModifierEffect
const signed type varnameSigned
const unsigned type varnameUnsigned

Regular Types

Regular type is a loose term but generally refers to a type that;-


Literal Types

A user-defined type can be used as a constant expression if it is sufficiently simple. 'Sufficiently simple means' the constructor must have an empty body and all members must be able to be initialised with constant expressions. Such a type is called a Literal Type

Here is an example of a literal type being used by several constant expressions;-

struct vector { int x, y, z; constexpr int calc(int q) const { return (q * (x + y + z)); } }; // Use the vector type in a constant expression... constexpr vector a {1, 2, 3}; constexpr int b = a.z; constexpr vector c[] = {a, vector{4, 5, 6}, vector{8, 7, 6}}; constexpr int d = c[1].y; // d == 5 constexpr int e = a.calc(d); // e == 30

The use of constexpr for the function calc() in the above example implies const and therefore the latter should not have to be specified. However, experience has shown that this is not always the case; the 'Clang/LLVM' compiler complains if const is omitted; not because it is wrong but because if the function were also used at run-time, it could behave differently

Trivial Types

A trivial type is one that has standard copy semantics; ie, it must be trivially copyable and movable. It must also have a trivial destructor. A copy/move/destructor operation is trivial if;-

The standard library predicate is_trivial<T> may be used to test the above rules

Standard Layout Types

A standard layout type is one that;-

Basically, if a type can be expressed in plain 'C' then it is probably a standard layout

The standard library predicate is_standard_layout<T> may be used to test the above rules

Miscellaneous Types

Type
size_tAn unsigned integer large enough to hold the size (in bytes) of any other type. This is defined in <cstddef> and is implementation-defined
ptrdiff_tA signed integer large enough to hold the result of any pointer subtraction. This is defined in <cstddef> and is implementation-defined
autoThis is not actually a type at all (it is a keyword) but it is used in place of a type name. A variable of type auto must be initialised at definition; the actual type of the variable is selected by the compiler to something appropriate based on the type of the variable or literal that is assigned to it

User-defined Data Types

Declarator Operators

OperatorEffect
type* var-namePointer
type* const var-nameConstant pointer
type* volatile var-nameVolatile pointer
type& var-namelvalue reference (must be initialised)
type&& var-namervalue reference (must be initialised)
type var-name[]Array
type fn-name(args)Function
auto fn-name(args) -> typeTwo operators; auto indicating a function with a suffix return-type and -> indicating the return type
auto fn-name(args) Function with deduced return type

Derived Types

ModifierEffect
type* var-name;Pointer
type var-name[n];Array
type& var-name = initialiser;lvalue reference (must be initialised)
type&& var-name = initialiser;rvalue reference (must be initialised)
struct type-name {};Structure
union type-name {};Union
enum {} type-name;"Plain" Enumeration. Size is implementation-defined
enum class type-name {};Class enumeration. Size is implementation-defined
class type-name {};Class

Deferred Type Definitions

If a type is required but that type is not yet defined, a Forward Declaration may be specified. For example;-

struct T1; // Only declared; defined later struct T2 {T1* a, T1* b} // Ok: Use T1 before it is defined void fn(T1 a); // Ok: Declaration of a function that takes a T1 type struct T1 c; // Error: T1 is not defined // ...at some point later... struct T1 {int y, int z}; // Define T1 struct T1 d; // Ok

Deducing Types

There are six sets of rules for type deduction; used in a variety of scenarios;-

Manually determining a deduced type can become very difficult and complex. One method of getting the compiler to output a type is to deliberately cause an error. The following definition will do this;-

// Declare a template with no definition template<typename T> class TD;

The above could be used like this;-

template<typename U> void fn(T& a) { // The following will cause two errors to be generated // that should include the types for 'U' and 'a' TD<U> u_type; TD<decltype(a)> a_type; } // Similarly for determining the type of an 'auto' auto& c = x; TD<decltype(c)> c_type;

Don't use std::type_info, ie typeid(T).name(); it almost always gives the wrong answer! The reason for this is that std::type_info::name is specified to return a type as if the argument to type_info had been passed by value to a function template. This makes it unreliable because the function template type deduction rules strip references, const and volatile from such arguments. if you choose to ignore this advice, then note that some compilers provide a c++filt command that can interpret and present the name and type information returned from type_info::name

The Boost library provides type_index.hpp which defines type_id_with_cvr<> which can be used to retrieve run-time type information;-

using boost::typeindex::type_id_with_cvr; cout << "Type = " << type_id_with_cvr<T>().pretty_name() << endl;

Function Template Type Deduction


Type Deduction Rules

When a function template is called, two type deductions take place; one for the type T and one for the function argument(s) based on T. How this is performed depends on the function argument declaration(s) and how the function is called. Consider;-

template<T> void fn(arg-type a);

The above could be called with;-

fn(expr);

The form of expr and arg-type interact to deduce the type of T and the type arg-type

Note #1 Because fn() takes a reference, the array type passed to it does not decay to a pointer. This technique can be used (for example) to create a template that returns the number of elements in an array;-

template<typename T, std::size_t N> constexpr std::size_t array_size(T (&)[N]) noexcept { return N; } // Use array_size() int x[] = {1, 2, 3, 4, 5}; int y[array_size(x)]; // Same number of elements as x[]

Note #2 Because fn() takes a reference, the function type passed to it does not decay to a pointer

Note #3 The derived type of T is always non-const even if a const value is supplied in the call. This is because the const-ness is taken care of, and guaranteed by the function declaration itself




auto Object Types

auto varname = expr; deduces an object's type from its initialiser. The type may be a variable type, a const or constexpr

Make extensive use of auto and prefer it to explicit types; it makes code cleaner, less error-prone, and easier to maintain

auto Type Deduction

The rules for auto type deduction are exactly as for function template type deduction, with one exception. With reference to the simple function template;-

template<T> void fn(arg-type a); // fn() may be called with... fn(expr);

…think of an auto expression as taking the form;-

auto a = expr;

Where auto takes the part of T from the template, and the deduced type of the variable a takes the part of arg-type

As with function template type deduction, there are three cases;-


The one difference between template and auto type deduction comes about from the use of uniform initialisers (which get interpreted as initialiser_list constructs);-

Non-auto caseauto equivalentDeduced type of auto case
int b = 42;auto a = 42;int
int b(42);auto a(42);int
int b = {42};auto a = {42};std::initializer_list<int> containing the single int element of value 42
int b{42};auto a{42}; std::initializer_list<int> containing the single int element of value 42
int
int b = {42, 83, 11};auto a = {42, 83, 11};std::initializer_list<int> containing the three int elements of values 42, 83, 11
int b{42, 83, 11};auto a{42, 83, 11}; std::initializer_list<int> containing the three int elements of values 42, 83, 11
Error

When auto Deduces The Wrong Type

auto sometimes deduces a type other than one expects. A common example is when proxy classes are being used; classes that emulate some other type. For example;-

std::vector<bool> a; bool b = a[3]; // 'b' is a bool and contains the status of bit 3 from the vector auto c = a[3]; // Error: 'c' is of type std::vector<bool>::reference and may exhibit undefined behaviour

The above problem comes about because vector defines a specialisation for bool; packing single bit bool values into words. As a result, vector<bool>::operator[]() returns a proxy class to hide this fact and to provide a clean interface to the resulting expression, with the primary intention of making vector<bool>::operator[]() look like it returns a T& in the same way as the general vector<T>::operator[]() operation does

Such a proxy class is intended to be transparent and usually only used an an rvalue. In the above case, the type std::vector<bool>::reference probably contains an internal pointer which, if a were an rvalue (say, returned from a function), would be left dangling when assigned to c. Other proxy classes include 'smart' pointers std::unique_ptr and std::shared_ptr (though these are designed to be more visible), std::bitset (similar issue to vector<boot>), and many others.

In cases such as these, options include:-

Lambda Capture Type Deduction

There are three types of lambda capture;-

Normally, the difference between the way by-value and init-capture handle const (and volatile) is not important because the object that a lambda creates is const anyway. However, if the lambda is made mutable then the difference does become apparent

decltype

decltype(expr) deduces a type from an expression; that is, the declared type

decltype Type Deduction

Givendecltype ExpressionYields Deduced Type
int a = 0;decltype(a)int
const int b = 0;decltype(b)const int
const int& c = b;decltype(c)const int&
X d;
char fn(const X& e);
decltype(d)X
decltype(fn)char(const X&)
decltype(fn(d))char

Function Return Type Deduction

See function return type deduction

Literals

TypeFamilyImplied Type
nIntegralint sign extended
nL nlIntegrallong sign extended
nLL nllIntegrallong long sign extended
nU nuIntegralunsigned
nUL nul nLU nulIntegralunsigned long
nULL null nLLU nlluIntegralunsigned long long
0Bb 0bbIntegralint binary notation
0oIntegralint octal notation
0Xh 0xhIntegralint hexadecimal notation
n.n n.nEn n.nE-n n.nen n.ne-n Floating Pointdouble
n.nL n.nlFloating Pointlong double
n.nF n.nfFloating Pointfloat
true / falseBooleanbool
'c'Characterchar (int in 'C')
"c…"Stringchar[n] where 'n' is the length of the string + 1 (for the NULL terminator)
nullptrPointerPointer

Character and String Literals

The following 'escape' (\) codes may be used in characters or strings

NameASCII NameC++ Name
NewlineNL (LF)\n
Horizontal tabHT\t
Vertical tabVT\v
BackspaceBS\b
Carriage returnCR\r
Form feedFF\f
AlertBEL or alert\aEmits a sound on some consoles
Backslash\\\
Question mark?\?Can be useful for avoiding confusion with trigraph characters
Single quote'\'
Double quote"\"
Decimal numbernnn\nnnMust start with a non-zero digit
Octal numberooo\0oooMust start with a zero
Hexadecimal numberhh\xhh
TypeRepresentationDescription
char'c'Plain char string. Almost always ASCII
wchar_tL'c'Wide character. Implementation-specific
char16_tu'\Uhhhh' u'\uhhhh' u'\xhhhh'Expands to 0000hhhh (16 bit Hex) Unicode Code-Point
char32_tU'\Uhhhhhhhh' U'\uhhhhhhhh'
U'\xhhhhhhhh'
32 bit Hex Unicode Code-Point
char string"c…"Defines const char[n] where 'n' is the length of the string + 1 (for the NULL terminator)
Raw char stringR"(c…)"A char string but the normal escaped '\' characters are not interpreted. Useful for creating regex
wchar_t stringL"c…"Wide character string. Terminated with L'\0'
LR"(ccc…)"Raw wide character string. Terminated with L'\0'
UTF-8 char stringu8"c…"UTF-8 string. Terminated with '\0'
u8R"(c…)"Raw UTF-8 string. Terminated with '\0'
UTF-16 char16_t stringu"c…"UTF-16 string. Terminated with u'\0'
uR"(c…)"Raw UTF-16 string. Terminated with u'\0'
UTF-32 char32_t stringU"c…"UTF-32 string. Terminated with U'\0'
UR"(c…)"Raw UTF-32 string. Terminated with U'\0'

String Literals

ExampleDescription
const char* p = "Hello";Assign a string literal to a const pointer
const char16_t* p = uR"(Hello)";Assign a (raw) UTF-16 string literal to a const pointer
char* p = "Hello";Illegal; not a const pointer
char p[] = {"Hello"};Assign to an array. Receiving array does not have to be const and (in this case) is automatically sized to that of the string + 1 (for the NULL terminator); ie, 6. Note that the {} are optional
char p[] = {"Goodbye " "Cruel " "World"};A string literal may be composed of one or more sub-strings (automatically) concatenated together; useful for specifying very long strings. In this example, the resulting string shall be 18 characters + 1 (for the NULL terminator)

The namespace std::literals::string_literals defines operators for defining std::string literals;-

ExampleDescription
auto s = "Hello"s;A std::string literal
auto s = L"Hello"s;A std::wstring literal
auto s = u"Hello"s;A std::u16string literal
auto s = U"Hello"s;A std::u32string literal

Word Alignment

The alignof(X) operator returns the alignment requirements of the specified type, variable or literal, in bytes

The alignment of a variable can be forced to be the same as some other type by using the alignas() operator. For example alignas(X) int data[42]; will set the alignment of the array data to be the same as that of type X

Variable Declarations and Definitions

lvalues And rvalues

Every expression is either an lvalue or an rvalue, but not both

TypeDescription
objectA contiguous region of memory/storage (this is a low level definition and not to be confused with class objects)
lvalueA name or expression that refers to an 'object'. An lvalue has an identity and can not be moved. Generally, an lvalue is a name for which the address may be taken
rvalueA "value that is not an lvalue". An rvalue can be moved/copied (assigned to an lvalue). It is not possible to take the address of an rvalue or assign to an rvalue
glvalue, xvalue, prvalueFor completeness only; a glvalue (generalised lvalue) is an lvalue that has identity, an xvalue (extraordinary) has identity and can be moved, and a prvalue (pure rvalue) has no identity but can be moved

Object Lifetime

TypeInitialisation and Lifetime
automatic(applicable only in function scope) Initialised on each encounter (if at all). Usually stack-based. Exist from definition to end of scope . The term 'automatic' is a legacy term and has now largely fallen out of use. It is not to be confused with the auto type
staticWhen used within a function scope, initialised once only at definition and exists until end of program execution
Free storeExist from the time of an explicit new operation until destroyed with delete
Temporary objectIntermediate result in computation, or an object for a const reference. The lifetime depends on context. These are almost always stack-based objects
Thread-local objectAn object declared thread_local. Lifetime is that of the host thread

Storage Classes

Storage ClassEffect on variable
extern type varIndicates that var is defined elsewhere (possibly in some other compilation unit), and this only a declaration. A variable declaration that includes an initial value is a definition regardless of the use of extern
static type var

Implicitly initialised to binary zeros in the absence of an explicit initialiser

If in file (global) scope, prevents the variable from being accessed from an external file; ie, it creates an internal linkage

If variable is part of a class, it is common (shared) between all instances of that class

If in function scope, variable retains value between function invocations. Such a variable will be initialised once only; on the first invocation of the function. Initialisation is undefined if performed recursively. For example;-

void fn(int a) { static int n = a; // Ok static int n = fn(a + 1); // Undefined return n; }
register type var

Hint to compiler to give speed priority to variable. Can not be referred to by a pointer. Often ignored by modern compilers

register is deprecated

volatile type var All reads/writes from/to var shall be honoured (not optimised away) 
automatic

Variable is allocated on the stack and is destroyed at the end of the containing block's scope. Implicit when variable is defined within a function

The auto keyword explicitly identifying an automatic variable is deprecated

global Implicit when variable is allocated at file scope, outside of any function. Implicitly initialised to zero in the absence of an explicit initialiser
thread_local type var Indicates that each thread is allocated its own copy of var

volatile

The volatile modifier may be applied to any object declaration. Some examples;-

volatile int a; // The int 'a' is volatile volatile int* b; // The int referred to by 'b' is volatile volatile X c; // The object c is volatile

The purpose of volatile is as an instruction to the compiler not to optimise-away what may seem like redundant read/write operations. For example;-

volatile int* p; // ... *p = 0; *p = 1;

In the above example, had p not been declared volatile then the compiler would likely optimise-away the line *p = 0; because it would "know" that it was redundant. The volatile modifier prevents such optimisation

Declarations and Definitions

All variables and functions must be declared or defined before they are referenced

DeclarationDescription
char c;A variable of type char. Implicitly initialised to zero if in global scope, or static
char c, d, e;Three variables of type char. Implicitly initialised to zero if in global scope, or static
char c = 'B';A char, assigned a value of 'B'
int i = 123;An int, assigned a value of 123
int i = j;An int, initialised with the variable 'j'; the type of 'j' may or may not be the same; if it is not the same then some implicit conversion is required/expected
double f {3.14};A double, assigned a value of 3.14
const char* a = "Hello";A char pointer, assigned a (const) string value of "Hello"
const char* a[] = {"yes", "no"};An unsized array of char pointers, each assigned a separate (const) string value. The array is unsized; its actual size shall be determined by the number of values assigned
auto a = 123;A variable of type auto, assigned a value of 123. The actual type shall be selected at compiler time (in this example, it will probably be of type int)
X y {123, "Hello", true};A user-defined type X, initialised with three values; an Integral, a const string, and a bool
int stop() {…}A function definition. The function takes no parameters and returns an int
bool go(const colour_t colour);A function declaration. The function takes a single parameter of type colour_t and returns a bool
using T = int;An alias for type int

There is rarely a need to introduce a new variable before there is a value available for it; reducing variable scope and delaying name introduction within a scope can help minimise errors and reduces namespace pollution

This is essentially the RAII (Resource Acquisition Is Initialisation) principle; this improves performance as no default initialiser is called prior to assigning a 'real' value. The most common reason for NOT initialising a value to something other than its default at definition is that it needs to be passed to a function to initialise it

One very neat example of employing the principles of limiting a variable's scope and delaying declaration until we have a value to initialise it with is to define the definition within an if statement;-

if (int a = fn()) { // We are free to manipulate 'a' but its scope is limited to this block } else { // The scope of 'a' extends to here too }

Only a single declaration may be made in this way, and it must be initialised (which is implied by the fact that the if statement operates on an expression)

Constants

An object may be variable (mutable) or constant (immutable). The former is the default. The latter can be achieved with the const specifier. The general form is;-

const DataType VarName = n; // 'VarName' is a constant value 'n'

Some examples;-

ExampleDescription
const int life = 42;A constant int
const char* p = "Hello";A pointer to a constant
char* const p = "Hello";A constant pointer
const char* const p = "Hello";A constant pointer to a constant

Aliases

A type alias generates a new name for an existing type. The new and old names are interchangeable; they are not distinctly different types. Aliases are useful to insulate the code from the underlying type details and allow for simpler modification

There are two methods of creating a type alias;-

SyntaxDetails
typedef existing_type new_type;'C'-style
using new_type = existing_type; 'C++'-style

Every storage class in the standard library defines value_type which is a synonym for the type it has been instantiated with. For example;-

template<class T> class vector { using value_type = T; // ...etc... }

Initialisation

C++ initialisation is notoriously complex; there are several forms, each with its own set of rules and caveats, and these are detailed below

However, in most normal use cases, the fine detail can be largely ignored (or at least not worried about too much). The point being, don't get bogged-down in the detail; in many cases, as long as the code is sensible, initialisation will behave sensibly. Mostly… maybe 

Zero Initialisation

If an object of type T is declared static or thread_local then it is zero initialised


Default Initialisation

If an object of type T is defined with no explicit initialiser at all, for example T a; then default initialisation takes place


Value Initialisation

If an object of type T is defined with an explicit empty initialiser, for example T a{};, T a = {};, or T a(); (though this last form will be parsed as a function declaration in most cases) then value initialisation takes place


Direct Initialisation

This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a(arguments);

This behaves like a function call (to a constructor). Both explicit and non-explicit constructors are considered and ADNL is employed to identify the best candidate, and (if required) implicit conversion is used to modify the argument(s) to match the constructor (unless the constructor is explicit)

Here describes the effects of direct initialisation on particular types


Copy Initialisation

This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a = initialiser;

This is an initialisation and conversion operation. If the initialiser does not exactly match the object's type, then methods shall be sought to convert the initialiser to the correct type and ADNL shall be employed to determine if the initialiser can be used to construct the object

Given the initialiser X a = b;, b is converted to a type X and the result passed to a copy constructor of X (or if b is an rvalue, possibly a move constructor). Both explicit and non-explicit constructors are considered for ADNL purposes, but only non-explicit versions may be selected. In practice, the copy/move is usually optimised-away by the compiler. The effect of this behaviour when compared with direct initialisation can be seen here

The optimisation that can elide the call to the constructor is mandatory, even if invoking such a constructor would have side-effects. As a result, explicit constructors may be selected. Technically, the rule regarding not selecting explicit constructors still applies, but as the constructor is guaranteed to never be called, the rule never comes into play


Direct List Initialisation

This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a{arguments};

If the type defines one or more constructors, then this forms Direct List Initialisation. This behaves like a function call (which is what it is; a function call to a constructor). Both explicit and non-explicit constructors are considered and Argument Dependent Name Lookup is employed to identify the best candidate, and (if required) implicit conversion is used to modify the argument(s) to match the constructor (unless the constructor is explicit). If an initializer_list constructor is defined then this will usually be selected in preference to other constructors

Here describes the effects of direct list initialisation on particular types


Copy List Initialisation

This occurs when T is a (possibly const/volatile) type and it defines a non-default constructor that may be called with the arguments specified by the initialisation T a = {arguments};

This behaves similarly to direct list initialisation except that explicit constructors are considered for ADNL purposes but never used; selection of an explicit constructor is an error


Aggregate Initialisation

If T is a (possibly const/volatile) struct or class and it does not define any user-provided constructors at all, or virtual member functions ( or inherited or explicit constructors, or virtual or private or protected base classes), then it is eligible for aggregate initialisation


An example of how the declaration (or not) of user-provided constructors can effect initialisation;-

struct X { int m1; X() = default; }; struct Y { int m2; X(); }; Y::Y() = default; X a{}; Y b{} // At this point, a.m1 == 0, and b.m2 == undefined!

The reason for the above result is as follows;-

The way to avoid the above problem is;-



For user-defined types, there can be differences between the behaviour of the different initialisation forms; it depends on the type's implementation

Example; given the following declarations;-

class X; class Y { public: Y(); // Default constructor 1 Y(int a); // Constructor 2 }; class X { public: X(int a); // Constructor 4 X(const X& rhs); // Copy constructor 5 X(X&& rhs); // Move constructor 6 X(const Y& rhs); // Copy constructor 7 X(Y&& rhs); // Move constructor 8 X& operator=(const X& rhs); // Copy operator 9 X& operator=(X&& rhs); // Move operator 10 X& operator=(const Y& rhs); // Copy operator 11 X& operator=(Y&& rhs); // Move operator 12 };

…the following definitions will invoke the indicated constructor(s)/operation(s);-

X x{1}; // Calls 4 X a{x}; // These four definitions shall call 5 X b = {x}; X c = x; X d(x); // In principle, these four definitions shall call 4 and then take that // result to call 6 (or maybe 5). In practice, only the call to 4 is // likely to be made; the call to 6 (or 5) is optimised-away X e{X{3}}; X f = {X{3}}; X g = X{3}; X h(X{3}); Y v; // These two definitions shall call 1 Y w{}; Y x(); // This shall be parsed as a function declaration Y y{3}; // Calls 2 X j{y}; // These four definitions shall call 7 X k = {y}; X m = y; X n(y); // These four definitions shall call 2 and then take that result to call 8 X p{Y{3}}; X q = {Y{3}}; X r = Y{3}; X s(Y{3}); p = q; // Calls 9 p = X{3}; // Calls 4 and with the result, calls 10 p = y; // Calls 11 p = Y{3}; // Calls 2 and with the result, calls 12

() vs [] in initialisers




More complex and multi-element initialisation is achieved with an initialiser list

ExampleEffect
int a[]{1, 2, 3};Array initialisation. Array is implicitly sized to 3 elements
int a[42]{1, 2, 3};Array initialisation. Array is explicitly sized to 42 elements. The first three elements are initialised as specified and the remainder are initialised to zero
int a[42]{};Array initialisation. Array is explicitly sized to 42 elements and all elements are initialised to zero
struct S {int x, string s};
S s{1, "Hello"};
Structure initialisation
struct S {int x, string s};
S s{};
Default structure initialisation; S s{} is equivalent to S s{{}, {}} which expands out to S s{{0}, {""}}
complex<double> z{0, pi};Use of constructor
vector<double> v{0.0, 1.1, 2.2};Use of list constructor
complex<double> z(0, pi);Use of constructor ('function' style)
vector<double> v{10, 8.3};Use of constructor (10 elements, all initialised to 8.3)
complex<double> z();This is a function declaration! (because in a declaration, an empty () always indicates a function)
complex<double> z{};…whereas this is a default initialiser

Unspecified Initialisation

Default initialisation may be specified as an empty list. For a type X, this could be X a{};. It is also possible to omit even the default initialiser; eg, X a;

When no initialiser is specified at all like this, there are special rules concerning the initialisation;-

For example;-

int* a = new int; // *a will be uninitialised int* b = new int{}; // *a will be initialised to zero class X { int a, int b; }; X* c = new X; // c.a and c.b will both be initialised to zero

Be safe by always explicitly initialising objects unless there is a specific reason not to; a classic case being a read/write buffer

Pointers

A pointer may refer to any object that has identity; that is, the object resides at a specific address

ExampleDescription
char c = 'a';
char* p = &c;p is a pointer and holds the address of c. '&' is the 'address-of' operator
char d = *p;d == 'a'. '*' is the dereference operator
int* a;Pointer to an int
const int* a;Pointer to a const int. The object (int) referred to may not be modified via this pointer
int* const a = b;const pointer to an int (must be initialised)
const int* const a = b;const pointer to a const int (must be initialised)
volatile int* a;Pointer to a volatile int
volatile int* const a;const pointer to a volatile int (must be initialised)
char** c;Pointer to a pointer to a char
int* a[8];An array of 8 pointers to ints
int (*fn)(char* param);Pointer to a function that takes a char* argument and returns an int. See also Function Pointers
int* fn(char* param);Function declaration that takes a char* argument and returns a pointer to an int
void* a;Pointer to an object of unknown type
nullptr A literal that represents a null pointer 

nullptr

nullptr is a literal that represents a null pointer

Prefer nullptr to 0 or NULL

Arrays

ExampleDescription
type varname[size];An array with 'size' number of elements is defined like this
varname[n]Array element 'n' is referenced like this
varname[n]Array element 'n' is referenced like this
*varnameDirectly dereference the first element
type* v = &varname[n]Create a pointer to element 'n' of an array
type varname[size_x][size_y];A 2 dimensional array (defined as an array of arrays)
type* varname = new type[size];An array allocated on the heap
int nums[3] = {1, 2, 3};A 3 element array populated with 3 values
int nums[42] = {1, 2, 3};A 42 element array. The first 3 elements are populated with values and the remaining elements are initialised to zero
int nums[] = {1, 2, 3};An array initialised with 3 numbers. The array is automatically sized to match the number of initial values (3) and populated accordingly
char name[] = {'H', 'i'};An array initialised with 2 characters. The array is automatically sized to match the number of initial values (2) and populated accordingly
char name[] = {"Hello"};An array initialised with a 'C-style' string literal. The array is automatically sized to that of the string + 1 (for a NULL terminator) and populated accordingly

References

A reference may refer to any object that has identity; that is, the object resides at a specific address. Unlike a pointer though, a reference is a named alias for the target object rather than a distinct variable. A reference is NOT an object. It is not unreasonable to view a reference as a const pointer to an object that is implicitly dereferenced when required

There are three 'types' of references; those that refer to lvalues (objects that we want to modify), those that refer to const lvalues (objects we don't want to modify), and those that refer to rvalues (generally, temporary references generated by the compiler/run-time environment). See also lvalues And rvalues

ExampleDescription
int i = 42;
int& r = i;r is a reference to i
int& r2 {i};r2 is a reference to i
int j = r;j == 42 (note implicit dereference; no special syntax)
int* p = &r;p is a pointer to i
r = 83;i == 83
++r;i == 84
const int& s = r;s is a const reference to r (r may not be modified via this reference)
const int& t {42};It is not possible to initialise a reference with an rvalue. Here, a temporary object is created to hold the (rvalue) literal 42, thus allowing the desired effect. The reference must be const. The temporary's scope is the same as that of the reference
int& f(char& param);Function declaration that takes a char reference argument and returns a pointer to an int. The object that param refers to may be modified from within the function
int& f(const char& param);Same as the previous example except that the object that param refers to may NOT be modified from within the function

References and pointers differ in the following ways;-

rvalue References

It is very useful to know if an rvalue reference is pointing to a temporary object (which will not be used after the operation in-hand), because if it is and we want to save that temporary object, we can sometimes perform an inexpensive move operation rather than a (potentially) expensive copy operation

The most common example of this is a function return value where a temporary object being returned will be saved into a caller-specified variable and then the temporary object destroyed. If we can just move the temporary object to the destination variable rather than copying it then this is a "good thing". Another example is an object (such as a std::string or std::list) that is actually just a very small handle to a potentially huge amount of data

ExampleDescription
int&& r {f()};rvalue reference to a function
int&& r {var};Illegal; rvalue reference to an lvalue (var)
string&& r {"Hello"};rvalue reference to a temporary
void fn(string&& r);A function that takes an rvalue string reference

A classic example of where rvalue references become valuable is in a swap function. A traditional swap function would have to create a temporary and make at least one copy of the parameters offered to it. Using rvalue references, the copy operations are avoided;-

// A "perfect" swap (almost) template<class T> void swap(T& a, T& b) { T tmp {static_cast<T&&>(a)}; // The initialization may write to a a = static_cast<T&&>(b); // The assignment may write to b b = static_cast<T&&>(tmp); // The assignment may write to tmp }

By using the static_cast<T&&> (resulting in a T&& type), the compiler is able to make use of any optimised operators for the type. In this example, that would be a move-constructor or a move-assignment. The standard library containers support these as well as rvalue versions of insert() and push_back() etc

Because the static_cast<T&&> construct is a little verbose and ugly, the standard library provides the function move(x) which means the same thing. Note that move() does not actually move anything; it is a somewhat misleading name. Regardless, we can improve the example above as follows;-

// Am "improved" swap template<class T> void swap(T& a, T& b) { T tmp {move(a)}; // Move from a a = move(b); // Move from b b = move(tmp); // Move from tmp }

As it stands, the above swap function will only accept two lvalues as parameters. To allow it to accept an rvalue as a parameter as well, we could include the two overloads;-

void swap(T&& a, T& b); void swap(T& a, T&& b);

The standard library containers deal with this issue in a different way, using shrink_to_fit() and clear()

Reference Collapsing

It is possible to take a reference to a reference, though this is only syntactically legal via use of an alias or a template reference parameter

Following on from this point, reference collapsing is the mechanism that allows a function template that takes a forwarding reference to be called with a argument that is already a reference, producing (say) fn(T& && a) which then reduces-down to something that is syntactically legal; fn(T& a)

Reference collapsing occurs in four scenarios;-

Here are the possible combinations and the resulting type that is defined;-

SyntaxInterpretation
using lref = int&int&
using rref = int&&int&&
using lref_to_lref = lref&int & &int&
using lref_to_rref = rref&int && &int&
using rref_to_rref = rref&&int && &&int&&
using rref_to_lref = lref&&int & &&int&
int && & a = bThis and similar direct (non-alias) syntax is illegal

Forwarding References

The syntax T&& can mean one of two things;-


Structures

A structure is a user-defined type. It is defined thus;-

struct structure-name { type1 member1; type2 member2; // ...etc... };

For example;-

struct S { type1 member1; type2 member2; S* member3; // ...etc... };

Fields (Bitfields)

A field element is defined like this;-

type field-name: number-of-bits;

Fields are most commonly used within a struct to allow several very small values to be packed, or to ease mapping onto an externally imposed layout such as a hardware interface. For example;-

struct cache { unsigned int page_num: 12; bool valid: 1; int: 3; // Nameless / unused bool dirty: 1; bool global: 1; };

Unions

A union is a user-defined type. Specifically, it is a struct in which all elements are allocated at the same address. For example;-

union union-name { type1 member1; type2 member2; // ...etc... };

Encapsulated Unions

It is possible to improve slightly on the raw union by encapsulating it in a class with accessor functions to maintain state and force correct usage by providing 'set' functions, and 'read' functions that check that the correct (ie, the last one that was written) element is being read. Such an arrangement is referred to as a tagged or discriminated union

Encapsulating unions does NOT fix all their problems;-

An alternative to using unions would be to use a set of derived classes. This also has the advantage of not imposing the cost of the union size (the size of the largest element) on the other (smaller) elements

In short, unless all the union elements are simple types (types without user-defined constructors/destructors, copy or move operations), they can cause a lot of trouble and are best avoided. Even if all the elements are 'simple', think twice; there's usually a better alternative

Enumerations

An enumeration is a user-defines set of named integer values (enumerators). There are two types of enum;-

enum animal {dog, cat, polar_bear}; // A "plain" enum enum class vehicle {car, bike, boat}; // An enum class enum struct vehicle {car, bike, boat}; // Exactly equivalent to enum class

Plain Enumerations

A "plain" or "unscoped" enumeration is a user-defined type, akin to a 'C'-type enum

"Plain" enums are termed unscoped because the names defined within them are in the enclosing scope rather than the scope of the enum definition

Example;-

enum warning {green, yellow, orange, red}; enum traffic_light {red, yellow, green}; // Error: 'red', 'yellow' and 'green' already defined in this scope warning a = 7; // Error: No int warning conversion int b = green; // Ok: green in scope and converts to int int c = warning::green; // Ok: warning int conversion supported warning d = warning::green; // Ok traffic_light e = d; // Error: No warning traffic_light conversion

An anonymous enum may be created if all that is required is a set of constants, rather than an enumeration;-

enum {green, yellow, orange, red};

One use for "plain" enums is in declaring names for tuple elements; their ability to implicitly convert to an integral makes them less awkward to use than a similar "class" enum, though see this example

Class Enumerations

A "class" or "scoped" enumeration is a user-defined type that is scoped and strongly typed. For example;-

enum class warning {green, yellow, orange, red}; enum class traffic_light {red, yellow, green}; warning a = 7; // Error: No int warning conversion int b = green; // Error: green not in scope int c = warning::green; // Error: No warning int conversion warning d = warning::green; // Ok traffic_light e = d; // Error : No warning traffic_light conversion

Sometimes, enumerator values are chosen to provide a bitmask. AND and OR functions can be created to safely manipulate these, such as the following. Note that explicit conversion is necessary because the enum class does not support implicit conversion;-

enum class bitmask {BIT0 = 0x01, BIT1 = 0x02, BIT2 = 0x04, BIT3 = 0x08}; constexpr bitmask operator&(bitmask a, bitmask b) { return static_cast<bitmask>(static_cast<int>(a)) & static_cast<int>(b)); } constexpr bitmask operator|(bitmask a, bitmask b) { return static_cast<bitmask>(static_cast<int>(a)) | static_cast<int>(b)); } void test_flags(const bitmask status) { if (status & (bitmask::BIT1 | bitmask::BIT3) {…} }

Because the '&' and '|' functions are constexpr, they can be used at compile time such as in a switch clause; case bitmask::BIT1 & bitmask::BIT3 {…}. Take care how the & and | functions are used though; they can (and do) return a bitmask value that is not actually legal!

"Class" enums can be used to index a tuple;-

using colours = std::tuple<int hue, int sat, int trans> colours col; // ... enum class colours {hue, saturation, transparency}; // The following are equivalent auto b = std::get<1>(col); auto c = std::get<static_cast<std::size_t>(fields_t::saturation)>(col);

Adding a helper function template reduces the syntax a little. This is general enough to work with any tuple and any enumeration regardless of underlying type;-

// version template<typename T> constexpr typename std::underlying_type<T>::type to_integral(T enumerator) noexcept { return static_cast<typename std::underlying_type<T>::type>(enumerator); } // version template<typename T> constexpr auto to_integral(T enumerator) noexcept { return static_cast<std::underlying_type_t<T>(enumerator); }

Use the helper function;-

auto d = std::get<to_integral(fields_t::saturation)>(col);

Plain Old Data (POD)

Sometimes, especially at a low level or when implementing a container class, it is useful to be able to treat an object as 'plain old data' (POD); ie, just a bunch of structureless bytes. Doing so can massively increase copy performance for example as it avoids the need to call constructors for each of the object's members

A POD type can be safely passed between C++ and 'C' code

For an object to be successfully (ie, without breaking any C++ language guarantees) treated as a POD, it must be a scalar type, a class, struct or union that complies with the following constraints, or an array of such a type;-

A standard library type predicate is_pod<T> is defined in <type_traits> that returns whether a type is a POD or not. This is much more convenient than remembering the above rules and applying them correctly!


A related concept to a 'standard' type is a trivial type which is one that has a trivial default constructor and trivial copy and move operations

See also Regular Types

Manipulating POD Types

Example

Here is a generalised copy function that uses the standard library type predicate is_pod<T>::value;-

template<typename T> void mycopy(T* to, const T* from, int count) { if (is_pod<T>::value) { // Fast copy memcpy(to, from, count * sizeof(T)); } else { // Slow copy using copy constructor for each element for (int x = 0; x < count; ++x) { to[x] = from[x]; } } }

...or here is a better technique that uses std::enable_if;-

// Fast copy template<typename T, typename = typename std::enable_if<is_pod<T>::value, T>::type> void mycopy(T* to, const T* from, int count) { memcpy(to, from, count * sizeof(T)); } // Slow copy using copy constructor for each element template<typename T, typename = typename std::enable_if<!is_pod<T>::value, T>::type> void mycopy(T* to, const T* from, int count) { for (int x = 0; x < count; ++x) { to[x] = from[x]; } }

Lists

Lists can be used for initialising named variables and may be used as expressions in many (but not all) cases. There are two forms;-

FormMeaning
T {}Qualified. Means "create object of type T and initialise it with T{}
{}Unqualified. Type must be determined from the context and must be unambiguous

A list is interpreted as follows';-

std::initializer_list<T>

The standard library std::initializer_list<T> type is used to construct variable length lists. It is mostly used for initialising user-defined containers. For example, the standard library vector has an initializer_list, so this;-

vector<double> v = {1, 2, 3.14};

…is actually interpreted as this;-

const double temp[] = {double{1}, double{2}, 3.14 } ; const initializer_list<double> tmp(temp, sizeof(temp) / sizeof(double)); vector<double> v(tmp);

initialiser_list can be used directly. One useful technique is for passing a varying size list of homogeneous values to a function, for example-

void fn(initializer_list<int> values) { if (!values.size()) { // Empty list passed-in } else { for (auto x : values) {…process…} } } // Call function with a list fn({1, 3, 7, 9, 22, 83});

Statements

This is a complete list of statements;-

statement:Ref.
declaration
expression;
{statement-list}
try {statement-list} handler-list
case constant-expression : statement
default: statement
break;
continue;
return expression;
goto label;
label: statement
selection-statementsee below
iteration-statementsee below

selection-Statement:
if (condition) statement
if (condition) statement else statement
switch (condition) statement

iteration-statement:
for (for-init-statement; condition; expression) statement
for (for-init-declaration : expression) statement
while (condition) statement
do statement while (expression);

statement-list:
statement statement-list

for-init-statement:
declaration
expression-statement (an expression terminated with a ; (semi-colon), courtesy of the for syntax)

for-init-declaration:
Declaration of a single, uninitialised variable

condition:
expression
type-specifier declarator = expression
type-specifier declarator {expression}

handler-list:
handler handler-listsee below

handler:
catch (exception-declaration) {statement-list}

Labels

A label specifies a point to which program flow may be directed. A label is only used in two contexts;-

A label is defined as label:. However, the format of label is quite different between the two uses; see the appropriate sections for details

Selection Statements

if Statement

An if statement is defined thus;-

if (condition) { statement(s) // Execute this if expression is true }

…or…

if (condition) { statement(s) // Execute this if expression is true } else { statement(s) // Execute this if expression is false }

switch Statement

A switch statement selects from among a set of alternative values (indicated by the case labels). It is defined thus;-

switch (expression) { case value: { // Execute this if expression == value } break; case value2: { // Execute this if expression == value2 } break; …etc… default: { // Execute this if expression != any of the above values } break; }

Although not strictly always required by the syntax, if a declaration is made within a switch, always encapsulate the declaration within a block {}. If you don't, the name will pollute the scope of the switch and can lead to subtle errors such as;-

switch (x) { case 1: int a; // Ok - Uninitialised int b = 42; // This will fail (explicit initialisation) my_class c; // This will fail (implicit initialisation) // Fall through... case 2: a += 1; // Error: a is in scope but is not initialised b = 83; // Ok, but original initialisation will fail anyway c.fn(); // Goodness knows! }

It is not legal/possible to by-pass an initialisation. The compiler should detect that b and c have such initialisations and that if the switch expression x is 2 then that initialisation shall be by-passed. This is why the declaration of b and c will fail

It so happens that an int does not require initialisation and for this reason, the declaration of a will succeed. However, the compiler should catch the fact that if x is 2 then a shall be used without being initialised

There may be other subtle combinations of similar errors that may or may not be caught by the compiler. In contrast, if the declarations were within a block, then this would be the result, which is much more sensible;-

switch (x) { case 1: { int a; // Ok - Uninitialised int b = 42; // Ok - Explicit initialisation my_class c; // Ok - Implicit initialisation } // Fall through... case 2: { a += 1; // Error: a not defined b = 83; // Error: b not defined c.fn(); // Error: c not defined } }

In short, always encapsulate switch declarations within a block, therefore limiting their scope to a single case label

Always add a comment in-place of any absent break statements to show the intentional omission; eg, // fall through...

One instance when default should NOT be used is when the switch expression is an enumeration type and the intention is to provide a case label for each enumerator. In this case, including the default label would prevent the compiler from detecting if any of the enumerators had not been accounted for. If there is a need to test for an illegal enumerator value then this is possibly best done separately (prior to the switch) rather than using a default label

Iteration Statements

for Statement

A for statement is defined thus;-

for (for-init-statement; condition; expression) { // Loop body. Executed repeatedly until 'condition' becomes false or explicitly terminated statement(s); }

An alternative to a while statement is a for statement in the following form which combines the condition of the for with its expression. Like the while statement, this still allowing a non-determinate number of iterations but it also has the advantage of not requiring a separate loop control variable; the element being operated on is used for this purpose, and that variable's scope is confined to the loop itself;-

for (int x; x = get_next_x();) { // Process x }

Range-for Statement

A range-for statement is defined thus;-

for (for-init-declaration : expression} { statement(s) // Loop body. Executed for each element that expression yields or until explicitly terminated }

while Statement

A while statement is defined thus;-

while (condition) { statement(s) // Loop body. Executed repeatedly until 'condition' becomes false or explicitly terminated }

do Statement

A do statement is defined thus;-

do { statement(s) // Loop body. Executed repeatedly until 'condition' becomes false or explicitly terminated } while (condition)

Loop Iteration Control

The iteration of any loop statement, for, range-for, while or do, may be modified from within the body of the loop in several ways

The continue statement is used exclusively to control loop iteration. The break statement is used likewise and also within switch statements. They are defined like this;-

break; // Terminate loop (or switch) continue; // Skip loop iteration

Control Flow Statements

goto Statement

A goto statement is defined thus;-

goto label; // ...The 'goto' statement shall skip over this code... label: // The 'goto' statement shall jump to here

Don't use goto. It's hideous, it subverts the logical flow of the program, it's the source of countless bugs, it's NEVER necessary, EVER!

return Statement

A return statement is defined thus;-

return expression;

Observe the principle of "one point of entry, one point of exit"; a function has one point of entry; the top. It should also have one point of exit; the bottom. Using multiple return statements within a function is ugly, it subverts the logical flow of the program, and can be a cause of error

Expressions

Many things are expressions; assignment, function calls, object construction, and many others

When parsing an expression, the compiler first extracts lexical tokens from the expression string. It does this following a 'greedy' technique; that is, each token is extracted to make it as long as possible while still being syntactically legal. Tokens are composed of the following elements;-

Token ClassExamplesRef.
Identifiervector, banana, count
Keywordint, for, class
Character literal'a', '\n', U'\U1234abcd'
Integral literal1, 42, 0x83
Floating-point literal1.2, 1.2e-3, 1.2L
String literal"Hello", R"("World")"
Operator+, +=, >>
Punctuation;, {, }, (, ), [, ]
Preprocessor operation#, ##

Order of evaluation of sub-expressions within an expression is undefined. Therefore;-

Conditional (tertiary) Expressions

A conditional expression is a more direct alternative to an if statement;-

condition ? expression : expression

For example;-

int a = (b > c) ? b : c; // If b > c then b shall be assigned to a, otherwise c shall be assigned to a

Temporary Objects

When evaluating an expression, it may be necessary to create a temporary object. For example, with the expression (a + b) * c, the value (a + b) must be held somewhere before evaluating the rest of the expression

Unless it is to be bound to a reference or used to initialise a named object, any temporary object that comes into being shall be destroyed at the end of the whole expression (NOT just the sub-expression) in which is was created. For fundamental types, this process is largely irrelevant to the application. However, for more complex types, the lifetime of any temporary object may become an issue

ExampleDescription
const char* c = (s1 + s2).c_str();
// ...use c...
s1 and s2 are of type string. The string class includes the method c_str() that returns a ref. to the raw plain 'C' string used internally by it. The problem here is that (s1 + s2) creates a temporary object, and so c points to the plain 'C' string of that temporary …which will be destroyed at the end of the expression!
if (strlen(c = (s1 + s2).c_str()) < 42)
{
  // ...Use c...
}
The if statement will actually work as intended because the comparison is part of the same expression that creates the temporary and therefore the temporary will still exist. However, subsequent use of c is undefined because, like the previous example, the temporary will be destroyed at the end of the expression
const string& c = s1 + s2;
// ...Use c...
Ok. By assigning the temporary to a (const) reference, its scope is extended
const string c = s1 + s2;
// ...Use c...
Ok. Use the temporary to initialise a new object
fn(s1 + s2);Ok. The temporary will exist until the function call ends

Constant Expressions

A constant expressions is defined thus;-

// Standard (object) form constexpr type name = expression; // Functional form constexpr name(arguments) { return expression; }

Use constexpr whenever possible; it can provide safer (compile-time) initialisation, and may allow optimisations and use-cases not possible by other means

Implicit Type Conversion

Integral and floating-point types may be freely mixed in assignments and expressions, and implicit conversion between types is performed in such a way as to try and preserve information. Implicit type conversions that preserve value are called promotions

Sometimes, it is not possible to preserve value. In this case, a narrowing conversion is performed. For example;-

float a = 123456.789; char b = a; // Ok. This is legal but clearly will lose information char c {a}; // Error. The {} initialiser syntax does not allow narrowing conversion

Basic conversion procedure and how narrowing is handled;-

Try to avoid narrowing conversions, but if they are unavoidable (or possible, in (say) a template function), consider the use of run-time checked conversion functions such as narrow_cast<>. This tests for loss of data by comparing the result after the conversion with the original value, at the obvious cost of additional overhead

Type Promotion

Implicit type conversions that preserves value are called promotions. Integral types are promoted before any arithmetic operation is performed. The main purpose is to convert numeric values to the 'natural' size of the underlying machine architecture (ie, int)

For an integral type, as long as the type is smaller than an int, promotion shall be performed. In contrast floating-point types are only converted if necessary (ie, if the types in an expression differ) by following the usual arithmetic conversion rules

Integral Promotion

For example, given the following;-

unsigned char a; unsigned char b; auto c = @ a; auto d = a @ b;

…the following types shall result for the respective arithmetic and bitwise logical operations;-

The implicit promotion of types smaller than int when performing arithmetic or bitwise operations can result in unexpected values being generated, especially when unsigned/signed conversion is implied or when negative signed values are being used. The following examples assume int is 32 bits

Example 1;-

uint8_t a = 0x80U; uint8_t b = ~a >> 4; // Error: 'b == 0xf7' and NOT '0x07'

The reason for the above result is that a is first promoted to an int, resulting in a value of 0x00000080. The ~ operation is then applied, resulting in the value 0xffffff7f. This is then right-shifted by 4, resulting in 0xfffffff7, and then implicitly cast back to a uint8_t resulting in a value of 0xf7

One way to correct this problem is to modify the expression as follows;-

uint8_t b = static_cast<uint8_t>(~a) >> 4;

Example 2;-

int8_t a = -1; uint16_t b = 0x8000U; uint32_t c = 0x80000000U; bool d = a < b; // Ok: 'd == true' bool e = a < c; // Error: 'e == false'

The result d is correct. It's worth understanding why. First, a is promoted to an int maintaining its value of -1. b is also promoted to an int, resulting in a value of 0x00008000 (32768). -1 is less than 32768, hence the result true

The result e is not correct. Here, a is promoted to an int maintaining its value of -1. c is already of a type whose size is ≥ int and so is not promoted. retaining its value of 0x80000000. As a result, a is now converted to the same type as c, resulting in the unsigned value 0xffffffff. This is not smaller than 0x8000000, hence the result false

Usual Arithmetic Conversion

The result of an arithmetic expression between two operands is determined by the "usual arithmetic conversion" rules; the general aim being to produce a result that is as large as the largest operand type. The rules are;-


Explicit Type Conversion (Casting)

Conversion from one type to another may be performed explicitly, using several syntax forms and operators (in vague order of 'niceness' and safety);-

FormDescription
{}Construction. This only allows safe conversions
const_cast

Cast-away const and volatile qualifiers. This is only safe if the original object was defined as non-const and/or non-volatile (and has since acquired these attributes)

The type being cast from must be a pointer, reference, or pointer-to-data-member

The type being cast to must be the same as that being cast from (except for any const or volatile qualifiers). For example;-

const int* a{}; int* b = const_cast<int*>(a);

const_cast imposes no run-time overhead

static_cast

Converts between related types such as one pointer type to another in the same class hierarchy, an integral type to an enumeration, or a floating-point type to an integral type. It also does conversions defined by constructors (§16.2.6, §18.3.3, §iso.5.2.9) and conversion operators (§18.4). For example-

void* my_allocator(size_t sz); int* p = static_cast<int*>(my_allocator(100));

For a static_cast to work, there must be an implicit conversion available for the specified types. If there isn't then a reinterpret_cast is required

static_cast<> may be used to add const or volatile to a type, but not remove them, eg, assuming X a{};, const X b = static_cast<const X>(a);

static_cast cannot be used on a polymorphic class hierarchy

static_cast imposes a (typically) small run-time overhead

reinterpret_cast

Changes the meaning of a bit pattern (does not change the actual data). Handles conversion between unrelated types such as a pointer to an integer. For example;-

char x[] = "1234"; int* y = static_cast<int*>(x); // Error: No implicit char* to int* conversion int* y = reinterpret_cast<int*>(x); // Ok: Hope you know what you're doing!

reinterpret_cast imposes no run-time overhead

dynamic_castDynamically checked (at run-time) conversion of pointers and references within a class hierarchy. See also dynamic_cast
(type)value'C'-style cast. This uses a combination of const_cast, static_cast and reinterpret_cast to perform whatever cast is specified. As a result it is very dangerous; virtually any cast can be performed and with virtually no safeguards
type(value)Function-style cast. Note that for a built-in or "plain" enumeration type T, T(e) is interpreted in the same way as (T)e (ie, a 'C'-style cast), with all the dangers that brings with it

Think twice before using any cast. Casts are almost always avoidable, and if they are not, confine them to small, well-defined areas; consider providing a function specially for the purpose to isolates the operation and avoid the need to scatter cast operations throughout the application code

Casts, whether explicit or implicit, are usually performed at run-time and often (but not always) involve a call to a non-default constructor for the type being cast-to. That constructor will create a new object. This can lead to all sorts of problems;-

class Y { public: fn(); }; class X : public Y { /* ... */ }; X a{}; static_cast<Y>(a).fn(); // Error: This line will compile and run, but it's wrong

The static_cast will invoke the copy constructor for Y and return a newly constructed object. Therefore, the call to fn() will be invoked on a copy of 'a' and not on 'a' itself. When fn() returns, the Y object shall be destroyed. Why you would want to do this at all is another issue entirely!

Templates provide a means of avoiding casts altogether

reinterpret_cast is often non-portable, usually because of differences in type sizes

'C'-style casts and function-style casts are both rather dangerous and best avoided. There are no scenarios where they MUST be used

Here is an alternative explicit conversion function that handles possible narrowing (loss of data) of scalar types;-

template<class Target, class Source> Target narrow_cast(Source v) { auto r = static_cast<Target>(v); // convert the value to the target type if (static_cast<Source>(r) != v) throw runtime_error("narrow_cast<>() failed"); return r; } // Example use auto c1 = narrow_cast<char>(42); // Ok auto c2 = narrow_cast<char>(342); // Will throw an exception if char ≤ 8 bits

Note that this is more likely to throw with floating-point types because of rounding errors. In that case, a range test rather than a hard != is probably a better test. This can be achieved with operator overloading or traits. The standard library round() function is also available

dynamic_cast

The dynamic_cast operator provides a dynamically checked (at run-time) conversion of pointers and references within a class hierarchy. It is useful when it is not possible to determine the correct cast at compile-time. For example, this;-

class Z {} class Y {} class X : Y, Z {}

…gives the following hierarchy;-

Z Y X

Given a pointer pz that refers to the base Z, we can derive a pointer to the base Y;-

X a; Z* pz = &a; Y* py = dynamic_cast<Y*>(pz); // ...or even... auto py = dynamic_cast<Y*>(pz);

The above example shows a very simple hierarchy but it may be arbitrarily complex


Converting Pointers


Converting References


General

Avoid getting into a situation where you need to downcast or crosscast a non-polymorphic type. There is no guaranteed type-safe way of doing this

Operators

This is a complete list of operators, with each block containing operators of the same precedence. The following definitions are used;-

OperationSyntaxOverload Impl. Type Ref.
Parenthesized expression( expr )-
Lambda[ capture-list ] lambda-declarator
  { statement-list }
-
Scope resolutionclass-name :: member-
Scope resolutionnamespace-name :: member-
Global:: name-
Member selectionobject . member-
Member selectionpointer -> memberY* X::operator->()
const Y* X::operator->() const
Subscriptingpointer [ expr ]Y& X::operator[](index)
const Y& X::operator[](index) const
Function callexpr ( expr-list )type X::operator()(expr-list)
Value constructiontype { expr-list }-
Function-style type conversiontype ( expr-list )X::operatortype() const
Post incrementlvalue++X X::operator++(int)
X operator++(X& lhs, int)
Post decrementlvalue--X X::operator--(int)
X operator--(X& lhs, int)
Type identificationtypeid( type )-
Run-time type identificationtypeid( expr )-
Run-time checked conversiondynamic_cast < type > ( expr )-
Compile-time checked conversionstatic_cast < type > ( expr )-
Unchecked conversionreinterpret_cast < type > ( expr )-
const conversionconst_cast < type > ( expr )-
Size of objectsizeof expr-
Size of typesizeof ( type )-
Size of parameter packsizeof...( name )-
Alignment of typealignof ( type )-
Pre increment++lvaluePrefix Unary
Pre decrement--lvaluePrefix Unary
Bitwise complement~expr
compl expr
Postfix Unary
Logical Not!expr
not expr
Postfix Unary
Unary minus- exprPostfix Unary
Unary plus+ exprPostfix Unary
Address of&lvalueY* X::operator&()
const Y* X::operator&() const
Y* operator&(X& lhs)
const Y* operator&(const X& lhs)
Dereference*exprY& X::operator*()
const Y& X::operator*() const
Y& operator*(X& lhs)
const Y& operator*(const X& lhs)
Create (allocate)new typestatic void* X::operator new(size_t sz)
void* operator new(size_t sz)
static void* X::operator new[](size_t sz)
void* operator new[](size_t sz)
Create (allocate and initialise)new type ( expr-list )
new type { expr-list }
Create (place)new ( expr-list ) typestatic void* X::operator new(size_t sz, expr-list)
void* operator new(size_t sz, expr-list)
Create (place and initialise)new ( expr-list ) type ( expr-list )
new ( expr-list ) type { expr-list }
Destroy (de-allocate)delete pointervoid X::operator delete(void* p)
void operator delete(void* p)
Destroy arraydelete[] pointervoid X::operator delete[](void* p, size_t sz)
void operator delete[](void* p, size_t sz)
Can expression throw?noexcept ( expr )-
Cast (type conversion)( type ) expr-
Member selectionobject .* pointer-to-member-
Member selectionpointer ->* pointer-to-memberY* X::operator->*(Y m)
const Y* X::operator->*(Y m) const
Y* operator->*(X& lhs, Y m)
const Y* operator->*(const X& lhs, Y m)
Multiplicationexpr * exprBinary (Arithmetic)
Divisionexpr / exprBinary (Arithmetic)
Moduloexpr % exprBinary (Arithmetic)
Additionexpr + exprBinary (Arithmetic)
Subtractionexpr - exprBinary (Arithmetic)
Shift left 'expr' number of bitsexpr << exprBinary (Bitwise)
Shift right 'expr' number of bitsexpr >> exprBinary (Bitwise)
Less thanexpr < exprBinary (Logical)
Less than or equalexpr <= exprBinary (Logical)
Greater thanexpr > exprBinary (Logical)
Greater than or equalexpr >= exprBinary (Logical)
Equalexpr == exprBinary (Logical)
Not equalexpr != expr
expr not_eq expr
Binary (Logical)
Bitwise ANDexpr & expr
expr bitand expr
Binary (Bitwise)
Bitwise XORexpr ^ expr
expr xor expr
Binary (Bitwise)
Bitwise ORexpr | expr
expr bitor expr
Binary (Bitwise)
Logical ANDexpr && expr
expr and expr
Binary (Logical)
Logical ORexpr || expr
expr or expr
Binary (Logical)
Conditional expressionexpr ? expr : expr-
List{ expr-list }-
Throw exceptionthrow expr-
Assignlvalue = exprBinary Assignment
Multiply and assignlvalue *= exprBinary Assignment
Divide and assignlvalue /= exprBinary Assignment
Modulo and assignlvalue %= exprBinary Assignment
Add and assignlvalue += exprBinary Assignment
Subtract and assignlvalue -= exprBinary Assignment
Shift left and assignlvalue <<= exprBinary Assignment
Shift right and assignlvalue >>= exprBinary Assignment
Bitwise AND and assignlvalue &= expr
lvalue and_eq expr
Binary Assignment
Bitwise OR and assignlvalue |= expr
lvalue or_eq expr
Binary Assignment
Bitwise XOR and assignlvalue ^= expr
lvalue xor_eq expr
Binary Assignment
Sequencingexpr, exprZ X::operator,(Y& rhs)
Z operator,(X& lhs, Y& rhs)

In addition to the above, a user-defined type may also define literal operators

The Overload Impl. Type column indicates the prototype/signature of the operator function when implemented for a user-defined type. For further details, see Overloading Operators

Some precedence examples;-

ExampleDescription
a + b * cMeans a + (b * c) because * has a higher precedence than +
a = b = cMeans a = (b = c)
a + b + cMeans (a + b) +c
if (x & mask == 0) {…}Means x & (mask == 0). Take care!
if (0 <= x <= 42) {…}Means (0 <= x) <= 42. This is interpreted as follows; 0 <= x yields a bool of true or false. This is implicitly converted to an int yielding 0 or 1. This is then compared with 42 which will always yield true
a+++ 1Means (a++) + 1

Explicit use of bitwise operators such as &, |, ^ etc can sometimes be avoided by using a bitfield

The bitwise operators &, |, ^, etc can be used for logical 'set' manipulation. However, consider the higher-level standard library types set and bitset instead

Overloading Operators

Operator overloading allows conventional notation to be used to manipulate an object of a user-defined type. For example, given two objects of a user-defined class; a and b, it may be useful to check for equality a == b, or to be able to add a + b (whatever 'add' means in the context of the type)

The Overload Impl. Type column in the table of operators indicates the prototype/signature of the operator function when implemented for a user-defined type. Most operators follow the same few function signature/prototype patterns and these are indicated below. For any operators that deviate from the standard patterns, their specific function signature/prototype is indicated specifically in the table and are described in more detail in the following sections

The meaning of Overload Impl. Type is as follows

Note: The arguments and return types are flexible for most operators; in all cases, lhs and rhs can be any type, and is commonly another X type. In the case of a binary operator, rtn-type is often another X but may be a different type. For example, a != operator would probably return a bool;-

Overload Impl. TypeDescription
-Operator may not be overloaded in user-defined type
Prefix Unary

Prefix unary operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two argument. The operator acts directly on and modifies the supplied object. An operator @ is defined with any of the following;-

X& X::operator@() X& operator@(X& lhs)

Generally, a prefix unary operator member function should return a reference to *this, and a non-member function should return a reference to lhs

Postfix Unary

Postfix unary operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two arguments. The operator does not modify the supplied object but (typically) creates a copy of it and operates on (and returns) the copy. An operator @ is defined with any of the following;-

rtn-type X::operator@() const rtn-type operator@(const X& lhs)
Binary

Binary operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two arguments. An operator @ is defined with any of the following;-

rtn-type X::operator@(const rhs) const rtn-type operator@(X lhs, const rhs) rtn-type operator@(const lhs, X rhs)

An Arithmentic Binary operator member function often returns a modified copy of *this, and a non-member function often returns a modified copy of the X argument, though in both cases it may be appropriate to return some other type

A conventional Bitwise Binary operation on an integral type returns a similar integral type. For a user-defined type, the return type often follows that of an Arithmentic Binary operator

A Logical Binary operation usually returns a bool value

Binary Assignment

Binary Assignment operator. This may be defined as either a non-static member function taking one argument or as a non-member function taking two arguments. An operator @ is defined with any of the following;-

X& X::operator@(const rhs) X& operator@(X& lhs, const rhs)

Generally, unless there is a good reason not to, a binary assignment operator member function should return a reference to *this, and a non-member function should return a reference to lhs. This allows chaining; a = b = c. An assignment operator that behaves differently will not operate in a 'normal' way and may cause problems within expressions

Following is an example of one of each of the four main operator formats; operator '++' (prefix Unary), operator '!' (postfix Unary), operator '+' (Binary) and operator '=' (Binary Assignment);-

class X { X& operator++(); // Prefix unary Y operator!() const; // Postfix unary X operator+(const rhs) const; // Binary X& operator=(const rhs); // Binary Assignment }

…or;-

X& operator++(X& lhs); // Prefix unary Y operator!(const X& lhs); // Postfix unary X operator+(const X& lhs, const rhs); // Binary X operator+(const lhs, const X& rhs); X& operator=(X& lhs, const rhs); // Binary Assignment

Wherever possible, maintain the normal meaning of operators. For example, + should not (say) perform a square-root operation or something completely unrelated

One obvious exception to this is the way the standard I/O library overloads the operators << and >> to do something completely different. Even so, these still 'sort-of' keep the traditional meaning in an abstract way

Also, maintain the relationships between operators. For example, a = a + 1 conventionally means the same as a += 1 or a = a - -1 or a -= -1, but these are all different operators; = (assignment), + (addition), - (subtraction), += (add and assign), -= (subtract and assign) and -expr (unary minus)

It is usually a good idea to define operators so they are commutative; ie, a + b should give the same result as b + a even if a and b are different types

Operators should be carefully considered and defined as a whole to avoid any discrepancies between them

Don't overload the operators &&, || or , (comma); doing so will result in trouble at some point

Notwithstanding the previous warning about not overloading , (comma), here is an example of doing exactly that. This method allows the subscripting [] operator to (at least give the impression of) taking multiple indices;-

enum city {london, paris, tokyo, new_york, …}; pair<city, city> operator,(city from, city to) { return make_pair(from, to); } map< pair<city, city>, unsigned int> distance_km; distance_km[london, tokyo] = 9554;

Special Operators

There are a number of operators that are considered "special" in that they do not follow the normal pattern of arguments and/or return values or their use is not standard. These are;-

All the above are described in the following sections

The following operators are also "special" but are described elsewhere;-

Subscripting []

The subscripting operator [] allows a type (or more specifically, the member(s) of a type) to be indexed like an array. The argument is the index and may be of any type, thus allowing associative arrays to be constructed. For example;-

class X { int m[10]; public: const int& operator[](int index) const { return m[index]; } int& operator[](int index) { return m[index]; } }; // Use the '[]' operators X a; int b = a[3]; a[4] = 42;

Function Call ()

The function call operator () (also known as the Application Operator) allows a type to be directly used as a function name. That is, an object of the type can act like a function; a Function Object. For example;-

class X { int m; public: bool operator()(); // Takes no arguments int operator()(int y); // Takes one argument of type int void operator()(int y, bool z); // Takes two arguments }; // Use the '()' operators X a; bool b = a(); int c = a(42); a(83, true);

Dereferencing ->

The dereferencing operator -> supports the important concept of indirection . In practice, this usually means returning a pointer reference to some member object or the internal pointer to some external resource. For example;-

class Y { int m1; int m2; }; class X { Y m; public: Y* operator->() { return &m; } }; // Use the '->' operator X a; int b = a->m1; // Access a member of the return Y object

Note that the variable a is not a pointer, which is a deviation from the standard usage of ->

Increment/Decrement ++/--

The increment/decrement operators are specified with ++ and -- respectively. They are unique in that they may be used in both prefix and postfix operations

class X { int m; public: // Constructor X(int a) : m{a} {} // Pre-increment/decrement X& operator++() { ++m; return *this; } X& operator--() { --m; return *this; } // Post-increment/decrement X operator++(int) { X a{m}; ++m; return a; } X operator--(int) { X a{m}; --m; return a; } }; // Use the '->' operator X a; ++a; --a;

Consider not providing postfix increment/decrement for a type. They are not as efficient as the prefix operators and are less frequently used anyway

Pointers To Members .*/->*

It is, of course, possible to take the address of a class member in the normal way;-

class X: { public: int m; }; X a; b = &a.m; // b is a pointer to member m of object a

However the above example is NOT what a pointer-to-member is

A pointer-to-member is best thought of as a named offset into a class (type) rather than as a normal pointer with a specific absolute address. The operators .* and ->* are used in this context to dereference such a pointer

Despite being thought of as a named offset, a pointer-to-member is most useful when the the user of the pointer does not (can not) know the exact member being referred to

Here is an example that uses a base class with a number of pure virtual functions defined (possibly the most common structure for such cases);-

class Y { public: int m1{}; const char* m2{}; virtual bool fn1(int a) = 0; virtual bool fn2(int a) = 0; virtual bool fn3(int a) = 0; }; class X : public Y { public: bool fn1(int a) override { /* ... */ } bool fn2(int a) override { /* ... */ } bool fn3(int a) override { /* ... */ } }; // Create an object of type X and also take its address X a; X* ap = &a;

Using the above definitions, this is how we might use pointers-to-data-members;-

// Create alias' for pointer-to-data-member types (an alias makes subsequent use a bit cleaner) using pint = int Y::*; using pcharp = const char* Y::*; // Create two pointer-to-members and initialise them to point to the two base data members pint m_int = &Y::m1; pcharp m_charp = &Y::m2; // Use the pointer-to-members a.*m_int = 81; // a.m1 == 81 ap->*m_charp = "Hello"; // a.m2 == "Hello"

…and this is how we might use pointers-to-function-members;-

// Create alias for pointer-to-function-member type using pfn = bool (Y::*)(int); // Create a pointer-to-member and initialise it to point to one of the member functions pfn fn = &Y::fn2; // Use the pointer-to-member bool r1 = (a.*fn)(42); // This will call X::fn2() for object a bool r2 = (ap->*fn)(42); // This will also call X::fn2() for object a

The above function pointer examples are very simple. When they get more complex, it is easy to get the syntax wrong. The standard function std::invoke may be used as an alternative to the above syntax. Expanding on the previous example;-

bool r3 = std::invoke(fn, a, 42); // This will call X::fn2() for object a

Conversion Operators

A user-defined type may define conversion operators with the notation operator T(). Such operators convert from the user-defined type to the type T

Here is an example of a class X implementing an assignment operator that takes an int and a complimentary conversion operator to convert an object of type X to an int;-

class X { int m{}; public: // Assignment from an int X& operator=(int val) noexcept { m = val; return *this; } // Convert an X into an int operator int() const noexcept { return m; } }; // Use the assignment and the conversion operators X a; a = 42; int b = a; // Implicit conversion; b == 42 cout << "a = " << a << ", b = " << b << endl; // Implicit conversion for a; Outputs 'a = 42, b = 42'

Literal Operators

Literal operators allow new literal notations to be defined for built-in and user-defined types so that a user-defined suffix applied to a "naked" literal value such as 123 or "Hello" or 1.23 can be interpreted specially

A literal operator function is defined with the notation operator"" _U, where U is the "unit" suffix. This does not fit into the normal pattern of operators as there isn't (by default) a "" operator

Unlike most other operators, a literal operator is very restricted in the types of arguments it may take. It must take one of the following forms;-

constexpr rtn-type operator"" _U(unsigned long long n); // Unsigned integer constexpr rtn-type operator"" _U(long double n); // Floating point constexpr rtn-type operator"" _U(const char* p); // 'C' string representation of integer or floating point template<char... chars> // Template literal representation of integer or floating point constexpr rtn-type operator"" _U(); constexpr rtn-type operator"" _U(const char-type* p, size_t sz); // 'C' string literal constexpr rtn-type operator"" _U(char-type c); // Character

For an operator "unit" of blob, here are example uses;-

123_blob // Unsigned integer - calls operator"" _blob(123) // A very long unsigned integer - calls operator"" _blob("1234567890123456789012345678901234567890") 1234567890123456789012345678901234567890_blob 123.45_blob // Floating point - calls operator"" _blob(123.45) "Hello\n"_blob // String - calls operator"" _blob("Hello\n", 6) R"(Hello\n)"_blob // (Raw) string - calls operator"" _blob("Hello\\n", 7) 'A'_blob // Character - calls operator"" _blob('A')

Here is a more complete example;-

class X { double time; public: // Constructor constexpr X(double t = 0) : time{t} {} }; // Literal operators to give an X with a s, ms, or us time scaling constexpr X operator"" _s(unsigned long long t) { return X{static_cast<double>(t)}; } constexpr X operator"" _ms(unsigned long long t) { return X{static_cast<double>(t) / 1000}; } constexpr X operator"" _us(unsigned long long t) { return X{static_cast<double>(t) / 1000000}; } // Use the literal operators X a{123_s}; // a.time == 123.0 X b = 123ms; // b.time == 0.123 X c; c = 123us; // c.time == 0.000123

A template literal operator is one that takes its arguments as a variadic template parameter pack rather than as function arguments. For example;-

// Template literal operator template<char... chars> constexpr int operator"" _blip(); // Using the template literal operator, the following would result in the call // operator"" _blip<'1', '2', '3'>() 123_blip

The implementation of a template literal operator typically needs to step through each character in-turn and construct a value. In order to do this, some helper functions are useful. The following example implementation of operator"" _bcd converts a string of digits into a BCD number;-

// Check that string represents a decimal value template<typename T, typename... U> constexpr unsigned int bcd_is_decimal(const T c, U...) { static_assert(c != '0', "BCD literal is not decimal"); return 0; // Return a dummy int to fulfil constexpr requirements } // Helper function; expect only one digit and extract it template<typename T> constexpr unsigned int bcd_helper(T c) { return static_cast<unsigned int>(c - '0'); } // Helper function; extract one digit and then process remainder template<typename T, typename... U> constexpr unsigned int bcd_helper(T c, U... tail) { return ((static_cast<unsigned int>(c - '0') << (4 * sizeof...(tail))) | bcd_helper(tail...)); } // Operator function template<char... chars> constexpr unsigned int operator"" _bcd() { return (bcd_is_decimal(chars...), bcd_helper(chars...)); } // Use the operator function; this will set 'a' to 74565 binary decimal (12345 bcd) constexpr unsigned int a = 12345_bcd;

Note that this example checks that the first digit of the literal is not zero; if it is then that indicates a non-decimal (ie, binary 0b, octal 0n, or hex 0x) value (which makes no sense in this context)

Memory Management

Free Store (Heap)

The operators new and delete allocate and de-allocate objects on the free store (heap) memory. They also have counterparts, new[] and delete[] for allocating/de-allocating arrays though the existence of the former is not obvious from the syntax. These operators are used as follows;-

T* var = new T{initialiser}; T* var {new T{initialiser}}; delete var; T* array_var = new T[n]{initialiser}; T* array_var {new T[n]{initialiser}}; delete[] array_var;

Using new and delete expressions in their raw form requires care; memory leaks, or incorrect application of delete on an allocated object can cause serious problems

Performing a new but not a subsequent delete will leak memory

Performing a delete and then later using the object at the deleted address, or performing a delete on an object not allocated by new, on an object that has already been deleted or on an invalid address are all undefined behaviour and usually disastrous

set_new_handler

The function std::set_new_handler() may be called by an application to register a callback (ie, a new-handler) that shall be called if operator new fails to fulfil an allocation request. It returns the previously registered callback function and is declared as;-

using new_handler = void(*)(); new_handler set_new_handler(new_handler p) noexcept;

If new fails to fulfil a memory request, it shall call the registered new-handler function. If the new-handler returns normally (ie, it does not throw an exception or terminate the program) then new shall try to perform the allocation again. The cycle repeats until the allocation succeeds or the new-handler does not return normally. This behaviour implies that the new-handler must do one or more of the following;-


Overloading new And delete

The new expression and operator new

new is actually composed of what may be thought of as two functions. In general use, this distinction is not apparent, but it is useful in order to fully understand the role of set_new_handler and is critical in understanding overriding or overloading operator new;-

The new expression

This is the function that is directly called by an expression such as X* a = new X;

It will first call operator new to acquire some memory of size (eg) sizeof(X)

If the memory acquisition is successful then the appropriate constructor is executed for the allocated type, within the acquired memory. If the construction fails (ie, throws an exception) then operator delete is executed to free the allocated memory and the exception is re-thrown

If the memory acquisition fails then a std::bad_alloc exception is thrown or (if the invoked operator new is a non-throwing version) nullptr is returned to the application


The new expression can also allocate groups of objects, as an array. This occurs with an expression such as X* b = new X[3];

In this case, an appropriate operator new[] is called to acquire some memory of size (eg) sizeof(X) * 3, and if successful, then it executes the constructor for each element of the array

If any constructor fails (ie, throws an exception), then the whole allocation is considered to have failed, the destructor is called for any successfully constructed elements and the appropriate operator delete[] is executed and the exception is re-thrown


It is not possible to override the new expression. It can be thought of as being aware of any overrides/overloads to operator new though and shall invoke the appropriate version

operator new

The sole purpose of this function is to allocate memory of the requested size. Exactly how it does this, and details such as alignment etc, is implementation-specific

operator new may be overridden; globally and/or specifically for a type. If it is, it should follow these conventions;-

  • On success, it should return a non-null void* to the allocated memory
  • On failure, if the new-handler is not nullptr (ref. std::get_new_handler()) then it should be called, followed by a further attempt to fulfil the allocation request. This process repeats forever until it succeeds or until new-handler is nullptr. On failure, it should throw a std::bad_alloc exception (the new-handler may also do this)
  • A valid pointer should be returned even if zero bytes are requested. This is usually achieved by allocating a single byte instead
  • It should be able to handle an unexpected size (ie, a size that does not match that of the object it is designed to allocate - see example)
operator new[]

This can be, and often is, identical to (or simply calls) operator new

It can be different to operator new though should the application need to handle array allocation differently to single object allocation


The delete expression

As with new, delete is also composed of what may be thought of as two distinct functions;-

The delete expression

This is the function that is directly called by an expression such as delete a;

It will first call the destructor for the object. This is why it is undefined behaviour to pass a void* to the delete expression

It will then call operator delete to free the memory previously allocated by operator new

This is one reason why a destructor should never throw an exception; doing so will almost certainly cause a memory leak


The delete expression can also de-allocate arrays of objects. This is performed by using the form delete[] b

First, the destructor is called for each element of the array, followed by a call to an appropriate version of operator delete[]


It is not possible to override the delete expression. It can be thought of as being aware of any overrides/overloads to operator delete though and shall usually   invoke the appropriate version

operator delete

This function is expected to free the memory previously allocated by operator new

operator delete may be overridden and should be done so to match any override of operator new. If it overridden, it should follow these conventions;-

  • A request to delete nullptr is legal and should always succeed
  • Like operator new, it should be able to handle unexpected size values
operator delete[]

This can be, and often is, identical to (or simply calls) operator delete, though like operator new[], it may do something else


Some reasons why one may wish to replace operator new and operator delete operators include;-

Standard global operator new and delete prototypes

The standard global function prototypes are as follows. For each version of operator new, there is also an operator new[] (though it is not shown here for brevity). Similarly for operator delete and operator delete[];-

Overloading global operator new and delete


'Non-throwing' operator new and delete


Standard type-specific operator new and delete prototypes

It is also possible to overload operator new and operator delete for a specific user-defined class. For each version of operator new, there is also an operator new[]. Similarly for operator delete and operator delete[];-

Overloading type-specific operator new and delete

General operator new and delete overloading considerations

The variety of operator new and operator delete functions can appear confusing

One way of looking at this is that the standard versions (N1, N2, MN1 and D1 and MD1) are invoked by the new and delete expressions unless one of the other versions are defined

There are a couple of additional rules as indicated by the above prototype lists, but this is the general principle

Any version of operator new may be invoked (assuming it is defined) by providing the appropriate arguments to the new expression

However, the delete expression does not take any arguments (other than a pointer to the memory to delete) and so any explicit call to the delete expression will only behave correctly for non-placement allocations. That is, the delete expression will not behave correctly for allocations made with N6, N7, MN3 or MN4

The way around this is that if using one of the other versions of operator delete, it must be called directly, and not via the delete expression. This also means that the object's destructor must also be explicitly called before deletion of the memory. For example;-

void* p = ::operator new(sizeof(X)); // Allocate only - no construction X* a = new(p) X{}; // Placement-new - construction a->X(); // Destruction ::operator delete(p); // Free memory

There is also a more complete example


new and delete within a class hierarchy

Placement new And delete

A placement operator new function is one of the forms N5, N6, N7, MN3 or MN4


The following example illustrates the use of placement new. It defines an allocator object that performs the actual memory allocation. It then defines a placement operator new function that uses the allocator object;-

// A custom allocator object class allocator { // ...Whatever members are required to implement the allocator... public: // Allocate void* alloc(std::size_t size) { // ...allocate some memory of size 'size' and return a pointer to it... return p; } // Free void free(void* p) { // ...free the memory specified by p... } }; // placement operator new void* operator new(std::size_t size, allocator& alloc) { // Pass the allocation request on to the supplied allocator object return alloc.alloc(size); }

Using the above definitions, we could allocate memory from a specific allocator object as follows;-

extern allocator persistent; extern allocator shared; // Allocate two 'X' objects in different 'allocator' areas X* pp = new(persistent) X{initialiser}; X* ps = new(shared) X{initialiser};

Placement Delete

A placement operator delete function is one of the forms D7, D8 or MD6


Here is an example destroy() function that is suitable for deleting memory allocated with the above allocator;-

// Destroy object allocated with placement new using an allocator object template<typename A, typename T> void destroy(A& alloc, T* ptr) { ptr->~T(); // Explicitly call destructor alloc.free(ptr); // Free memory } // Delete the two object allocated in the previous example destroy(persistent, pp); destroy(shared, ps);

launder

Defined in the standard header <new> as;-

constexpr T* std::launder<typename T>(T* p);

Example

Consider the following:-

struct X { const int m1 }; X a{42};

The member m1 is const and so following the above, the compiler can assume that any subsequent reference to a.m1 will always have the value 42, and may perform some optimisations based on this assumption

The type X is a trivial type, so it is safe not to call its destructor. Therefore, it is legal to use placement new such as;-

X* b = new(&a) X{83};

This will simply create another instance of X, overwriting the original instance. But there is a problem; the compiler can assume that any subsequent reference to b->m1 will always have the value 83, and this would be correct. However, the original assumption that any reference to a.m1 yielding the value 42 is no longer (or at least, should no longer be) true

In reality, it is likely that the compiler will return 42 for a.m1 though. Essentially, the program is now broken; the const has been implicitly cast-away and the member variable modified, leading to undefined behaviour

This problem can be solved by accessing a.m1 as *std::launder(&a.m1). This effectively prevents the compiler from maintaining the assumption that it once held about the value of a.m1, and forces it to re-evaluate it


Example

Another case where std::launder is useful is in circumventing the rule that states that a new object can not be accessed via a pointer to the old object if the two pointers are of different types. For example;-

aligned_storage<sizeof(int), alignof(int)>::type c; new(&c) int{54}; int* d = std::launder(reinterpret_cast<int*>(&c));

Without the std::launder, the above would not guarantee a correct result


Error Handling

Preconditions And Postconditions

Most functions make some assumptions about the arguments passed to them and the state of any objects that the function may deal with. These preconditions may be implied (assumed) or explicitly tested for within the function. Similarly, the postconditions represent a guarantee that the function makes to its caller about the state of any values returned or any modifications it may make to any other objects

Preconditions and postconditions are important in maintaining invariance

There are several ways of dealing with preconditions;-

In reality, a combination of the above approaches may be suitable in any one situation

The standard C++ implementation supports two mechanisms for the checking of preconditions (and postconditions);-

std::terminate()

The std::terminate() function is defined in the standard header <exception> and may be called explicitly from within normal code if other error handling techniques are not an option

By default, std::terminate() will call abort(). This behaviour may be changed by specifying an alternative handler function to std::set_terminate(). For example;-

// From <exception>... using terminate_handler = void(*)(); // Our new terminate() handler [[noreturn]] void my_handler() { // Handle termination } // Establish custom handler terminate_handler old = set_terminate(my_handler); // Do things that may cause terminate() to be called... // Restore original handler set_terminate(old);

The program is exited via an implicit call to std::terminate() if any of the following conditions are met;-

If a program terminates owing to an uncaught exception, it is implementation-specific as to whether destructors are called or not. This may depend on the environment; for example, if invoked from a debugger then it is probably desirable NOT to call destructors

Exceptions

It is not uncommon for a program (or part of a program) to be able to detect an error but have no idea how to deal with it. For example, a library function that has no concept of how it is being used

Exceptions provide a mechanism for such a program (or part of a program) to propagate the error back up the call stack in the general hope/expectation that some code, somewhere will know what to do about it; that is, the exception mechanism provides a means of getting error information from the point of detection to the point of handling

There are some key concepts that make the exception mechanism reliable/usable. These are; the exception safety guarantee (which are central to effective recovery of run-time errors), and Resource Acquisition Is Initialisation (RAII). Both of these concepts rely on the specification of invariants

The construct for preparing to handle the possibility of an exception is the try block

The two basic constructs for propagating an exception and handling it are called throw and catch. Here is an example;-

// Our exception type. In this case, we have no actual data to pass struct serious_error {}; void do_something() { // ... if (something_has_gone_badly_wrong) { throw serious_error{}; } } void fn() { // Do something and catch any exceptions if necessary try { do_something(); // No error - carry on... } catch(serious_error err) { // Error - handle it... } }

In the above example, if do_something() were to throw an exception other than serious_error (or it called some other function that threw some other exception) then the calling function fn() would not handle it. Instead, the exception would propagate further up the call stack to whatever called fn()

Throwing An Exception

Promising Not To Throw

A function may be declared as not throwing any exceptions by using noexcept. For example;-

void fn() noexcept;

This declares a guarantee that the function will not throw or propagate any exceptions. If this guarantee is broken at run-time then the program will end immediately by calling std::terminate() (no destructors further up the call-tree shall be called)



Catching An Exception

There is a school of thought which says that catching any exceptions is more trouble than it it worth. That is, the program should allow any and all exceptions to make their way up to main() and terminate the program (possibly with some diagnostic output first)

The reason for this thinking is that even if an exception is caught, it is often very difficult to know what to do with it and to recover in any safe and meaningful sense from whatever error is being indicated. Often, the only exception that can be usefully handled is a new bad_alloc error, and even this can be avoided by using the non-throwing version of new and employing more traditional methods for handling the error

Rethrowing An Exception

Exceptions Within Threads

When Not To Use Exceptions

For various historical and practical reasons, there are cases where exceptions can not be used (or it would be unwise to use them);-

In order to make best use of exceptions within a program, a solid, simple strategy needs to be put in place. Specifically, key functions or subsystem should be designed to either always succeed or fail in a controlled and well-defined way that leaves the state of the program consistent with no lost or broken resources

A function that throws an exception (or fails to catch an exception) should deal with any resource cleanup at the time. It should not rely on its caller to do it for it

When using external libraries, it may be necessary to convert from one error-handling strategy to another. For example, checking for errno after a system function call and throwing an exception if appropriate

Unless a function guarantees noexcept, assume it might throw an exception and protect against this when handling resources

Even (apparently) simple operations such a =, < and sort() may throw exceptions

The exception mechanism is intended to provide a consistent error handling method spanning multiple modules and libraries, possibly developed independently of each other. It also shifts error handling code out of the main flow of execution into specific (catch) blocks which keeps the main flow cleaner and makes the error handling more obvious and visible

The exception mechanism is intended to be used. That is, an error condition does not have to be particularly rare or particularly catastrophic in order to warrant the use of an exception. An error may be considered quite common, and/or not particularly disastrous (as is the case with many I/O operations), but the exception mechanism could still be appropriate. "Exception" should be interpreted as "something that the code was unable to do" rather than "we're all going to die!"

Most large programs will be expected to throw and catch at least some exceptions during a normal and successful run

Having said this, see also this option

Do not allow exceptions to be emitted from destructors

Only throw objects that are user-defined types specifically defined for the purpose, rather than (say) an int. This will minimise the chance of two exceptions (possibly from different libraries, for example) being confused

Do not use exceptions to perform non-error asynchronous tasks, such as (say) a key entry or I/O interrupt. There are other mechanisms to handle this sort of activity and using exceptions in the role is an abuse of the exception mechanism. It is worth noting that an implementation will typically optimise the exception handling mechanism based on the assumption that it is used only for ("out of band") error reporting

Older code may use the following syntax;-

// Specifies that no exceptions shall be thrown void fn() throw(); // Specifies the types of exception that may be thrown void fn() throw(exc1, exc2);

Both these forms are now deprecated. The first has been replaced with noexcept, and the second proved unsuccessful and has been abandoned

Function Try Blocks

It is possible to define an entire function body as a try block. For example;-

void fn() try { // Do some work... } catch (...) { // Catch all exceptions... }

For most functions (such as the above example), there is very little to be gained by this syntax. However, function-try blocks are more useful in constructors

Normally, if an exception occurs within a base or member initialiser then the exception is passed-up to whatever invoked the constructor rather than the constructor itself. A function-try block allows the latter. For example;-

class X { vector<int> v; public: X(int); } X::X(int size) try { // Member construction :v(size); // Rest of construction... } catch (std::exception& err) { // If an exception is thrown by the vector<T> construction then we will // catch it here rather than it being passed up to the caller of X() }

Exception Guarantees

A function that leaves the program in a valid state, with no resource leaks and no inconsistencies is considered exception-safe

Generally, for a class to be exception-safe, it must have an invariant. Objects that are not classes but have some relationship to each other (a relationship that is assumed at all times) must also have an invariant. If such invariants prove false (not maintained) then exception-safety will usually be compromised

Before an exception is thrown, all objects that may be effected must be placed into a valid state (a state that meets each object's invariant). Unfortunately, the state chosen, while valid, may not be the best one for the caller


A function should offer one of the following three exception-safety guarantees;-

GuaranteeDescription
The basic guaranteeIf an exception is thrown, the object (or objects) being operated on (and by extension, the whole program) will always be left in a state that meets its invariants. Any co-dependencies it shares with other objects shall remain valid and meet all the invariants for the object (though the state may have changed, possibly in unpredictable ways). No resources shall be leaked
The strong guaranteeIf an exception is thrown, the object (or objects) being operated on (and by extension, the whole program) will be left in exactly the same state it was in before the function started. It is as if the function had never been called, apart from the fact that there is now an exception working its way up the call stack!
The nothrow guaranteeThe function shall never throw an exception and shall always run to completion (and, one assumes, always leaves the object(s) being operated on in a valid state). All operations on any built-in or pointer type offer this guarantee

Exception-Safe Construction

The nothrow guarantee is the most desirable but often impossible to provide if the function is anything other than trivial and/or deals with anything other than built-in types; many 'innocent looking' standard container operations may throw, for example

A general design technique that is often used to provide the strong guarantee is that of copy-and-swap; that is, copy the object (make a temporary), modify the copy (if an exception is thrown at this point then the original remains unchanged), and if all is well, swap the copy with the original using a non-throwing swap()

Offering the strong guarantee rather than the basic guarantee is highly desirable and is often relatively simple as long as the function is dealing with data that it has full visibility of, and control over. If it needs to invoke other functions that may themselves throw then offering such a guarantee becomes much harder even if those functions offer the strong guarantee as well; what happens if the first function call succeeds and the second one fails and throws an exception? Is it possible to undo the effect of the first function?

Another possible obstacle to offering the strong guarantee is that there may be an unacceptable cost involved; the copy-and-swap technique is not free—by definition it involves creating a temporary object

The presence of noexcept is not an indication of a function offering the nothrow guarantee. It is an indication that if an exception is thrown then something has gone seriously wrong and the program is now in an undefined state. The nothrow guarantee is a feature of a function's implementation, not its declaration. See also unexpected() and set_unexpected()

The only time no exception safety guarantee can be offered at all is if the function relies on some other (legacy) function that is itself not exception-safe

The exception safety of a function is only as strong as that of the weakest operation it performs (assuming recovery from the effects of the weakest operation is not possible)

Whenever possible, use pointer manager objects

Although inferior to following true RAII principles, if it is absolutely necessary to use raw resources ("naked" pointers etc), then the following technique can provide exception safety;-

// Define a class that will call an arbitrary function from its destructor template<typename F> struct final_action { F clean; final_action(F f): clean{f} {} ~final_action() { clean(); } }; // Define a function that deduces the type of an action template<class F> final_action<F> finally(F f) { return final_action<F>(f); } // A function that uses our 'finally' mechanism void fn() { int* p1 {}; int* p2 {}; // Protect our "naked" resources auto cleanup = finally([&] { delete p1; delete p2; }); // Create some resources (new may throw an exception) p1 = new int[42]; p2 = new int[83]; // Carry on. When 'cleanup' goes out of scope, it shall tidy-up our resources }

unexpected() and set_unexpected()

The function [[noreturn]] void unexpected() is called in the event that an exception is thrown from a function marked as noexcept

Functions

Two different syntax styles are available for declaring a function; the traditional 'C'-style;-

// Using prefix return-type syntax [[noreturn]] static extern inline constexpr return-type name(argument-list) noexcept; [[noreturn]] static extern inline constexpr auto name(argument-list) noexcept;  [[noreturn]] static extern inline constexpr decltype(auto) name(argument-list) noexcept; 

…or this (suffix return-type syntax);-

// Using suffix return-type syntax [[noreturn]] static extern inline constexpr auto name(argument-list) -> return-type noexcept; 

A function is defined like this ('C'-style syntax); all three forms follow the same pattern;-

// Using prefix return-type syntax [[noreturn]] static extern inline constexpr return-type name(argument-list) noexcept { // body return return-expression; }

…or this (suffix return-type syntax);-

// Using suffix return-type syntax [[noreturn]] static extern inline constexpr auto name(argument-list) -> return-type noexcept { // body return return-expression; }

All functions consist of the following components;-

In addition to the above, a member function may also be specified as;-

Here is a rather complex example using many of the above options;-

struct S { [[noreturn]] inline auto f(const unsigned long int* const param) const noexcept -> void; };

Arguments

Return Type


Deduced Return Types

Suffix Return Type
Function's Type Top Level Argument Qualifiers

The last couple of examples above demonstrate an important issue; that is, top level const and/or volatile qualifiers are ignored when determining a function's type


The following shows how to determine the number of, and types of arguments of a function (and a lambda expression), and its return type, given only a pointer to the function;-

#include <tuple> // A traits type used to extract the function attributes template <typename T> struct func_traits : public func_traits<decltype(&T::operator())> {}; // Specialisation for function pointers template <typename RTN, typename... ARGS> struct func_traits<RTN(*)(ARGS...)> { using return_type = RTN; enum { num_args = sizeof...(ARGS) }; template <size_t num_args> struct arg { using type = typename std::tuple_element<num_args, std::tuple<ARGS...>>::type; }; }; // Specialisation for lambdas template <typename LMB, typename RTN, typename... ARGS> struct func_traits<RTN(LMB::*)(ARGS...) const> { using return_type = RTN; enum { num_args = sizeof...(ARGS) }; template <size_t num_args> struct arg { using type = typename std::tuple_element<num_args, std::tuple<ARGS...>>::type; }; }; // A function template that takes an arbitrary function/lambda as an argument template <typename T> void fn(T&& t) { using traits = func_traits<typename std::decay<T>::type>; // Determine the function/lambda attributes using return_t = typename traits::return_type; // Return type auto num_args = traits::num_args; // Number of arguments using arg0_t = typename traits::template arg<0>::type; // First arg type // ... }

The above could be invoked as follows and the function 'fn' would be able to determine the supplied function/lambda arguments etc;-

// A simple function void my_function(int a, float b, int c) {} // A lambda auto my_lambda = [](int a, int b) { return 3.14; }; // Function 'fn' may be called with an arbitrary function/lambda reference fn(my_function); fn(my_lambda);

Inline Functions

A function may be defined as inline. For example;-

inline int fn(int p);

Specifying inline is a hint to the compiler that the function body should be placed in-situ at the point the function is called, rather than instantiating the function once and performing a normal function call to it

In summary, apply inline to those functions that are known to be truly trivial, and are called frequently. Treat everything else as an optimisation (that in most cases will make virtually no difference to the total performance!). Beware of bloat

Constant Expression Functions

If a function is declared as a constexpr then as long as its arguments are also constexpr and it only uses constexpr internally (which, by definition, it MUST), then it can be invoked and its value determined at compile-time. See constant expressions

Function Invocation

A function is invoked thus;-

return-value = expression(arguments)

When a function is called, a new stack-frame is created. Formal arguments are allocated within this and are initialised from the function's actual arguments. Local variables are also created within the function's stack-frame

Returning From A Function

A function is normally exited with the return statement;-

// Returning from a function with a void return type return; // Returning from a function with a non-void return type return expression;

There are actually five ways to exit a function;-

Returning an object from a function, rather than writing to it via an argument reference, is often not as expensive as it looks on the surface. move rather than copy operations are used wherever possible so if the returned object is a container/handler then it can be passed-by-value back to the caller relatively inexpensively

Some functions, notably many operators, return references

However, returning (and then using) a pointer or reference to an object that was created within the function on the stack will result in undefined behaviour as all such objects are destroyed when the function exits

Return Value Optimisation (RVO)

Argument Passing

Function arguments may be passed by value, or by reference. Pointers are passed by value and explicitly dereferenced within the function. For example;-

// a is passed by value, b is passed by reference and c is a pointer void fn(int a, int& b, int* c) { ++a; // Local variable ++b; // Implicitly dereferenced ++(*c); // Explicitly dereferenced } // Use the function int x = 0; int y = 10; int z = 20; fn(x, y, &z); // At this point, x == 0, y == 11 and z == 21

Arguments may be passed as const to prevent the function modifying them;-

void fn(int a, const int b, const int* c) { ++a; // Ok ++b; // Error: b is const ++c; // Ok *c = 42; // Error: *c is const }

Arguments may also be qualified as volatile in addition to, or instead of const

Argument Passing By Value

An Analysis Of Pass-By-Value

(Consider pass-by-value for copyable arguments that are cheap to move and are always copied)

Argument Passing By Reference

In-line with the rules for reference initialisation, a literal, constant, or a value that requires conversion can be passed as a const T& argument, but NOT as a (non-const) T&. Allowing conversions for a const T& argument ensures that it can be given exactly the same set of values as a (pass-by-value) T argument by passing the value in a temporary, if necessary

A function may also take rvalue references as arguments. The main use for these is in defining move-constructor, move-assignment operations, and forwarding functions. For example;-

// Three functions (overloaded) that take an lvalue reference, // a const lvalue reference and an rvalue reference respectively void fn(vector<int>& v); void fn(const vector<int>& v); void fn(vector<int>&& v); // Use the function vector<int> u1 {1, 2, 3, 4}; const vector<int> u2 {5, 6, 7, 8}; fn(u1); // Invokes fn(vector<int>&) fn(u2); // Invokes fn(const vector<int>&) fn(vector<int> {1, 2, 3, 4}); // Invokes fn(vector<int>&&)

For small objects (say, up to 4 words), it can be more efficient to pass-by-value. Accessing an object that has been passed-by-reference is almost always slower than accessing one that is passed-by-value so if the object is small and is accessed several times from within the function, then pass-by-value may be the faster method. This is very platform and compiler-dependant though

Non-const pass-by-reference can often be eliminated by using suitable move-constructor and move-assignment operations and returning the result in the standard way instead

Pass-by-pointer is useful when 'no object' (indicated by nullptr) is a valid option. Compared to non-const pass-by-reference, it is also more explicit even if 'no object' is not a requirement

Array Arguments

Passing an array as a function argument will implicitly pass a pointer to the start of the array; that is, an argument of type T[] will be converted to T*. Therefore, the following are (mostly) equivalent;-

void fn(int* p); // p is pointer. An array (or not) structure is not specified void fn(int a[]); // a is pointer to an array of unspecified size void fn(int v[42]); // v is pointer to an array (the number of elements is not enforced)

Passing an actual array, rather than a pointer to it, can be achieved by passing a reference. Such a reference makes the array size part of the argument type;-

// Function taking a reference to an array of 3 elements void fn(int(&x)[3]); // Use the function int a[] = {1, 2, 3}; int b[] = {1, 2}; fn(a); // Ok fn(b); // Error: Wrong number of elements

One use for the above technique is in templates where the number of elements must be deduced. For example;-

template<class T, int N> void fn(T(&a)[N]) { // N will be the number of elements }

The following example demonstrates a nasty error that can occur when using "naked" arrays to maintain a hierarchy of class objects;-

class animal { /* ... */ }; class dog : public animal { /* ... */ }; fn(animal* p, int len) { // Iterate through an array of animal objects } dog dogs[8]; fn(dogs, 8);

The above will compile but will result in a catastrophic failure. What will happen is that the array dogs[8] shall be implicitly converted to a dog* and then implicitly to an animal* (because a dog is a type of animal) in the call to fn(). The problem is sizeof(dog) is not sizeof(animal) which has obvious implications when fn() iterates through the pointer p and tries to access *p. Using containers such as std::array can help avoid these problems

Be extremely wary of any interface of the form (T*, count); if T is a base class then the results can be fatal

List Arguments

A list (indicated with {}) may be passed as a function argument as long as the values in the list can be used to initialise the specified argument type. For example;-

// Function 1, first version (overloaded) template<class T> void f1(std::initializer_list<T> x); struct S { int a; string s; }; // Function 1, second version (overloaded) void f1(S x); // Function 2 template<class T, int N> void f2(T(&x)[N]); // Function 3 void f3(int x); // Use the above functions f1({1,2,3,4}); // T is int and the initializer_list has size() 4 f1({1,"Hello"}); // f1(S{1,"Hello"}) f2({1,2,3,4}); // T is int and N is 4 f3({1}); // f4(int{1}); // This is ambiguous. It could resolve to either version of f1() // As a result, it resolves to the version with the initializer_list f1({1}); // T is int and the initializer_list has size() 1

Variable Numbers Of Arguments

A variable number (and in some cases, type) of arguments may be expressed in three different ways;-

The first two methods are described elsewhere. Here is an example of the last method; the standard 'C' printf() function. This takes at least one parameter, a plain string reference. It may also have zero or more additional parameters;-

int printf(const char* format, ...);

Within such a function, the variable arguments are accessed thus;-

int printf(const char* format, ...) { // Set up environment for accessing variable arguments va_list varg; // Specify the va_list and the last formal argument to va_start() va_start(varg, format); // Variable arguments are accessed like this (the type must be known!)... type v = va_arg(varg, type); // Subsequent calls to va_arg() will return the next argument(s) in turn // Cleanup va_end(varg); }

It is sometimes necessary to forward variadic parameters on to some other function as-is. This can be achieved if the function being called takes a va_list rather than ..., as follows;-

// Standard 'C' vprintf() int vprintf(const char* format, va_list args); int my_printf(const char* format, ...) { va_list varg; va_start(varg, format); vprintf(format, varg); va_end(varg); }

Default Arguments

Default values may be specified for function arguments, with the default(s) being used in the event the caller does not provide a value for that argument. For example;-

// Declare function with (some) default argument values void fn(int a, int b = 2, string c = "Hello"); // Use the function fn(38, 42, "Boing"); // All function arguments are as specified fn(38, 42); // Equiv. to fn(38, 42, "Hello") fn(38); // Equiv. to fn(38, 2, "Hello")

Default argument values do not have to be literals. For example;-

// Declare function with (some) default argument values static string my_name; void fn(int a, int b = 2, string c = my_name); // Use the function my_name = "Bob"; fn(38, 42); // Equiv. to fn(38, 42, "Bob") or fn(38, 42, my_name)

Overloaded Functions

Two or more functions declared within the same scope and with the same name, but different arguments are said to be "overloaded". This is useful if the functions conceptually perform the same task. For example;-

print(int a); print(string s); print(char* p, int len);

Which version of an overloaded function is called is controlled by the argument-dependent lookup mechanism

Argument-Dependent Name Lookup

When an overloaded function is called, the compiler determines which version of the function resolve to by comparing the type of each overloaded function with the caller's argument(s). The criteria for "best match" of each argument is as follows, in this order;-

In addition;-

If the function takes a single argument then the one with the "best match" is called. If the function takes multiple arguments then the called function is the one that has a "best match" for one of the arguments and a better or equal match for the others

If two functions match equally well, except for the const-ness and/or volatile-ness of their argument(s), then the non-const/volatile version shall be chosen over the const/volatile version unless the const-ness/volatile-ness of the supplied argument(s) dictates otherwise (for example, an already const reference)

If more than one function matches at the same level then the call is considered ambiguous and a compiler error is raised. One exception to this is if a templated function expands to give the exact same signature as a standard (non-template) function, then overload resolution will favour the standard function

This technique is referred to as Argument Dependent Name Lookup (ADL) or Koenig Lookup

Overloading On Forwarding References

Avoid overloading on forwarding references 

Consider two overloaded functions;-

// fn() taking a forwarding reference template<typename T> void fn(T&& a); 1 // fn() taking an int void fn(int a); 2

The above functions could be called with;-

X b{}; short c = 3; fn(b); // Ok: Calls 1 fn(3); // Ok: Calls 2 fn(c); // Error?: Calls 1 rather than 2 - probably not what was wanted

In the last case, fn(T&& a) was a better match than fn(int a) as the latter would require a promotion of the supplied argument


Alternatives to overloading on forwarding references

Possiible options are;-

None of the above alternatives allows perfect forwarding within the function (for that, one must use a forwarding reference). If this is a requirement then one solution is to use Tag Dispatch. Expanding on the above example;-

// Implementation of fn() taking a forwarding reference template<typename T> void fn_impl(T&& a, std::false_type); // Implementation of fn() taking an int void fn_impl(int a, std::true_type); // fn() taking a forwarding reference template<typename T> void fn(T&& a) { fn_impl(std::forward<T>(a), std::is_integral<typename std::remove_reference<T>::type>()); }

Function Pointers

It is possible to take the address of a function and assign it to a pointer in the same way as for an object. Two function pointer forms are supported;-

rtn-type (*fp)(args) // 'fp' is the pointer variable's name rtn-type fp(args)

The function pointer may be used to call the function. For example;-

// Define a function void fn(int a, string b); // Define a pointer of the appropriate type to hold the function's address void (*fn_p)(int, string); // Call the function fn_p = &fn; fn_p(42, "Hello"); (*fn_p)(83, "Goodbye");

A function pointer alias is defined like this (using the above example);-

// Define a function pointer type using fp = void (*)(int, string); // Use the type fp fn_p = &fn;

…or using typedef;-

// Define a function pointer type typedef void (*fp)(int, string);

A function pointer may refer to a noexcept function;-

// Define a function void fn(int a, string b) noexcept; // Define and initialise a pointer of the appropriate type to hold the function's address void (*fn_p)(int, string) noexcept = fn;

Example of a function taking a function pointer;-

void fn(int (*fnp)(double)) { int v = fnp(1.23); } // ...or using the alternate form... void fn(int fnp(double)) { int v = fnp(1.23); }

Lambda Expressions

A lambda expression facilitates the definition of an anonymous function object (though it can also be named). It is a shorthand to the notion of defining a class with an operator(), making an object of that type and then invoking it. Lambda expressions may be passed to functions as an operation for the function to execute

A lambda expression is defined like this;-

auto lambda-name = [capture-list](argument-list) mutable noexcept -> type {body} auto lambda-name = [capture-list](argument-list) mutable noexcept {body} auto lambda-name = [capture-list] mutable noexcept {body} // See note below re 'mutable'

All lambda expressions consist of the following components;-


It is possible to emulate init capture in C++11 using std::bind;-

auto px = std::make_unique<X>(); auto fn = std::bind([](const std::unique_ptr<X>& px) { /* ...use px... */}, std::move(px));

This works because std::bind move-constructs any of its members initialised from rvalues (which is exactly what std::move() produces). The lambda expression takes an lvalue reference to the captured pointer px. Note that it does not take an rvalue reference because although the initialisation value (returned from std::move()) is an rvalue, the member inside the bind object is an lvalue. Therefore, when the lambda expression executes (ie, when the closure's operator() operator is invoked), it operates on the move-constructed px member of the bind object

By default, the px member of the bind object is not const, and so in order to prevent the lambda expression from modifying it (ie, to maintain the same behaviour as a stand-along lambda expression), px is passed to the lambda explicitly as const

Because std::bind maintains copies of all its argument, the lifetime of the closure is the same as its parent bind object. It is therefore possible to treat objects within the bind object as if they were within the closure


The following example shows a function object that outputs all values from a vector that meet the criteria (v[i] % m) == 0

class modulo_print { // Members to hold the capture list ostream& os; int m; public: modulo_print(ostream& s, int mm) : os(s), m(mm) {} // Capture void operator()(int x) const { if (!(x % m)) { os << x << endl; } } }; // Output v[i] to os if (v[i] % m) == 0 void print_modulo(const vector<int>& v, ostream& os, int m) { for_each(begin(v), end(v), modulo_print{os, m}); }

This works because the for_each() function template implicitly appends a () to its third argument (ie, it calls operator() for the objects it iterates through). Therefore, the example first constructs a modulo_print object with the initialisers os and m and then uses that object repeatedly by calling operator()(int x) where x is an element of vector v. This demonstrates an extremely useful technique

Defining operator() as const is the usual case, but not compulsory. Here is the equivalent lambda expression for the above code;-

// Output v[i] to os if (v[i] % m) == 0 void print_modulo(const vector<int>& v, ostream& os, int m) { for_each(begin(v), end(v), [&os, m](int x){ if (!(x % m)) { os << x << endl; } }); }

…or we could name the lambda…

// Output v[i] to os if (v[i] % m) == 0 void print_modulo(const vector<int>& v, ostream& os, int m) { auto modulo_print = [&os, m](int x){ if (!(x % m)) { os << x << endl; } }; for_each(begin(v), end(v), modulo_print); }

Defining the lambda expression as mutable would be the equivalent of defining the above operator() as non-const

Here is the same function using a range-for loop. For this simple example, this could be considered the best option;-

// Output v[i] to os if (v[i] % m) == 0 void print_modulo(const vector<int>& v, ostream& os, int m) { for(auto x : v) { if (!(x % m)) { os << x << endl; } } }

Avoid using default capture modes

There are two default capture modes; by-reference [&], and by-value [=]

The problem with the former is that it can lead to dangling references. The problem with the latter is that it implies that the resulting closure is self-contained. This is not necessarily the case

By-reference capture can lead to a dangling reference if the lifetime of the closure exceeds that of the (usually local) referenced variable or argument

There is nothing magic about (say) [&var1, &var2] over the default capture mode [&], but the former is explicit (and therefore doesn't ‘hide’ the references to the specific variables), narrows scope, and is generally good practice

By-value capture does not always isolate the lambda expression from relative lifetime problems; capturing a pointer by value does not prevent the pointed-to object being prematurely deleted

A lambda expression capture, ie [...] only applies to (ie, only captures) non-static, local variables and function arguments that are within scope where the lambda expression is defined. Therefore, given the following example where a lambda expression is passed to fn2();-

class X { int m = 3; public: void fn() { fn2([=](int a) { return a / m; }); } };

The above code will compile and run. However, because the capture [=] applies only to local variables, the single variable actually captured is this, and not this->m. Within the lambda, this-> is implied and is automatically applied to the reference of member m. It is as if the above were written as;-

class X { int m = 3; public: void fn() { auto p_obj = this; fn2([p_obj](int a) { return a / p_obj->m; }); } };

Incidentally, a capture of [m] would fail (m is not a local variable), as would an empty capture [] (not capturing this)

Because the member m is referenced from this, the lambda expression in this example is dependent on the lifetime of the this object. Therefore, if fn2() saved the supplied closure for later use before returning, it would contain a dangling this pointer if the X object were deleted prematurely

The fix for this problem is to rework fn() so that it makes a local copy of the member before defining the lambda expression;-

void fn() { auto b = m; fn2([b](int a) { return a / b; }); }

Here is a tidier approach using init capture;-

void fn() { fn2([m = m](int a) { return a / m; }); }

A similar issue occurs with objects that are statically allocated; global and static variables (defined at file, namespace, function or class scope). Such objects cannot be captured by a lambda expression, but they can nonetheless be accessed from within it, and are, at the same time, modifiable from outside of the lambda's scope. Using a default by-value capture [=] may give the false impression that such a lambda expression has acquired local copies of all its values and is self-contained, when in reality, any global or static variables are used as if captured by-reference

Generic Lambda Expressions

A lambda expression may take arguments of type auto. For example;-

auto is_lt = [](const auto& p1, const auto& p2) { return p1 < p2; }

The above lambda expression will take two arguments of any type as long as the operator < may be applied to them

Note also the use of an auto return type. In the above example, the result will always be of type bool but this may not always be the case as shown in the following example;-

auto get_least = [](const auto& p1, const auto& p2) { return std::decltype((p1 < p2) ? p1 : p2); }

Another example;-

template<typename T> int fn(T a); auto c = [](auto b) { fn(b); };

Variadic Lambda Expressions

Lambda expressions can be variadic;-

auto d = [](auto&&... args) { return fn(std::forward<decltype(args)>(args)...); };

This example also uses generic arguments, and may be called like any other variadic function

Classes

A class is a user-defined type that provides a framework for defining data elements along with the functions and operations that relate directly to the data elements. A class allows a specific concept to be encapsulated into a single entity with a finite well-defined set of interfaces that abstract-away the internal complexities of the concept

Compartmentalising functionality in this way also helps the compiler detect incorrect use, and improves subsequent understanding

A class is declared like this;-

class class-name { // Class members are declared here }

Here is an example of a simple class;-

class X { private: // All members declared here represent the implementation int m; // A data member public: // All members declared here represent the interface // A constructor X(int a) : m{a} {} // A member function declaration void put_m(int a) // A member function definition int get_m() { return m; } };

…and a simple use of it;-

// Create an object of type X and explicitly initialise it X my_x{42}; // Call a member function int x_m = my_x.get_m();

Empty Base Class Optimisation

There is a caveat to the rule stating that sizeof(X) is always > zero even if X is an empty class. That is, a compiler may optimize-away the non-zero size of an empty base class. This allows an empty base class to be used without any overhead. This is referred to as the Empty-Base Optimisation or EBO


Don't make assumptions about the internal layout of objects in memory. The address of a derived class object may not be the same as the address of its base. Members may not be laid-out in memory in the order specified in the source code

The exception is that of POD types

Class Structure

Member Access Control

All members of a class fall into one of three access control specifiers;-

SpecifierDescription
private:Member is visible to member functions of the same class and to friends of the class
protected:Member is visible to member functions of the same class and to friends of the class, and to any member functions and friends of derived classes
public:Member is visible to all, including from outside the class

The specifiers are used like this;-

class X { private: // All members declared here will be private protected: // All members declared here will be protected public: // All members declared here will be public };

Take special care when using protected. Such members are more open to abuse than private members

In almost all cases, a protected interface should be restricted to exposing internal types, member functions and constants/enumerations. The need to define protected data members is usually a sign of a design error

Like exposing too many public members, protected members can easily lead to maintenance issues because of the scope of external access and external reliance on the members

Notwithstanding the above, protected member functions can sometimes provide an efficient and more closely controlled implementation platform for derived classes than could be achieved by other means

An example that makes use of protected is also available

Member Access

If a class is derived from multiple base classes, then ambiguities can arise that need to be resolved. For example;-

class Z { public: int m; }; class Y { public: int m; }; class X : Y, Z { }; X a;

If a reference were made to a.m then it would be unclear as to which m was being specified. One method would be to qualify the reference. For example, a.Z::m or a.Y::m

There are other (and often better) methods as well

Type Members

Nested Enumerations

Enumerations may be defined within a class like any other type;-

class X { enum colours { red, blue, yellow }; // Ok enum animals; // Error: Declaration with no base type specified enum fruit : unsigned char; // Ok: Declaration with base type specified enum class buildings; // Ok: enum class declaration }; // External definitions of the above X enum declarations enum X::animals { cat, dog, moose }; enum X::fruit : unsigned char { orange, apple, banana }; enum class X::buildings { house, shed, igloo };

Nested Classes

Data Members

A data member is declared exactly like any non-member type. For example;-

class Y {}; class X { int a; // A member of type int Y b; // A member of type Y };

Static Data Members

A class data member may be declared static;-

class X { static int m1; // A static data member }; // Define m1 and give it a default value int X::m1{42};

One use of static members is to hold a default initial value for an object. This can be set statically or via some set() function and picked-up by the class' default constructor, eg, X{} and used to initialise any new object

One consequence of this approach is that it is not necessary to provide a separate function to read the default value; simply creating a default-constructed object is sufficient

Member Functions

A class member function is within the scope of the parent class, has access to all members of the class, and must be invoked with reference to an object of the class type (ie, it has a this pointer)


const Member Functions


Function Reference Qualifiers

To make a function accept only an lvalue argument, declare it with a non-const lvalue reference such as void fn(X& a);. Similarly, to make a function accept only an rvalue argument, declare it with an rvalue reference such as void fn(X&& a);

Function reference qualifiers specify the same criteria but apply it to a member function's host object;-

class Y {}; class X { private: Y m{}; public: // Returns a copy of this 'X' X get_x() { return *this }; // This version of fn() will be used if *this is an lvalue // It uses copy semantics Y fn() & { return m; } 1 // This version of fn() will be used if *this is an rvalue // It uses move semantics Y fn() && { return move(m); } 2 }; // Use the above functions X a{}; Y b = a.fn(); // Invokes 1 Y c = get_x().fn(); // Invokes 2


Values passed to functions are only eligible for implicit type conversion if they are listed in the function's argument list. The upshot of this is that for a member function, this  is never eligible for implicit type conversion. This is why;-

class X { public: X(); X(int a); X operator+(const X& rhs); }; class Y { public: operator X(); // Conversion function from 'Y' to 'X' }; X a{}; Y b{}; X c = a + 3; // Ok: This works because '3' is implicitly converted to an 'X' X d = 3 + a; // Error: This fails because the built-in type 'int' does not // support conversion of 'X' to 'int, and the alternative of '42' // being converted to an 'X' fails because '42' is 'this', and // therefore is not implicitly converted to an 'X', despite 'X' // declaring a suitable constructor, X(int a) X e = a + b; // Ok: This works because 'b' is implicitly converted to an 'X' X f = b + a; // Error: This fails because 'a' is 'this', remains a type 'Y', // and is not implicitly converted to an 'X', despite 'Y' // declaring a suitable conversion operator

Avoid returning/creating "handles" to object internals

A handle can take the form of a reference, pointer, iterator, or some other type. Returning a handle to a type's internal representation from a member function breaks encapsulation, especially if the handle refers to a private data member (or member function). When returned from a const member function, it also negates the ethos of the const because the caller can use the returned handle to change the internal representation

A const handle could be returned, and doing so controls the extent of the encapsulation breach somewhat; it may be perfectly reasonable for the application to recover details of the type's internals (at least some of them). However, if the returned reference outlives the member it refers to then undefined behaviour will result if the reference is used. This situation is almost inevitable if the host object is a temporary formed as part of an expression. The temporary shall be destroyed at the end of the expression leaving any handle that was acquired from it as part of the expression, dangling. Similarly, if the caller makes a reference from a returned temporary object. For example;-

// The 'X' object shall be destroyed after get_ptr_to_y_member() // returns, leaving 'hndl1' dangling Y* hndl1 = X{}.get_ptr_to_y_member(); // Here, the 'Y' object returned from get_copy_of_y_member() shall be // destroyed at the end of the expression, leaving 'hndl2' dangling Y* hndl2 = &X{}.get_copy_of_y_member();

Of course, some functions must return a handle, such as operator[] and operator-> but such functions are the exception

'this' Pointer

Deleting Member Functions

It is possible to delete a member function by specifying = delete. For example;-

X::fn() = delete; // Delete fn()

This is most often used to delete unwanted compiler-generated special member functions, but it can be used to delete other functions as well. Here are some other common cases;-


Static Member Functions

A static member function is within the scope of the parent class and has access to all members of the class. may be declared static;-

class X { static int m1; // A static data member int m2; // A non-static data member static void fn(int a); // A static member function }; // Define fn() X::fn(int a) { m1 = a; // Ok m2 = a; // Error: fn() has no access to non-static members }

Friends

A friend function has access to all members of the specified class but is not a member of the class and is not within the scope of the class. That is, it grants explicit access to the protected and private members of the class

A friend is most useful when operations involving two different classes are required. For example, given the classes X and Y, it may be required to provide an addition function of the form a + b (where a is of type X and b is of type Y)

One way of handling this would be to provide an operator+ member function for X that takes a Y as a second argument. This would work but adds a dependency on Y for X

A non-member function that takes an X and Y argument could also work but without direct access to the internals of both types, it could suffer from efficiency issues

A friend function expands on this last option by allowing direct access to the internals of both types. For example;-

class X { friend X operator+(const X& lhs, const Y& rhs); }; class Y { friend X operator+(const X& lhs, const Y& rhs); }; // Define the friend operator+ function X operator+(const X& lhs, const Y& rhs) { /* ...can directly access members of lhs and rhs */ } // Use our operator function X a; Y b; X c = a + b;

Friend functions are often thought of as a cludge or a symptom of a poor design. This is a misunderstanding of how they should be viewed. Think of them more as a compliment/alternative to a member public interface. In some cases, they can provide the cleanest solution to a class inter-dependency requirement

Friend classes allow closely related concepts to be expressed. A complex pattern of friends is a likely indication of a design error

Invariants

An invariant is a condition or state that we can assert to be true at all times. In the case of a class, it essentially refers to the notion of a guarantee that a class object always contains sane and consistent values. The invariant is established by the class constructor, is maintained by all functions that operate on the class and exists until the object is destroyed. This concept is central to the ethos of a class and is especially important for exception handling

Connected to this is the idea of preconditions and postconditions with regard to functions

Class Life Cycle

An object's life begins at the end of its construction and continues until the start of its destruction. Between these two operations, the object may be copied (to create a new object) and moved from one place to another

The constructor, copy, move and destructor operations work together; to be efficient and error-free, they must be considered as a logically connected whole, rather than as four individual parts

For details of what happens if an object is only partially constructed or partially destroyed, see Throwing An Exception

There are five situations in which an object is copied or moved (unless the operation is optimised-away);-

Special Member Functions

The compiler may generate some functions automatically. These functions are called special member functions. Here is a summary of them (where X is a user-defined class);-

Function PrototypeDescription
X();Default constructor; initialise an object 
X(argument-list);Constructor; initialise an object 
X(const X& rhs);Copy constructor 
X& operator=(const X& rhs);Copy assignment; clean-up target and copy 
X(X&& rhs); Move constructor 
X& operator=(X&& rhs); Move assignment; clean-up target and move 
~X();Destructor; clean-up 

There are some inter-dependency rules that dictate when the compiler will and will not generate the special member functions. Much of the rationale here is that if a user-defined version of these functions is defined, then it is almost always because there is some special resource management to be done. This being the case, that 'special resource management' probably needs doing for all the functions (or at least the function's 'twin'). Unfortunately, legacy support issues prevent these rules being as consistent as they ought to be;-

The above rules do not apply to member function templates for constructor and/or copy/move operations. Consider;-

class X { // ... public: template<typename T> 1 X(T& rhs); template<typename T> X& operator=(T& rhs); };

Adopting a policy of always declaring the copy and move operations and the destructor as = default, and changing them as and when necessary, can avoid some subtle future problems. One of these problems is the silent loss of compiler-generated move operations if a copy operation or destructor is added; the resulting code will probably (but not necessarily) still work but with a performance hit

It may be useful to point out that this;-

struct X { int a{}; int b{}; X() {} // Default constructor ~X() noexcept {} // Destructor X(const X& rhs) : a{rhs.a}, b{rhs.b} {} // Copy constructor X(X&& rhs) : a{move(rhs.a)}, b{move(rhs.b)} {} // Move constructor X& operator=(const X& rhs) // Copy assignment { a = rhs.a; b = rhs.b; return *this; } X& operator=(X&& rhs) // Move assignment { a = move(rhs.a); b = move(rhs.b); return *this; } };

…is equivalent to;-

struct X { int a{}; int b{}; X() = default; // Default constructor ~X() = default; // Destructor X(const X& rhs) = default; // Copy constructor X(X&& rhs) = default; // Move constructor X& operator=(const X& rhs) = default; // Copy assignment X& operator=(X&& rhs) = default; // Move assignment };

Special member functions and the pimpl idiom

When using the pimpl idiom, define special member functions in the implementation file; not the header file (assuming use of std::unique_ptr)

The pimpl idiom essentially moves the implementation out of the (primary) class into some other (implementation) class. The original (primary) class then becomes just a 'handle' and (generally) only contains a single member that references the implementation class

The removal of the dependency on the types used by the implementation often leads to incomplete types being declared in the header file. This in itself is not a problem, but it can cause problems for the primary class' special member functions if defined in the header file. Such functions only 'see' the incomplete data types and this often leads to compiler errors. Compiler-generated special functions are implicitly inline and therefore can unwittingly cause this problem too

Here are some cases that explain what can happen and how to fix each one;-

Note: None of the above precautions are necessary if the implementation class is maintained by a std::unique_ptr reference; it is perfectly safe to implement the special functions in the primary class' header and for them to use incomplete types. This works because of the different way a std::unique_ptr implements custom deleters, compared to a std::unique_ptr; std::unique_ptr types are more efficient but at the cost of requiring a complete type at the point of destruction

User-Provided Functions

The term User-Provided only applies to Special Member Functions

A function is not User-Provided if is it not explicitly declared at all in the class declaration, or it is declared as = default or = delete in the class declaration

Note that final point! A function may be declared as = default in the declaration;-

struct X { X() = default; };

…or in the definition;-

// Declaration struct Y { Y(); };
// Implementation Y::Y() = default;

In the above examples, X::X() is not considered User-Provided, but Y::Y() is

Whether or not a constructor is User-Provided can have important (possibly adverse) effects on initialisation; see this example

To avoid possible serious issues with initialisation;-

Constructors

A constructor is a member function that provides a mechanism for always ensuring an object of a particular class is initialised. For example;-

class X { public: // Some constructors... X(); X(int a); X(int a, float b, bool c = false); };

As long as the pattern of establishing an invariant from the very start is adhered to, and following through with other member functions maintaining that invariant, the class members do not have to handle the possibility of uninitialised data members or data members with illegal values; something that can simplify the code a great deal

Copy Constructors

See Copying

Move Constructors

See Moving

Constant Expression Constructors

A constructor can be declared a constexpr. This allows a user-defined literal type to be defined. For example;-

class X { int m1; float m2; public: constexpr X(int a = 0, float b = 0.0) : m1{a}, m2{b} {} }; // Create some literals of type X X a{3, 83.7}; constexpr X b{8, 42.9}; // Guaranteed compile-time construction

explicit Constructors

If required, when a constructor is invoked its argument(s) are subject to implicit type conversion just like any other function. If the constructor takes exactly one argument then this implicit type conversion can lead to obscure but legal uses. For example;-

// A class definition with a constructor taking a single argument class X { public: X(int a); }; // A function that takes type X as an argument void fn(X a); X b{83}; // Explicit construction X c = {83}; // Implicit (copy) construction // Call fn() fn(42); // ...or... fn({42});

In the above example, the call fn(42) implicitly constructs an X object, initialises it with 42 and then passes the object to fn(). However, taken in isolation this is not at all clear from the simple call fn(42)

To prevent implicit type conversion and obscure uses such as the above example, define the constructor explicit. For example;-

class X { public: explicit X(int a); }; X b{83}; // Explicit construction X c = {83}; // Error: Implicit (copy) construction // These two calls will now fail (implicit conversion) fn(42); fn({42}); // To call fn(), we must now be explicit in our construction of X fn(X{42});

initializer_list Constructors

An initialiser-list constructor is one that takes a single argument of type std::initializer_list<T>

Such a constructor is used to initialise a class using a list {} as an initialiser value . For example;-

vector<int> v1{1, 3, 5, 7, 11};

Here is an example of how a container class might define and use an initializer_list constructor;-

template<class E> class Vector { public: Vector(std::initializer_list<E> s); // initializer_list constructor // ... private: int sz; E* elem; }; template<class E> Vector::Vector(std::initializer_list<E> s) :sz{s.size()} // set vector size { reserve(sz); // Acquire some space uninitialized_copy(s.begin(), s.end(), elem); // initialize all elements in elem };

An initializer_list constructor provided with an empty {} list that does something different to a default constructor is probably a design fault

Delegating Constructors

Often, overloaded constructors need to do the same thing. Options for dealing with this are;-

Here is an example of delegating to another constructor;-

class X { int m1; int m2; Default-initialised in this example public: // A constructor that checks its argument before assigning it to m1 X(int a) { if (a > 0 && a < 21) m1 = a; else throw Bad_X(a); } // A Constructor that invoke the previous constructor X() : X{42} {} };

Delegation is NOT the same as explicitly calling a constructor like this;-

X() { X{42}; }

What this does is create a new unnamed object and then does nothing with it

Conversion Constructors

Note: This section does not describe a particular variation of a constructor; what is described is really just a normal constructor. What is highlighted here is a particular concept of using a constructor

A constructor may be defined that takes a number of arguments of various types and builds a user-defined type X from them. For example;-

class X { int m1; float m2; public: // Digression: These first two constructors could be eliminated if the last // constructor had default argument values X() { m1 = 0; m2 = 0.0; } X(int a) { m1 = a; m2 = 0.0; } X(float b) { m1 = 0; m2 = b; } X(int a, float b) { m1 = a; m2 = b; } };

Unless a constructor is defined explicit, it may also be invoked implicitly;-

fn (X a); fn({}); fn({1}); fn({2.3}); fn({4, 5.6});

In this way, a constructor can act as a conversion function at the cost of creating a new object

By relying on these constructors, we are able to reduce these operators;-

X operator+(X, X); X operator+(X, int); X operator+(int, X);

…to just one (the first one). The conversion provided by the constructor X(int a) will create an X-type object thus eliminating the need for the second and third functions

Destructors

A destructor is a member function that provides a mechanism for dismantling an object and releasing any resources it may hold. It must compliment the constructor exactly if problems are to be avoided. It is declared thus;-

class X { public: ~X(); // Destructor };

Virtual destructors

When and when not to use a virtual destructor

Base And Member Construction And Destruction

Constructors and destructors need to handle class hierarchies. In this respect, a constructor builds a class from the "bottom up";-

A destructor dismantles a class in the reverse order;-

This procedure ensures that no members are accessed before they are initialised or after they have been destroyed. See also Destructors

See also derived classes and slicing


A derived class' constructors must explicitly pass initialisation arguments down to its base class(es). If it does not then the base class(es) shall be constructed with its default constructor (if one is defined; if not, then a compiler error shall result). Default construction of the base may or may not be what is wanted. For example;-

class Y { int m1{}; public: Y() {} // Default constructor 1 Y(int a) : m1{a} {} 2 Y(const Y& rhs) : m1{rhs.m1} {} // Copy constructor 3 Y& operator=(const Y& rhs) { m1 = rhs.m1; return *this; } // Assignment operator 1 }; class X { int m2{}; public: X() {} // Default constructor 4 X(int b) : m2{b} {} 5 X(int a, int b) : Y{a}, m2{b} {} 6 X(const X& rhs) : Y{rhs}, m2{rhs.m2} {} // Copy constructor 7 X& operator=(const X& rhs) { Y::operator=(rhs); m2 = rhs.ms; return *this; } // Assignment operator 2 }; X c{}; // Calls 4, and implicitly 1; 'm1' default-initialised X d{42}; // Calls 5, and implicitly 1; 'm1' default-initialised X e{3, 8}; // Calls 6, and 2; 'm1' initialised to '3' X f{e}; // Calls 7, and Y's copy constructor 3; 'm1' initialised to '3' c = e; // Calls 2, which in-turn calls Y's copy-assignment operator 1; // 'c.m1' set to '3', 'c.m2' set to '8'

Direct Member Initialisation

A class defined without a constructor may be initialised with a list of values for its members just like a struct can. It is value for memberwise, default and copy initialisation;-

class Z { public: int m1; float m2; string m3; }; // Memberwise initialisation Z a{42, 3.14, "Hello"}; // m1 = 42, m2 = 3.14, m3 = "Hello" // Default initialisation (when specified without an initialiser or with an empty {}) Z b{}; // Equiv. to 'Z b{{}, {}, {}}' which equates to 'Z b{{0}, {0.0}, {""}}'; // m1 = 0, m2 = 0.0, m3 = "" // Copy initialisation Z c{a}; // m1 = 42, m2 = 3.14, m3 = "Hello"

Data Member Initialisation

A constructor may initialise its members via code within the body;-

class Z { int m; public: Z(int a) { m = a; } };

Member Initialiser List

Constructor arguments may be passed on for base and member initialisation directly as a member initialiser list, using a qualifying : notation. For example;-

// Base class class X { public: X(int a) { /* ... */ } }; // Class with a base class class Z : X { int m1; float m2; // Default initialised public: Z(int a, int b) : X{a}, m1{b} { /* ... */ } };

To avoid confusion, list members in constructor initialisation lists in the same order as the members are declared. Many compilers will issue a warning if this is not the case

In-Class Member Initialisation

Class data members may be initialised with default values just like any other object;-

class Y { // Set default values for all Y's data members int m1{42}; float m2{3.1416}; public: // Some constructors that may perform varying amounts of explicit initialisation Y() {}; Y(int a) : m1{a} {}; Y(int a, float b) : m1{a}, m2{b} {}; }; // Create some objects of above type... Y a{}; // m1 == 42, m2 == 3.1416 Y b{83}; // m1 == 83, m2 == 3.1416 Y c{83, 6.626e-34}; // m1 == 42, m2 == 36.626e-34

In this example, an object of type Y may be constructed with various combinations of specific values. For member initialisation not explicitly dealt with in each constructor though, the member(s) shall be initialised with default values

In-class initialisation is a good method for reducing clutter and ensuring consistency in the constructors if initialisation to the same values is required in each case. It becomes more useful as the number of constructors increase in number and complexity

Copying

A copy operation of the form x = y sets the value of x to be equal to that of y and leaves y unaffected

There are two copy operations; the copy constructor and the copy assignment operator;-

X(const X& rhs); // Copy constructor X& operator=(const X& rhs); // Copy assignment, pass-by-reference X& operator=(X rhs); // Copy assignment, pass-by-value (this version will not be compiler-generated)

Example invocations for the above are;-

X a{}; // Default construction X b{a}; // Copy construction b = a; // Copy assignment

A copy operation should always copy all object members. Failure to do so can be hidden because the 'gaps' will often be filled by appropriate default constructors. Chances of error increase for types with many members and/or types that are likely to be modified in the future

This is one major reason to use the default copy operations wherever possible; they don't forget!

Slicing can also be a cause of errors as can notions of "deep" and "shallow" copy (or similar)

Sometimes, a class may contain members that are not considered part of its value; that is, they do not effect the == operation. Whether these members are copied or not depends on the type

Take care to ensure that all copy operations provide the properties of equivalence and independence . Not proving these properties can be the cause of some nasty bugs. Also note that the standard library relies on these two properties for correct operation

Breaking the independence property should be avoided as such use often requires some form of garbage-collection and/or handler object such as a shared_ptr

The independence property can usually be safely relaxed for mutable members; maybe implementing some form of copy-on-write scheme if required

Although it is possible to define the copy (and move) operators with different argument and return types such as X(X& rhs); or const X& operator=(volatile X& rhs);, don't do it. It will only cause confusion and error

Copy-to-Self

Copy-to-self can easily happen unintentionally, as a result of using aliases or pointers. Another possibility is that given two pointers of differing types, one could be a base of the other, but the same object

One obvious technique for handling copy-to-self is to compare the resource handles and act appropriately

Another technique provides safe copy-to-self while avoiding explicit checking

Slicing

A pointer to a derived class is implicitly convertible to a pointer of any of its public base classes

It is therefore possible to perform a copy operation in terms of two base-class pointers. For example, if class X were derived from Y then two pointers to objects of type X could be passed to a function copy(Y* lhs, const Y* rhs). That function could then perform the operation *lhs = *rhs

This operation would execute but it would only copy the Y members and not the X members. This effect is called slicing and results in an object that is only half-copied. This can be the intended (correct) outcome but it can also be a bug; possibly in violation of the object's invariant and possibly broken in other ways

If you don't want to allow slicing for a type then there are two ways of preventing it;-

Copy Example

Probably the most common reason for redefining the copy operations is to handle the case where a member holds a reference to some resource. The compiler-generated copy operations will copy the members (ie, just the reference) leaving the assigned-to object with a reference to the same resource as the assigned-from object. This is almost always not what is required. In this case, the copy operations can be redefined to copy not only the members but also the referenced resources

The following example demonstrates this. This is a very simple container class that exhibits this problem. This example also uses a base class to demonstrate how to correctly handle this as well. For brevity, the base class details are not shown but the remainder of this example assumes the base class Y is essentially a repeat of the shown class X;-

class Y { int* yp{} // ...a similar set of member functions to those shown for X below... }; class X : public Y { int* xp{}; public: // Constructor X(int a, int b) : Y{a}, xp{new int{b}} {} // Copy constructor X(const X& rhs) = default; // Copy assignment X& operator=(const X& rhs) = default; // Destructor ~X() { delete xp; xp = nullptr; } };

As it stands, the above example will not work correctly because on a copy, only the resource pointers are copied and not the pointer contents; ie, the copy operation does not uphold the property of independence. On destruction, run-time errors are likely because of multiple attempts to delete the same resource pointer. In addition, resources that were allocated at construction could be leaked because the pointer is re-assigned without first deleting the original referred-to object

These problems are caused because the compiler-generated copy constructor and copy assignment operations for X and Y do not do what is required. They copy X::xp and Y::yp but not what X::xp and Y::yp point to. To correct this, we must define our own copy constructor and copy assignment for X (and Y, though this is just a repeat of X);-

public: // Copy constructor X(const X& rhs) : Y{rhs}, xp{new int{*rhs.xp}} {} // Copy assignment X& operator=(X rhs) { Y::operator=(rhs); // Not needed if there is no base class *xp = *rhs.xp; return *this; }

With the above modifications, the copy operations for X will now behave correctly

Note the different copy operations used; copy construction must deal with uninitialised memory. Copy assignment does not, but it must deal with the possibility that the object being assigned-to already contains resources that must be released before re-assigning new resources (though such complexity is not apparent in this simple example)

The example above provides a basic exception guarantee for the copy-assignment operator; the copied-to object will be left with part-old/part-new data if an exception is thrown part-way through copying the members or the data that the members refer to

A strong exception guarantee can be achieved by defining a swap() function and modifying the copy-assignment operator as follows. The cost of this is that it uses a temporary object;-

public: // Copy assignment X& operator=(X rhs) noexcept { cout << "X::op= 1" << endl; swap(rhs); cout << "X::op= 2" << endl; return *this; } void swap(X& rhs) noexcept { cout << "X::swap()" << endl; using std::swap; Y::swap(rhs); // Not needed if there is no base class swap(xp, rhs.xp); }

The technique requires the creation of a temporary object. This being the case, the example uses the pass-by-value version of operator= rather than the pass-by-reference version. This will create the temporary object at the point of call (rather than within the function body with (say) X tmp{rhs};) which in some cases may produce better code; for example, if the rhs argument was an rvalue then the compiler may be able to optimize-away the temporary or choose a move operation instead which would be more efficient

The copy assignment operations shown here do not check for copy-to-self. As they are, they will still work in such cases. Checking for copy-to-self adds overhead for what is a rare case; it is often preferable to avoid the check unless it is really needed. Obviously, copy-to-self should always work (even if it is not as efficient as it could be)

See swap() for full details on correctly implementing swap()

Handling copy in a resource-management class

For most resource-managing classes, the appropriate way to deal with copying is either;-

In summary, the copying behaviour of the controlled resource should dictate the copying behaviour of the resource-managing class

Consider the following example of an socket resource manager; when the socket_hdl is destroyed, it ensures that the underlying socket handle is closed;-

class socket_hdl { int hdl{-1}; public: socket_hdl() : hdl{ ::socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP); } {} ~socket_hdl() { ::close(); } };

However, if such an object were copied (which is not an unreasonable operation for a socket handle) then the close() operation would have to be deferred until the last instance were destroyed. This is probably best handled with an internal reference count (not shown) with the destructor's call to close() being conditional on the reference count indicating the last instance

The std::unique_ptr and std::shared_ptr allow an optional deleter function to be specified during construction. Such a function is called just prior to destruction of the controlled object and is generally used to address the same type of problem as the previous socket_hdl example addresses. The following example shows an alternative to socket_hdl that makes use of std::shared_ptr reference counting to control when to close the socket handle; the close() operation will only ever be called once;-

std::shared_ptr<int> a{ new int{ ::socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) }, [](int* hdl){ ::close(*hdl); } };

Another example is std::lock_guard which ensures that the mutex it maintains is unlocked before being destroyed; this is important because the behaviour of a std::mutex that is destroyed whilst still locked is undefined

Moving

A move operation of the form x = y sets the value of x to be equal to that of y prior to the move, and sets y to some moved-from value/state

The primary goal of move over copy is to improve performance. However, move operations do not guarantee better performance, and may not exist at all for a particular type

There are two move operations; the move constructor and the move assignment operator;-

X(X&& rhs); // Move constructor X& operator=(X&& rhs); // Move assignment

A valid moved-from value/state does not necessarily require that all members of the moved-from object must be zero'd/nulled. A member that is a pointer to some resource probably should be nulled, but there is often no need to zero/null (say) a member that is a fundamental type unless it happens to be associated in some critical way with a more complex object. The key thing is to maintain the class invariant and object independence  and (preferably) leave the moved-from object in a state that can be easily handled by the destructor

Don't implement move in terms of swap(). It is not as efficient as a simple copy; it typically involves additional, unnecessary assignment. It also means that the moved-from object hangs-on to resources longer than it needs to; the resource would then typically only be released at the end of the calling expression. If the resource is very expensive/large, even this small delay could cause issues for the program

The correct way to implement move is to perform a copy (move) of the resources and then nullify the moved-from resources, all within the move operation

Built-in types and pointers are considered to have implicit move operations defined as copy-assignments. The moved-from member is NOT zero'd/nulled

In principal, this can raise the danger of two objects being entangled via member pointers which could lead to an attempt to delete a member pointer twice at destruction. However, a default destructor would not attempt to delete such a pointer anyway and because of the inter-dependency rules, a user-defined destructor and move operation(s) have to be defined together. So, if there is still a problem, it is the application writer's fault!

Construction, Copy And Move Predicates

There are a number of predicates that may be used to test whether or not certain operations are allowed of a specified type;-

PredicateDescription
std::is_default_constructible<typename T>Returns true if T is default constructible
std::is_trivially_default_constructible<typename T>Returns true if T is trivially default constructible (ie, will not call any non-trivial constructors)
std::is_nothrow_default_constructible<typename T>Returns true if T is noexcept default constructible
std::is_constructible<typename T, typename... ARGS>Returns true if T is constructible from the specified arguments
std::is_trivially_constructible<typename T, typename... ARGS>Returns true if T is trivially constructible (ie, will not call any non-trivial constructors) from the specified arguments
std::is_nothrow_constructible<typename T, typename... ARGS>Returns true if T is noexcept constructible from the specified arguments
std::is_assignable<typename T, typename RHS>Returns true if T is assignable from the specified object
std::is_trivially_assignable<typename T, typename RHS>Returns true if T is trivially assignable (ie, will not call any non-trivial assignment operations) from the specified object
std::is_nothrow_assignable<typename T, typename RHS>Returns true if T is noexcept assignable from the specified object

In addition, the following predicates test for copy-constructible, copy-assignable, move-constructible, and/or move-assignable. They all follow the same format. In each case, substitute OP with copy or move, and substitute CA with constructible or assignable as required to give all 12 combinations;-

PredicateDescription
std::is_OP_CA<typename T>Returns true if T is OP-CA
std::is_trivially_OP_CA<typename T>Returns true if T is trivially OP-CA (ie, will not call any non-trivial OP constructors/assignment operations)
std::is_nothrow_OP_CA<typename T>Returns true if T is noexcept OP-CA

If an operation (construction, copy, or move) is described as trivial, it means that the data can be copied manually as if by std::memmove(). All POD types are trivially copyable

A type T is considered to be trivially constructible/copyable/movable if the following holds true;-

As it is good policy to ensure that all copy and move construction and assignment behave consistently, in reality, a type that behaves trivially for one construction/copy/move operations should behave likewise for all operations

Mutability

An object may be variable (mutable) or constant (immutable

Declaring a non-member function that promises not modify an object is achieved by declaring the argument(s) const ()

A member function may be specified to promise not to modify the state of its parent object with the const suffix. In this example, the function get_m() has no need to modify its parent object and so may be declared const;-

class X { int m; // ... int get_m() const { return m; } };

Sometimes, it is desirable for a member function to be logically const, but for it actually to modify the object it is operating on (but modifying it in a way that will not be outwardly visible and does not effect the object's state). For example, for debugging purposes it may be desirable to keep a count of the number of times a particular const member function is called. Maintaining such a count does not actually alter the object's state in any important way (except for debugging). A const function may return a value that is expensive to calculate and so it wishes to cache the final value so it can be used again. Storing this value in the object does not actually effect the object's state in any meaningful way (from the user's point of view)

Allowing a const member function to modify members of its object is achieved by marking those members mutable. In this example, the member function get_m() is able to modify the data member count despite being a const function;-

class X { int m; mutable int count; // ... int get_m() const { return m; ++count; } };

Make const member functions thread-safe unless you're certain that they will not be used in a multi-threaded context

If they are not thread-safe, then assumptions made by multiple, simultaneous callers will be invalid, and the function is likely to trip itself up

mutable is best used for small, specific cases

If the same effect is needed but involving a greater amount of member data then it may be best to move the mutable data to some other object, maintain a reference to that other object and manipulate it indirectly

Clearly, mutable can easily be used inappropriately. Take care!

Concrete Types

A concrete type is a malleable definition but generally refers to a type that exhibits the following characteristics;-

Concrete types are important and most reasonably sized programs should define a good set of them. They help reduce ad-hoc use of fundamental types in ways that can lead to maintenance issues, and they help in creating more specific and less error-prone interfaces

The lack of concrete types can also lead to a proliferation of more complex types. This can lead to overly-complex structures and inefficiencies

Derived Classes

Two cornerstones of object oriented programming are explicit interfaces and run-time polymorphism. Inheritance plays a critical role in facilitating these two mechanisms; the functions of the base classes provide an interface (which may be augmented by the derived class), and virtual functions provide run-time polymorphism

It is important to distinguish between what the various inheritance constructs actually mean;-

Trying to make the above constructs behave in any way other than intended will lead to problems


Defining a class directly in terms of some other class provides an is-a notion (actually, only true for public inheritance). For example, cat and dog are both animals. Therefore, we may wish to define types cat and dog such that they have an is-a relationship with the type animal

The class acting as the common concept (animal in this case) is called the base class, or superclass

The class(es) that depend on the base class (cat and dog in this case) are called derived classes, or subclasses

There are two main reasons for deriving one type from another;-

A derivation of a base class is specified with the : notation, and it is possible to derive from more than one base by separating the base names with , (comma). The general format is;-

class derived-name : access-control-specifier base-name, access-control-specifier base-name … { };

For example;-

class animal { int weight; int height; int number_of_legs; public: print_info() { /* Print common animal information */ } }; class cat : public animal { cat_breed_t breed; int length_of_whiskers; bool grumpy; public: print_info() { // Call base class version of print_info() animal::print_info(); // Print cat-specific information } }; class dog : public animal { dog_breed_t breed; int max_tail_wagging_speed; bool floppy_ears; public: print_info() { // Call base class version of print_info() animal::print_info(); // Print dog-specific information } };

The above example will generate the following inheritance graph;-

animal cat dog

There are four basic methods that can be employed to elicit a derived class from a pointer to a base class;-

The first and last options are strongly preferred over the other two, though use of dynamic_cast is sometimes unavoidable

Avoid hiding inherited names