[c++] How do I use arrays in C++?

5. Common pitfalls when using arrays.

5.1 Pitfall: Trusting type-unsafe linking.

OK, you’ve been told, or have found out yourself, that globals (namespace scope variables that can be accessed outside the translation unit) are Evil™. But did you know how truly Evil™ they are? Consider the program below, consisting of two files [main.cpp] and [numbers.cpp]:

// [main.cpp]
#include <iostream>

extern int* numbers;

int main()
{
    using namespace std;
    for( int i = 0;  i < 42;  ++i )
    {
        cout << (i > 0? ", " : "") << numbers[i];
    }
    cout << endl;
}

// [numbers.cpp]
int numbers[42] = {1, 2, 3, 4, 5, 6, 7, 8, 9};

In Windows 7 this compiles and links fine with both MinGW g++ 4.4.1 and Visual C++ 10.0.

Since the types don't match, the program crashes when you run it.

The Windows 7 crash dialog

In-the-formal explanation: the program has Undefined Behavior (UB), and instead of crashing it can therefore just hang, or perhaps do nothing, or it can send threating e-mails to the presidents of the USA, Russia, India, China and Switzerland, and make Nasal Daemons fly out of your nose.

In-practice explanation: in main.cpp the array is treated as a pointer, placed at the same address as the array. For 32-bit executable this means that the first int value in the array, is treated as a pointer. I.e., in main.cpp the numbers variable contains, or appears to contain, (int*)1. This causes the program to access memory down at very bottom of the address space, which is conventionally reserved and trap-causing. Result: you get a crash.

The compilers are fully within their rights to not diagnose this error, because C++11 §3.5/10 says, about the requirement of compatible types for the declarations,

[N3290 §3.5/10]
A violation of this rule on type identity does not require a diagnostic.

The same paragraph details the variation that is allowed:

… declarations for an array object can specify array types that differ by the presence or absence of a major array bound (8.3.4).

This allowed variation does not include declaring a name as an array in one translation unit, and as a pointer in another translation unit.

5.2 Pitfall: Doing premature optimization (memset & friends).

Not written yet

5.3 Pitfall: Using the C idiom to get number of elements.

With deep C experience it’s natural to write …

#define N_ITEMS( array )   (sizeof( array )/sizeof( array[0] ))

Since an array decays to pointer to first element where needed, the expression sizeof(a)/sizeof(a[0]) can also be written as sizeof(a)/sizeof(*a). It means the same, and no matter how it’s written it is the C idiom for finding the number elements of array.

Main pitfall: the C idiom is not typesafe. For example, the code …

#include <stdio.h>

#define N_ITEMS( array ) (sizeof( array )/sizeof( *array ))

void display( int const a[7] )
{
    int const   n = N_ITEMS( a );          // Oops.
    printf( "%d elements.\n", n );
}

int main()
{
    int const   moohaha[]   = {1, 2, 3, 4, 5, 6, 7};

    printf( "%d elements, calling display...\n", N_ITEMS( moohaha ) );
    display( moohaha );
}

passes a pointer to N_ITEMS, and therefore most likely produces a wrong result. Compiled as a 32-bit executable in Windows 7 it produces …

7 elements, calling display...
1 elements.

  1. The compiler rewrites int const a[7] to just int const a[].
  2. The compiler rewrites int const a[] to int const* a.
  3. N_ITEMS is therefore invoked with a pointer.
  4. For a 32-bit executable sizeof(array) (size of a pointer) is then 4.
  5. sizeof(*array) is equivalent to sizeof(int), which for a 32-bit executable is also 4.

In order to detect this error at run time you can do …

#include <assert.h>
#include <typeinfo>

#define N_ITEMS( array )       (                               \
    assert((                                                    \
        "N_ITEMS requires an actual array as argument",        \
        typeid( array ) != typeid( &*array )                    \
        )),                                                     \
    sizeof( array )/sizeof( *array )                            \
    )

7 elements, calling display...
Assertion failed: ( "N_ITEMS requires an actual array as argument", typeid( a ) != typeid( &*a ) ), file runtime_detect ion.cpp, line 16

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

The runtime error detection is better than no detection, but it wastes a little processor time, and perhaps much more programmer time. Better with detection at compile time! And if you're happy to not support arrays of local types with C++98, then you can do that:

#include <stddef.h>

typedef ptrdiff_t   Size;

template< class Type, Size n >
Size n_items( Type (&)[n] ) { return n; }

#define N_ITEMS( array )       n_items( array )

Compiling this definition substituted into the first complete program, with g++, I got …

M:\count> g++ compile_time_detection.cpp
compile_time_detection.cpp: In function 'void display(const int*)':
compile_time_detection.cpp:14: error: no matching function for call to 'n_items(const int*&)'

M:\count> _

How it works: the array is passed by reference to n_items, and so it does not decay to pointer to first element, and the function can just return the number of elements specified by the type.

With C++11 you can use this also for arrays of local type, and it's the type safe C++ idiom for finding the number of elements of an array.

5.4 C++11 & C++14 pitfall: Using a constexpr array size function.

With C++11 and later it's natural, but as you'll see dangerous!, to replace the C++03 function

typedef ptrdiff_t   Size;

template< class Type, Size n >
Size n_items( Type (&)[n] ) { return n; }

with

using Size = ptrdiff_t;

template< class Type, Size n >
constexpr auto n_items( Type (&)[n] ) -> Size { return n; }

where the significant change is the use of constexpr, which allows this function to produce a compile time constant.

For example, in contrast to the C++03 function, such a compile time constant can be used to declare an array of the same size as another:

// Example 1
void foo()
{
    int const x[] = {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};
    constexpr Size n = n_items( x );
    int y[n] = {};
    // Using y here.
}

But consider this code using the constexpr version:

// Example 2
template< class Collection >
void foo( Collection const& c )
{
    constexpr int n = n_items( c );     // Not in C++14!
    // Use c here
}

auto main() -> int
{
    int x[42];
    foo( x );
}

The pitfall: as of July 2015 the above compiles with MinGW-64 5.1.0 with -pedantic-errors, and, testing with the online compilers at gcc.godbolt.org/, also with clang 3.0 and clang 3.2, but not with clang 3.3, 3.4.1, 3.5.0, 3.5.1, 3.6 (rc1) or 3.7 (experimental). And important for the Windows platform, it does not compile with Visual C++ 2015. The reason is a C++11/C++14 statement about use of references in constexpr expressions:

C++11 C++14 $5.19/2 nineth dash

A conditional-expression e is a core constant expression unless the evaluation of e, following the rules of the abstract machine (1.9), would evaluate one of the following expressions:
        ?

  • an id-expression that refers to a variable or data member of reference type unless the reference has a preceding initialization and either
    • it is initialized with a constant expression or
    • it is a non-static data member of an object whose lifetime began within the evaluation of e;

One can always write the more verbose

// Example 3  --  limited

using Size = ptrdiff_t;

template< class Collection >
void foo( Collection const& c )
{
    constexpr Size n = std::extent< decltype( c ) >::value;
    // Use c here
}

… but this fails when Collection is not a raw array.

To deal with collections that can be non-arrays one needs the overloadability of an n_items function, but also, for compile time use one needs a compile time representation of the array size. And the classic C++03 solution, which works fine also in C++11 and C++14, is to let the function report its result not as a value but via its function result type. For example like this:

// Example 4 - OK (not ideal, but portable and safe)

#include <array>
#include <stddef.h>

using Size = ptrdiff_t;

template< Size n >
struct Size_carrier
{
    char sizer[n];
};

template< class Type, Size n >
auto static_n_items( Type (&)[n] )
    -> Size_carrier<n>;
// No implementation, is used only at compile time.

template< class Type, size_t n >        // size_t for g++
auto static_n_items( std::array<Type, n> const& )
    -> Size_carrier<n>;
// No implementation, is used only at compile time.

#define STATIC_N_ITEMS( c ) \
    static_cast<Size>( sizeof( static_n_items( c ).sizer ) )

template< class Collection >
void foo( Collection const& c )
{
    constexpr Size n = STATIC_N_ITEMS( c );
    // Use c here
    (void) c;
}

auto main() -> int
{
    int x[42];
    std::array<int, 43> y;
    foo( x );
    foo( y );
}

About the choice of return type for static_n_items: this code doesn't use std::integral_constant because with std::integral_constant the result is represented directly as a constexpr value, reintroducing the original problem. Instead of a Size_carrier class one can let the function directly return a reference to an array. However, not everybody is familiar with that syntax.

About the naming: part of this solution to the constexpr-invalid-due-to-reference problem is to make the choice of compile time constant explicit.

Hopefully the oops-there-was-a-reference-involved-in-your-constexpr issue will be fixed with C++17, but until then a macro like the STATIC_N_ITEMS above yields portability, e.g. to the clang and Visual C++ compilers, retaining type safety.

Related: macros do not respect scopes, so to avoid name collisions it can be a good idea to use a name prefix, e.g. MYLIB_STATIC_N_ITEMS.

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to arrays

PHP array value passes to next row Use NSInteger as array index How do I show a message in the foreach loop? Objects are not valid as a React child. If you meant to render a collection of children, use an array instead Iterating over arrays in Python 3 Best way to "push" into C# array Sort Array of object by object field in Angular 6 Checking for duplicate strings in JavaScript array what does numpy ndarray shape do? How to round a numpy array?

Examples related to pointers

Method Call Chaining; returning a pointer vs a reference? lvalue required as left operand of assignment error when using C++ Error: stray '\240' in program Reference to non-static member function must be called How to convert const char* to char* in C? Why should I use a pointer rather than the object itself? Function stoi not declared C pointers and arrays: [Warning] assignment makes pointer from integer without a cast Constant pointer vs Pointer to constant How to get the real and total length of char * (char array)?

Examples related to multidimensional-array

what does numpy ndarray shape do? len() of a numpy array in python What is the purpose of meshgrid in Python / NumPy? Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray Typescript - multidimensional array initialization How to get every first element in 2 dimensional list How does numpy.newaxis work and when to use it? How to count the occurrence of certain item in an ndarray? Iterate through 2 dimensional array Selecting specific rows and columns from NumPy array

Examples related to c++-faq

What are the new features in C++17? Why should I use a pointer rather than the object itself? Why is enum class preferred over plain enum? gcc/g++: "No such file or directory" What is an undefined reference/unresolved external symbol error and how do I fix it? When is std::weak_ptr useful? What XML parser should I use in C++? What is a lambda expression in C++11? Why should C++ programmers minimize use of 'new'? Iterator invalidation rules