[c++] Detecting endianness programmatically in a C++ program

Is there a programmatic way to detect whether or not you are on a big-endian or little-endian architecture? I need to be able to write code that will execute on an Intel or PPC system and use exactly the same code (i.e. no conditional compilation).

This question is related to c++ algorithm endianness

The answer is


Please see this article:

Here is some code to determine what is the type of your machine

int num = 1;
if(*(char *)&num == 1)
{
    printf("\nLittle-Endian\n");
}
else
{
    printf("Big-Endian\n");
}

This is normally done at compile time (specially for performance reason) by using the header files available from the compiler or create your own. On linux you have the header file "/usr/include/endian.h"


Ehm... It surprises me that noone has realized that the compiler will simply optimize the test out, and will put a fixed result as return value. This renders all code examples above, effectively useless. The only thing that would be returned is the endianness at compile-time! And yes, I tested all of the above examples. Here's an example with MSVC 9.0 (Visual Studio 2008).

Pure C code

int32 DNA_GetEndianness(void)
{
    union 
    {
        uint8  c[4];
        uint32 i;
    } u;

    u.i = 0x01020304;

    if (0x04 == u.c[0])
        return DNA_ENDIAN_LITTLE;
    else if (0x01 == u.c[0])
        return DNA_ENDIAN_BIG;
    else
        return DNA_ENDIAN_UNKNOWN;
}

Disassembly

PUBLIC  _DNA_GetEndianness
; Function compile flags: /Ogtpy
; File c:\development\dna\source\libraries\dna\endian.c
;   COMDAT _DNA_GetEndianness
_TEXT   SEGMENT
_DNA_GetEndianness PROC                 ; COMDAT

; 11   :     union 
; 12   :     {
; 13   :         uint8  c[4];
; 14   :         uint32 i;
; 15   :     } u;
; 16   : 
; 17   :     u.i = 1;
; 18   : 
; 19   :     if (1 == u.c[0])
; 20   :         return DNA_ENDIAN_LITTLE;

    mov eax, 1

; 21   :     else if (1 == u.c[3])
; 22   :         return DNA_ENDIAN_BIG;
; 23   :     else
; 24   :        return DNA_ENDIAN_UNKNOWN;
; 25   : }

    ret
_DNA_GetEndianness ENDP
END

Perhaps it is possible to turn off ANY compile-time optimization for just this function, but I don't know. Otherwise it's maybe possible to hardcode it in assembly, although that's not portable. And even then even that might get optimized out. It makes me think I need some really crappy assembler, implement the same code for all existing CPUs/instruction sets, and well.... never mind.

Also, someone here said that endianness does not change during run-time. WRONG. There are bi-endian machines out there. Their endianness can vary durng execution. ALSO, there's not only Little Endian and Big Endian, but also other endiannesses (what a word).

I hate and love coding at the same time...


How about this?

#include <cstdio>

int main()
{
    unsigned int n = 1;
    char *p = 0;

    p = (char*)&n;
    if (*p == 1)
        std::printf("Little Endian\n");
    else 
        if (*(p + sizeof(int) - 1) == 1)
            std::printf("Big Endian\n");
        else
            std::printf("What the crap?\n");
    return 0;
}

bool isBigEndian()
{
    static const uint16_t m_endianCheck(0x00ff);
    return ( *((uint8_t*)&m_endianCheck) == 0x0); 
}

int i=1;
char *c=(char*)&i;
bool littleendian=c;

Here's another C version. It defines a macro called wicked_cast() for inline type punning via C99 union literals and the non-standard __typeof__ operator.

#include <limits.h>

#if UCHAR_MAX == UINT_MAX
#error endianness irrelevant as sizeof(int) == 1
#endif

#define wicked_cast(TYPE, VALUE) \
    (((union { __typeof__(VALUE) src; TYPE dest; }){ .src = VALUE }).dest)

_Bool is_little_endian(void)
{
    return wicked_cast(unsigned char, 1u);
}

If integers are single-byte values, endianness makes no sense and a compile-time error will be generated.


You can do it by setting an int and masking off bits, but probably the easiest way is just to use the built in network byte conversion ops (since network byte order is always big endian).

if ( htonl(47) == 47 ) {
  // Big endian
} else {
  // Little endian.
}

Bit fiddling could be faster, but this way is simple, straightforward and pretty impossible to mess up.


Declare:

My initial post is incorrectly declared as "compile time". It's not, it's even impossible in current C++ standard. The constexpr does NOT means the function always do compile-time computation. Thanks Richard Hodges for correction.

compile time, non-macro, C++11 constexpr solution:

union {
  uint16_t s;
  unsigned char c[2];
} constexpr static  d {1};

constexpr bool is_little_endian() {
  return d.c[0] == 1;
}

Unless the endian header is GCC-only, it provides macros you can use.

#include "endian.h"
...
if (__BYTE_ORDER == __LITTLE_ENDIAN) { ... }
else if (__BYTE_ORDER == __BIG_ENDIAN) { ... }
else { throw std::runtime_error("Sorry, this version does not support PDP Endian!");
...

while there is no quick and standard way to determine it, this will output it:

#include <stdio.h> 
int main()  
{ 
   unsigned int i = 1; 
   char *c = (char*)&i; 
   if (*c)     
       printf("Little endian"); 
   else
       printf("Big endian"); 
   getchar(); 
   return 0; 
} 

union {
    int i;
    char c[sizeof(int)];
} x;
x.i = 1;
if(x.c[0] == 1)
    printf("little-endian\n");
else    printf("big-endian\n");

This is another solution. Similar to Andrew Hare's solution.


The way C compilers (at least everyone I know of) work the endianness has to be decided at compile time. Even for biendian processors (like ARM och MIPS) you have to choose endianness at compile time. Further more the endianness is defined in all common file formats for executables (such as ELF). Although it is possible to craft a binary blob of biandian code (for some ARM server exploit maybe?) it probably has to be done in assembly.


I would do something like this:

bool isBigEndian() {
    static unsigned long x(1);
    static bool result(reinterpret_cast<unsigned char*>(&x)[0] == 0);
    return result;
}

Along these lines, you would get a time efficient function that only does the calculation once.


Declare an int variable:

int variable = 0xFF;

Now use char* pointers to various parts of it and check what is in those parts.

char* startPart = reinterpret_cast<char*>( &variable );
char* endPart = reinterpret_cast<char*>( &variable ) + sizeof( int ) - 1;

Depending on which one points to 0xFF byte now you can detect endianness. This requires sizeof( int ) > sizeof( char ), but it's definitely true for the discussed platforms.


You can also do this via the preprocessor using something like boost header file which can be found boost endian


As pointed out by Coriiander, most (if not all) of those codes here will be optimized away at compilation time, so the generated binaries won't check "endianness" at run time.

It has been observed that a given executable shouldn't run in two different byte orders, but I have no idea if that is always the case, and it seems like a hack to me checking at compilation time. So I coded this function:

#include <stdint.h>

int* _BE = 0;

int is_big_endian() {
    if (_BE == 0) {
        uint16_t* teste = (uint16_t*)malloc(4);
        *teste = (*teste & 0x01FE) | 0x0100;
        uint8_t teste2 = ((uint8_t*) teste)[0];
        free(teste);
        _BE = (int*)malloc(sizeof(int));
        *_BE = (0x01 == teste2);
    }
    return *_BE;
}

MinGW wasn't able to optimize this code, even though it does optimize the other codes here away. I believe that is because I leave the "random" value that was alocated on the smaller byte memory as it was (at least 7 of its bits), so the compiler can't know what that random value is and it doesn't optimize the function away.

I've also coded the function so that the check is only performed once, and the return value is stored for next tests.


See Endianness - C-Level Code illustration.

// assuming target architecture is 32-bit = 4-Bytes
enum ENDIANNESS{ LITTLEENDIAN , BIGENDIAN , UNHANDLE };


ENDIANNESS CheckArchEndianalityV1( void )
{
    int Endian = 0x00000001; // assuming target architecture is 32-bit    

    // as Endian = 0x00000001 so MSB (Most Significant Byte) = 0x00 and LSB (Least     Significant Byte) = 0x01
    // casting down to a single byte value LSB discarding higher bytes    

    return (*(char *) &Endian == 0x01) ? LITTLEENDIAN : BIGENDIAN;
} 

I don't like the method based on type punning - it will often be warned against by compiler. That's exactly what unions are for !

bool is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1; 
}

The principle is equivalent to the type case as suggested by others, but this is clearer - and according to C99, is guaranteed to be correct. gcc prefers this compared to the direct pointer cast.

This is also much better than fixing the endianness at compile time - for OS which support multi-architecture (fat binary on Mac os x for example), this will work for both ppc/i386, whereas it is very easy to mess things up otherwise.


You can use std::endian if you have access to C++20 compiler such as GCC 8+ or Clang 7+.

Note: std::endian began in <type_traits> but was moved to <bit> at 2019 Cologne meeting. GCC 8, Clang 7, 8 and 9 have it in <type_traits> while GCC 9+ and Clang 10+ have it in <bit>.

#include <bit>

if constexpr (std::endian::native == std::endian::big)
{
    // Big endian system
}
else if constexpr (std::endian::native == std::endian::little)
{
    // Little endian system
}
else
{
    // Something else
}

The C++ way has been to use boost, where preprocessor checks and casts are compartmentalized away inside very thoroughly-tested libraries.

The Predef Library (boost/predef.h) recognizes four different kinds of endianness.

The Endian Library was planned to be submitted to the C++ standard, and supports a wide variety of operations on endian-sensitive data.

As stated in answers above, Endianness will be a part of c++20.


If you don't want conditional compilation you can just write endian independent code. Here is an example (taken from Rob Pike):

Reading an integer stored in little-endian on disk, in an endian independent manner:

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

The same code, trying to take into account the machine endianness:

i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif

I was going through the textbook:Computer System: a programmer's perspective, and there is a problem to determine which endian is this by C program.

I used the feature of the pointer to do that as following:

#include <stdio.h>

int main(void){
    int i=1;
    unsigned char* ii = &i;

    printf("This computer is %s endian.\n", ((ii[0]==1) ? "little" : "big"));
    return 0;
}

As the int takes up 4 bytes, and char takes up only 1 bytes. We could use a char pointer to point to the int with value 1. Thus if the computer is little endian, the char that char pointer points to is with value 1, otherwise, its value should be 0.


Do not use a union!

C++ does not permit type punning via unions!
Reading from a union field that was not the last field written to is undefined behaviour!
Many compilers support doing so as an extensions, but the language makes no guarantee.

See this answer for more details:

https://stackoverflow.com/a/11996970


There are only two valid answers that are guaranteed to be portable.

The first answer, if you have access to a system that supports C++20,
is to use std::endian from the <type_traits> header.

(At the time of writing, C++20 has not yet been released, but unless something happens to affect std::endian's inclusion, this shall be the preferred way to test the endianness at compile time from C++20 onwards.)

C++20 Onwards

constexpr bool is_little_endian = (std::endian::native == std::endian::little);

Prior to C++20, the only valid answer is to store an integer and then inspect its first byte through type punning.
Unlike the use of unions, this is expressly allowed by C++'s type system.

It's also important to remember that for optimum portability static_cast should be used,
because reinterpret_cast is implementation defined.

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: ... a char or unsigned char type.

C++11 Onwards

enum class endianness
{
    little = 0,
    big = 1,
};

inline endianness get_system_endianness()
{
    const int value { 0x01 };
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01) ? endianness::little : endianness::big;
}

C++11 Onwards (without enum)

inline bool is_system_little_endian()
{
    const int value { 0x01 };
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01);
}

C++98/C++03

inline bool is_system_little_endian()
{
    const int value = 0x01;
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01);
}

Unless you're using a framework that has been ported to PPC and Intel processors, you will have to do conditional compiles, since PPC and Intel platforms have completely different hardware architectures, pipelines, busses, etc. This renders the assembly code completely different between the two.

As for finding endianness, do the following:

short temp = 0x1234;
char* tempChar = (char*)&temp;

You will either get tempChar to be 0x12 or 0x34, from which you will know the endianness.


I surprised no-one has mentioned the macros which the pre-processor defines by default. While these will vary depending on your platform; they are much cleaner than having to write your own endian-check.

For example; if we look at the built-in macros which GCC defines (on an X86-64 machine):

:| gcc -dM -E -x c - |grep -i endian
#define __LITTLE_ENDIAN__ 1

On a PPC machine I get:

:| gcc -dM -E -x c - |grep -i endian
#define __BIG_ENDIAN__ 1
#define _BIG_ENDIAN 1

(The :| gcc -dM -E -x c - magic prints out all built-in macros).


For further details, you may want to check out this codeproject article Basic concepts on Endianness:

How to dynamically test for the Endian type at run time?

As explained in Computer Animation FAQ, you can use the following function to see if your code is running on a Little- or Big-Endian system: Collapse

#define BIG_ENDIAN      0
#define LITTLE_ENDIAN   1
int TestByteOrder()
{
   short int word = 0x0001;
   char *byte = (char *) &word;
   return(byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This code assigns the value 0001h to a 16-bit integer. A char pointer is then assigned to point at the first (least-significant) byte of the integer value. If the first byte of the integer is 0x01h, then the system is Little-Endian (the 0x01h is in the lowest, or least-significant, address). If it is 0x00h then the system is Big-Endian.


As stated above, use union tricks.

There are few problems with the ones advised above though, most notably that unaligned memory access is notoriously slow for most architectures, and some compilers won't even recognize such constant predicates at all, unless word aligned.

Because mere endian test is boring, here goes (template) function which will flip the input/output of arbitrary integer according to your spec, regardless of host architecture.

#include <stdint.h>

#define BIG_ENDIAN 1
#define LITTLE_ENDIAN 0

template <typename T>
T endian(T w, uint32_t endian)
{
    // this gets optimized out into if (endian == host_endian) return w;
    union { uint64_t quad; uint32_t islittle; } t;
    t.quad = 1;
    if (t.islittle ^ endian) return w;
    T r = 0;

    // decent compilers will unroll this (gcc)
    // or even convert straight into single bswap (clang)
    for (int i = 0; i < sizeof(r); i++) {
        r <<= 8;
        r |= w & 0xff;
        w >>= 8;
    }
    return r;
};

Usage:

To convert from given endian to host, use:

host = endian(source, endian_of_source)

To convert from host endian to given endian, use:

output = endian(hostsource, endian_you_want_to_output)

The resulting code is as fast as writing hand assembly on clang, on gcc it's tad slower (unrolled &,<<,>>,| for every byte) but still decent.


untested, but in my mind, this should work? cause it'll be 0x01 on little endian, and 0x00 on big endian?

bool runtimeIsLittleEndian(void)
{
 volatile uint16_t i=1;
 return  ((uint8_t*)&i)[0]==0x01;//0x01=little, 0x00=big
}

Examples related to c++

Method Call Chaining; returning a pointer vs a reference? How can I tell if an algorithm is efficient? Difference between opening a file in binary vs text How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Install Qt on Ubuntu #include errors detected in vscode Cannot open include file: 'stdio.h' - Visual Studio Community 2017 - C++ Error How to fix the error "Windows SDK version 8.1" was not found? Visual Studio 2017 errors on standard headers How do I check if a Key is pressed on C++

Examples related to algorithm

How can I tell if an algorithm is efficient? Find the smallest positive integer that does not occur in a given sequence Efficiently getting all divisors of a given number Peak signal detection in realtime timeseries data What is the optimal algorithm for the game 2048? How can I sort a std::map first by value, then by key? Finding square root without using sqrt function? Fastest way to flatten / un-flatten nested JSON objects Mergesort with Python Find common substring between two strings

Examples related to endianness

Convert Little Endian to Big Endian C program to check little vs. big endian Convert a byte array to integer in Java and vice versa convert big endian to little endian in C [without using provided func] C Macro definition to determine big endian or little endian machine? Detecting endianness programmatically in a C++ program Does Java read integers in little endian or big endian? How do I convert between big-endian and little-endian values in C++?