Purpose of Unions in C and C

Question

I have used unions earlier comfortably  today I was alarmed when I read this post and came to know that this code   union ARGB       uint32 t colour       struct componentsTag               uint8 t b          uint8 t g          uint8 t r          uint8 t a        components     pixel   pixel colour   0xff040201      ARGB  colour is the active member from now on     somewhere down the line  without any edit to pixel  if pixel components a          accessing the non-active member ARGB  components   is actually undefined behaviour I e  reading from a member of the union other than the one recently written to leads to undefined behaviour  If this isn t the intended usage of unions  what is  Can some one please explain it elaborately   Update   I wanted to clarify a few things in hindsight    The answer to the question isn t the same for C and C    my ignorant younger self tagged it as both C and C    After scouring through C  11 s standard I couldn t conclusively say that it calls out accessing inspecting a non-active union member is undefined unspecified implementation-defined  All I could find was   9 5 1      If a standard-layout union contains several standard-layout structs that share a common initial sequence  and if an object of this standard-layout union type contains one of the standard-layout structs  it is permitted to inspect the common initial sequence of any of standard-layout struct members    9 2 19  Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members   While in C   C99 TC3 - DR 283 onwards  it s legal to do so  thanks to Pascal Cuoq for bringing this up   However  attempting to do it can still lead to undefined behavior  if the value read happens to be invalid  so called  trap representation   for the type it is read through  Otherwise  the value read is implementation defined  C89 90 called this out under unspecified behavior  Annex J  and K amp R s book says it s implementation defined  Quote from K amp R      This is the purpose of a union - a single variable that can legitimately hold any of one of several types        so long as the usage is consistent  the type retrieved must be the type most recently stored  It is the programmer s responsibility to keep track of which type is currently stored in a union  the results are implementation-dependent if something is stored as one type and extracted as another   Extract from Stroustrup s TC  PL  emphasis mine      Use of unions can be essential for compatness of data       sometimes misused for  type conversion      Above all  this question  whose title remains unchanged since my ask  was posed with an intention of understanding the purpose of unions AND not on what the standard allows E g  Using inheritance for code reuse is  of course  allowed by the C   standard  but it wasn t the purpose or the original intention of introducing inheritance as a C   language feature  This is the reason Andrey s answer continues to remain as the accepted one

User · Answer

You can use a a union for two main reasons    A handy way to access the same data in different ways  like in your example A way to save space when there are different data members of which only one can ever be  active    1 Is really more of a C-style hack to short-cut writing code on the basis you know how the target system s memory architecture works  As already said you can normally get away with it if you don t actually target lots of different platforms  I believe some compilers might let you use packing directives also  I know they do on structs    A good example of 2  can be found in the VARIANT type used extensively in COM

User · Answer

bobobobo code is correct as  Joshua pointed out  sadly I m not allowed to add comments  so doing it here  IMO bad decision to disallow it in first place   https   en cppreference com w cpp language data members Standard layout tells that it is fine to do so  at least since C  14  In a standard-layout union with an active member of non-union class type T1  it is permitted to read a non-static data member m of another union member of non-union class type T2 provided m is part of the common initial sequence of T1 and T2  except that reading a volatile member through non-volatile glvalue is undefined    since in the current case T1 and T2 donate the same type anyway

User · Answer

Others have mentioned the architecture differences  little - big endian    I read the problem that since the memory for the variables is shared  then by writing to one  the others change and  depending on their type  the value could be meaningless   eg       union        float f        int   i        x   Writing to x i would be meaningless if you then read from x f - unless that is what you intended in order to look at the sign  exponent or mantissa components of the float   I think there is also an issue of alignment  If some variables must be word aligned then you might not get the expected result   eg      union        char  c 4         int   i        x   If  hypothetically  on some machine a char had to be word aligned then c 0  and c 1  would share storage with i but not c 2  and c 3

User · Answer

For one more example of the actual use of unions  the CORBA framework serializes objects using the tagged union approach  All user-defined classes are members of one  huge  union  and an integer identifier tells the demarshaller how to interpret the union

User · Answer

In the C language as it was documented in 1974  all structure members shared a common namespace  and the meaning of  ptr- member  was defined as adding the member s displacement to  ptr  and accessing the resulting address using the member s type   This design made it possible to use the same ptr with member names taken from different structure definitions but with the same offset  programmers used that ability for a variety of purposes   When structure members were assigned their own namespaces  it became impossible to declare two structure members with the same displacement   Adding unions to the language made it possible to achieve the same semantics that had been available in earlier versions of the language  though the inability to have names exported to an enclosing context may have still necessitated using a find replace to replace foo- member into foo- type1 member    What was important was not so much that the people who added unions have any particular target usage in mind  but rather that they provide a means by which programmers who had relied upon the earlier semantics  for whatever purpose  should still be able to achieve the same semantics even if they had to use a different syntax to do it

User · Answer

The purpose of unions is rather obvious  but for some reason people miss it quite often     The purpose of union  is to save memory by using the same memory region for storing different objects at different times  That s it   It is like a room in a hotel  Different people live in it for non-overlapping periods of time  These people never meet  and generally don t know anything about each other  By properly managing the time-sharing of the rooms  i e  by making sure different people don t get assigned to one room at the same time   a relatively small hotel can provide accommodations to a relatively large number of people  which is what hotels are for   That s exactly what union does  If you know that several objects in your program hold values with non-overlapping value-lifetimes  then you can  merge  these objects into a union and thus save memory  Just like a hotel room has at most one  active  tenant at each moment of time  a union has at most one  active  member at each moment of program time  Only the  active  member can be read  By writing into other member you switch the  active  status to that other member   For some reason  this original purpose of the union got  overridden  with something completely different  writing one member of a union and then inspecting it through another member  This kind of memory reinterpretation  aka  type punning   is not a valid use of unions  It generally leads to undefined behavior is described as producing implementation-defined behavior in C89 90   EDIT  Using unions for the purposes of type punning  i e  writing one member and then reading another  was given a more detailed definition in one of the Technical Corrigenda to the C99 standard  see DR 257 and DR 283   However  keep in mind that formally this does not protect you from running into undefined behavior by attempting to read a trap representation

User · Answer

The most common use of union I regularly come across is aliasing   Consider the following   union Vector3f     struct  float x y z         float elts 3       What does this do   It allows clean  neat access of a Vector3f vec  s members by either name    vec x vec y vec z 1 f     or by integer access into the array  for  int i   0   i  lt  3   i       vec elts i  1 f    In some cases  accessing by name is the clearest thing you can do   In other cases  especially when the axis is chosen programmatically  the easier thing to do is to access the axis by numerical index - 0 for x  1 for y  and 2 for z

User · Answer

You could use unions to create structs like the following  which contains a field that tells us which component of the union is actually used   struct VAROBJECT       enum o t   Int  Double  String   objectType       union               int intValue          double dblValue          char  strValue        value    object

User · Answer

The behaviour may be undefined  but that just means there isn t a  standard   All decent compilers offer  pragmas to control packing and alignment  but may have different defaults  The defaults will also change depending on the optimisation settings used   Also  unions are not just for saving space  They can help modern compilers with type punning  If you reinterpret cast lt  gt  everything the compiler can t make assumptions about what you are doing  It may have to throw away what it knows about your type and start again  forcing a write back to memory  which is very inefficient these days compared to CPU clock speed

User · Answer

In C    Boost Variant implement a safe version of the union  designed to prevent undefined behavior as much as possible   Its performances are identical to the enum   union construct  stack allocated too etc  but it uses a template list of types instead of the enum

User · Answer

Although this is strictly undefined behaviour  in practice it will work with pretty much any compiler  It is such a widely used paradigm that any self-respecting compiler will need to do  the right thing  in cases such as this  It s certainly to be preferred over type-punning  which may well generate broken code with some compilers

User · Answer

In C it was a nice way to implement something like an variant   enum possibleTypes    eInt    eDouble    eChar     struct Value       union Value         int iVal         double dval        char cVal        value       possibleTypes discriminator       switch val discriminator       case eInt  val value  iVal   break    In times of litlle memory this structure is using less memory than a struct that has all the member   By the way C provides      typedef struct         unsigned int mantissa low 32         mantissa       unsigned int mantissa high 20        unsigned int exponent 11            exponent       unsigned int sign 1        realVal    to access bit values

User · Answer

As others mentioned  unions combined with enumerations and wrapped into structs can be used to implement tagged unions  One practical use is to implement Rust s Result lt T  E gt   which is originally implemented using a pure enum  Rust can hold additional data in enumeration variants   Here is a C   example   template  lt typename T  typename E gt  struct Result       public      enum class Success   uint8 t   Ok  Err        Result T val            m success   Success  Ok          m value ok   val            Result E val            m success   Success  Err          m value err   val            inline bool operator   const Result amp  other            return other m success    this- gt m success            inline bool operator   const Result amp  other            return other m success    this- gt m success            inline T expect const char  errorMsg            if  m success    Success  Err  throw errorMsg          else return m value ok            inline bool is ok             return m success    Success  Ok            inline bool is err             return m success    Success  Err            inline const T  ok             if  is ok    return m value ok          else return nullptr            inline const T  err             if  is err    return m value err          else return nullptr                Other methods from https   doc rust-lang org std result enum Result html      private      Success m success      union  val t   T ok  E err    m value

User · Answer

The behavior is undefined from the language point of view  Consider that different platforms can have different constraints in memory alignment and endianness  The code in a big endian versus a little endian machine will update the values in the struct differently  Fixing the behavior in the language would require all implementations to use the same endianness  and memory alignment constraints     limiting use   If you are using C    you are using two tags  and you really care about portability  then you can just use the struct and provide a setter that takes the uint32 t and sets the fields appropriately through bitmask operations  The same can be done in C with a function   Edit  I was expecting AProgrammer to write down an answer to vote and close this one  As some comments have pointed out  endianness is dealt in other parts of the standard by letting each implementation decide what to do  and alignment and padding can also be handled differently  Now  the strict aliasing rules that AProgrammer implicitly refers to are a important point here  The compiler is allowed to make assumptions on the modification  or lack of modification  of variables  In the case of the union  the compiler could reorder instructions and move the read of each color component over the write to the colour variable

User · Answer

As you say  this is strictly undefined behaviour  though it will  work  on many platforms  The real reason for using unions is to create variant records   union A      int i     double d      A a 10         records in  a  can be either ints or doubles  a 0  i   42  a 1  d   1 23    Of course  you also need some sort of discriminator to say what the variant actually contains  And note that in C   unions are not much use because they can only contain POD types - effectively those without constructors and destructors

User · Answer

Technically it s undefined  but in reality most  all   compilers treat it exactly the same as using a reinterpret cast from one type to the other  the result of which is implementation defined  I wouldn t lose sleep over your current code

[c++] Purpose of Unions in C and C++

Examples related to c++

Examples related to c

Examples related to unions

Examples related to type-punning