How to allocate aligned memory only using the standard library

Question

I just finished a test as part of a job interview  and one question stumped me  even using Google for reference  I d like to see what the StackOverflow crew can do with it   The memset 16aligned function requires a 16-byte aligned pointer passed to it  or it will crash  a  How would you allocate 1024 bytes of memory  and align it to a 16 byte boundary  b  Free the memory after the memset 16aligned has executed            void  mem     void  ptr         answer a  here     memset 16aligned ptr  0  1024          answer b  here

User · Answer

long add     mem    void  malloc 1024  15   add    long mem  add   add -  add   16    align to 16 byte boundary ptr    whatever   add

User · Answer

You can also add some 16 bytes and then push the original ptr to 16bit aligned by adding the  16-mod  as below the pointer    main    void  mem1   malloc 1024 16   void  mem     char  mem1  1     force misalign   my computer always aligns  printf     ptr    p  n    mem    void  ptr     long mem 16   amp    0x0F  printf     aligned ptr    p  n    ptr     printf    ptr after adding diff mod  p  same as above       long mem1    16 -  long mem1 16        free mem1

User · Answer

size  1024  alignment   16  aligned size   size   alignment - size    alignment    mem   malloc aligned size   memset 16aligned mem  0  1024   free mem     Hope this one is the simplest implementation  let me know your comments

User · Answer

For the solution i used a concept of padding which aligns the memory and do not waste the      memory of a single byte    If there are constraints that  you cannot waste a single byte  All pointers allocated with malloc are 16 bytes aligned   C11 is supported  so you can just call aligned alloc  16  size     void  mem   malloc 1024 16   void  ptr     char   mem 16   amp    0x0F  memset 16aligned ptr  0  1024   free mem

User · Answer

On the 16 vs 15 byte-count padding front  the actual number you need to add to get an alignment of N is max 0 N-M  where M is the natural alignment of the memory allocator  and both are powers of 2    Since the minimal memory alignment of any allocator is 1 byte  15 max 0 16-1  is a conservative answer   However  if you know your memory allocator is going to give you 32-bit int aligned addresses  which is fairly common   you could have used 12 as a pad   This isn t important for this example but it might be important on an embedded system with 12K of RAM where every single int saved counts   The best way to implement it if you re actually going to try to save every byte possible is as a macro so you can feed it your native memory alignment   Again  this is probably only useful for embedded systems where you need to save every byte   In the example below  on most systems  the value 1 is just fine for MEMORY ALLOCATOR NATIVE ALIGNMENT  however for our theoretical embedded system with 32-bit aligned allocations  the following could save a tiny bit of precious memory    define MEMORY ALLOCATOR NATIVE ALIGNMENT    4  define ALIGN PAD2 N M     N  gt  M       N - M     0   define ALIGN PAD N  ALIGN PAD2  N   MEMORY ALLOCATOR NATIVE ALIGNMENT

User · Answer

Original answer        void  mem   malloc 1024 16       void  ptr     char   mem 16   amp    0x0F      memset 16aligned ptr  0  1024       free mem       Fixed answer        void  mem   malloc 1024 15       void  ptr     uintptr t mem 15   amp     uintptr t 0x0F      memset 16aligned ptr  0  1024       free mem       Explanation as requested  The first step is to allocate enough spare space  just in case   Since the memory must be 16-byte aligned  meaning that the leading byte address needs to be a multiple of 16   adding 16 extra bytes guarantees that we have enough space   Somewhere in the first 16 bytes  there is a 16-byte aligned pointer    Note that malloc   is supposed to return a pointer that is sufficiently well aligned for any purpose   However  the meaning of  any  is primarily for things like basic types     long  double  long double  long long  and pointers to objects and pointers to functions   When you are doing more specialized things  like playing with graphics systems  they can need more stringent alignment than the rest of the system     hence questions and answers like this    The next step is to convert the void pointer to a char pointer  GCC notwithstanding  you are not supposed to do pointer arithmetic on void pointers  and GCC has warning options to tell you when you abuse it    Then add 16 to the start pointer   Suppose malloc   returned you an impossibly badly aligned pointer  0x800001   Adding the 16 gives 0x800011  Now I want to round down to the 16-byte boundary     so I want to reset the last 4 bits to 0   0x0F has the last 4 bits set to one  therefore   0x0F has all bits set to one except the last four   Anding that with 0x800011 gives 0x800010   You can iterate over the other offsets and see that the same arithmetic works   The last step  free    is easy  you always  and only  return to free   a value that one of malloc    calloc   or realloc   returned to you     anything else is a disaster   You correctly provided mem to hold that value     thank you   The free releases it   Finally  if you know about the internals of your system s malloc package  you could guess that it might well return 16-byte aligned data  or it might be 8-byte aligned    If it was 16-byte aligned  then you d not need to dink with the values   However  this is dodgy and non-portable     other malloc packages have different minimum alignments  and therefore assuming one thing when it does something different would lead to core dumps   Within broad limits  this solution is portable   Someone else mentioned posix memalign   as another way to get the aligned memory  that isn t available everywhere  but could often be implemented using this as a basis   Note that it was convenient that the alignment was a power of 2  other alignments are messier   One more comment     this code does not check that the allocation succeeded   Amendment  Windows Programmer pointed out that you can t do bit mask operations on pointers  and  indeed  GCC  3 4 6 and 4 3 1 tested  does complain like that   So  an amended version of the basic code     converted into a main program  follows   I ve also taken the liberty of adding just 15 instead of 16  as has been pointed out   I m using uintptr t since C99 has been around long enough to be accessible on most platforms   If it wasn t for the use of PRIXPTR in the printf   statements  it would be sufficient to  include  lt stdint h gt  instead of using  include  lt inttypes h gt    This code includes the fix pointed out by C R   which was reiterating a point first made by Bill K a number of years ago  which I managed to overlook until now     include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      int main void        void  mem   malloc 1024 15       void  ptr    void      uintptr t mem 15   amp     uintptr t 0x0F       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem       return 0       And here is a marginally more generalized version  which will work for sizes which are a power of 2    include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      static void test mask size t align        uintptr t mask     uintptr t  align - 1       void  mem   malloc 1024 align-1       void  ptr    void      uintptr t mem align-1   amp  mask       assert  align  amp   align - 1      0       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem      int main void        test mask 16       test mask 32       test mask 64       test mask 128       return 0       To convert test mask   into a general purpose allocation function  the single return value from the allocator would have to encode the release address  as several people have indicated in their answers   Problems with interviewers  Uri commented  Maybe I am having  a  reading comprehension problem this morning  but if the interview question specifically says   How would you allocate 1024 bytes of memory  and you clearly allocate more than that  Wouldn t that be an automatic failure from the interviewer   My response won t fit into a 300-character comment     It depends  I suppose   I think most people  including me  took the question to mean  How would you allocate a space in which 1024 bytes of data can be stored  and where the base address is a multiple of 16 bytes    If the interviewer really meant how can you allocate 1024 bytes  only  and have it 16-byte aligned  then the options are more limited    Clearly  one possibility is to allocate 1024 bytes and then give that address the  alignment treatment   the problem with that approach is that the actual available space is not properly determinate  the usable space is between 1008 and 1024 bytes  but there wasn t a mechanism available to specify which size   which renders it less than useful  Another possibility is that you are expected to write a full memory allocator and ensure that the 1024-byte block you return is appropriately aligned   If that is the case  you probably end up doing an operation fairly similar to what the proposed solution did  but you hide it inside the allocator    However  if the interviewer expected either of those responses  I d expect them to recognize that this solution answers a closely related question  and then to reframe their question to point the conversation in the correct direction    Further  if the interviewer got really stroppy  then I wouldn t want the job  if the answer to an insufficiently precise requirement is shot down in flames without correction  then the interviewer is not someone for whom it is safe to work    The world moves on  The title of the question has changed recently   It was Solve the memory alignment in C interview question that stumped me   The revised title  How to allocate aligned memory only using the standard library   demands a slightly revised answer     this addendum provides it   C11  ISO IEC 9899 2011  added function aligned alloc        7 22 3 1 The aligned alloc function      Synopsis     include  lt stdlib h gt  void  aligned alloc size t alignment  size t size         Description   The aligned alloc function allocates space for an object whose alignment is   specified by alignment  whose size is specified by size  and whose value is   indeterminate  The value of alignment shall be a valid alignment supported by the implementation and the value of size shall be an integral multiple of alignment       Returns   The aligned alloc function returns either a null pointer or a pointer to the allocated space    And POSIX defines posix memalign       include  lt stdlib h gt   int posix memalign void   memptr  size t alignment  size t size         DESCRIPTION      The posix memalign   function shall allocate size bytes aligned on a boundary specified by alignment  and shall return a pointer to the allocated memory in memptr  The value of alignment shall be a power of two multiple of sizeof void          Upon successful completion  the value pointed to by memptr shall be a multiple of alignment       If the size of the space requested is 0  the behavior is implementation-defined  the value returned in memptr shall be either a null pointer or a unique pointer       The free   function shall deallocate memory that has previously been allocated by posix memalign         RETURN VALUE      Upon successful completion  posix memalign   shall return zero  otherwise  an error number shall be returned to indicate the error    Either or both of these could be used to answer the question now  but only the POSIX function was an option when the question was originally answered   Behind the scenes  the new aligned memory function do much the same job as outlined in the question  except they have the ability to force the alignment more easily  and keep track of the start of the aligned memory internally so that the code doesn t have to deal with specially     it just frees the memory returned by the allocation function that was used

User · Answer

long add     mem    void  malloc 1024  15   add    long mem  add   add -  add   16    align to 16 byte boundary ptr    whatever   add

User · Answer

Here s an alternate approach to the  round up  part   Not the most brilliantly coded solution but it gets the job done  and this type of syntax is a bit easier to remember  plus would work for alignment values that aren t a power of 2    The uintptr t cast was necessary to appease the compiler  pointer arithmetic isn t very fond of division or multiplication   void  mem   malloc 1024   15   void  ptr    void     uintptr t  mem   15    16   16  memset 16aligned ptr  0  1024   free mem

User · Answer

Unfortunately  in C99 it seems pretty tough to guarantee alignment of any sort in a way which would be portable across any C implementation conforming to C99   Why   Because a pointer is not guaranteed to be the  byte address  one might imagine with a flat memory model   Neither is the representation of uintptr t so guaranteed  which itself is an optional type anyway   We might know of some implementations which use a representation for void    and by definition  also char    which is a simple byte address  but by C99 it is opaque to us  the programmers   An implementation might represent a pointer by a set  segment  offset  where offset could have who-knows-what alignment  in reality    Why  a pointer could even be some form of hash table lookup value  or even a linked-list lookup value   It could encode bounds information   In a recent C1X draft for a C Standard  we see the  Alignas keyword   That might help a bit   The only guarantee C99 gives us is that the memory allocation functions will return a pointer suitable for assignment to a pointer pointing at any object type   Since we cannot specify the alignment of objects  we cannot implement our own allocation functions with responsibility for alignment in a well-defined  portable manner   It would be good to be wrong about this claim

User · Answer

Here s an alternate approach to the  round up  part   Not the most brilliantly coded solution but it gets the job done  and this type of syntax is a bit easier to remember  plus would work for alignment values that aren t a power of 2    The uintptr t cast was necessary to appease the compiler  pointer arithmetic isn t very fond of division or multiplication   void  mem   malloc 1024   15   void  ptr    void     uintptr t  mem   15    16   16  memset 16aligned ptr  0  1024   free mem

User · Answer

Perhaps they would have been satisfied with a knowledge of memalign  And as Jonathan Leffler points out  there are two newer preferable functions to know about   Oops  florin beat me to it  However  if you read the man page I linked to  you ll most likely understand the example supplied by an earlier poster

User · Answer

Here s an alternate approach to the  round up  part   Not the most brilliantly coded solution but it gets the job done  and this type of syntax is a bit easier to remember  plus would work for alignment values that aren t a power of 2    The uintptr t cast was necessary to appease the compiler  pointer arithmetic isn t very fond of division or multiplication   void  mem   malloc 1024   15   void  ptr    void     uintptr t  mem   15    16   16  memset 16aligned ptr  0  1024   free mem

User · Answer

Perhaps they would have been satisfied with a knowledge of memalign  And as Jonathan Leffler points out  there are two newer preferable functions to know about   Oops  florin beat me to it  However  if you read the man page I linked to  you ll most likely understand the example supplied by an earlier poster

User · Answer

Original answer        void  mem   malloc 1024 16       void  ptr     char   mem 16   amp    0x0F      memset 16aligned ptr  0  1024       free mem       Fixed answer        void  mem   malloc 1024 15       void  ptr     uintptr t mem 15   amp     uintptr t 0x0F      memset 16aligned ptr  0  1024       free mem       Explanation as requested  The first step is to allocate enough spare space  just in case   Since the memory must be 16-byte aligned  meaning that the leading byte address needs to be a multiple of 16   adding 16 extra bytes guarantees that we have enough space   Somewhere in the first 16 bytes  there is a 16-byte aligned pointer    Note that malloc   is supposed to return a pointer that is sufficiently well aligned for any purpose   However  the meaning of  any  is primarily for things like basic types     long  double  long double  long long  and pointers to objects and pointers to functions   When you are doing more specialized things  like playing with graphics systems  they can need more stringent alignment than the rest of the system     hence questions and answers like this    The next step is to convert the void pointer to a char pointer  GCC notwithstanding  you are not supposed to do pointer arithmetic on void pointers  and GCC has warning options to tell you when you abuse it    Then add 16 to the start pointer   Suppose malloc   returned you an impossibly badly aligned pointer  0x800001   Adding the 16 gives 0x800011  Now I want to round down to the 16-byte boundary     so I want to reset the last 4 bits to 0   0x0F has the last 4 bits set to one  therefore   0x0F has all bits set to one except the last four   Anding that with 0x800011 gives 0x800010   You can iterate over the other offsets and see that the same arithmetic works   The last step  free    is easy  you always  and only  return to free   a value that one of malloc    calloc   or realloc   returned to you     anything else is a disaster   You correctly provided mem to hold that value     thank you   The free releases it   Finally  if you know about the internals of your system s malloc package  you could guess that it might well return 16-byte aligned data  or it might be 8-byte aligned    If it was 16-byte aligned  then you d not need to dink with the values   However  this is dodgy and non-portable     other malloc packages have different minimum alignments  and therefore assuming one thing when it does something different would lead to core dumps   Within broad limits  this solution is portable   Someone else mentioned posix memalign   as another way to get the aligned memory  that isn t available everywhere  but could often be implemented using this as a basis   Note that it was convenient that the alignment was a power of 2  other alignments are messier   One more comment     this code does not check that the allocation succeeded   Amendment  Windows Programmer pointed out that you can t do bit mask operations on pointers  and  indeed  GCC  3 4 6 and 4 3 1 tested  does complain like that   So  an amended version of the basic code     converted into a main program  follows   I ve also taken the liberty of adding just 15 instead of 16  as has been pointed out   I m using uintptr t since C99 has been around long enough to be accessible on most platforms   If it wasn t for the use of PRIXPTR in the printf   statements  it would be sufficient to  include  lt stdint h gt  instead of using  include  lt inttypes h gt    This code includes the fix pointed out by C R   which was reiterating a point first made by Bill K a number of years ago  which I managed to overlook until now     include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      int main void        void  mem   malloc 1024 15       void  ptr    void      uintptr t mem 15   amp     uintptr t 0x0F       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem       return 0       And here is a marginally more generalized version  which will work for sizes which are a power of 2    include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      static void test mask size t align        uintptr t mask     uintptr t  align - 1       void  mem   malloc 1024 align-1       void  ptr    void      uintptr t mem align-1   amp  mask       assert  align  amp   align - 1      0       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem      int main void        test mask 16       test mask 32       test mask 64       test mask 128       return 0       To convert test mask   into a general purpose allocation function  the single return value from the allocator would have to encode the release address  as several people have indicated in their answers   Problems with interviewers  Uri commented  Maybe I am having  a  reading comprehension problem this morning  but if the interview question specifically says   How would you allocate 1024 bytes of memory  and you clearly allocate more than that  Wouldn t that be an automatic failure from the interviewer   My response won t fit into a 300-character comment     It depends  I suppose   I think most people  including me  took the question to mean  How would you allocate a space in which 1024 bytes of data can be stored  and where the base address is a multiple of 16 bytes    If the interviewer really meant how can you allocate 1024 bytes  only  and have it 16-byte aligned  then the options are more limited    Clearly  one possibility is to allocate 1024 bytes and then give that address the  alignment treatment   the problem with that approach is that the actual available space is not properly determinate  the usable space is between 1008 and 1024 bytes  but there wasn t a mechanism available to specify which size   which renders it less than useful  Another possibility is that you are expected to write a full memory allocator and ensure that the 1024-byte block you return is appropriately aligned   If that is the case  you probably end up doing an operation fairly similar to what the proposed solution did  but you hide it inside the allocator    However  if the interviewer expected either of those responses  I d expect them to recognize that this solution answers a closely related question  and then to reframe their question to point the conversation in the correct direction    Further  if the interviewer got really stroppy  then I wouldn t want the job  if the answer to an insufficiently precise requirement is shot down in flames without correction  then the interviewer is not someone for whom it is safe to work    The world moves on  The title of the question has changed recently   It was Solve the memory alignment in C interview question that stumped me   The revised title  How to allocate aligned memory only using the standard library   demands a slightly revised answer     this addendum provides it   C11  ISO IEC 9899 2011  added function aligned alloc        7 22 3 1 The aligned alloc function      Synopsis     include  lt stdlib h gt  void  aligned alloc size t alignment  size t size         Description   The aligned alloc function allocates space for an object whose alignment is   specified by alignment  whose size is specified by size  and whose value is   indeterminate  The value of alignment shall be a valid alignment supported by the implementation and the value of size shall be an integral multiple of alignment       Returns   The aligned alloc function returns either a null pointer or a pointer to the allocated space    And POSIX defines posix memalign       include  lt stdlib h gt   int posix memalign void   memptr  size t alignment  size t size         DESCRIPTION      The posix memalign   function shall allocate size bytes aligned on a boundary specified by alignment  and shall return a pointer to the allocated memory in memptr  The value of alignment shall be a power of two multiple of sizeof void          Upon successful completion  the value pointed to by memptr shall be a multiple of alignment       If the size of the space requested is 0  the behavior is implementation-defined  the value returned in memptr shall be either a null pointer or a unique pointer       The free   function shall deallocate memory that has previously been allocated by posix memalign         RETURN VALUE      Upon successful completion  posix memalign   shall return zero  otherwise  an error number shall be returned to indicate the error    Either or both of these could be used to answer the question now  but only the POSIX function was an option when the question was originally answered   Behind the scenes  the new aligned memory function do much the same job as outlined in the question  except they have the ability to force the alignment more easily  and keep track of the start of the aligned memory internally so that the code doesn t have to deal with specially     it just frees the memory returned by the allocation function that was used

User · Answer

The first thing that popped into my head when reading this question was to define an aligned struct  instantiate it  and then point to it   Is there a fundamental reason I m missing since no one else suggested this   As a sidenote  since I used an array of char  assuming the system s char is 8 bits  i e  1 byte    I don t see the need for the   attribute    packed   necessarily  correct me if I m wrong   but I put it in anyway   This works on two systems I tried it on  but it s possible that there is a compiler optimization that I m unaware of giving me false positives vis-a-vis the efficacy of the code  I used gcc 4 9 2 on OSX and gcc 5 2 1 on Ubuntu    include  lt stdio h gt   include  lt stdlib h gt   int main          void  mem      void  ptr         answer a  here    struct   attribute    packed   s CozyMem          char acSpace 16             mem   malloc sizeof struct s CozyMem       ptr   mem         memset 16aligned ptr  0  1024          Check if it s aligned    if   unsigned long ptr  amp  15     0  printf  Aligned to 16 bytes  n       else printf  Rubbish  n           answer b  here    free mem       return 1

User · Answer

Three slightly different answers depending how you look at the question   1  Good enough for the exact question asked is Jonathan Leffler s solution  except that to round up to 16-aligned  you only need 15 extra bytes  not 16   A      allocate a buffer with room to add 0-15 bytes to ensure 16-alignment    void  mem   malloc 1024 15   ASSERT mem      some kind of error-handling code    round up to multiple of 16  add 15 and then round down by masking    void  ptr     char  mem 15   amp     size t 0x0F    B   free mem     2  For a more generic memory allocation function  the caller doesn t want to have to keep track of two pointers  one to use and one to free   So you store a pointer to the  real  buffer below the aligned buffer    A   void  mem   malloc 1024 15 sizeof void     if   mem  return mem  void  ptr     char  mem sizeof void   15   amp     size t 0x0F    void   ptr  -1    mem  return ptr    B   if  ptr  free   void   ptr  -1      Note that unlike  1   where only 15 bytes were added to mem  this code could actually reduce the alignment if your implementation happens to guarantee 32-byte alignment from malloc  unlikely  but in theory a C implementation could have a 32-byte aligned type   That doesn t matter if all you do is call memset 16aligned  but if you use the memory for a struct then it could matter    I m not sure off-hand what a good fix is for this  other than to warn the user that the buffer returned is not necessarily suitable for arbitrary structs  since there s no way to determine programatically what the implementation-specific alignment guarantee is  I guess at startup you could allocate two or more 1-byte buffers  and assume that the worst alignment you see is the guaranteed alignment  If you re wrong  you waste memory  Anyone with a better idea  please say so      Added  The  standard  trick is to create a union of  likely to be maximally aligned types  to determine the requisite alignment   The maximally aligned types are likely to be  in C99   long long    long double    void     or  void     void    if you include  lt stdint h gt   you could presumably use  intmax t  in place of long long  and  on Power 6  AIX  machines  intmax t would give you a 128-bit integer type    The alignment requirements for that union can be determined by embedding it into a struct with a single char followed by the union   struct alignment       char     c      union               intmax t      imax          long double   ldbl          void          vptr          void          fptr  void                u    align data  size t align    char    amp align data u imax -  amp align data c    You would then use the larger of the requested alignment  in the example  16  and the align value calculated above   On  64-bit  Solaris 10  it appears that the basic alignment for the result from malloc   is a multiple of 32 bytes      In practice  aligned allocators often take a parameter for the alignment rather than it being hardwired  So the user will pass in the size of the struct they care about  or the least power of 2 greater than or equal to that  and all will be well   3  Use what your platform provides  posix memalign for POSIX   aligned malloc on Windows   4  If you use C11  then the cleanest - portable and concise - option is to use the standard library function aligned alloc that was introduced in this version of the language specification

User · Answer

On the 16 vs 15 byte-count padding front  the actual number you need to add to get an alignment of N is max 0 N-M  where M is the natural alignment of the memory allocator  and both are powers of 2    Since the minimal memory alignment of any allocator is 1 byte  15 max 0 16-1  is a conservative answer   However  if you know your memory allocator is going to give you 32-bit int aligned addresses  which is fairly common   you could have used 12 as a pad   This isn t important for this example but it might be important on an embedded system with 12K of RAM where every single int saved counts   The best way to implement it if you re actually going to try to save every byte possible is as a macro so you can feed it your native memory alignment   Again  this is probably only useful for embedded systems where you need to save every byte   In the example below  on most systems  the value 1 is just fine for MEMORY ALLOCATOR NATIVE ALIGNMENT  however for our theoretical embedded system with 32-bit aligned allocations  the following could save a tiny bit of precious memory    define MEMORY ALLOCATOR NATIVE ALIGNMENT    4  define ALIGN PAD2 N M     N  gt  M       N - M     0   define ALIGN PAD N  ALIGN PAD2  N   MEMORY ALLOCATOR NATIVE ALIGNMENT

User · Answer

We do this sort of thing all the time for Accelerate framework  a heavily vectorized OS X   iOS library  where we have to pay attention to alignment all the time   There are quite a few options  one or two of which I didn t see mentioned above   The fastest method for a small array like this is just stick it on the stack  With GCC   clang    void my func  void           uint8 t array 1024    attribute     aligned 16                   No free   required  This is typically two instructions  subtract 1024 from the stack pointer  then AND the stack pointer with -alignment   Presumably the requester needed the data on the heap because its lifespan of the array exceeded the stack or recursion is at work or stack space is at a serious premium   On OS X   iOS all calls to malloc calloc etc  are always 16 byte aligned   If you needed 32 byte aligned for AVX  for example  then you can use posix memalign   void  buf   NULL  int err   posix memalign   amp buf  32   alignment    1024   size     if  err      RunInCirclesWaivingArmsWildly        free buf     Some folks have mentioned the C   interface that works similarly    It should not be forgotten that pages are aligned to large powers of two  so page-aligned buffers are also 16 byte aligned  Thus  mmap   and valloc   and other similar interfaces are also options  mmap   has the advantage that the buffer can be allocated preinitialized with something non-zero in it  if you want  Since these have page aligned size  you will not get the minimum allocation from these  and it will likely be subject to a VM fault the first time you touch it    Cheesy   Turn on guard malloc or similar  Buffers that are n 16 bytes in size such as this one will be n 16 bytes aligned  because VM is used to catch overruns and its boundaries are at page boundaries   Some Accelerate framework functions take in a user supplied temp buffer to use as scratch space  Here we have to assume that the buffer passed to us is wildly misaligned and the user is actively trying to make our life hard out of spite   Our test cases stick a guard page right before and after the temp buffer to underline the spite   Here  we return the minimum size we need to guarantee a 16-byte aligned segment somewhere in it  and then manually align the buffer afterward  This size is desired size   alignment - 1  So  In this case that is 1024   16 - 1   1039 bytes  Then align as so    include  lt stdint h gt  void My func  uint8 t  tempBuf              uint8 t  alignedBuf    uint8 t                                 uintptr t  tempBuf     uintptr t alignment-1                                             amp  -  uintptr t  alignment                Adding alignment-1 will move the pointer past the first aligned address and then ANDing with -alignment   e g  0xfff   ff0 for alignment 16  brings it back to the aligned address    As described by other posts  on other operating systems without 16-byte alignment guarantees  you can call malloc with the larger size  set aside the pointer for free   later  then align as described immediately above and use the aligned pointer  much as described for our temp buffer case    As for aligned memset  this is rather silly  You only have to loop in up to 15 bytes to reach an aligned address  and then proceed with aligned stores after that with some possible cleanup code at the end  You can even do the cleanup bits in vector code  either as unaligned stores that overlap the aligned region  providing the length is at least the length of a vector  or using something like movmaskdqu  Someone is just being lazy  However  it is probably a reasonable interview question if the interviewer wants to know whether you are comfortable with stdint h  bitwise operators and memory fundamentals  so the contrived example can be forgiven

User · Answer

I m surprised noone s voted up Shao s answer that  as I understand it  it is impossible to do what s asked in standard C99  since converting a pointer to an integral type formally is undefined behavior   Apart from the standard allowing conversion of uintptr t  lt -  void   but the standard does not seem to allow doing any manipulations of the uintptr t value and then converting it back

User · Answer

Three slightly different answers depending how you look at the question   1  Good enough for the exact question asked is Jonathan Leffler s solution  except that to round up to 16-aligned  you only need 15 extra bytes  not 16   A      allocate a buffer with room to add 0-15 bytes to ensure 16-alignment    void  mem   malloc 1024 15   ASSERT mem      some kind of error-handling code    round up to multiple of 16  add 15 and then round down by masking    void  ptr     char  mem 15   amp     size t 0x0F    B   free mem     2  For a more generic memory allocation function  the caller doesn t want to have to keep track of two pointers  one to use and one to free   So you store a pointer to the  real  buffer below the aligned buffer    A   void  mem   malloc 1024 15 sizeof void     if   mem  return mem  void  ptr     char  mem sizeof void   15   amp     size t 0x0F    void   ptr  -1    mem  return ptr    B   if  ptr  free   void   ptr  -1      Note that unlike  1   where only 15 bytes were added to mem  this code could actually reduce the alignment if your implementation happens to guarantee 32-byte alignment from malloc  unlikely  but in theory a C implementation could have a 32-byte aligned type   That doesn t matter if all you do is call memset 16aligned  but if you use the memory for a struct then it could matter    I m not sure off-hand what a good fix is for this  other than to warn the user that the buffer returned is not necessarily suitable for arbitrary structs  since there s no way to determine programatically what the implementation-specific alignment guarantee is  I guess at startup you could allocate two or more 1-byte buffers  and assume that the worst alignment you see is the guaranteed alignment  If you re wrong  you waste memory  Anyone with a better idea  please say so      Added  The  standard  trick is to create a union of  likely to be maximally aligned types  to determine the requisite alignment   The maximally aligned types are likely to be  in C99   long long    long double    void     or  void     void    if you include  lt stdint h gt   you could presumably use  intmax t  in place of long long  and  on Power 6  AIX  machines  intmax t would give you a 128-bit integer type    The alignment requirements for that union can be determined by embedding it into a struct with a single char followed by the union   struct alignment       char     c      union               intmax t      imax          long double   ldbl          void          vptr          void          fptr  void                u    align data  size t align    char    amp align data u imax -  amp align data c    You would then use the larger of the requested alignment  in the example  16  and the align value calculated above   On  64-bit  Solaris 10  it appears that the basic alignment for the result from malloc   is a multiple of 32 bytes      In practice  aligned allocators often take a parameter for the alignment rather than it being hardwired  So the user will pass in the size of the struct they care about  or the least power of 2 greater than or equal to that  and all will be well   3  Use what your platform provides  posix memalign for POSIX   aligned malloc on Windows   4  If you use C11  then the cleanest - portable and concise - option is to use the standard library function aligned alloc that was introduced in this version of the language specification

User · Answer

Here s an alternate approach to the  round up  part   Not the most brilliantly coded solution but it gets the job done  and this type of syntax is a bit easier to remember  plus would work for alignment values that aren t a power of 2    The uintptr t cast was necessary to appease the compiler  pointer arithmetic isn t very fond of division or multiplication   void  mem   malloc 1024   15   void  ptr    void     uintptr t  mem   15    16   16  memset 16aligned ptr  0  1024   free mem

User · Answer

You could also try posix memalign    on POSIX platforms  of course

User · Answer

We do this sort of thing all the time for Accelerate framework  a heavily vectorized OS X   iOS library  where we have to pay attention to alignment all the time   There are quite a few options  one or two of which I didn t see mentioned above   The fastest method for a small array like this is just stick it on the stack  With GCC   clang    void my func  void           uint8 t array 1024    attribute     aligned 16                   No free   required  This is typically two instructions  subtract 1024 from the stack pointer  then AND the stack pointer with -alignment   Presumably the requester needed the data on the heap because its lifespan of the array exceeded the stack or recursion is at work or stack space is at a serious premium   On OS X   iOS all calls to malloc calloc etc  are always 16 byte aligned   If you needed 32 byte aligned for AVX  for example  then you can use posix memalign   void  buf   NULL  int err   posix memalign   amp buf  32   alignment    1024   size     if  err      RunInCirclesWaivingArmsWildly        free buf     Some folks have mentioned the C   interface that works similarly    It should not be forgotten that pages are aligned to large powers of two  so page-aligned buffers are also 16 byte aligned  Thus  mmap   and valloc   and other similar interfaces are also options  mmap   has the advantage that the buffer can be allocated preinitialized with something non-zero in it  if you want  Since these have page aligned size  you will not get the minimum allocation from these  and it will likely be subject to a VM fault the first time you touch it    Cheesy   Turn on guard malloc or similar  Buffers that are n 16 bytes in size such as this one will be n 16 bytes aligned  because VM is used to catch overruns and its boundaries are at page boundaries   Some Accelerate framework functions take in a user supplied temp buffer to use as scratch space  Here we have to assume that the buffer passed to us is wildly misaligned and the user is actively trying to make our life hard out of spite   Our test cases stick a guard page right before and after the temp buffer to underline the spite   Here  we return the minimum size we need to guarantee a 16-byte aligned segment somewhere in it  and then manually align the buffer afterward  This size is desired size   alignment - 1  So  In this case that is 1024   16 - 1   1039 bytes  Then align as so    include  lt stdint h gt  void My func  uint8 t  tempBuf              uint8 t  alignedBuf    uint8 t                                 uintptr t  tempBuf     uintptr t alignment-1                                             amp  -  uintptr t  alignment                Adding alignment-1 will move the pointer past the first aligned address and then ANDing with -alignment   e g  0xfff   ff0 for alignment 16  brings it back to the aligned address    As described by other posts  on other operating systems without 16-byte alignment guarantees  you can call malloc with the larger size  set aside the pointer for free   later  then align as described immediately above and use the aligned pointer  much as described for our temp buffer case    As for aligned memset  this is rather silly  You only have to loop in up to 15 bytes to reach an aligned address  and then proceed with aligned stores after that with some possible cleanup code at the end  You can even do the cleanup bits in vector code  either as unaligned stores that overlap the aligned region  providing the length is at least the length of a vector  or using something like movmaskdqu  Someone is just being lazy  However  it is probably a reasonable interview question if the interviewer wants to know whether you are comfortable with stdint h  bitwise operators and memory fundamentals  so the contrived example can be forgiven

User · Answer

Unfortunately  in C99 it seems pretty tough to guarantee alignment of any sort in a way which would be portable across any C implementation conforming to C99   Why   Because a pointer is not guaranteed to be the  byte address  one might imagine with a flat memory model   Neither is the representation of uintptr t so guaranteed  which itself is an optional type anyway   We might know of some implementations which use a representation for void    and by definition  also char    which is a simple byte address  but by C99 it is opaque to us  the programmers   An implementation might represent a pointer by a set  segment  offset  where offset could have who-knows-what alignment  in reality    Why  a pointer could even be some form of hash table lookup value  or even a linked-list lookup value   It could encode bounds information   In a recent C1X draft for a C Standard  we see the  Alignas keyword   That might help a bit   The only guarantee C99 gives us is that the memory allocation functions will return a pointer suitable for assignment to a pointer pointing at any object type   Since we cannot specify the alignment of objects  we cannot implement our own allocation functions with responsibility for alignment in a well-defined  portable manner   It would be good to be wrong about this claim

User · Answer

size  1024  alignment   16  aligned size   size   alignment - size    alignment    mem   malloc aligned size   memset 16aligned mem  0  1024   free mem     Hope this one is the simplest implementation  let me know your comments

User · Answer

If there are constraints that  you cannot waste a single byte  then this solution works  Note  There is a case where this may be executed infinitely  D       void  mem       void  ptr  try     mem    malloc 1024        if  mem   16    0             free mem            goto try            ptr   mem       memset 16aligned ptr  0  1024

User · Answer

I m surprised noone s voted up Shao s answer that  as I understand it  it is impossible to do what s asked in standard C99  since converting a pointer to an integral type formally is undefined behavior   Apart from the standard allowing conversion of uintptr t  lt -  void   but the standard does not seem to allow doing any manipulations of the uintptr t value and then converting it back

User · Answer

usage of memalign  Aligned-Memory-Blocks might be a good solution for the problem

User · Answer

Three slightly different answers depending how you look at the question   1  Good enough for the exact question asked is Jonathan Leffler s solution  except that to round up to 16-aligned  you only need 15 extra bytes  not 16   A      allocate a buffer with room to add 0-15 bytes to ensure 16-alignment    void  mem   malloc 1024 15   ASSERT mem      some kind of error-handling code    round up to multiple of 16  add 15 and then round down by masking    void  ptr     char  mem 15   amp     size t 0x0F    B   free mem     2  For a more generic memory allocation function  the caller doesn t want to have to keep track of two pointers  one to use and one to free   So you store a pointer to the  real  buffer below the aligned buffer    A   void  mem   malloc 1024 15 sizeof void     if   mem  return mem  void  ptr     char  mem sizeof void   15   amp     size t 0x0F    void   ptr  -1    mem  return ptr    B   if  ptr  free   void   ptr  -1      Note that unlike  1   where only 15 bytes were added to mem  this code could actually reduce the alignment if your implementation happens to guarantee 32-byte alignment from malloc  unlikely  but in theory a C implementation could have a 32-byte aligned type   That doesn t matter if all you do is call memset 16aligned  but if you use the memory for a struct then it could matter    I m not sure off-hand what a good fix is for this  other than to warn the user that the buffer returned is not necessarily suitable for arbitrary structs  since there s no way to determine programatically what the implementation-specific alignment guarantee is  I guess at startup you could allocate two or more 1-byte buffers  and assume that the worst alignment you see is the guaranteed alignment  If you re wrong  you waste memory  Anyone with a better idea  please say so      Added  The  standard  trick is to create a union of  likely to be maximally aligned types  to determine the requisite alignment   The maximally aligned types are likely to be  in C99   long long    long double    void     or  void     void    if you include  lt stdint h gt   you could presumably use  intmax t  in place of long long  and  on Power 6  AIX  machines  intmax t would give you a 128-bit integer type    The alignment requirements for that union can be determined by embedding it into a struct with a single char followed by the union   struct alignment       char     c      union               intmax t      imax          long double   ldbl          void          vptr          void          fptr  void                u    align data  size t align    char    amp align data u imax -  amp align data c    You would then use the larger of the requested alignment  in the example  16  and the align value calculated above   On  64-bit  Solaris 10  it appears that the basic alignment for the result from malloc   is a multiple of 32 bytes      In practice  aligned allocators often take a parameter for the alignment rather than it being hardwired  So the user will pass in the size of the struct they care about  or the least power of 2 greater than or equal to that  and all will be well   3  Use what your platform provides  posix memalign for POSIX   aligned malloc on Windows   4  If you use C11  then the cleanest - portable and concise - option is to use the standard library function aligned alloc that was introduced in this version of the language specification

User · Answer

Three slightly different answers depending how you look at the question   1  Good enough for the exact question asked is Jonathan Leffler s solution  except that to round up to 16-aligned  you only need 15 extra bytes  not 16   A      allocate a buffer with room to add 0-15 bytes to ensure 16-alignment    void  mem   malloc 1024 15   ASSERT mem      some kind of error-handling code    round up to multiple of 16  add 15 and then round down by masking    void  ptr     char  mem 15   amp     size t 0x0F    B   free mem     2  For a more generic memory allocation function  the caller doesn t want to have to keep track of two pointers  one to use and one to free   So you store a pointer to the  real  buffer below the aligned buffer    A   void  mem   malloc 1024 15 sizeof void     if   mem  return mem  void  ptr     char  mem sizeof void   15   amp     size t 0x0F    void   ptr  -1    mem  return ptr    B   if  ptr  free   void   ptr  -1      Note that unlike  1   where only 15 bytes were added to mem  this code could actually reduce the alignment if your implementation happens to guarantee 32-byte alignment from malloc  unlikely  but in theory a C implementation could have a 32-byte aligned type   That doesn t matter if all you do is call memset 16aligned  but if you use the memory for a struct then it could matter    I m not sure off-hand what a good fix is for this  other than to warn the user that the buffer returned is not necessarily suitable for arbitrary structs  since there s no way to determine programatically what the implementation-specific alignment guarantee is  I guess at startup you could allocate two or more 1-byte buffers  and assume that the worst alignment you see is the guaranteed alignment  If you re wrong  you waste memory  Anyone with a better idea  please say so      Added  The  standard  trick is to create a union of  likely to be maximally aligned types  to determine the requisite alignment   The maximally aligned types are likely to be  in C99   long long    long double    void     or  void     void    if you include  lt stdint h gt   you could presumably use  intmax t  in place of long long  and  on Power 6  AIX  machines  intmax t would give you a 128-bit integer type    The alignment requirements for that union can be determined by embedding it into a struct with a single char followed by the union   struct alignment       char     c      union               intmax t      imax          long double   ldbl          void          vptr          void          fptr  void                u    align data  size t align    char    amp align data u imax -  amp align data c    You would then use the larger of the requested alignment  in the example  16  and the align value calculated above   On  64-bit  Solaris 10  it appears that the basic alignment for the result from malloc   is a multiple of 32 bytes      In practice  aligned allocators often take a parameter for the alignment rather than it being hardwired  So the user will pass in the size of the struct they care about  or the least power of 2 greater than or equal to that  and all will be well   3  Use what your platform provides  posix memalign for POSIX   aligned malloc on Windows   4  If you use C11  then the cleanest - portable and concise - option is to use the standard library function aligned alloc that was introduced in this version of the language specification

User · Answer

You could also try posix memalign    on POSIX platforms  of course

User · Answer

usage of memalign  Aligned-Memory-Blocks might be a good solution for the problem

User · Answer

If there are constraints that  you cannot waste a single byte  then this solution works  Note  There is a case where this may be executed infinitely  D       void  mem       void  ptr  try     mem    malloc 1024        if  mem   16    0             free mem            goto try            ptr   mem       memset 16aligned ptr  0  1024

User · Answer

For the solution i used a concept of padding which aligns the memory and do not waste the      memory of a single byte    If there are constraints that  you cannot waste a single byte  All pointers allocated with malloc are 16 bytes aligned   C11 is supported  so you can just call aligned alloc  16  size     void  mem   malloc 1024 16   void  ptr     char   mem 16   amp    0x0F  memset 16aligned ptr  0  1024   free mem

User · Answer

Perhaps they would have been satisfied with a knowledge of memalign  And as Jonathan Leffler points out  there are two newer preferable functions to know about   Oops  florin beat me to it  However  if you read the man page I linked to  you ll most likely understand the example supplied by an earlier poster

User · Answer

Original answer        void  mem   malloc 1024 16       void  ptr     char   mem 16   amp    0x0F      memset 16aligned ptr  0  1024       free mem       Fixed answer        void  mem   malloc 1024 15       void  ptr     uintptr t mem 15   amp     uintptr t 0x0F      memset 16aligned ptr  0  1024       free mem       Explanation as requested  The first step is to allocate enough spare space  just in case   Since the memory must be 16-byte aligned  meaning that the leading byte address needs to be a multiple of 16   adding 16 extra bytes guarantees that we have enough space   Somewhere in the first 16 bytes  there is a 16-byte aligned pointer    Note that malloc   is supposed to return a pointer that is sufficiently well aligned for any purpose   However  the meaning of  any  is primarily for things like basic types     long  double  long double  long long  and pointers to objects and pointers to functions   When you are doing more specialized things  like playing with graphics systems  they can need more stringent alignment than the rest of the system     hence questions and answers like this    The next step is to convert the void pointer to a char pointer  GCC notwithstanding  you are not supposed to do pointer arithmetic on void pointers  and GCC has warning options to tell you when you abuse it    Then add 16 to the start pointer   Suppose malloc   returned you an impossibly badly aligned pointer  0x800001   Adding the 16 gives 0x800011  Now I want to round down to the 16-byte boundary     so I want to reset the last 4 bits to 0   0x0F has the last 4 bits set to one  therefore   0x0F has all bits set to one except the last four   Anding that with 0x800011 gives 0x800010   You can iterate over the other offsets and see that the same arithmetic works   The last step  free    is easy  you always  and only  return to free   a value that one of malloc    calloc   or realloc   returned to you     anything else is a disaster   You correctly provided mem to hold that value     thank you   The free releases it   Finally  if you know about the internals of your system s malloc package  you could guess that it might well return 16-byte aligned data  or it might be 8-byte aligned    If it was 16-byte aligned  then you d not need to dink with the values   However  this is dodgy and non-portable     other malloc packages have different minimum alignments  and therefore assuming one thing when it does something different would lead to core dumps   Within broad limits  this solution is portable   Someone else mentioned posix memalign   as another way to get the aligned memory  that isn t available everywhere  but could often be implemented using this as a basis   Note that it was convenient that the alignment was a power of 2  other alignments are messier   One more comment     this code does not check that the allocation succeeded   Amendment  Windows Programmer pointed out that you can t do bit mask operations on pointers  and  indeed  GCC  3 4 6 and 4 3 1 tested  does complain like that   So  an amended version of the basic code     converted into a main program  follows   I ve also taken the liberty of adding just 15 instead of 16  as has been pointed out   I m using uintptr t since C99 has been around long enough to be accessible on most platforms   If it wasn t for the use of PRIXPTR in the printf   statements  it would be sufficient to  include  lt stdint h gt  instead of using  include  lt inttypes h gt    This code includes the fix pointed out by C R   which was reiterating a point first made by Bill K a number of years ago  which I managed to overlook until now     include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      int main void        void  mem   malloc 1024 15       void  ptr    void      uintptr t mem 15   amp     uintptr t 0x0F       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem       return 0       And here is a marginally more generalized version  which will work for sizes which are a power of 2    include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      static void test mask size t align        uintptr t mask     uintptr t  align - 1       void  mem   malloc 1024 align-1       void  ptr    void      uintptr t mem align-1   amp  mask       assert  align  amp   align - 1      0       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem      int main void        test mask 16       test mask 32       test mask 64       test mask 128       return 0       To convert test mask   into a general purpose allocation function  the single return value from the allocator would have to encode the release address  as several people have indicated in their answers   Problems with interviewers  Uri commented  Maybe I am having  a  reading comprehension problem this morning  but if the interview question specifically says   How would you allocate 1024 bytes of memory  and you clearly allocate more than that  Wouldn t that be an automatic failure from the interviewer   My response won t fit into a 300-character comment     It depends  I suppose   I think most people  including me  took the question to mean  How would you allocate a space in which 1024 bytes of data can be stored  and where the base address is a multiple of 16 bytes    If the interviewer really meant how can you allocate 1024 bytes  only  and have it 16-byte aligned  then the options are more limited    Clearly  one possibility is to allocate 1024 bytes and then give that address the  alignment treatment   the problem with that approach is that the actual available space is not properly determinate  the usable space is between 1008 and 1024 bytes  but there wasn t a mechanism available to specify which size   which renders it less than useful  Another possibility is that you are expected to write a full memory allocator and ensure that the 1024-byte block you return is appropriately aligned   If that is the case  you probably end up doing an operation fairly similar to what the proposed solution did  but you hide it inside the allocator    However  if the interviewer expected either of those responses  I d expect them to recognize that this solution answers a closely related question  and then to reframe their question to point the conversation in the correct direction    Further  if the interviewer got really stroppy  then I wouldn t want the job  if the answer to an insufficiently precise requirement is shot down in flames without correction  then the interviewer is not someone for whom it is safe to work    The world moves on  The title of the question has changed recently   It was Solve the memory alignment in C interview question that stumped me   The revised title  How to allocate aligned memory only using the standard library   demands a slightly revised answer     this addendum provides it   C11  ISO IEC 9899 2011  added function aligned alloc        7 22 3 1 The aligned alloc function      Synopsis     include  lt stdlib h gt  void  aligned alloc size t alignment  size t size         Description   The aligned alloc function allocates space for an object whose alignment is   specified by alignment  whose size is specified by size  and whose value is   indeterminate  The value of alignment shall be a valid alignment supported by the implementation and the value of size shall be an integral multiple of alignment       Returns   The aligned alloc function returns either a null pointer or a pointer to the allocated space    And POSIX defines posix memalign       include  lt stdlib h gt   int posix memalign void   memptr  size t alignment  size t size         DESCRIPTION      The posix memalign   function shall allocate size bytes aligned on a boundary specified by alignment  and shall return a pointer to the allocated memory in memptr  The value of alignment shall be a power of two multiple of sizeof void          Upon successful completion  the value pointed to by memptr shall be a multiple of alignment       If the size of the space requested is 0  the behavior is implementation-defined  the value returned in memptr shall be either a null pointer or a unique pointer       The free   function shall deallocate memory that has previously been allocated by posix memalign         RETURN VALUE      Upon successful completion  posix memalign   shall return zero  otherwise  an error number shall be returned to indicate the error    Either or both of these could be used to answer the question now  but only the POSIX function was an option when the question was originally answered   Behind the scenes  the new aligned memory function do much the same job as outlined in the question  except they have the ability to force the alignment more easily  and keep track of the start of the aligned memory internally so that the code doesn t have to deal with specially     it just frees the memory returned by the allocation function that was used

User · Answer

You could also try posix memalign    on POSIX platforms  of course

User · Answer

MacOS X specific     All pointers allocated with malloc are 16 bytes aligned   C11 is supported  so you can just call aligned malloc  16  size    MacOS X picks code that is optimised for individual processors at boot time for memset  memcpy and memmove and that code uses tricks that you ve never heard of to make it fast  99  chance that memset runs faster than any hand-written memset16 which makes the whole question pointless     If you want a 100  portable solution  before C11 there is none  Because there is no portable way to test alignment of a pointer  If it doesn t have to be 100  portable  you can use  char  p   malloc  size   15   p     -  unsigned int  p    16    This assumes that the alignment of a pointer is stored in the lowest bits when converting a pointer to unsigned int  Converting to unsigned int loses information and is implementation defined  but that doesn t matter because we don t convert the result back to a pointer    The horrible part is of course that the original pointer must be saved somewhere to call free    with it  So all in all I would really doubt the wisdom of this design

User · Answer

The first thing that popped into my head when reading this question was to define an aligned struct  instantiate it  and then point to it   Is there a fundamental reason I m missing since no one else suggested this   As a sidenote  since I used an array of char  assuming the system s char is 8 bits  i e  1 byte    I don t see the need for the   attribute    packed   necessarily  correct me if I m wrong   but I put it in anyway   This works on two systems I tried it on  but it s possible that there is a compiler optimization that I m unaware of giving me false positives vis-a-vis the efficacy of the code  I used gcc 4 9 2 on OSX and gcc 5 2 1 on Ubuntu    include  lt stdio h gt   include  lt stdlib h gt   int main          void  mem      void  ptr         answer a  here    struct   attribute    packed   s CozyMem          char acSpace 16             mem   malloc sizeof struct s CozyMem       ptr   mem         memset 16aligned ptr  0  1024          Check if it s aligned    if   unsigned long ptr  amp  15     0  printf  Aligned to 16 bytes  n       else printf  Rubbish  n           answer b  here    free mem       return 1

User · Answer

Perhaps they would have been satisfied with a knowledge of memalign  And as Jonathan Leffler points out  there are two newer preferable functions to know about   Oops  florin beat me to it  However  if you read the man page I linked to  you ll most likely understand the example supplied by an earlier poster

User · Answer

You could also try posix memalign    on POSIX platforms  of course

User · Answer

MacOS X specific     All pointers allocated with malloc are 16 bytes aligned   C11 is supported  so you can just call aligned malloc  16  size    MacOS X picks code that is optimised for individual processors at boot time for memset  memcpy and memmove and that code uses tricks that you ve never heard of to make it fast  99  chance that memset runs faster than any hand-written memset16 which makes the whole question pointless     If you want a 100  portable solution  before C11 there is none  Because there is no portable way to test alignment of a pointer  If it doesn t have to be 100  portable  you can use  char  p   malloc  size   15   p     -  unsigned int  p    16    This assumes that the alignment of a pointer is stored in the lowest bits when converting a pointer to unsigned int  Converting to unsigned int loses information and is implementation defined  but that doesn t matter because we don t convert the result back to a pointer    The horrible part is of course that the original pointer must be saved somewhere to call free    with it  So all in all I would really doubt the wisdom of this design

User · Answer

You can also add some 16 bytes and then push the original ptr to 16bit aligned by adding the  16-mod  as below the pointer    main    void  mem1   malloc 1024 16   void  mem     char  mem1  1     force misalign   my computer always aligns  printf     ptr    p  n    mem    void  ptr     long mem 16   amp    0x0F  printf     aligned ptr    p  n    ptr     printf    ptr after adding diff mod  p  same as above       long mem1    16 -  long mem1 16        free mem1

User · Answer

Original answer        void  mem   malloc 1024 16       void  ptr     char   mem 16   amp    0x0F      memset 16aligned ptr  0  1024       free mem       Fixed answer        void  mem   malloc 1024 15       void  ptr     uintptr t mem 15   amp     uintptr t 0x0F      memset 16aligned ptr  0  1024       free mem       Explanation as requested  The first step is to allocate enough spare space  just in case   Since the memory must be 16-byte aligned  meaning that the leading byte address needs to be a multiple of 16   adding 16 extra bytes guarantees that we have enough space   Somewhere in the first 16 bytes  there is a 16-byte aligned pointer    Note that malloc   is supposed to return a pointer that is sufficiently well aligned for any purpose   However  the meaning of  any  is primarily for things like basic types     long  double  long double  long long  and pointers to objects and pointers to functions   When you are doing more specialized things  like playing with graphics systems  they can need more stringent alignment than the rest of the system     hence questions and answers like this    The next step is to convert the void pointer to a char pointer  GCC notwithstanding  you are not supposed to do pointer arithmetic on void pointers  and GCC has warning options to tell you when you abuse it    Then add 16 to the start pointer   Suppose malloc   returned you an impossibly badly aligned pointer  0x800001   Adding the 16 gives 0x800011  Now I want to round down to the 16-byte boundary     so I want to reset the last 4 bits to 0   0x0F has the last 4 bits set to one  therefore   0x0F has all bits set to one except the last four   Anding that with 0x800011 gives 0x800010   You can iterate over the other offsets and see that the same arithmetic works   The last step  free    is easy  you always  and only  return to free   a value that one of malloc    calloc   or realloc   returned to you     anything else is a disaster   You correctly provided mem to hold that value     thank you   The free releases it   Finally  if you know about the internals of your system s malloc package  you could guess that it might well return 16-byte aligned data  or it might be 8-byte aligned    If it was 16-byte aligned  then you d not need to dink with the values   However  this is dodgy and non-portable     other malloc packages have different minimum alignments  and therefore assuming one thing when it does something different would lead to core dumps   Within broad limits  this solution is portable   Someone else mentioned posix memalign   as another way to get the aligned memory  that isn t available everywhere  but could often be implemented using this as a basis   Note that it was convenient that the alignment was a power of 2  other alignments are messier   One more comment     this code does not check that the allocation succeeded   Amendment  Windows Programmer pointed out that you can t do bit mask operations on pointers  and  indeed  GCC  3 4 6 and 4 3 1 tested  does complain like that   So  an amended version of the basic code     converted into a main program  follows   I ve also taken the liberty of adding just 15 instead of 16  as has been pointed out   I m using uintptr t since C99 has been around long enough to be accessible on most platforms   If it wasn t for the use of PRIXPTR in the printf   statements  it would be sufficient to  include  lt stdint h gt  instead of using  include  lt inttypes h gt    This code includes the fix pointed out by C R   which was reiterating a point first made by Bill K a number of years ago  which I managed to overlook until now     include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      int main void        void  mem   malloc 1024 15       void  ptr    void      uintptr t mem 15   amp     uintptr t 0x0F       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem       return 0       And here is a marginally more generalized version  which will work for sizes which are a power of 2    include  lt assert h gt   include  lt inttypes h gt   include  lt stdio h gt   include  lt stdlib h gt   include  lt string h gt   static void memset 16aligned void  space  char byte  size t nbytes        assert  nbytes  amp  0x0F     0       assert   uintptr t space  amp  0x0F     0       memset space  byte  nbytes       Not a custom implementation of memset      static void test mask size t align        uintptr t mask     uintptr t  align - 1       void  mem   malloc 1024 align-1       void  ptr    void      uintptr t mem align-1   amp  mask       assert  align  amp   align - 1      0       printf  0x 08  PRIXPTR    0x 08  PRIXPTR   n    uintptr t mem   uintptr t ptr       memset 16aligned ptr  0  1024       free mem      int main void        test mask 16       test mask 32       test mask 64       test mask 128       return 0       To convert test mask   into a general purpose allocation function  the single return value from the allocator would have to encode the release address  as several people have indicated in their answers   Problems with interviewers  Uri commented  Maybe I am having  a  reading comprehension problem this morning  but if the interview question specifically says   How would you allocate 1024 bytes of memory  and you clearly allocate more than that  Wouldn t that be an automatic failure from the interviewer   My response won t fit into a 300-character comment     It depends  I suppose   I think most people  including me  took the question to mean  How would you allocate a space in which 1024 bytes of data can be stored  and where the base address is a multiple of 16 bytes    If the interviewer really meant how can you allocate 1024 bytes  only  and have it 16-byte aligned  then the options are more limited    Clearly  one possibility is to allocate 1024 bytes and then give that address the  alignment treatment   the problem with that approach is that the actual available space is not properly determinate  the usable space is between 1008 and 1024 bytes  but there wasn t a mechanism available to specify which size   which renders it less than useful  Another possibility is that you are expected to write a full memory allocator and ensure that the 1024-byte block you return is appropriately aligned   If that is the case  you probably end up doing an operation fairly similar to what the proposed solution did  but you hide it inside the allocator    However  if the interviewer expected either of those responses  I d expect them to recognize that this solution answers a closely related question  and then to reframe their question to point the conversation in the correct direction    Further  if the interviewer got really stroppy  then I wouldn t want the job  if the answer to an insufficiently precise requirement is shot down in flames without correction  then the interviewer is not someone for whom it is safe to work    The world moves on  The title of the question has changed recently   It was Solve the memory alignment in C interview question that stumped me   The revised title  How to allocate aligned memory only using the standard library   demands a slightly revised answer     this addendum provides it   C11  ISO IEC 9899 2011  added function aligned alloc        7 22 3 1 The aligned alloc function      Synopsis     include  lt stdlib h gt  void  aligned alloc size t alignment  size t size         Description   The aligned alloc function allocates space for an object whose alignment is   specified by alignment  whose size is specified by size  and whose value is   indeterminate  The value of alignment shall be a valid alignment supported by the implementation and the value of size shall be an integral multiple of alignment       Returns   The aligned alloc function returns either a null pointer or a pointer to the allocated space    And POSIX defines posix memalign       include  lt stdlib h gt   int posix memalign void   memptr  size t alignment  size t size         DESCRIPTION      The posix memalign   function shall allocate size bytes aligned on a boundary specified by alignment  and shall return a pointer to the allocated memory in memptr  The value of alignment shall be a power of two multiple of sizeof void          Upon successful completion  the value pointed to by memptr shall be a multiple of alignment       If the size of the space requested is 0  the behavior is implementation-defined  the value returned in memptr shall be either a null pointer or a unique pointer       The free   function shall deallocate memory that has previously been allocated by posix memalign         RETURN VALUE      Upon successful completion  posix memalign   shall return zero  otherwise  an error number shall be returned to indicate the error    Either or both of these could be used to answer the question now  but only the POSIX function was an option when the question was originally answered   Behind the scenes  the new aligned memory function do much the same job as outlined in the question  except they have the ability to force the alignment more easily  and keep track of the start of the aligned memory internally so that the code doesn t have to deal with specially     it just frees the memory returned by the allocation function that was used

[c] How to allocate aligned memory only using the standard library?

Examples related to c

Examples related to memory-management