[c] What is size_t in C?

I am getting confused with size_t in C. I know that it is returned by the sizeof operator. But what exactly is it? Is it a data type?

Let's say I have a for loop:

for(i = 0; i < some_size; i++)

Should I use int i; or size_t i;?

This question is related to c int size-t

The answer is


Since nobody has yet mentioned it, the primary linguistic significance of size_t is that the sizeof operator returns a value of that type. Likewise, the primary significance of ptrdiff_t is that subtracting one pointer from another will yield a value of that type. Library functions that accept it do so because it will allow such functions to work with objects whose size exceeds UINT_MAX on systems where such objects could exist, without forcing callers to waste code passing a value larger than "unsigned int" on systems where the larger type would suffice for all possible objects.


size_t is an unsigned integer data type which can assign only 0 and greater than 0 integer values. It measure bytes of any object's size and returned by sizeof operator. const is the syntax representation of size_t, but without const you can run the programm.

const size_t number;

size_t regularly used for array indexing and loop counting. If the compiler is 32-bit it would work on unsigned int. If the compiler is 64-bit it would work on unsigned long long int also. There for maximum size of size_t depending on compiler type.

size_t already define on <stdio.h> header file, but It can also define by <stddef.h>, <stdlib.h>, <string.h>, <time.h>, <wchar.h> headers.

  • Example (with const)
#include <stdio.h>

int main()
{
    const size_t value = 200;
    size_t i;
    int arr[value];

    for (i = 0 ; i < value ; ++i)
    {
        arr[i] = i;
    }

    size_t size = sizeof(arr);
    printf("size = %zu\n", size);
}

Output -: size = 800


  • Example (without const)
#include <stdio.h>

int main()
{
    size_t value = 200;
    size_t i;
    int arr[value];

    for (i = 0 ; i < value ; ++i)
    {
        arr[i] = i;
    }

    size_t size = sizeof(arr);
    printf("size = %zu\n", size);
}

Output -: size = 800


This is a platform-specific typedef. For example, on a particular machine, it might be unsigned int or unsigned long. You should use this definition for more portability of your code.


size_t and int are not interchangeable. For instance on 64-bit Linux size_t is 64-bit in size (i.e. sizeof(void*)) but int is 32-bit.

Also note that size_t is unsigned. If you need signed version then there is ssize_t on some platforms and it would be more relevant to your example.

As a general rule I would suggest using int for most general cases and only use size_t/ssize_t when there is a specific need for it (with mmap() for example).


The manpage for types.h says:

size_t shall be an unsigned integer type


size_t is unsigned integer data type. On systems using the GNU C Library, this will be unsigned int or unsigned long int. size_t is commonly used for array indexing and loop counting.


size_t is a type that can hold any array index.

Depending on the implementation, it can be any of:

unsigned char

unsigned short

unsigned int

unsigned long

unsigned long long

Here's how size_t is defined in stddef.h of my machine:

typedef unsigned long size_t;

In general, if you are starting at 0 and going upward, always use an unsigned type to avoid an overflow taking you into a negative value situation. This is critically important, because if your array bounds happens to be less than the max of your loop, but your loop max happens to be greater than the max of your type, you will wrap around negative and you may experience a segmentation fault (SIGSEGV). So, in general, never use int for a loop starting at 0 and going upwards. Use an unsigned.


If you are the empirical type,

echo | gcc -E -xc -include 'stddef.h' - | grep size_t

Output for Ubuntu 14.04 64-bit GCC 4.8:

typedef long unsigned int size_t;

Note that stddef.h is provided by GCC and not glibc under src/gcc/ginclude/stddef.h in GCC 4.2.

Interesting C99 appearances

  • malloc takes size_t as an argument, so it determines the maximum size that may be allocated.

    And since it is also returned by sizeof, I think it limits the maximum size of any array.

    See also: What is the maximum size of an array in C?


To go into why size_t needed to exist and how we got here:

In pragmatic terms, size_t and ptrdiff_t are guaranteed to be 64 bits wide on a 64-bit implementation, 32 bits wide on a 32-bit implementation, and so on. They could not force any existing type to mean that, on every compiler, without breaking legacy code.

A size_t or ptrdiff_t is not necessarily the same as an intptr_t or uintptr_t. They were different on certain architectures that were still in use when size_t and ptrdiff_t were added to the Standard in the late ’80s, and becoming obsolete when C99 added many new types but not gone yet (such as 16-bit Windows). The x86 in 16-bit protected mode had a segmented memory where the largest possible array or structure could be only 65,536 bytes in size, but a far pointer needed to be 32 bits wide, wider than the registers. On those, intptr_t would have been 32 bits wide but size_t and ptrdiff_t could be 16 bits wide and fit in a register. And who knew what kind of operating system might be written in the future? In theory, the i386 architecture offers a 32-bit segmentation model with 48-bit pointers that no operating system has ever actually used.

The type of a memory offset could not be long because far too much legacy code assumes that long is exactly 32 bits wide. This assumption was even built into the UNIX and Windows APIs. Unfortunately, a lot of other legacy code also assumed that a long is wide enough to hold a pointer, a file offset, the number of seconds that have elapsed since 1970, and so on. POSIX now provides a standardized way to force the latter assumption to be true instead of the former, but neither is a portable assumption to make.

It couldn’t be int because only a tiny handful of compilers in the ’90s made int 64 bits wide. Then they really got weird by keeping long 32 bits wide. The next revision of the Standard declared it illegal for int to be wider than long, but int is still 32 bits wide on most 64-bit systems.

It couldn’t be long long int, which anyway was added later, since that was created to be at least 64 bits wide even on 32-bit systems.

So, a new type was needed. Even if it weren’t, all those other types meant something other than an offset within an array or object. And if there was one lesson from the fiasco of 32-to-64-bit migration, it was to be specific about what properties a type needed to have, and not use one that meant different things in different programs.


size_t or any unsigned type might be seen used as loop variable as loop variables are typically greater than or equal to 0.

When we use a size_t object, we have to make sure that in all the contexts it is used, including arithmetic, we want only non-negative values. For instance, following program would definitely give the unexpected result:

// C program to demonstrate that size_t or
// any unsigned int type should be used 
// carefully when used in a loop

#include<stdio.h>
int main()
{
const size_t N = 10;
int a[N];

// This is fine
for (size_t n = 0; n < N; ++n)
a[n] = n;

// But reverse cycles are tricky for unsigned 
// types as can lead to infinite loop
for (size_t n = N-1; n >= 0; --n)
printf("%d ", a[n]);
}

Output
Infinite loop and then segmentation fault

From my understanding, size_t is an unsigned integer whose bit size is large enough to hold a pointer of the native architecture.

So:

sizeof(size_t) >= sizeof(void*)

size_t is an unsigned type. So, it cannot represent any negative values(<0). You use it when you are counting something, and are sure that it cannot be negative. For example, strlen() returns a size_t because the length of a string has to be at least 0.

In your example, if your loop index is going to be always greater than 0, it might make sense to use size_t, or any other unsigned data type.

When you use a size_t object, you have to make sure that in all the contexts it is used, including arithmetic, you want non-negative values. For example, let's say you have:

size_t s1 = strlen(str1);
size_t s2 = strlen(str2);

and you want to find the difference of the lengths of str2 and str1. You cannot do:

int diff = s2 - s1; /* bad */

This is because the value assigned to diff is always going to be a positive number, even when s2 < s1, because the calculation is done with unsigned types. In this case, depending upon what your use case is, you might be better off using int (or long long) for s1 and s2.

There are some functions in C/POSIX that could/should use size_t, but don't because of historical reasons. For example, the second parameter to fgets should ideally be size_t, but is int.