[c] C Programming: How to read the whole file contents into a buffer

I want to write the full contents of a file into a buffer. The file actually only contains a string which i need to compare with a string.

What would be the most efficient option which is portable even on linux.

ENV: Windows

This question is related to c file-io

The answer is


Here is what I would recommend.

It should conform to C89, and be completely portable. In particular, it works also on pipes and sockets on POSIXy systems.

The idea is that we read the input in large-ish chunks (READALL_CHUNK), dynamically reallocating the buffer as we need it. We only use realloc(), fread(), ferror(), and free():

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>

/* Size of each input chunk to be
   read and allocate for. */
#ifndef  READALL_CHUNK
#define  READALL_CHUNK  262144
#endif

#define  READALL_OK          0  /* Success */
#define  READALL_INVALID    -1  /* Invalid parameters */
#define  READALL_ERROR      -2  /* Stream error */
#define  READALL_TOOMUCH    -3  /* Too much input */
#define  READALL_NOMEM      -4  /* Out of memory */

/* This function returns one of the READALL_ constants above.
   If the return value is zero == READALL_OK, then:
     (*dataptr) points to a dynamically allocated buffer, with
     (*sizeptr) chars read from the file.
     The buffer is allocated for one extra char, which is NUL,
     and automatically appended after the data.
   Initial values of (*dataptr) and (*sizeptr) are ignored.
*/
int readall(FILE *in, char **dataptr, size_t *sizeptr)
{
    char  *data = NULL, *temp;
    size_t size = 0;
    size_t used = 0;
    size_t n;

    /* None of the parameters can be NULL. */
    if (in == NULL || dataptr == NULL || sizeptr == NULL)
        return READALL_INVALID;

    /* A read error already occurred? */
    if (ferror(in))
        return READALL_ERROR;

    while (1) {

        if (used + READALL_CHUNK + 1 > size) {
            size = used + READALL_CHUNK + 1;

            /* Overflow check. Some ANSI C compilers
               may optimize this away, though. */
            if (size <= used) {
                free(data);
                return READALL_TOOMUCH;
            }

            temp = realloc(data, size);
            if (temp == NULL) {
                free(data);
                return READALL_NOMEM;
            }
            data = temp;
        }

        n = fread(data + used, 1, READALL_CHUNK, in);
        if (n == 0)
            break;

        used += n;
    }

    if (ferror(in)) {
        free(data);
        return READALL_ERROR;
    }

    temp = realloc(data, used + 1);
    if (temp == NULL) {
        free(data);
        return READALL_NOMEM;
    }
    data = temp;
    data[used] = '\0';

    *dataptr = data;
    *sizeptr = used;

    return READALL_OK;
}

Above, I've used a constant chunk size, READALL_CHUNK == 262144 (256*1024). This means that in the worst case, up to 262145 chars are wasted (allocated but not used), but only temporarily. At the end, the function reallocates the buffer to the optimal size. Also, this means that we do four reallocations per megabyte of data read.

The 262144-byte default in the code above is a conservative value; it works well for even old minilaptops and Raspberry Pis and most embedded devices with at least a few megabytes of RAM available for the process. Yet, it is not so small that it slows down the operation (due to many read calls, and many buffer reallocations) on most systems.

For desktop machines at this time (2017), I recommend a much larger READALL_CHUNK, perhaps #define READALL_CHUNK 2097152 (2 MiB).

Because the definition of READALL_CHUNK is guarded (i.e., it is defined only if it is at that point in the code still undefined), you can override the default value at compile time, by using (in most C compilers) -DREADALL_CHUNK=2097152 command-line option -- but do check your compiler options for defining a preprocessor macro using command-line options.


A portable solution could use getc.

#include <stdio.h>

char buffer[MAX_FILE_SIZE];
size_t i;

for (i = 0; i < MAX_FILE_SIZE; ++i)
{
    int c = getc(fp);

    if (c == EOF)
    {
        buffer[i] = 0x00;
        break;
    }

    buffer[i] = c;
}

If you don't want to have a MAX_FILE_SIZE macro or if it is a big number (such that buffer would be to big to fit on the stack), use dynamic allocation.