[c] How to replicate vector in c?

In the days before c++ and vector/lists, how did they expand the size of arrays when they needed to store more data?

This question is related to c

The answer is


They would start by hiding the defining a structure that would hold members necessary for the implementation. Then providing a group of functions that would manipulate the contents of the structure.

Something like this:

typedef struct vec
{
    unsigned char* _mem;
    unsigned long _elems;
    unsigned long _elemsize;
    unsigned long _capelems;
    unsigned long _reserve;
};

vec* vec_new(unsigned long elemsize)
{
    vec* pvec = (vec*)malloc(sizeof(vec));
    pvec->_reserve = 10;
    pvec->_capelems = pvec->_reserve;
    pvec->_elemsize = elemsize;
    pvec->_elems = 0;
    pvec->_mem = (unsigned char*)malloc(pvec->_capelems * pvec->_elemsize);
    return pvec;
}

void vec_delete(vec* pvec)
{
    free(pvec->_mem);
    free(pvec);
}

void vec_grow(vec* pvec)
{
    unsigned char* mem = (unsigned char*)malloc((pvec->_capelems + pvec->_reserve) * pvec->_elemsize);
    memcpy(mem, pvec->_mem, pvec->_elems * pvec->_elemsize);
    free(pvec->_mem);
    pvec->_mem = mem;
    pvec->_capelems += pvec->_reserve;
}

void vec_push_back(vec* pvec, void* data, unsigned long elemsize)
{
    assert(elemsize == pvec->_elemsize);
    if (pvec->_elems == pvec->_capelems) {
        vec_grow(pvec);
    }
    memcpy(pvec->_mem + (pvec->_elems * pvec->_elemsize), (unsigned char*)data, pvec->_elemsize);
    pvec->_elems++;    
}

unsigned long vec_length(vec* pvec)
{
    return pvec->_elems;
}

void* vec_get(vec* pvec, unsigned long index)
{
    assert(index < pvec->_elems);
    return (void*)(pvec->_mem + (index * pvec->_elemsize));
}

void vec_copy_item(vec* pvec, void* dest, unsigned long index)
{
    memcpy(dest, vec_get(pvec, index), pvec->_elemsize);
}

void playwithvec()
{
    vec* pvec = vec_new(sizeof(int));

    for (int val = 0; val < 1000; val += 10) {
        vec_push_back(pvec, &val, sizeof(val));
    }

    for (unsigned long index = (int)vec_length(pvec) - 1; (int)index >= 0; index--) {
        int val;
        vec_copy_item(pvec, &val, index);
        printf("vec(%d) = %d\n", index, val);
    }

    vec_delete(pvec);
}

Further to this they would achieve encapsulation by using void* in the place of vec* for the function group, and actually hide the structure definition from the user by defining it within the C module containing the group of functions rather than the header. Also they would hide the functions that you would consider to be private, by leaving them out from the header and simply prototyping them only in the C module.


You can see implementation vc_vector:

struct vc_vector {
  size_t count;
  size_t element_size;
  size_t reserved_size;
  char* data;
  vc_vector_deleter* deleter;
};

...

vc_vector* vc_vector_create_copy(const vc_vector* vector) {
  vc_vector* new_vector = vc_vector_create(vector->reserved_size / vector->count,
                                           vector->element_size,
                                           vector->deleter);
  if (unlikely(!new_vector)) {
    return new_vector;
  }

  if (memcpy(vector->data,
             new_vector->data,
             new_vector->element_size * vector->count) == NULL) {
    vc_vector_release(new_vector);
    new_vector = NULL;
    return new_vector;
  }

  new_vector->count = vector->count;
  return new_vector;
}

To use it:

vc_vector* v1 = vc_vector_create(0, sizeof(int), NULL);
for (int i = 0; i < 10; ++i) {
  vc_vector_push_back(v1, &i);
}

// v1 = 0 1 2 3 4 5 6 7 8 9

vc_vector* v2 = vc_vector_create_copy(v1);

// v2 = 0 1 2 3 4 5 6 7 8 9 (copy of v1)

// to get pointer to int:

const int* v2_data = vc_vector_data(v1);

https://github.com/jakubgorny47/baku-code/tree/master/c_vector

Here's my implementation. It's basicaly a struct containing pointer to the data, size (in elements), overall allocated space and a size of the type that's being stored in vector to allow use of void pointer.


You can use "Gena" library. It closely resembles stl::vector in pure C89.

From the README, it features:

  • Access vector elements just like plain C arrays: vec[k][j];
  • Have multi-dimentional arrays;
  • Copy vectors;
  • Instantiate necessary vector types once in a separate module, instead of doing this every time you needed a vector;
  • You can choose how to pass values into a vector and how to return them from it: by value or by pointer.

You can check it out here:

https://github.com/cher-nov/Gena


A lot of C projects end up implementing a vector-like API. Dynamic arrays are such a common need, that it's nice to abstract away the memory management as much as possible. A typical C implementation might look something like:

typedef struct dynamic_array_struct
{
  int* data;
  size_t capacity; /* total capacity */
  size_t size; /* number of elements in vector */
} vector;

Then they would have various API function calls which operate on the vector:

int vector_init(vector* v, size_t init_capacity)
{
  v->data = malloc(init_capacity * sizeof(int));
  if (!v->data) return -1;

  v->size = 0;
  v->capacity = init_capacity;

  return 0; /* success */
}

Then of course, you need functions for push_back, insert, resize, etc, which would call realloc if size exceeds capacity.

vector_resize(vector* v, size_t new_size);

vector_push_back(vector* v, int element);

Usually, when a reallocation is needed, capacity is doubled to avoid reallocating all the time. This is usually the same strategy employed internally by std::vector, except typically std::vector won't call realloc because of C++ object construction/destruction. Rather, std::vector might allocate a new buffer, and then copy construct/move construct the objects (using placement new) into the new buffer.

An actual vector implementation in C might use void* pointers as elements rather than int, so the code is more generic. Anyway, this sort of thing is implemented in a lot of C projects. See http://codingrecipes.com/implementation-of-a-vector-data-structure-in-c for an example vector implementation in C.