How To Implement a Vector Class in Pure C?


When you need a collection or container with more flexibility than an Array provides, the first data structure you’ll usually go to is a Vector. Vectors are part of the STL in C+ as std::vector<T>, where T stands for the type you want the collection to be of.

Because we don’t have this luxury in C I will show you today how we can implement our own Vector type written purely in C. We will rely on some concepts from my article on how to implement Classes in C, so read it first if you are interested.

If you need an overview of different containers and their advantages and disadvantages you can find them in my article here.

How a Vector works

Normally you would think of a vector in 2D space as something with a length and a direction. However, this is not the case when it comes to the Vector container.

A Vector is a container that allocates adjacent memory for a certain data type in Heap Memory. It has some information about capacity and size. Size is the currently available memory for new elements and size is the currently used amount of that available memory.

Vector Memory Layout Allocation
Vector Memory Map

A certain amount of memory is pre-allocated (it’s capacity) in order to make the insert operations fast. If this memory is all used by the vector than more memory is reserved. There are some strategies how much and when the memory is reserved. However, once allocated the memory is never released until the destruction of the Vector.

Vector Memory Layout Reallocation
Vector Memory Map after Reallocation

The great advantage compared to a normal array is that a Vector is flexible in it’s size as it can append and remove elements. Adjacent memory saves space and makes access relatively fast. The payoff is it’s slightly bigger size.

Goals of the Vector Class Implementation in C

We will implement a Vector Class in C that is capable of the following features:

  • Initialization of Vector (and clean up!)
  • Adding Elements to Vector
  • Accessing Elements
  • Removing Elements from Vector

We will only implement a Vector for the Integer (int) type. The C language is missing the <template> stuff that C++ provides so you have to make an implementation for every type that you need. (You maybe could use void-Pointers but that is a rabbit whole of it’s own…)

Vector Initialization in C

We will need three files for our implementation:

  • vector.h – here we define the Interface for our Vector
  • vector.c – the actual Vector implementation
  • main.c – here we write a little program to test our vector

The vector_t struct contains three members of which one may confuse you a bit. The first one, called elements, will point to all the elements within the vector when they are added. The third one keeps track of the current size of the array. But why capacity?

The vector reserves some memory up front in oder to store the elements. If this memory is used then the vector reallocates more memory and copies all the elements into the new reserved memory block. There are different strategies on how much memory to reserve, we will use a very simple one.

Our interface declares the struct for our vector objects as well as the initialization, construction and destruction functions. Note that the struct is only declared here but not defined. This keeps the members private. The downside to this is that you have to allocate memory with the vector_new() function in order to use it.

#pragma once
#include <stdlib.h>

typedef struct vector_t vector_t;

vector_t* vector_new();
void vector_ctr(vector_t* obj);
void vector_dtr(vector_t* obj);

The implementation of memory allocation and class initialization is exactly as I described it in my article about classes in C. We define the default capacity of our vector with 4 which is totally arbitrary but works well for our examples. In the destructor everything is freed, so do not forget to call it!

#include "vector.h"
#include <stdlib.h>

static const int DEFAULT_CAPACITY = 4;

typedef struct vector_t {
    int* elements;
    int capacity;
    int size;
} vector_t;

vector_t* vector_new()
{
    return (vector_t*)  malloc(sizeof(vector_t));
}

void vector_ctr(vector_t* obj)
{
    obj->elements = calloc(DEFAULT_CAPACITY, sizeof(int));
    obj->capacity = DEFAULT_CAPACITY;
    obj->size = 0;
}

void vector_dtr(vector_t* obj)
{
    free(obj->elements);
    free(obj);
}

Our test program is very boring so far as it only creates, allocates and destroys our vector:

#include "vector.h"

int main()
{
    vector_t* my_vector = vector_new();
    vector_ctr(my_vector);

    vector_dtr(my_vector);

    return 0;
}

Adding an Element to the Vector in C

We use the Vector class from the C++ STL as a model. Therefore, the function for adding an element will also be called push_back. We also want to keep track of the current size and the current capacity of our Vector. Therefore, the Interface is defined as follows:

...
int vector_size(vector_t* obj);
int vector_capacity(vector_t* obj);

void vector_push_back(vector_t* obj, const int element);
...

The size and capacity functions are simply getters of the private variables of our object:

...
int vector_size(vector_t* obj)
{
    return obj->size;
}

int vector_capacity(vector_t* obj)
{
    return obj->capacity;
}
...

I use the term Object here, although there are no real objects in the C Language. Anyway, the concept is the same as I already explained in this article.

And now the push_back function. We simply add the passed item to the internal array of the vector_t struct. But can only do this until our capacity is reached, then we have to reallocate more memory in order to store the elements properly.

Also note that we have to keep track of the array size ourselves. In C, the sizeof(…) function does not work as intended for dynamic arrays but only for its static counterparts.

...
void vector_push_back(struct vector_t* obj, const int element)
{
    if(obj->size > 0 && obj->size % obj->capacity == 0) {    /* Vector is full */
        obj->capacity = (obj->size / DEFAULT_CAPACITY + 1) * DEFAULT_CAPACITY; /* Increase in Steps of DEFAULT_CAPACITY */
        obj->elements = realloc(obj->elements, obj->capacity * sizeof(int));
        printf("New Allocation: %d\n", obj->capacity);
    }
    obj->elements[obj->size++] = element;
}
...

Now we test our new functionality by putting them into our test program in main:

#include "vector.h"

int main()
{
    vector_t* my_vector = vector_new();
    vector_ctr(my_vector);

    printf("Vector Size: %d\n", vector_size(my_vector));
    printf("Vector Capacity: %d\n", vector_capacity(my_vector));

    vector_push_back(my_vector, 42);

    printf("Vector Size: %d\n", vector_size(my_vector));
    printf("Vector Capacity: %d\n", vector_capacity(my_vector));

    vector_push_back(my_vector, 55);
    vector_push_back(my_vector, 34);
    vector_push_back(my_vector, 67);
    vector_push_back(my_vector, 70);

    printf("Vector Size: %d\n", vector_size(my_vector));
    printf("Vector Capacity: %d\n", vector_capacity(my_vector));

    vector_dtr(my_vector);

    return 0;
}

Our test program then has the following output:

Vector Size: 0
Vector Capacity: 4
Vector Size: 1    
Vector Capacity: 4
New Allocation: 8 
Vector Size: 5
Vector Capacity: 8

The initial size is 0 and its capacity is 4. After adding the first element, the size is 1 but the capacity remains unchanged. It will change when we try to add the fifth element – then the capacity is increased to 8 (two times the default capacity) and the size then is 5.

Accessing Vector Elements in C

In general there are two ways how to access elements in a vector. You can access one specific element through its index with the at(…) function. The other possibility is by accessing the elements with the begin() and end() iterators. Both possibilities are implement and shown next.

Accessing Vector Elements with at(…)

First we implement the possibility to access an element by its index with the at(…) function.

We pass the index of the element that we want to access. Be careful not to pass an invalid index. The return type is int because we have a vector of type int and the function returns a single element.

...
int vector_at(vector_t* obj, const int index);
...

The implementation is again very simple:

...

int vector_at(vector_t* obj, const int index)
{
    return obj->elements[index];
}
...

Accessing Vector Elements with begin() and end() Iterators

An iterator is not the element itself but a pointer to an element. In the C++ STL there are two functions that return such pointers, namely begin() and end().

The begin() points to the first element and can be dereferenced to its value, but the end() iterator points to a theoretical past-the-end element and thus shall not be dereferenced! It only serves as a marker for the end of the vector. Iterators can be increased and decreased and so you can traverse through the vector and access its elements. This is possible because of the adjacent memory of the vector.

The functions themselves return pointers to the type of the elements:

..
int* vector_begin(vector_t* obj);
int* vector_end(vector_t* obj);
..

Another very simple implementation and again the warning notice that the end() element is not really a valid element:

...
int* vector_begin(vector_t* obj)
{
    return &obj->elements[0];
}

int* vector_end(vector_t* obj)
{
    /* Returns a THEORETICAL past-the-end Element 
        and thus shall not be dereferenced! */
    return &obj->elements[obj->size];
}
...

In our test program we add the direct access to the first element via iterator, looping through the whole vector via iterators and also accessing each element of the vector with the at(…) function and all indices:

..
printf("Vector Element begin(): %d\n", *vector_begin(my_vector));
/* printf("Vector Element end(): %d\n", *vector_end(my_vector));  /* NEVER! do this */
...
for(int i = 0; i<5; i++) {
    printf("Element at %d: %d\n", i, vector_at(my_vector, i));
}

for(int* it=vector_begin(my_vector); it != vector_end(my_vector); it++) {
    printf("Element iterated: %d\n", *it);
}
...

The output of the additional printf(…) commands in out test program is as follows:

...
Vectore Element begin(): 42
Element at 0: 42
Element at 1: 55
Element at 2: 34
Element at 3: 67
Element at 4: 70
Element iterated: 42
Element iterated: 55
Element iterated: 34
Element iterated: 67
Element iterated: 70
...

Removing Vector Elements in C

The last feature of Vectors that we will cover in this article is removing elements from the vector. Removing an element will never shrink the reserved memory but only decrease its size. This means that the capacity stays the same after removing on or more elements from the vector.

Again there are two possibilities to do this. First we can pop_back() which means that the last element will be removed, or we can erase(..) elements from the array. With the latter function we can delete either one element or a range of elements between two given iterators.

Our header declaration for these two functions look like the following:

...
void vector_pop_back(vector_t* obj);
void vector_erase(vector_t* obj, int* first, int* last); 
...

The implementation does nothing that we didn’t expect:

pop_back() simply determines the index of the last element, resets it to zero and reduces the size by one. (The reset to 0 is optional as we will see soon).

erase() reorganizes the array and resets the size but does not shrink the capacity. Because of that you don’t have to free or reset any values. If the memory is used again by the vector it will be overwritten and otherwise the whole vector will be freed altogether.

...
void vector_pop_back(vector_t* obj)
{
    int lastIndex = obj->size - 1;
    obj->elements[lastIndex] = 0;
    obj->size--;
}
...
void vector_erase(vector_t* obj, int* first, int* last)
{
    int lastElement = (obj->size - 1);
    int firstElement = first - &obj->elements[0];

    for(int i=firstElement; i < lastElement; i++) {
        obj->elements[i] = obj->elements[i+1];
    }
    obj->size -= (last - first) > 0 ? (last - first) : 1;
}
...

We extend our test program where we left off. It shows the different usages of pop_back() and erase(…):

  printf("Vector Size before Pop: %d\n", vector_size(my_vector));
  printf("Vector Capacity before Pop: %d\n", vector_capacity(my_vector));
    
  vector_pop_back(my_vector);

  for(int i = 0; i<4; i++) {
      printf("Element at %d: %d\n", i, vector_at(my_vector, i));
  }

  printf("Vector Size after Pop: %d\n", vector_size(my_vector));
  printf("Vector Capacity after Pop: %d\n", vector_capacity(my_vector));

  /* Remove first element only */
  vector_erase(my_vector, vector_begin(my_vector), NULL); 
  /* Remove all Elements from second to last */
  vector_erase(my_vector, vector_begin(my_vector) + 1, vector_end(my_vector)); 
  
  for(int i = 0; i<vector_size(my_vector); i++) {
      printf("Element at %d: %d\n", i, vector_at(my_vector, i));
  }

  printf("Vector Size after Erase: %d\n", vector_size(my_vector));
  printf("Vector Capacity after Erase: %d\n", vector_capacity(my_vector));

The output is extended by the following lines:

...
Vector Size before Pop: 5
Vector Capacity before Pop: 8
Element at 0: 42
Element at 1: 55
Element at 2: 34
Element at 3: 67
Vector Size after Pop: 4
Vector Capacity after Pop: 8
Element at 0: 55
Vector Size after Erase: 1
Vector Capacity after Erase: 8

As you can see only one Element is left in the vector after all erasing operations are done. This is exactly what we expected.

Summary & Further Ideas

We have now the basic functionality for our Vector Container Class in the C language. There are indeed many ways to improve it and also we are missing some functionality (e.g. empty(), clear(), assign(), etc.). It is also only for int variables, but what about floats or chars?

Please go ahead and extend and optimize this class for your needs!

This article was first published by Marco Lieblang on moderncprogramming.com. If you are reading this somewhere else, it may be plagiarism.

Marco Lieblang

Professional Programmer since 2003, passionate Programmer since the mid 90's. Developing in many languages from C/C++ to Java, C#, Python and some more. And I also may know a bit about Assembly Languages and Retro Systems.

Recent Posts