Reference Counting in C

Ever since computer programming has been around, the problem of dynamic memory management has plagued many a programs.

Now a days most programming languages rely on either Garbage Collection or Reference Counting, but what is a C programmer to do?

Turns out, there is no magic to reference counting, it literally is what it sounds like. Each piece of memory has a counter to how many references it has. Each function that wants to keep the value outside of the given scope, should retain the value, and once it's done with it release it.

You can easily create thin wrappers around your allocation modules to build this capability.

Of course, without the help from the compiler, you cannot do Automatic Reference Counting (ARC), and need to make sure your code properly calls retain and release and also avoid retention cycles.

// A header for each memory block
typedef struct MEMHDR {
	int retain_cnt;
} MEMHDR;

#define MEMHDR_SIZ	     ALIGN(sizeof(MEMHDR))
#define MEMHDR_PTR(ptr)  ((MEMHDR *)((char *)(ptr) - MEMHDR_SIZ))

// align the size to word-boundry
#define ALIGN(siz)       ((siz) + sizeof(int) - ((siz) % sizeof(int)))

// Allocate extra header at the head of the memory pointer,
// but return the offset pointer.
// This gives you an internal header to maintain any attributes
// for this block of memory
#define allocate(siz) ((char *)malloc(MEMHDR_SIZ + ALIGN(siz))) + MEMHDR_SIZ))

// Simply increment ths retention counter and return the original pointer
// Allows the use like this:
//   str1 = retain(str)
#define retain(ptr)   (MEMHDR_PTR(ptr)->retain_cnt++, ptr)

// Release is a little more involved.
// decrement the retention counter and free the block,
// if the retain_cnt is zero
#define release(ptr)  ( \
	--(MEMHDR_PTR(ptr)->retain_cnt), \
    MEMHDR_PTR(ptr)->retain_cnt \
    	?ptr \
        :(free(ptr),NULL) \
)

Essentially, all we are doing is prepending a header to each piece of allocated memory and tracking a retain_cnt variable in it. When the retain_cnt hits zero, we will free the block.

Let's see how you would use this with an example. We will build an API to read a file.

typedef struct FILER {
    FILE *fd;
    char *buf;
} FILER;

FILER * filer_open(const char *path) {
    FILE *fd = fopen(path, "r");
    FILER *filer = allocate(sizeof(FILER));
    filer.fd = fd;
    return filer;
}

void filer_close(FILER *filer) {
    fclose(filer->fd);
    if (filer->buf)
        release(filer->buf);
}

char * filer_read(FILER *filer) {
    if (!filer->buf)
        filer->buf = allocate(BUFSIZ);
    fgets(filer->buf, BUFSIZ, filer->fd);
    return filer->buf;
}

filer.c
int main(void) {
    FILER *f = filer_open("/home/user/.profile");

    char *line = filer_read(f);
    char *first_line = retain(line);

    filer_close(f);

    // first_line is still valid here

    ...

    // clean up
    release(first_line);
}
main.c

As you can see in this example, a major side-benefit of using reference counting is the locality of memory operations. Both filer.c and main.c have equal numbers of allocate/retain and release, making reviewing the code for memory leaks very easy.

That's all there is to it!

Manuj Bhatia

Manuj Bhatia