Low-Overhead Weak Pointers

Abstract: The usual linked-backreference method of implementing weak pointers has awkward copying semantics and involves a linear time search whenever a pointer is deallocated. This entry describes a better weak pointer implementation method.

What is a weak pointer? Consider this C++ code:

    Foo* x = new Foo;
    Foo* y = x;
    delete x;
    cout << y;

This outputs the location at which a Foo was allocated, but that value is now as good as garbage, because that Foo was deallocated. It would be useful to have y automatically set to NULL so that code could check whether the object is still valid. A pointer that is set to NULL when its target is deallocated is called a weak pointer.

Weak pointers are usually implemented by storing a linked list in the target of locations that should be set to NULL when the object is deleted. Consider the bookkeeping of this implementation: whenever a pointer is copied, a new reference must be added to the list, and whenever a pointer is deallocated (eg. the pointer itself goes out of scope), the list must be searched in linear time to delete it from the list. That hurts. Obviously there are ways to speed it up, but the asymptotic bound is still there.

Here is my solution: First, set up a small object pool with entries that look like this:

    struct Entry {
        unsigned int age;
        union {
            void* ptr;
            int next;

Which is just the typical small object pool entry with an extra field: age, which is initialized to zero when the pool is created. If you don’t know what a small object pool is, then the next paragraph is for you!

A small object pool is a way of quickly allocating memory of predefined size. Most allocators today use this method for sufficiently small objects (say 64 bytes or less). You implement it by allocating a big array of Entrys, and initializing them to form a sort of linked list. That is, pool[0].next = 1; pool[1].next = 2; etc. This linked list forms a “free list”, a list of unallocated space. The pool also keeps track of the first free entry in the list (call it head), in this case 0. Then when it is asked to allocate some space, it allocates head, and then sets head to the next entry in the linked list: pool[head].next. When it is asked to deallocate a location, it sets that location’s .next to head and head in turn to that location. It’s just standard linked list calculations with a slightly clever twist, which gives you minimal space overhead and constant time allocate and delete.

Okay, so about this age field. A weak pointer, instead of storing a location in memory, stores an index into this pointer pool. A weak pointer also has an age field which refers to the age of the cell it is pointing to when the pointer was created. To access what the pointer is pointing to, you look up that index in the pool. To see whether the object has been destroyed, you check the pointer’s age against the cell’s age. If the two ages are equal, then the pointer is valid, otherwise the object has been destroyed while the pointer was referring to it.

To allocate a weak pointer to an object, first allocate a space in the pool. Set that space’s ptr field to the object, set the pointer’s index to the index of the new space, and set the pointer’s age to the new space’s age. When you delete the object, free it from the pool and increment that cell’s age field. If a new object is allocated there in the future, it will have a greater age so that the pointers to the old object won’t be fooled into thinking the new one is the old one.

This scheme is only guaranteed to work as long as you don’t allocate 2sizeof(int) times in the same location. While that is a huge number and it is unlikely that any real application would do that, it may be more likely than you think. If the weak pointers follow a cycle of alloc, delete, alloc, delete, then the same space is used over and over again and the rest of the pool is untouched. It is conceivable that you could do that four billion times. A solution is to check on deallocation whether an increment would rollover, and if so, invalidate that pool entry—never use it again. This creates a memory leak, however it would be so gradual (in the worst case, 8 bytes every 4 billion allocations on 32 bit machines) that it would be neglegible. I feel silly even addressing the memory leak issue.

If you’re writing this in C++ as I am, you’ll want to hide all this behind a pointer wrapper class. Either that or you’ll want to use the one I wrote. I will post a link as soon as I write it.

5 thoughts on “Low-Overhead Weak Pointers

  1. True enough. It’s unclear how that will affect things in the long run. In C++ programs, pointers are passed around enough for it to be worth it. Using this for, say, Perl’s weak pointers may not be such a win, since they are copied with reference semantics.

  2. or you could stick with the original implementation and keep the list of pointers sorted in a binary tree. log n insert/delete

  3. Oh goody. I had a nice response to your comment, Anonymous, if that is your real name. It got lost in the bitstream somewhere. Anyway, the only non-speculative content of the comment was that I have an untested implementation in the soylib library: http://soylentsoft.net/svn/open/soylib . One might say that the word “untested” makes that part speculative, too. :-)

  4. Ah here it is. WordPress thought it might be spam, so it asked me to check. I determined that it was, because it had too many links and was essentialy nonsense.

    Ah, yes, that was one of the ways I blanketed over when I said “there are ways to speed it up”, though I hadn’t thought of that explicitly. It does reduce the asymptotic bound, however I don’t have a good intuition for the real expense of that. To really be able to tell, I think benchmarks would be necessary. I think the correct implementation, as with most things, depends on common usage. If you are dereferencing pointers much more than you are creating and destroying them, then that will probably be a win. In my code, I find that I dereference a pointer only once or twice for each create/destroy pair. That may be because I like the feel of pass-by-value better than pass-by-reference.

    My initial (yet untested, because my RTS found a different way to solve its problem) implementation of this is in the soylib library. On the surface, I really like this algorithm: its time and space complexity are top-notch. I would like to observe the double-dereference behavior regarding cache misses and other things for which my algorithmic intuition evades me.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s