Daniel Beer Atom | RSS | About

Tiny AVL trees

tavl is an implementation of AVL trees that keeps all nodes in a flat statically-allocated array. It’s intended for use on embedded systems where one may want to index a set of items with a fixed upper bound, and where the set of items is too large for a linear search to be fast enough.

The implementation consists of two files and a test:

You can build and run the test with:

gcc -O1 -Wall -o test test_tavl.c tavl.c
./test

Usage

You will likely need to tweak the implementation to suit your use-case. By default, keys are of type uint32_t, and a tree can consist of up to 32,767 nodes.

Nodes are expected to be kept in a flat array. Nodes refer to each other by array index, rather than by pointer. This enables a memory saving, since for a limit of 32,767 nodes, an index will take up less space.

Each node contains only internal book-keeping (left and right indices, and balancing flags), and a key. Values are associated with nodes by keeping a parallel array (or multiple arrays) of value data. Matching indices associate keys with values.

You should, in your firmware, declare storage for nodes and a root pointer:

#define MAX_NODES   1024

struct tavl_node nodes[MAX_NODES];
struct value_type values[MAX_NODES];
tavl_index_t root = TAVL_NONE;

To insert a node:

tavl_index_t i = /* choose something unused from node_array */;

nodes[i].key = my_key;
values[i] = my_value;
tavl_insert(nodes, &root, i);

To delete a node:

tavl_delete(nodes, &root, i);
/* Node i is no longer reachable from the root */

And to find a node given a key:

tavl_index_t i = tavl_find(nodes, root, my_key);

if (i != TAVL_NONE) {
        /* values[i] contains the associated data */
}

Maintaining a contiguous free-list at the end of the array

The examples above assume a method for tracking nodes that are unallocated. This could be done by an auxiliary free-stack (perhaps by reusing the index pointers in struct tavl_node).

Another method, which has the side-effect of making for easy enumeration, is to divide the array into two contiguous portions: one of allocated nodes, and another of unallocated. A single extra index points to the next free node:

tavl_index_t freeptr; /* initially 0: all nodes are free */

Then we can allocate a new node as follows:

if (freeptr < MAX_NODES) {
        tavl_index_t i = freeptr++;

        nodes[i].key = my_key;
        values[i] = my_value;
        tavl_insert(nodes, &root, i);
}

When deleting, we may need to move an item from the end to fill the gap. Moving nodes requires them to be unlinked and relinked:

tavl_delete(nodes, &root, i);
freeptr--;

if (i != freeptr) {
        /* fill the gap */
        tavl_delete(nodes, &root, freeptr);
        memcpy(&nodes[i], &nodes[freeptr], sizeof(nodes[i]));
        memcpy(&values[i], &values[freeptr], sizeof(values[i]));
        tavl_insert(nodes, &root, i);
}

Performance and memory consumption

Being a balanced binary tree, all operations (insert, delete and find) are bounded by O(log n).

With the default configuration, each node consumes 8 bytes of RAM, plus whatever data you choose to associate.

Stack usage and section sizes when compiled for an ARM Cortex M4 with GCC 8.3.0 (-O1):

text           data     bss     dec     hex filename
1090              0       0    1090     442 tavl.o

  Func                               Cost    Frame  Inherit   Height
------------------------------------------------------------------------
> tavl_delete                         232       72      232        3
> tavl_insert                         224       64      224        3
  fix_path                            160      112      160        2
  rotate_and_fix                       48       48       48        1
> tavl_find                             8        8        8        1