Saturday, June 11, 2016

Storing a Graph in Memory

Graphs are undoubtedly one of the most important objects in computer science. There are several algorithms that operate on graphs and these algorithms might take different amounts of execution time depending on the graph representation being used. In this post, we will explore some of the ways in which a graph can be stored in memory. Throughout the rest of this post we will assume that the underlying graph contains $n$ nodes $0,1,\dots,n-1$ and $m$ edges $0,1,\dots,m-1$.

Edge List

The simplest representation is to store the edges of the graph in a list. This representation is useful when adjacency information (which edges go out of a given node) is not important. For example, Kruskal's algorithm for minimum spanning tree does not use adjacency information but it needs the edges sorted by weight and so this representation can be used to implement Kruskal.

Adjacency Matrix

When the graph is dense: the number of edges is close to the maximal number of edges, an adjacency matrix can be used. The edges are stored as entries in a $n \times n$ matrix $A$ so that: $$A_{i,j} = \begin{cases}1 \text{, if there is an edge from i to j} \\0 \text{, elsewise}\end{cases}$$

Adjacency List

The adjacency list representation is the one that is used the most. In this representation every node stores a list of outgoing edges. Traversing the graph by using this representation takes time $\mathcal{O} (n+m)$ whereas traversing it by using an adjacency matrix takes time $\mathcal{O} (n^2)$. When the graph is dense it does not really matter because $m=\mathcal{O} (n^2)$ but when it is sparse then algorithms that use the adjacency list representation are way faster. The following is an example implementation.


A Triple-Array Representation

The adjacency list implementation uses dynamic arrays (vectors) and there is a call to push_back each time an edge is added. We can further optimize by using static arrays instead of vectors. In this representation every node remembers the last edge out of it and every edge remembers the previous edge (with respect to the node on which both edges are incident) and the node to which it is pointing. Here is an example implementation.