Choosing a representation for the data is a matter of trade-offs.
To choose a good representation for the data, we always need to know:
With respect to the graph size, we have dense graphs, where m = Θ(n2), sparse graphs, where m = O(n), and some intermediate graphs.
In dense graphs, the degrees of most vertices are of the order of Θ(n). In sparse graphs, the degrees of most vertices are small (O(1)), but we can sometimes have a few vertices of very high degree.
For instance, the graph corresponding to a road network is a sparse graph. If we represent intersections by vertices, each vertex (intersection) usually has 3 or 4 neighbours (out of the hundreds of millions intersections in the world).
Typically, we have the following operations to be performed:
We have a n×n matrix with 0-1 or true-false values, defined as: ax,y=1 if there is an edge from x to y, and 0 otherwise.
.Memory: Θ(n2)
Test edge: O(1)
Parse neighbours: Θ(n)
Summary: Adjacency matrix is good for dense graphs, but bad for sparse graphs. Imagine a graph with 108 vertices and 4×108
edges, but which occupies 1016 bits (or around 1000TB).It involves keeping a collection containing the edges (as pairs of vertices). It is compact for sparse graphs, but all operations need to parse the full collection.
Memory: Θ(m)
Test edge: O(m)
Parse neighbours: O(m)
For each vertex, we keep a collection of its neighbours (inbound or outbound or both).
The collection of neighbors may be a vector, a linked list, or a set. The set allows to quickly test if (x,y) is an edge if x has a lot of outbound neighbors; the vector is more compact and works reasonably if the above test is not very often performed or if the number of outbound neighbours is small.
To get from the vertex to the set of neighbours, we can use a vector where the vertex is the index, or a map (dictionary) where the vertex is the key.
The vector is more compact and faster, but requires the vertices to be consecutive integers (which, in turn, means that removing a vertex requires the re-numbering of all the vertices following it).
Memory: Θ(n+m)
Test edge: O(deg(x))
Parse neighbours: Θ(deg(x))
Read-only operations
Operations concerning edge costs
Example, Python:
class Vertex: def __eq__(self, other): if not isinstance(other, self.__class__): return False ... def __ne__(self, other): return not self.__eq__(other) def __hash__(self): ...
Example, Java:
class Vertex: boolean equals(Object other) { if (! other instanceof Vertex) return false; Vertex otherVertex = (Vertex)other; ... } int hashCode() { ... } }
Example, C++
class Vertex: bool operator==(Vertex const& other) { ... } bool operator<(Vertex const& other) { ... } }
Example, Python:
class Graph: # Return by reference; beware of possible change by user def parseNout(self, x): return self.__out[x] # return a copy def parseNout(self, x): l = [] for y in self.__out[x]: l.append(y) return l # return a copy def parseNout(self, x): return [y for y in self.__out[x]] # return an iterable def parseNout(self, x): for y in self.__out[x]: yield y for y in g.parseNout(x): ... # Beware: s = g.parseNout(x) s.append(...)
Example, Java:
class Graph: // return by reference public Iterable parseNout(Vertex x) { return _out.get(x); } // return a copy public Iterable<Vertex> parseNout(Vertex x) { return new ArrayList<Vertex>(_out.get(x)); } // return a read-only wrapper over the direct reference public Iterable parseNout(Vertex x) { return Collections.unmodifyableList(_out.get(x)); } private Map > _out; }
Example, C++
class Graph: // Standard C++ collection class iterator {...} iterator parseNout_begin(x){..} iterator parseNout_end(x){..} // ad-hoc - return by value listRadu-Lucian LUPŞAparseNout(Vertex x) }