Ligra is a lightweight graph processing framework for shared memory. It is particularly suited for implementing parallel graph traversal algorithms where only a subset of the vertices are processed in an iteration. The project was motivated by the fact that the largest publicly available real-world graphs all fit in shared memory. When graphs fit in shared-memory, processing them using Ligra can give performance improvements of up to orders of magnitude compared to distributed-memory graph processing systems.
Ligra supports two data types, one representing agraph
, and another for representing subsets of the vertices, which is referred to asvertexSubsets
. Other than constructors and size queries, the interface supplies only two functions, one for mapping over vertices (vertexMap
) and the other for mapping over edges (edgeMap
). Since a vertexSubset is a subset of vertices, the vertexMap can be used to map over any subset of the original vertices, and hence its utility in traversal algorithms---or more generally in any algorithm in which only (possibly small) subsets of the graph are processed on each round. The edgeMap also processes a subset of the edges, which is specified using a vertexSubset to indicate the valid sources, and a Boolean function to indicate the valid targets of each edge.
For example, a parallel breadth-first search (BFS) algorithm is implemented using vertexSubsets and edgeMap:
The BFS uses a Parents array (initialized all to -1, except for the rootr
) in which each vertex will point to its parent in a BFS tree. As with standard parallel versions of BFS, on each stepi
(starting at 0) the algorithm maintains a frontier of all vertices reachable from the rootr
steps. Initially a vertexSubset containing just the root vertex is created to represent the frontier. Using edgeMap, each step checks the neighbors of the frontier to see which have not been visited, updates those to point to their parent in the frontier, and adds them to the next frontier. The user-supplied functionUpdate
atomically checks to see if a vertex has been visited using a compare-and-swap and returns true if not previously visited (Parents[i
] == -1). TheCond
function tells edgeMap to consider only target vertices which have not been visited. The edgeMap function returns a new vertex set containing the target vertices for whichUpdate
, all the vertices in the next frontier. The BFS completes when the frontier is empty and hence no more vertices are reachable.
Ligra uses an optimization for edgeMap which switches between read-based and write-based processing of frontier vertices based on the size of the frontier. This optimization was introduced for BFS byBeamer, Asanovic and Patterson
, and we generalize it to a variety of other graph algorithms besides BFS.
Ligra+ is an extension to Ligra that supports graph compression. The interface of Ligra+ is exactly the same as Ligra's interface, so code written in Ligra can run in Ligra+ without modification. Ligra+ currently supports three compression schemes---byte codes, byte codes with run-length encoding, and nibble codes. We show that using graph compression, we are able to reduce the space usage of graph algorithms and improve their parallel performance at the same time.
The source code for Ligra and Ligra+ is publicly availableon Github
. Implementations for breadth-first search, approximate betweenness centrality, graph radii estimation, connected components, PageRank and Bellman-Ford single-source shortest paths are currently available. Please contactJulian Shun
if you experience any problems running it. Feedback will be appreciated. We plan to add more algorithms implemented using Ligra. We welcome others to develop their own applications in Ligra, and would be happy to add some of them to the Github repository. Please send your implementations toJulian Shun