Gunrock is a CUDA library for graph-processing designed specifically for the GPU. It uses a high-level, bulk-synchronous, data-centric abstraction focused on operations on a vertex or edge frontier. Gunrock achieves a balance between performance and expressiveness by coupling high performance GPU computing primitives and optimization strategies with a high-level programming model that allows programmers to quickly develop new graph primitives with small code size and minimal GPU programming knowledge.
For more details, please visit our website, read Why Gunrock, our PPoPP 2016 paper, Gunrock: A High-Performance Graph Processing Library on the GPU, and check out the Publications section. See Release Notes to keep up with the our latest changes.
-
For Frequently Asked Questions, see the FAQ.
-
For information on building Gunrock, see Building Gunrock.
-
The "tests" subdirectory included with Gunrock has a comprehensive test application for most the functionality of Gunrock.
-
For the programming model we use in Gunrock, see Programming Model.
-
To use our stats logging and performance chart generation pipeline, please check out Gunrock-to-JSON.
-
We have also provided code samples for how to use Gunrock's C interface and how to call Gunrock primitives from Python, as well as annotated code for two typical graph primitives.
-
For details on upcoming changes and features, see the Road Map.
To report Gunrock bugs or request features, please file an issue directly using Github.
Leyuan Wang, Yangzihao Wang, Carl Yang, and John D. Owens. A Comparative Study on Exact Triangle Counting Algorithms on the GPU. In Proceedings of the 1st High Performance Graph Processing Workshop, HPGP '16, May 2016. [DOI | http]
Yuechao Pan, Yangzihao Wang, Yuduo Wu, Carl Yang, and John D. Owens. Multi-GPU Graph Analytics. CoRR, abs/1504.04804(1504.04804v2), April 2016. [arXiv]
Yangzihao Wang, Andrew Davidson, Yuechao Pan, Yuduo Wu, Andy Riffel, and John D. Owens. Gunrock: A High-Performance Graph Processing Library on the GPU. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '16, pages 11:1--11:12, March 2016. Distinguished Paper. [DOI | http]
Yuduo Wu, Yangzihao Wang, Yuechao Pan, Carl Yang, and John D. Owens. Performance Characterization for High-Level Programming Models for GPU Graph Analytics. In IEEE International Symposium on Workload Characterization, IISWC2015, October 2015. [DOI | http]
Carl Yang, Yangzihao Wang, and John D. Owens. Fast Sparse Matrix and Sparse Vector Multiplication Algorithm on the GPU. In Graph Algorithms Building Blocks, GABB 2015, May 2015. [DOI | http]
Afton Geil, Yangzihao Wang, and John D. Owens. WTF, GPU! Computing Twitter's Who-To-Follow on the GPU. In Proceedings of the Second ACM Conference on Online Social Networks, COSN '14, pages 63--68, October 2014. [DOI | http]
GTC 2016, Gunrock: A Fast and Programmable Multi-GPU Graph Processing Library, April 2016. [slides]
NVIDIA webinar, April 2016. [slides]
GPU Technology Theater at SC15, Gunrock: A Fast and Programmable Multi-GPU Graph processing Library, November 2015. [slides | video]
GTC 2014, High-Performance Graph Primitives on the GPU: design and Implementation of Gunrock, March 2014. [slides | video]
-
Yangzihao Wang, University of California, Davis
-
Yuechao Pan, University of California, Davis
-
Yuduo Wu, University of California, Davis
-
Carl Yang, University of California, Davis
-
Leyuan Wang, University of California, Davis
-
Weitang Liu, University of California, Davis
-
Muhammad Osama, University of California, Davis
-
Chenshan Shari Yuan, University of California, Davis
-
Andy Riffel, University of California, Davis
-
Huan Zhang, University of California, Davis
-
John Owens, University of California, Davis
Thanks to the following developers who contributed code: The connected-component implementation was derived from code written by Jyothish Soman, Kothapalli Kishore, and P. J. Narayanan and described in their IPDPSW '10 paper A Fast GPU Algorithm for Graph Connectivity (DOI). The breadth-first search implementation and many of the utility functions in Gunrock are derived from the b40c library of Duane Merrill. The algorithm is described in his PPoPP '12 paper Scalable GPU Graph Traversal (DOI). Thanks to Erich Elsen and Vishal Vaidyanathan from Royal Caliber and the Onu Team for their discussion on library development and the dataset auto-generating code. Thanks to Adam McLaughlin for his technical discussion. Thanks to Oded Green on his technical discussion and an optimization in CC primitive.
This work was funded by the DARPA XDATA program under AFRL Contract FA8750-13-C-0002, by NSF awards CCF-1017399, OCI-1032859, and CCF-1629657, by DARPA STTR award D14PC00023, and by DARPA SBIR award W911NF-16-C-0020. Our XDATA principal investigator is Eric Whyne of Data Tactics Corporation and our DARPA program managers are Dr. Christopher White (2012--2014) and Mr. Wade Shen (2015--present).
Gunrock is copyright The Regents of the University of California, 2013--2016. The library, examples, and all source code are released under Apache 2.0.