Code Monkey home page Code Monkey logo

pattenmatching_sharedmemory's Introduction

PruneJuice on shared memory

This is a single node implementation of the patternmatching pipeline shown in Reza, T., Ripeanu, N., Tripoul, N., Sanders, G., and Pearce, R. PruneJuice: Pruning Trillion-edge Graphs to a Precise Pattern-Matching Solution. In Proceedings of The IEEE/ACM International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'18), Dallas, Texas, 11 - 16 November, 2018.

The main contribution to the distributed memory system is the ability to dynamically aggregate topological information from the graph to inform the pruning heuristic. It is described in Tripoul, N., Halawa, H., Reza, T., Ripeanu, M., Sanders, G., and Pearce, R. There are Trillions of Little Forks in the Road. Choose Wisely! - Estimating the Cost and Likelihood of Success of Constrained Walks to Optimize a Graph Pruning Pipeline. In Proceedings of The 8th Workshop on Irregular Applications: Architectures and Algorithms (IA^3'18), co-located with SC'18, Dallas, Texas, 11 - 16 November, 2018.

Installation

This software requires cmake >=3.10

sudo apt install cmake
cmake --version

The TOTEM Framwork upon which is built the pattern matching algorithm requires both CUDA >=9.1 and the Intel Threading Building Block library.

sudo apt install nvidia-cuda-toolkit gcc-6
nvcc --version

sudo mkdir /usr/local/cuda
sudo mkdir /usr/local/cuda/bin
sudo ln -s /usr/bin/nvcc /usr/local/cuda/bin/nvcc
sudo apt-get install libtbb-dev

Once the prerequisite are installed, you can clone the github and start running an example.

git clone https://github.com/tahsinreza/pattenmatching_sharedmemory.git

Then we can use make to build the cmake configuration and the various executables into the build/ folder.

make all

Testing the patternmatching algorithm

For convenience, two python script can run and show the results of the experiments in the following paper : http://www.ece.ubc.ca/~matei/papers/ia3-nicolas.pdf

python run_patternmatching.py --help 

usage: run_patternmatching.py [-h] [--experiment {1,2,3,all}]
                              [--dataset {test_1,test_2,test_3,test_4,test_all,youtube,patent,IMDB,reddit,all}]

Run patternmatching tests.

optional arguments:
  -h, --help            show this help message and exit
  --experiment {1,2,3,all}
                        experiment to run (default="all")
  --dataset {test_1,test_2,test_3,test_4,test_all,youtube,patent,IMDB,reddit,all}

The test dataset are already included inside the repository at 'data/patternmatching/test_*'. To launch the experiments on those datasets, the following command can be run. This should take a few seconds to finish.

python run_patternmatching.py --dataset test_all --experiment all

When this is done, the result can be displayed through

python show_patternmatching_results.py --dataset test_all --experiment all

It accepts the same arguments as run_patternmatching.py.

Additional datasets

The additional datasets are huge (1GB - 64GB). They can be downloaded here :

They should be put inside the folder data/patternmatching/.

The youtube and patent datasets use simple patterns, on a large-memory (512GB) machine: 4 socket Intel(R) Xeon(R) CPU E5-2670.v2 @2.50GHz (40 cores) less than a minute to complete.

The IMDB dataset has harder patterns and can take several hours to complete.

The Reddit dataset is huge with very complex pattern. A large memory machine is required (More than 128 GB of RAM). It can take ten's of hours to terminates.

pattenmatching_sharedmemory's People

Contributors

ahgharaibeh avatar ahg-g avatar ntripoul avatar elizeu avatar scottsallinen avatar lbeltrao avatar hpcresearchanddevelopment avatar tahsinreza avatar

Watchers

James Cloos avatar  avatar Tanuj Kr Aasawat avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.