sbeamer / gapbs Goto Github PK

View Code? Open in Web Editor NEW

303.0 303.0 129.0 139 KB

GAP Benchmark Suite

Home Page: http://gap.cs.berkeley.edu/benchmark.html

License: Other

Makefile 6.23% C++ 92.87% Emacs Lisp 0.90%

benchmark bfs graph-algorithms openmp

gapbs's People

Contributors

Stargazers

Watchers

Forkers

mminutoli danteisalive vesvis tgoodyear yunmingzhang17 jspark1105 whu-zhigao manishsat sampollard jleidel alonamid faintfay daidong jagadeesh545 mortezaramezani wfang michaelsutton gr-rahimi alenks cpape129 david19han ogreen ledif westernmagic sandy4321 sizhuo-zhang cookiestuf x-y-z mbrukman arithmetica foxjas dpreed jesunsahariar petosa lushl9301 rdfriese gurbinder533 szarnyasg orrrrtem warsier juliantoya marjanfariborz wheatman guoqinglei kacpro davidoster qianlong-zhang itsfrank amoghlonkar darchr liuzhidan azurite rasviitanen dz333 tejashah94 kchester12 spruett connormas abhishekkumarjain akatsukis seanzw utkarshsaxena1 libaoke subbubhimala sg0 dhschall louisjenkinscs mandeep-khadka artorhem maiablanco brightclark laruim l-hoang manish-22 ahn9807 louis-yankang soupy710 jiangjiaqi6 zongwuwang shuiliusheng gshimansky unikernellinux heshpdx mrutt92 zooooey marfariborz owm111 bsg-external sid-agrawal kevins981 stardust-sjf alisemi pongstorn-xilinx kromancer yjsoh michaelleerilee jessdagostini multi-v-vm naruto110 deglaus

gapbs's Issues

Adding new kernels to the framework

Good morning folks!

I am writing this to you as a question/suggestion. Currently working on graph-processing and going through recent literature on the topic, I figured out that most papers make use of either the Ligra framework or this one. It turns out, Ligra implements some kernels that are not available here (e.g.: Raddi, MIS).

However, this framework supporting a widely available graph file format, which I believe makes it a better fit for adoption and I do believe that we would all benefit from the addition of some extra kernels. So there comes the question: do plan including some more kernels?

bfs verification fails on kronecker graphs scale >= 16

I apologize if this is a picnic problem (i.e. mine, not gapbs) especially since I recall but can't find something about the bfs kernel run time being limited, but I have experienced bfs verification failures on Kronecker graphs of scale 16 and larger (smaller graphs pass).

gapbs-master$ make clean all
rm -f bc bfs cc pr sssp tc converter test/out/*
g++ -std=c++11 -O3 -Wall -fopenmp src/bc.cc -o bc
g++ -std=c++11 -O3 -Wall -fopenmp src/bfs.cc -o bfs
g++ -std=c++11 -O3 -Wall -fopenmp src/cc.cc -o cc
g++ -std=c++11 -O3 -Wall -fopenmp src/pr.cc -o pr
g++ -std=c++11 -O3 -Wall -fopenmp src/sssp.cc -o sssp
g++ -std=c++11 -O3 -Wall -fopenmp src/tc.cc -o tc
g++ -std=c++11 -O3 -Wall -fopenmp src/converter.cc -o converter

gapbs-master$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu116.04.4' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu116.04.4)

gapbs-master$ ./bfs -g 15 -n 1 -v
Generate Time: 0.07998
Build Time: 0.02079
Graph has 32767 nodes and 441438 undirected edges for degree: 13
Source 6549.00000
i 0.00004
td 146 0.00005
e 0.00001
bu 14340 0.00022
bu 9608 0.00011
bu 109 0.00005
c 0.00002
td 0 0.00001
Trial Time: 0.00059
Verification: PASS
Verification Time: 0.00266
Average Time: 0.00059

gapbs-master$ ./bfs -g 16 -n 1 -v
Generate Time: 0.10371
Build Time: 0.05079
Graph has 65536 nodes and 909645 undirected edges for degree: 13
Source 13098.00000
i 0.00006
td 23 0.00007
td 7636 0.00007
e 0.00004
bu 36959 0.00032
bu 2062 0.00012
c 0.00005
td 7 0.00002
td 0 0.00001
Trial Time: 0.00084
Reachability mismatch
Verification: FAIL
Verification Time: 0.00605
Average Time: 0.00084

It doesn't seem to be a problem with uniform graphs...

gapbs-master$ ./bfs -u 20 -n 1 -v
Generate Time: 0.28557
Build Time: 0.90005
Graph has 1048576 nodes and 16776968 undirected edges for degree: 15
Source 209629.00000
i 0.00079
td 29 0.00006
td 872 0.00002
td 27534 0.00037
td 579473 0.00761
e 0.00592
bu 440667 0.00727
bu 0 0.00041
c 0.00055
Trial Time: 0.02335
Verification: PASS
Verification Time: 0.52839
Average Time: 0.02335

"make test" dependency is missing

make -j test sometimes fails because the directory test/out is not built yet.
Here's a sample output (the output is not deterministic because I used make -j):

...
./bc -g10 -vn1 > test/out/verify-bc-g10.out
./bfs -g10 -vn1 > test/out/verify-bfs-g10.out
 PASS Build
./cc -g10 -vn1 > test/out/verify-cc-g10.out
./cc_sv -g10 -vn1 > test/out/verify-cc_sv-g10.out
/bin/sh: 1: cannot create test/out/verify-bc-g10.out: Directory nonexistent
test/test.mk:77: recipe for target 'test/out/verify-bc-g10.out' failed

This problem can be fixed by adding the test/out dependency in the following rule:

test/out/verify-%-$(TEST_GRAPH).out: test/out %
    ./$* -$(TEST_GRAPH) -vn1 > $@

Out-of-bounds access in builder.h?

Hi Scott,

It seems like when symmetrizing a graph, MakeCSRInPlace() accesses an out-of-bounds element and miscalculates total_missing_inv. At line 252:

      for (NodeID_ n = 0; n <= num_nodes_; n++) {
        offsets[n] += total_missing_inv;
        total_missing_inv += invs_needed[n]; // n will eventually be num_nodes_
      }

Range of n is [0, num_nodes_]. This is fine for offsets with a size of num_nodes_ + 1. But invs_needed[num_nodes_] goes out of bounds. Consequently an undetermined value is added to total_missing_inv, and I think it affects the second step of symmetrizing as well (at line 268).

My current fix is just executing the last round of loop separately (as below). I would like to confirm with you whether I understand the code correctly and if this fix would solve it.

      for (NodeID_ n = 0; n < num_nodes_; n++) {
        offsets[n] += total_missing_inv;
        total_missing_inv += invs_needed[n];
      }
      offsets[num_nodes_] += total_missing_inv;

Thank you for your time!

The calculation of num_edges may cause problems.

gapbs/src/graph.h

Line 134 in ad3be9e

num_edges_ = (out_index_[num_nodes_] - out_index_[0]) / 2;

Hi, I wonder why should we divide by 2 when calculating the number of edges. I guess it probably means the edges are bi-directional?
However, when using the num_edges value could bring errors.

Please kindly check this part.

Thanks

Request help for additional function about get the start vertex from the file

Dear Maintainer，

I hope to add a feature to Gapbs that can read a list of nodes from a file or traverse all nodes. I compiled the new code locally (ubuntu 20.04) and added a command line option.

I would like to confirm with you that my modifications will not impact the performance of GAPBS and will ensure that BFS operates in the correct mode. This is crucial for our performance evaluation of GAPBS, and we greatly appreciate your attention to this matter. If you have any concerns about our modifications, please also let us know here.

I support the pull request for detial information. #39

gzip error during make bench-graphs

Hello, I am trying to build the input graphs using the command make bench-graphs but I encounter an error related to gzip as follows:

gunzip -k benchmark/graphs/raw/USA-road-d.USA.gr.gz
gzip: invalid option -- 'k'
Try 'gzip --help' for more information.
make: *** [benchmark/graphs/raw/USA-road-d.USA.gr] Error 1

My current gzip version is gzip 1.5 , do you require a specific version? Thanks!

How to prove relabelling in triangle counting is effective？

Is there any relevant literature describing the validity of relabelling?

urand graph generation test fails on Ubuntu 22

I upgraded my system from Ubuntu 20 to 22 and now all tests pass except for test-generate-u10.
This test runs ./bfs -u10 -n0 and compares it to a reference output.
The number of edges is 16104 on Ubuntu 20 and 16104 on Ubuntu 22.
Digging deeper, I think that the problem is that the uniform_int_distribution of libstdc++ was changed.
(Ubuntu 22 default gcc version is 11.2.0 whereas Ubuntu 20 uses gcc 9.4.0.)

To demonstrate the problem I slightly modified the code:

diff --git a/src/generator.h b/src/generator.h
index 3127f4b..77dbfb5 100644
--- a/src/generator.h
+++ b/src/generator.h
@@ -72,6 +72,7 @@ class Generator {
         rng.seed(kRandSeed + block/block_size);
         for (int64_t e=block; e < std::min(block+block_size, num_edges_); e++) {
           el[e] = Edge(udist(rng), udist(rng));
+          std::cout << "edge(" << e << ")=" << el[e].u << "," << el[e].v << std::endl;
         }
       }
     }

The diff between Ubuntu 20 and 22 outputs is:

4874c4874
< edge(4873)=662,102
---
> edge(4873)=661,102
5160c5160
< edge(5159)=942,269
---
> edge(5159)=941,269
7620c7620
< edge(7619)=263,264
---
> edge(7619)=263,263
15027c15027
< edge(15026)=641,856
---
> edge(15026)=641,855
16385,16387c16385,16387
< Generate Time:       0.07683 
< Build Time:          0.04481 
< Graph has 1024 nodes and 16104 undirected edges for degree: 15
---
> Generate Time:       0.06717 
> Build Time:          0.01616 
> Graph has 1024 nodes and 16103 undirected edges for degree: 15

You can see that some edges changed, probably due to the uniform random number engine. Specifically, Edge 7619 turned into a self loop, so it got squished, reducing the number of edges by 1.

I'm not sure how to fix this issue. Perhaps adding some tolerance to the edge count comparison in the tests?

terminate called after throwing an instance of 'std::bad_array_new_length'

Hi sbeamer:

I tried to compile and run the gapbs benchmark. I get the error of std::bad_array_new_length

The steps to reproduce:

I run make bench-graphs to download the Graph.

ls benchmark/graphs/raw/ -l
total 31882424
lrwxrwxrwx 1 14 Sep 6 10:37 twitter.el -> twitter_rv.net
-rw-r----- 1 26172210176 Sep 5 18:27 twitter_rv.net
-rw-r----- 1 6475352982 Feb 17 2010 twitter_rv.tar.gz

Then I run make bench-run, and get the following error:

./converter -f benchmark/graphs/raw/twitter.el -b benchmark/graphs/twitter.sg
Read Time:           57.28654
terminate called after throwing an instance of 'std::bad_array_new_length'
  what():  std::bad_array_new_length
make: *** [benchmark/graphs/twitter.sg] Aborted

I am able to trace and the error coming at this place:

2  0x0000000000408850 in BuilderBase<long, long, long, true>::MakeGraph() () at src/builder.h:235
#3  0x0000000000403560 in main () at src/converter.cc:26
(gdb) n
53	    while (in >> u >> v) {
(gdb) l
48	  }
49	
50	  EdgeList ReadInEL(std::ifstream &in) {
51	    EdgeList el;
52	    NodeID_ u, v;
53	    while (in >> u >> v) {
**54	      el.push_back(Edge(u, v))**;
55	    }
56	    return el;
57	  }

Strange program names due to additional '$' in Makefile

Hi,
When I compile using make, the resulting names of the executables become: 'bc$', 'tc$', 'pr$' etc.
If I remove the last '$' in this line of the Makefile, the executable names are as expected bc, tc, pr.
What is this $ meant for? Could this be a typo?
Best,
Arnau

Atomic insert

Hello, I have a minor concern for bfs implementation. I am sorry if I miss something but I think in bottom-up function for the bfs (line 57) insertion should be done atomically. Otherwise that might cause a race condition when two threads try to update same word in the bitmap.

Segmentation fault found on SSSP

Hi, I am using GAP for a performance study. I found a segmentation fault on SSSP while testing it on SuiteSparse matrices, such as scircuit.mtx, shipsec1.mtx, consph.mtx, cop20k_A.mtx, and many others. The segmentation fault occurs at here, in which the index "dest_bin" can be '-1' in many cases, and the variable "new_dist" can be '-1' as well. The SSSP did pass on evaluated matrices like webbase-1M.mtx tho.
Could you please check? Thank you very much :)

How to set the path before we do CXX=g++-8 make

Distributed Version ?

Hi,
Is there distributed version? I only found the openMP version which is constrained to single machine.

Unable to make bench-graphs

This is on a machine with 64GB RAM and 934GB of free disk space.

asamara@server:~/gapbs$ make bench-graphs
./converter -g27 -k16 -wb benchmark/graphs/kron.wsg
Generate Time:       176.26653
Build Time:          181.72024
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc

Generation of Kronecker graph

I want to use this repository for the generation of Kronecker graph.
Can anyone help me from where I shall start?

In readme.md it is given that:
./bfs -g 10 -n 1
-g 20 generates a Kronecker graph with 2^20 vertices (Graph500 specifications)
But where does this generated graph gets stored?

Thanks in advance

Issue in `converter` for machine with low config, can there other raw graph avaliable except twiter/sk/USA road?

Certificate of the "web" dataset has expired

Running make bench-graphs yields:

...
wget -P benchmark/graphs/raw https://sparse.tamu.edu/MM/LAW/sk-2005.tar.gz
--2022-08-07 13:17:01--  https://sparse.tamu.edu/MM/LAW/sk-2005.tar.gz
Resolving sparse.tamu.edu (sparse.tamu.edu)... 54.237.159.171, 52.21.227.162, 3.226.182.14, ...
Connecting to sparse.tamu.edu (sparse.tamu.edu)|54.237.159.171|:443... connected.
ERROR: cannot verify sparse.tamu.edu's certificate, issued by ‘CN=InCommon RSA Server CA,OU=InCommon,O=Internet2,L=Ann Arbor,ST=MI,C=US’:
  Issued certificate has expired.
To connect to sparse.tamu.edu insecurely, use `--no-check-certificate'.
make: *** [benchmark/bench.mk:68: benchmark/graphs/raw/sk-2005.tar.gz] Error 5

Segmentation fault while running on large dataset

Hi, i have run the bfs benchmark using a large configuration and it returns segmentation fault after running awhile.

[root@local gapbs]# ./bfs -g 32 - n 1
Segmentation fault

Would you let me know where i can find the detailed logs?

thanks,

Dong

Using graphs from LDBC

Good morning,

As stated by @sbeamer in #25, having a variety of inputs is more important than having a handful of kernels. As I am trying to use more kernels for my research projects, I came across the LDBC Graphalytics Datasets library which provides a handful of graphs (and I believe some are overlapping with the ones provided by GAP).

So far I have been trying to convert the graph to the .sg and .wsg formats support by GAP, but unsuccessfully.

For each graph that one may want to use, the library provides a .zip file containing a .e file which is supposed to be an edge-list or a weighted edge-list format. Attempting to convert them to the serialized formats I have been trying the following commands:

$ ./converter -sf benchmark/graphs/raw/datagen-8_2-zf.wel -wb benchmark/graphs/datagen-8_2-zf.wsg
$ ./converter -sf benchmark/graphs/raw/datagen-8_2-zf.el -b benchmark/graphs/datagen-8_2-zf.sg

But it turns out I consistently get the following output:

Read Time:           0.00003
Build Time:          0.00000
Graph has 1 nodes and 0 directed edges for degree: 0

Which is a bit odd knowing that the graph is supposed to be reasonably sized.

Do you know why I get that kind of behaviour from the converter? Am I doing something wrong here?

I believe that the file format matches. For instance, when I run the head command on the edge-list file, I get the following output:

6 13194202950341 1
10 13194193460473 1
41 10995156189111 1.01186
41 10995167006541 1
41 13194202501832 1
41 13194202844825 1
48 10995127351388 1.30612
48 13194202613934 1
50 13194202934774 1
59 13194202476756 1

running on server with 32GB of RAM

Hi Community,
I'm a student and I'm researching on a way to improve cache. And I want run GAP Benchmark on my server to test that way. But this server has only 32GB of RAM memory.
Is there any way to run this GAP benchmark on the server?
Is there any lighter version of the benchmark that be eligible for 32GB of RAM memory?

Thank you,
With all due respect

Let's make the world better