netbench / gpcnet Goto Github PK
View Code? Open in Web Editor NEWLicense: Other
License: Other
The build of GPCNET results in multiple definitions of the symbols table_outerbar, table_innerbar, and print_buffer:
$ make clean
rm -f *.o
rm -f network_test
rm -f network_load_test
$ make
cc -c -o network_test.o network_test.c -I .
cc -c -o random_ring.o random_ring.c -I .
cc -c -o collectives.o collectives.c -I .
cc -c -o subcomms.o subcomms.c -I .
cc -c -o utils.o utils.c -I .
cc -o network_test utils.o random_ring.o collectives.o subcomms.o network_test.o -I . -lm
/usr/bin/ld: random_ring.o:(.bss+0x0): multiple definition of `table_outerbar'; utils.o:(.bss+0x0): first defined here
/usr/bin/ld: random_ring.o:(.bss+0x60): multiple definition of `table_innerbar'; utils.o:(.bss+0x60): first defined here
/usr/bin/ld: random_ring.o:(.bss+0xc0): multiple definition of `print_buffer'; utils.o:(.bss+0xc0): first defined here
...
I suggest the following changes:
$ diff network_test.h.orig network_test.h
34c34
< char table_outerbar[TBLSIZE+1], table_innerbar[TBLSIZE+1], print_buffer[TBLSIZE+1];
---
> extern char table_outerbar[TBLSIZE+1], table_innerbar[TBLSIZE+1], print_buffer[TBLSIZE+1];
$ diff utils.c.orig utils.c
21a22,23
> char table_outerbar[TBLSIZE+1], table_innerbar[TBLSIZE+1], print_buffer[TBLSIZE+1];
>
$ make clean; make
rm -f *.o
rm -f network_test
rm -f network_load_test
cc -c -o network_test.o network_test.c -I .
cc -c -o random_ring.o random_ring.c -I .
cc -c -o collectives.o collectives.c -I .
cc -c -o subcomms.o subcomms.c -I .
cc -c -o utils.o utils.c -I .
cc -o network_test utils.o random_ring.o collectives.o subcomms.o network_test.o -I . -lm
$
Hello,
I got the table from below after running network_test.
I have two questions:
Kind regards,
Lucian Anton
Network Tests v1.3
Test with 14320 MPI ranks (1790 nodes)
Legend
RR = random ring communication pattern
Nat = natural ring communication pattern
Lat = latency
BW = bandwidth
BW+Sync = bandwidth with barrier
+------------------------------------------------------------------------------------------------------------------------------------------+
| Isolated Network Tests |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| Name | Min | Max | Avg | Avg(Worst) | 99% | 99.9% | Units |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| RR Two-sided Lat (8 B) | 1.2 | 22.2 | 1.5 | 4.7 | 3.6 | 5.1 | usec |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| RR Get Lat (8 B) | 1.3 | 22.3 | 1.9 | 3.7 | 2.2 | 3.6 | usec |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| RR Two-sided BW (131072 B) | 549.7 | 3015.1 | 1199.2 | 764.5 | 460.4 | 335.0 | MiB/s/rank |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| RR Put BW (131072 B) | 7.4 | 22134.8 | 2598.8 | 7.4 | 0.9 | 0.9 | MiB/s/rank |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| RR Two-sided BW+Sync (131072 B) | 336.2 | 2031.9 | 916.5 | 769.7 | 335.5 | 186.9 | MiB/s/rank |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| Nat Two-sided BW (131072 B) | 650.0 | 4913.7 | 1899.5 | 1124.1 | 1142.5 | 883.4 | MiB/s/rank |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| Multiple Allreduce (8 B) | 37.3 | 78.3 | 45.5 | 78.3 | 113.3 | 999.9 | usec |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
| Multiple Alltoall (4096 B) | 838.9 | 1003.9 | 901.6 | 838.9 | 479.3 | 186.3 | MiB/s/rank |
+---------------------------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+
Hello all,
We are working to run gpcnet on the new ARM offerings for AWS, using EFA, and are getting Segmetation Faults on the congester portion of the tests. First question.....have the benchmarks been successfully compiled/run on ARM by anyone that they can share lessons learned if they have any? Second, any tips on where to start when looking into the Seg Faults for just the congestion portion?
Thanks!
I tried to do this by setting latency iterations, but malloc fails for larger (> 10,000,000) latency iterations.
Failed to allocate perf_vals in random_ring()
Sometimes users would like to just run congestors without first running canaries.
Dear colleagues,
Thank you for very interesting article about GPCNeT presented at SC19!
In my opinion GPCNeT looks like a good attempt to fill the existing gap in congestion control studies of HPC networking.
I'm not sure is the GitHub is the right place for asking questions regarding GPCNeT, but why not to?
Since congestion control is also a point of my personal research interests, I decided to evaluate the GPCNeT on a typical small cluster system with 32 nodes:
I ran the network_load_test using 28 of 32 nodes with in several scenarious and got the results presented below:
I'm curious why there is no congestion impact in all scenarios (except some random noise from time to time). I came up with the several gipotheses:
In my opinion the first one gipothesis with head room in the switch is the case here.
What do you think? Maybe there was attemts to run GPCNeT on small clusters not mentioned in the paper?
In any case I would be grateful for any discussion or explanation :-)
Mikhail
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.