Code Monkey home page Code Monkey logo

bench2's Introduction

benchmark on Neoverse-V1 (Graviton 3)

CFLAGS=-Wall -O3 -march=armv8-a+sve:

❯ ./main 300 10000
v_sparse_dot ------ Value: 3.028991, Time: 907.99ns/iter
v_sparse_dot_omp -- Value: 3.028991, Time: 1215.47ns/iter
v_sparse_dot_sve -- Value: 3.028991, Time: 1110.44ns/iter

CFLAGS=-Wall -O3 -mcpu=neoverse-v1:

❯ ./main 300 10000
v_sparse_dot ------ Value: 4.321673, Time: 1696.15ns/iter
v_sparse_dot_omp -- Value: 4.321673, Time: 1218.40ns/iter
v_sparse_dot_sve -- Value: 4.321673, Time: 1085.07ns/iter

bench2's People

Contributors

silver-ymz avatar

Watchers

 avatar

bench2's Issues

The makefile is missing the -fopenmp C flag and thus the related pragmas are ignored by the compiler.

I played with the code here since these was a concern with -mcpu=neoverse-v1 versus -march=armv8-a+sve flags.
With GCC-11 two of the tests are faster than with clang with either cpu/arch flag:

v_sparse_dot ------ Value: 2.551899, Time: 494.75ns/iter
v_sparse_dot_omp -- Value: 2.551900, Time: 2318.14ns/iter
v_sparse_dot_sve -- Value: 2.551899, Time: 951.34ns/iter

It appears that for the function v_sparse_dot_omp the #pragma omp directives are ignored when using the Makefile as-is.
When -fopenmp is given to enable the omp options the build fails:

clang-15 -Wall -O3 -mcpu=neoverse-v1 -fopenmp -c -o lib.o lib.c
lib.c:35:13: error: expected an OpenMP directive
#pragma omp reduction(| : m1) reduction(| : m2)
            ^
lib.c:33:3: error: statement after '#pragma omp parallel for' must be a for loop
  while (lhs_pos < lhs_loop_len && rhs_pos < rhs_loop_len) {
  ^
2 errors generated.
make: *** [Makefile:11: lib.o] Error 1

The same is true with -march=armv8-a+sve.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.