Code Monkey home page Code Monkey logo

gpu-samples's Introduction

GPU samples

Parallel reduction and pointer jumping algorithms summarizing values from an array adapted to run on a GPU using Aparapi and JOCL (frontends to openCL).

building and running comparison of various sync methods in openCL parallel reduction

First, make sure that you have an openCL driver for your GPU installed: Nvidia, AMD Linux (AMD on windows should be available by default, hopefully).

./mvnw package
java -jar target/pointer-jumping-gpu-1.0-SNAPSHOT-jar-with-dependencies.jar

This will run parallel reduction kernels using 3 different approaches to synchronization on arrays of various sizes from 32k to 128M elements, 50 times for each size. On my machine it takes about 5 minutes. For each size it will output average time for each sync method.

These are times I got on my integrated Intel GPU:

32k element array:

BARRIER average:     403076
   SIMD average:     295953
 HYBRID average:     269073
    CPU average:      62924

128k:

BARRIER average:     768170
   SIMD average:     483343
 HYBRID average:     433704
    CPU average:     175977

256k:

BARRIER average:    1018578
   SIMD average:     793267
 HYBRID average:     738423
    CPU average:     367999

512k:

BARRIER average:    1191166
   SIMD average:    1019678
 HYBRID average:     828609
    CPU average:     780270

1M:

BARRIER average:    1759843
   SIMD average:    1580668
 HYBRID average:    1366559
    CPU average:    1288948

2M:

BARRIER average:    3406786
   SIMD average:    3070155
 HYBRID average:    2398054
    CPU average:    2674748

3M:

BARRIER average:    4166284
   SIMD average:    4192948
 HYBRID average:    3480526
    CPU average:    3575055

4M-4k:

BARRIER average:    6573353 (1 recursive step on HYBRID)
   SIMD average:    6758205
 HYBRID average:    5653419
    CPU average:    5582159

4M:

BARRIER average:   13797841
   SIMD average:   13367851
 HYBRID average:   12600975
    CPU average:    5427631

32M:

BARRIER average:  102840013
   SIMD average:  103991061
 HYBRID average:   95481061
    CPU average:   41226782

128M:

BARRIER average:  363563970
   SIMD average:  387534517
 HYBRID average:  344870087
    CPU average:  160136923

255M:

BARRIER average:  878887550 (1 recursive step on HYBRID)
   SIMD average:  819652415
 HYBRID average:  730983353
    CPU average:  323803437

gpu-samples's People

Contributors

morgwai avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.