Code Monkey home page Code Monkey logo

gpu-programming-mn-matrices's Introduction

Assignment III

GPU Programming – Fall 2021

  1. Change into the 08-intro-to-cuda directory.
  2. Examine the source code in add-vectors.cu until you are comfortable with its operation. In particular, be sure you can identify which parts of the program correspond with each part of the pattern described in the program's heading comments.
  3. Compile and run the program:

nvcc -o add-vectors add-vectors.cu

./add-vectors

  1. The output will probably not be too exciting but should convince you the program is working correctly. Try running the program with different vector lengths

./add-vectors 5

./add-vectors 50

./add-vectors 10000

./add-vectors 100000000

The program doesn't display vectors longer than 100 elements, so the last two commands won't produce any output. Notice, however, that the computation is correct for a range of sizes, even though our block size was set to 16.

  1. CUDA SDKs since version 5.0 have included a profiler. You do not need to instrument and/or recompile your code; just run the profiler with your program and any arguments:

nvprof ./add-vectors 1000

The output will timing information for each CUDA function. Notice that the program spends most of its time allocating memory on the device when the vector length is 1000. Now try

nvprof ./add-vectors 100000000

and you should find very different behavior; the time to copy memory to and from the device is the dominant time.

Now it's your turn

Exercise: Write a program that initializes two M×N matrices and computes the sum of the two matrices on the GPU device. After copying the result back to the host, your program should print out the result matrix if N≤10. You may use add-vectors.cu as a starting point or start from scratch.

It is natural to use a 2D grid for a matrix. In this case the block_size and num_blocks variables should be of type dim3. The kernel launch area show below accomplishes this

dim3 block_size( 16, 16 );

dim3 num_blocks( ( n - 1 + block_size.x ) / block_size.x,

( m - 1 + block_size.y ) / block_size.y );

add_matrices<<< num_blocks, block_size >>>( c_d, a_d, b_d, m, n );

Of course, the kernel code will need to work correctly with a 2D grid rather than the 1D grid used in add-vectors.cu.

Test your code with a range of values of M and N. For each case, run your program both without and with the profiler.

What to turn in

Please turn in a printout of your final matrix-addition source code along with a short report summarizing the profiling data.

gpu-programming-mn-matrices's People

Contributors

leanerr avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.