Code Monkey home page Code Monkey logo

Comments (4)

alazzaro avatar alazzaro commented on July 22, 2024 1

I'm currently working with matrices where >99% of the values of 0, and it's very unlikely most of the blocks will be larger than 1x1 in size. However, the libsmm_acc code makes me think it could still be useful there, since SMM is used for the smaller blocks and there's some cuda optimization.

The <1% occupancy is fine for the library. However, as you pointed out, the library is really optimized for block-sparsity (DBCSR = distributed block CSR), so you will not get any special optimization for the multiplication of such blocks (they will simply run on the CPU, i.e. no GPU involved).

I was able to create large matrices by modifying the example dbcsr_example_3.cpp and got 1 teraflop performance, but with occupation of 1.

For this particular case of dense matrices, we run a "densification" algorithm.

It looks like when I assign values with c_dbcsr_iterator_next_2d_block_d, it treats the blocks like full 2D matrices. I could still calculate the 1x1 blocks and their resulting locations, but I feel like that wouldn't be very fast.

This is correct, we always assume blocks, so single elements are treated as 1x1 blocks. We don't run any special optimizationf for that (it is a bit out of the scope for DBCSR). Still, the library will work, but I would assume the performance would be low.

In conclusion, sparsity is fine (we use to run tests with 0.01% of occupancy), but we do expect blocks-sparse to get the best performance out of the library.

from dbcsr.

SimLeek avatar SimLeek commented on July 22, 2024

Fair. I guess 'maximum local sparsity' should be something like 75% for a block size, and it might be useful to do a game of life style convolve, 'if 3 values in a square of 4 involving this location are non-zero, then count this zero as non-zero', then build an R-tree for 100% dense rectangles, if you know there's going to be grouping but you don't know exactly where.

One of my matrices might be like that in a best case scenario, with most non-zero elements along the diagonal, sand I might be able to just set some grouping on the diagonal ahead of time, but the other matrices will actually avoid clumping.

Anyway, I think that answers the question in this issue: libsmm_acc does not optimize 1x1 blocks, nor does any other part of the library, since that's outside the scope of DBCSR. If that's correct, I think this can be closed.

from dbcsr.

alazzaro avatar alazzaro commented on July 22, 2024

Anyway, I think that answers the question in this issue: libsmm_acc does not optimize 1x1 blocks, nor does any other part of the library, since that's outside the scope of DBCSR. If that's correct, I think this can be closed.

This is correct. Note that the library will work for 1x1 elements, but there will a massive overhead for assuming "blocks" (lookup of the dimensions and calculation of the position) that are useless for single elements... On the other hand, we could think to build the matrices with blocks of a given dimension, where the blocks are sparse inside. Then what we need is a way to group 1x1 blocks in larger blocks...

Out of curiosity, what's your use case?

from dbcsr.

SimLeek avatar SimLeek commented on July 22, 2024

I'm trying to get optimized sparsity working for linear layers in pytorch, preferably with each layer being variational to maintain sparsity, then conv layers. I'm looking at smaller SPGeMM and SpMSpV libraries now that look hopeful, but I'll have to test them.

Unfortunately, that variational rule is likely what's going to make it so that most of the blocks, even 2x2, will be sparse. And now that I think about it, that might apply to the synapses as well as the input, since having meaningfully different input connections would help keep an output from activating at the same time as another output.

from dbcsr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.