What levels of sparsity is this useful for? about dbcsr HOT 4 OPEN

SimLeek commented on September 23, 2024

What levels of sparsity is this useful for?

from dbcsr.

Comments (4)

alazzaro commented on September 23, 2024 1

I'm currently working with matrices where >99% of the values of 0, and it's very unlikely most of the blocks will be larger than 1x1 in size. However, the libsmm_acc code makes me think it could still be useful there, since SMM is used for the smaller blocks and there's some cuda optimization.

The <1% occupancy is fine for the library. However, as you pointed out, the library is really optimized for block-sparsity (DBCSR = distributed block CSR), so you will not get any special optimization for the multiplication of such blocks (they will simply run on the CPU, i.e. no GPU involved).

I was able to create large matrices by modifying the example dbcsr_example_3.cpp and got 1 teraflop performance, but with occupation of 1.

For this particular case of dense matrices, we run a "densification" algorithm.

It looks like when I assign values with c_dbcsr_iterator_next_2d_block_d, it treats the blocks like full 2D matrices. I could still calculate the 1x1 blocks and their resulting locations, but I feel like that wouldn't be very fast.

This is correct, we always assume blocks, so single elements are treated as 1x1 blocks. We don't run any special optimizationf for that (it is a bit out of the scope for DBCSR). Still, the library will work, but I would assume the performance would be low.

In conclusion, sparsity is fine (we use to run tests with 0.01% of occupancy), but we do expect blocks-sparse to get the best performance out of the library.

from dbcsr.

SimLeek commented on September 23, 2024

Fair. I guess 'maximum local sparsity' should be something like 75% for a block size, and it might be useful to do a game of life style convolve, 'if 3 values in a square of 4 involving this location are non-zero, then count this zero as non-zero', then build an R-tree for 100% dense rectangles, if you know there's going to be grouping but you don't know exactly where.

One of my matrices might be like that in a best case scenario, with most non-zero elements along the diagonal, sand I might be able to just set some grouping on the diagonal ahead of time, but the other matrices will actually avoid clumping.

Anyway, I think that answers the question in this issue: libsmm_acc does not optimize 1x1 blocks, nor does any other part of the library, since that's outside the scope of DBCSR. If that's correct, I think this can be closed.

from dbcsr.

alazzaro commented on September 23, 2024

Anyway, I think that answers the question in this issue: libsmm_acc does not optimize 1x1 blocks, nor does any other part of the library, since that's outside the scope of DBCSR. If that's correct, I think this can be closed.

This is correct. Note that the library will work for 1x1 elements, but there will a massive overhead for assuming "blocks" (lookup of the dimensions and calculation of the position) that are useless for single elements... On the other hand, we could think to build the matrices with blocks of a given dimension, where the blocks are sparse inside. Then what we need is a way to group 1x1 blocks in larger blocks...

Out of curiosity, what's your use case?

from dbcsr.

SimLeek commented on September 23, 2024

I'm trying to get optimized sparsity working for linear layers in pytorch, preferably with each layer being variational to maintain sparsity, then conv layers. I'm looking at smaller SPGeMM and SpMSpV libraries now that look hopeful, but I'll have to test them.

Unfortunately, that variational rule is likely what's going to make it so that most of the blocks, even 2x2, will be sparse. And now that I think about it, that might apply to the synapses as well as the input, since having meaningfully different input connections would help keep an output from activating at the same time as another output.

from dbcsr.

What levels of sparsity is this useful for? about dbcsr HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent