Code Monkey home page Code Monkey logo

spbla's Introduction

spbla

JB Research Ubuntu Pages License Package

spbla is a linear Boolean algebra library primitives and operations for work with sparse matrices written for CPU, Cuda and OpenCL platforms. The primary goal of the library is implementation, testing and profiling algorithms for solving formal-language-constrained problems, such as context-free and regular path queries with various semantics for graph databases. The library provides C-compatible API, written in the GraphBLAS style. The library is shipped with python package pyspbla - wrapper for spbla library C API. This package exports library features and primitives in high-level format with automated resources management and fancy syntax sugar.

Features summary

  • Python package for every-day tasks
  • C API for performance-critical computations
  • Cuda backend for computations
  • OpenCL backend for computations
  • Cpu (fallback) backend for computations
  • Matrix creation (empty, from data, with random data)
  • Matrix-matrix operations (multiplication, element-wise addition, kronecker product)
  • Matrix operations (equality, transpose, reduce to vector, extract sub-matrix)
  • Matrix data extraction (as lists, as list of pairs)
  • Matrix syntax sugar (pretty string printing, slicing, iterating through non-zero values)
  • IO (import/export matrix from/to .mtx file format)
  • GraphViz (export single matrix or set of matrices as a graph with custom color and label settings)
  • Debug (matrix string debug markers, logging)

Platforms

  • Linux based OS (tested on Ubuntu 20.04)

Installation

Get the latest package version from PyPI package index:

$ python3 -m pip install pyspbla

Simple example

Create sparse matrices, compute matrix-matrix product and print the result to the output:

import pyspbla as sp

a = sp.Matrix.empty(shape=(2, 3))
a[0, 0] = True
a[1, 2] = True

b = sp.Matrix.empty(shape=(3, 4))
b[0, 1] = True
b[0, 2] = True
b[1, 3] = True
b[2, 1] = True

print(a, b, a.mxm(b), sep="\n")

Performance

Sparse Boolean matrix-matrix multiplication evaluation results are listed bellow. Machine configuration: PC with Ubuntu 20.04, Intel Core i7-6700 3.40GHz CPU, DDR4 64Gb RAM, GeForce GTX 1070 GPU with 8Gb VRAM.

time mem

The matrix data is selected from the SuiteSparse Matrix Collection link.

Matrix name # Rows Nnz M Nnz/row Max Nnz/row Nnz M^2
SNAP/amazon0312 400,727 3,200,440 7.9 10 14,390,544
LAW/amazon-2008 735,323 5,158,388 7.0 10 25,366,745
SNAP/web-Google 916,428 5,105,039 5.5 456 29,710,164
SNAP/roadNet-PA 1,090,920 3,083,796 2.8 9 7,238,920
SNAP/roadNet-TX 1,393,383 3,843,320 2.7 12 8,903,897
SNAP/roadNet-CA 1,971,281 5,533,214 2.8 12 12,908,450
DIMACS10/netherlands_osm 2,216,688 4,882,476 2.2 7 8,755,758

Detailed comparison is available in the full paper text at link .

Directory structure

spbla
├── .github - GitHub Actions CI setup 
├── docs - documents, text files and various helpful stuff
├── scripts - short utility programs 
├── spbla - library core source code
│   ├── include - library public C API 
│   ├── sources - source-code for implementation
│   │   ├── core - library core and state management
│   │   ├── io - logging and i/o stuff
│   │   ├── utils - auxilary class shared among modules
│   │   ├── backend - common interfaces
│   │   ├── cuda - cuda backend
│   │   ├── opencl - opencl backend
│   │   └── sequential - fallback cpu backend
│   ├── utils - testing utilities
│   └── tests - gtest-based unit-tests collection
├── python - pyspbla related sources
│   ├── pyspbla - spbla library wrapper for python (similar to pygraphblas)
│   ├── tests - regression tests for python wrapper
│   └── data - generate data for pyspbla regression tests
├── deps - project dependencies
│   ├── clbool - OpenCL based matrix operations for dcsr, csr and coo matrices
│   ├── cub - cuda utility, required for nsparse
│   ├── gtest - google test framework for unit testing
│   └── nsparse - SpGEMM implementation for csr matrices (with unified memory, configurable)
└── CMakeLists.txt - library cmake config, add this as sub-directory to your project

Contributing

If you want to contribute to this project, follow our short and simple open-source contributors guide. Also have a look at code of conduct.

Contributors

Citation

@online{spbla,
  author = {Orachyov, Egor and Karpenko, Maria and Alimov, Pavel and Grigorev, Semyon},
  title = {spbla: sparse Boolean linear algebra for CPU, Cuda and OpenCL computations},
  year = 2021,
  url = {https://github.com/JetBrains-Research/spbla},
  note = {Version 1.0.0}
}

License

This project is licensed under MIT License. License text can be found in the license file.

Acknowledgments

This is a research project of the Programming Languages and Tools Laboratory at JetBrains-Research. Laboratory website link.

spbla's People

Contributors

egororachyov avatar gsvgit avatar mkarpenkospb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spbla's Issues

Add API documentation to repository

As per JOSS review guidelines, "Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?", a level of API documentation is required for the available functionality of the library.

This would be best enabled following the guides of https://joss.readthedocs.io/en/latest/review_criteria.html#api-documentation

As many C++ methods are documented using Doxygen compatible syntax, it may be possible to autogenerate the API documentation from the repository, and host locally (either as part of a Github Pages site, or a more extensive readthedocs site).

Paper feedback

In ref to review openjournals/joss-reviews#3743
Summary feedback on the manuscript

First off, overall the paper is well written, API/implementation is well described, well documented library/API.

This feedback is in 3 parts, writing (minor to no issues), and a question on the memory cost of addition, evaluation (some comments).

Writing

Introduction:
I'd rewrite the first sentence into something like:

Answering research questions in data analysis often involves expressing the solution in terms of matrix/vector operations.

I understand what you're saying in its current form, just concerned you're underselling the contribution.

This way it is possible to employ a set  -->  ... to leverage a set of powerful ...

Section IV

applicability --> utility

Method

End of section III

"it can negatively affect the memory consumption for large matrices with lots of duplicated non-zero values at the same positions."

Can this not be done without extra allocation by doing a sparse vector elementwise multiplication, reusing your own API?
It could be of interest to review (for future work, not this review necessarily)

Evaluation

Table II+III

  • If you report mean of 10 (or in general k) runs, please include variance and/or confidence intervals. Means can obscure interesting effects that can be exploited to optimize the API.
  • Consider that arithmetic mean is not always appropriate in performance/standardized benchmarks, see Fleming's seminal paper for background[1]. That being said, I do understand that you're bound to how others report on standard benchmark datasets, I don't expect you to redo/review the current results/experimetns, this is a suggestion only for future analysis.
  • Please boldface best / worst results, humans take a long time to parse tabular numerical data, helping with highlights makes it faster to draw conclusions from the tabular data
  • Consider using a lineplot / regression (LOESS) with mean + confidence interval, this would clearly show differences and patterns, and highlight the difference in performance
  • Did you consider effect of hot/cold cache, e.g. my assumption would be the first invocation would be the slowest if that is a factor, did you see the same?

Thanks,

Ben

[1] https://doi.org/10.1145/5666.5673

CUB build issues with CUDA 11

When using CUDA 11.1+ the following build error is observed:

In file included from /opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/detail/execution_policy.h:33,
                 from /opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/pointer.h:25,
                 from /opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/memory_resource.h:25,
                 from /opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/memory.h:24,
                 from /home/mlxd/DELME/review_spbla/spbla/spbla/sources/cuda/details/device_allocator.cuh:30,
                 from /home/mlxd/DELME/review_spbla/spbla/spbla/sources/cuda/cuda_matrix.hpp:30,
                 from /home/mlxd/DELME/review_spbla/spbla/spbla/sources/cuda/cuda_backend.cu:26:
/opt/cuda/bin/../targets/x86_64-linux/include/thrust/system/cuda/config.h:78:2: error: #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.
   78 | #error The version of CUB in your include path is not compatible with this release of Thrust. CUB is now included in the CUDA Toolkit, so you no longer need to use your own checkout of CUB. Define THRUST_IGNORE_CUB_VERSION_CHECK to ignore this.

This seems to stem from CUB being included as part of CUDA 11 and is reported elsewhere (see IBM/aihwkit#56, tensorflow/tensorflow#41803 for examples). The resolution seems to be not to download CUB if using CUDA 11.

Missing header declarations

Building the library using GCC 11 with CMake 3.21 gives the following error:

Consolidate compiler generated dependencies of target spbla
[ 43%] Building CXX object spbla/CMakeFiles/spbla.dir/sources/utils/csr_utils.cpp.o
In file included from ./spbla/spbla/sources/utils/csr_utils.cpp:25:
./spbla/spbla/sources/utils/csr_utils.hpp:35:35: error: ‘size_t’ has not been declared
   35 |         static void buildFromData(size_t nrows, size_t ncols,

This can be fixed by including the #include<cstddef> header

Next:

[ 43%] Building CXX object spbla/CMakeFiles/spbla.dir/sources/utils/csr_utils.cpp.o
/home/mlxd/DELME/review_spbla/spbla/spbla/sources/utils/csr_utils.cpp: In static member function ‘static void spbla::CsrUtils::buildFromData(size_t, size_t, const index*, const index*, size_t, std::vector<unsigned int>&, std::vector<unsigned int>&, bool, bool)’:
/home/mlxd/DELME/review_spbla/spbla/spbla/sources/utils/csr_utils.cpp:85:35: error: ‘numeric_limits’ is not a member of ‘std’
   85 |                 index prev = std::numeric_limits<index>::max();
      |                                   ^~~~~~~~~~~~~~
/home/mlxd/DELME/review_spbla/spbla/spbla/sources/utils/csr_utils.cpp:85:55: error: expected primary-expression before ‘>’ token
   85 |                 index prev = std::numeric_limits<index>::max();
      |                                                       ^
/home/mlxd/DELME/review_spbla/spbla/spbla/sources/utils/csr_utils.cpp:85:58: error: ‘::max’ has not been declared; did you mean ‘std::max’?
   85 |                 index prev = std::numeric_limits<index>::max();

can be fixed by including #include<limits>.

Add additional examples to docs

To check off item Example usage in the review, some additional documented examples would be best. This should ideally showcase the various available functionalities in the library.

Additional build issues

Hi @egor-bogomolov I have attempted to continue building the repo but am hitting some issues still. For GCC 11, with CUDA 11.5 I made the following changes to ensure compilation. Mostly ensuring headers are included where needed. I have attached the diff below:

diff --git a/deps/nsparse/include/nsparse/unified_allocator.h b/deps/nsparse/include/nsparse/unified_allocator.h
index ea6a98b..10f36fc 100644
--- a/deps/nsparse/include/nsparse/unified_allocator.h
+++ b/deps/nsparse/include/nsparse/unified_allocator.h
@@ -3,7 +3,7 @@
 #include <thrust/detail/config.h>
 #include <thrust/device_ptr.h>
 #include <thrust/mr/allocator.h>
-#include <thrust/memory/detail/device_system_resource.h>
+//#include <thrust/memory/detail/device_system_resource.h>

 #include <limits>
 #include <stdexcept>
@@ -80,4 +80,4 @@ using managed = thrust::device_unified_allocator<T>;
 //template <typename T>
 //using managed_vector = thrust::device_vector<T, managed<T>>;

-}  // namespace nsparse
\ No newline at end of file
+}  // namespace nsparse
diff --git a/spbla/sources/cuda/cuda_instance.hpp b/spbla/sources/cuda/cuda_instance.hpp
index dd2b8fe..c5e3510 100644
--- a/spbla/sources/cuda/cuda_instance.hpp
+++ b/spbla/sources/cuda/cuda_instance.hpp
@@ -27,6 +27,8 @@

 #include <core/config.hpp>
 #include <unordered_set>
+#include <cstddef>
+

 namespace spbla {

@@ -69,4 +71,4 @@ namespace spbla {

 }

-#endif //SPBLA_CUDA_INSTANCE_HPP
\ No newline at end of file
+#endif //SPBLA_CUDA_INSTANCE_HPP
diff --git a/spbla/sources/sequential/sq_ewiseadd.hpp b/spbla/sources/sequential/sq_ewiseadd.hpp
index 616274f..0ca6c6a 100644
--- a/spbla/sources/sequential/sq_ewiseadd.hpp
+++ b/spbla/sources/sequential/sq_ewiseadd.hpp
@@ -25,6 +25,7 @@
 #ifndef SPBLA_SQ_EWISEADD_HPP
 #define SPBLA_SQ_EWISEADD_HPP

+#include <cstddef>
 #include <sequential/sq_csr_data.hpp>

 namespace spbla {
diff --git a/spbla/sources/sequential/sq_kronecker.hpp b/spbla/sources/sequential/sq_kronecker.hpp
index a65f0d2..7562942 100644
--- a/spbla/sources/sequential/sq_kronecker.hpp
+++ b/spbla/sources/sequential/sq_kronecker.hpp
@@ -25,6 +25,7 @@
 #ifndef SPBLA_SP_KRONECKER_HPP
 #define SPBLA_SP_KRONECKER_HPP

+#include <cstddef>
 #include <sequential/sq_csr_data.hpp>

 namespace spbla {
diff --git a/spbla/sources/sequential/sq_submatrix.hpp b/spbla/sources/sequential/sq_submatrix.hpp
index 62da1b1..02a0ac3 100644
--- a/spbla/sources/sequential/sq_submatrix.hpp
+++ b/spbla/sources/sequential/sq_submatrix.hpp
@@ -25,6 +25,7 @@
 #ifndef SPBLA_SQ_SUBMATRIX_HPP
 #define SPBLA_SQ_SUBMATRIX_HPP

+#include <cstddef>
 #include <sequential/sq_csr_data.hpp>

 namespace spbla {
diff --git a/spbla/sources/sequential/sq_transpose.hpp b/spbla/sources/sequential/sq_transpose.hpp
index 14b21ab..c85d584 100644
--- a/spbla/sources/sequential/sq_transpose.hpp
+++ b/spbla/sources/sequential/sq_transpose.hpp
@@ -26,6 +26,7 @@
 #define SPBLA_SQ_TRANSPOSE_HPP

 #include <sequential/sq_csr_data.hpp>
+#include <cstddef>

 namespace spbla {

diff --git a/spbla/sources/utils/csr_utils.hpp b/spbla/sources/utils/csr_utils.hpp
index ce509ea..82ff72f 100644
--- a/spbla/sources/utils/csr_utils.hpp
+++ b/spbla/sources/utils/csr_utils.hpp
@@ -27,6 +27,7 @@

 #include <core/config.hpp>
 #include <vector>
+#include <cstddef>

 namespace spbla {

After this, I can confirm 100% compilation, and I was able to run (a sample of) the tests.

Add community guidelines for contribution

To tick off item Community guidelines in the review adding contribution guidelines to the repository would help. This can be a paragraph in the readme, as well as templated issue/PR items for user submissions.

Building / tests fail on missing Cmake in gtest

In ref to review openjournals/joss-reviews#3743
Describe the bug
Building/executing gtest fails, perhaps I missed something in the build instructions ?

To Reproduce
(after git clone, note I set CUDA to OFF in CMakeLists.txt

❯ mkdir build
❯ cd build
❯ ls
❯ cmake ..
-- The CXX compiler identification is GNU 10.3.1
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build spbla in release mode (default: was not specified)
-- The C compiler identification is GNU 10.3.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Looking for CL_VERSION_2_2
-- Looking for CL_VERSION_2_2 - not found
-- Looking for CL_VERSION_2_1
-- Looking for CL_VERSION_2_1 - not found
-- Looking for CL_VERSION_2_0
-- Looking for CL_VERSION_2_0 - not found
-- Looking for CL_VERSION_1_2
-- Looking for CL_VERSION_1_2 - found
-- Found OpenCL: /usr/local/cuda/lib64/libOpenCL.so (found suitable version "1.2", minimum required is "1.1") 
-- Add googletest as unit-testing library
CMake Error at CMakeLists.txt:58 (add_subdirectory):
  The source directory

    /home/bcardoen/SFUVault/spbla/deps/gtest

  does not contain a CMakeLists.txt file.


-- Add CPU sequential fallback backend
-- Add OpenCL backend for GPGPU computations
-- Add unit tests directory to the project
-- Configuring incomplete, errors occurred!
See also "/home/bcardoen/SFUVault/spbla/build/CMakeFiles/CMakeOutput.log".
See also "/home/bcardoen/SFUVault/spbla/build/CMakeFiles/CMakeError.log".

Expected behavior
Build success

Environment

  • OS name and version: Fedora 33
  • GPU name and vendor: Nvidia GeForce 1050Ti
  • GPU driver version Driver Version: 460.56 CUDA Version: 11.2

Build Configuration

  • Compiler version
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.3.1 20210422 (Red Hat 10.3.1-1) (GCC) 

  • SDK version
  • CMake version cmake version 3.19.7

Generate API docs

The Python API, assuming sufficiently listed docstrings, can be used to autogenerate a documented API. This would assist with usage of the library, and be complimentary to the example usage request #14

Ensure CUDA architecture is set for targets

As part of JOSS review 3743, I am following the build guide for the library. As of CMake 3.18, the following warning/error will be reported, as per https://cmake.org/cmake/help/latest/policy/CMP0104.html

This may cause issues on newer systems building the library.

CMake Warning (dev) in spbla/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "spbla".
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Warning (dev) in spbla/CMakeLists.txt:
  Policy CMP0104 is not set: CMAKE_CUDA_ARCHITECTURES now detected for NVCC,
  empty CUDA_ARCHITECTURES not allowed.  Run "cmake --help-policy CMP0104"
  for policy details.  Use the cmake_policy command to set the policy and
  suppress this warning.

  CUDA_ARCHITECTURES is empty for target "spbla".
This warning is for project developers.  Use -Wno-dev to suppress it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.