Code Monkey home page Code Monkey logo

accel's Introduction

Accel: GPGPU Framework for Rust

pipeline status

crate crates.io docs.rs GitLab Pages
accel Crate docs.rs cargo-doc CUDA-based GPGPU framework
accel-core Crate docs.rs cargo-doc Helper for writing device code
accel-derive Crate docs.rs cargo-doc Procedural macro for generating kernel code

Requirements

minimum supported rust version

  • Minimum Supported Rust Version (MSRV) is 1.42.0
  • Install CUDA on your system
    • accel depends on CUDA Device APIs through rust-cuda/cuda-sys
    • accel does not depend on CUDA Runtime APIs. It means that a compiled binary requires only libcuda.so at runtime, which is far lighter than entire CUDA development toolkit.
  • Setup NVPTX target of Rust
curl -sSL https://gitlab.com/termoshtt/accel/raw/master/setup_nvptx_toolchain.sh | bash

Or, you can use docker container

Limitations

This project is still in early stage. There are several limitations as following:

Contribution

This project is developed on GitLab and mirrored to GitHub.

Sponsors

Links

Projects which accel depends on:

Related Projects:

accel's People

Contributors

bheisler avatar k0nserv avatar king6cong avatar kngwyu avatar kpp avatar mergecat[bot] avatar termoshtt avatar wcampbell0x2a avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

accel's Issues

`cuda-sys` Fails to Link on Windows

The CUDA_LIBRARY_PATH variable is split by : in the cuda-sys build script. On Windows, this does not correctly handle paths like "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\lib\x64", producing "C" and "\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v9.1\lib\x64".

Error handling of nvptx-compile

Related to #32

Error message of link error is hard to see (´・ω・`)
The link error should NOT be handled using Result<_> since it must be work if the installation of LLVM is correct. Compile error due to user input is raised only from xargo.

Reduce pre-requirements

  • Use xargo as a library to avoid calling cargo install xargo
  • Rustup is already standard way, and we need not to write explicit dependency.

accel and nvptx64 linker issues

Hi

I can build nvptx64 sub-crate but cargo test fails on it due to "Link" issue (when compiling ptx-builder v0.1.0). I've LLVM-6.0, CUDA-8.0 installed, and tried to changed the linker in nvptx64-nvidia-cuda.json to llvm-linker(?!) but didn't help.

Is it because of my gpu titan xp arch? or something else?

Also when I try to cargo build the root accel crate the error is:

error: linking with cc failed: exit code: 1
.... OMITTED ....
note: /usr/bin/ld: cannot find -lcudart
/usr/bin/ld: cannot find -lcublas
collect2: error: ld returned 1 exit status

But I have them in my /usr/local/cuda/lib64 and /usr/local/cuda/include/.

Any idea how to resolve it?
Thanks

Cannot build test on master branch

$ CUDA_LIBRARY_PATH=/opt/cuda/lib64 cargo test
rustc: CommandLine Error: Option 'print-summary-global-ids' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options
error: Could not compile `accel`.
warning: build failed, waiting for other jobs to finish...
rustc: CommandLine Error: Option 'print-summary-global-ids' registered more than once!

Kernel module

#[kernel] compiles only one function, but we need modules. I'd like to introduce #[kernel_module]:

#[kernel_module]
pub mod my_drivers;

which module is copied into rust2ptx with my_drivers.rs as it is.

include raw *.cu file

Is there a way to mix rust-accel with raw *.cu files?

I would like to write some device helper functions in *.cu, then call them from #[kernel] functions defined in rust-accel.

Is there an example for how to do this?

Link PTX assemblas

rust2ptx crate compiles Rust to PTX using xargo rustc command, which compiles only lib.rs. To avoid undefined reference, accel-core uses #[inline(always)], but it should be bad :<

nvptx toolchain

nvptx crate uses only a little part of xargo, and development of xargo is in maintenance mode.

There is two ways to enable nvptx64-nvidia-cuda target in rustc
ref: How do I bootstrap Rust to cross-compile for a new platform?

  • out-of-tree: Add nvptx64-nvidia-cuda.json and compile libcore
    • xargo uses this way
  • in-tree: Add nvptx64_nvidia_cuda.rs into librustc_target

TODO

  • Add nvptx64-nvidia-cuda target into rustc
  • Distribute them as nvptx toolchain
  • replace xargo rustc and llvm-link by cargo +accel-nvptx build
  • Rewrite nvptx crate to manage nvptx-toolchain and execute build.
  • Switch to rust-accel/nvptx

Restart accel

CI on GPU

GitHub Actions with self-hosted runner works?? #9

Stable Rust

Stabilize Host-side code. Device-side code is out-of-scope because large amount of issues are remains for nvptx backend. See the link.

  • proc-macro has been stabilized as #63
  • cargo check runs on stable Rust #66

Update dependencies

  • syn, quote, proc-macro2 1.0 #67
  • rust-cuda/cuda-sys 0.3.0 #66

rust-ptx-linker

Linker flavor using rust-ptx-linker has been merged into rustc rust-lang/rust#57937

  • Rewrite accel-derive with rust-ptx-linker #69
  • archive nvptx crate

Document

  • Needs a guide book #68

Links

rust-lang/rust

rust-cuda/wg

Unable to compile image crate

For some reasons it seem not able to include external crate with macro use:

[clynamen@clynamen_nb][~/software/sources/image]% cargo build --verbose
       Fresh version_check v0.1.4                                                                                                                     
       Fresh cfg-if v0.1.5                                                                                                                            
       Fresh nodrop v0.1.12                                                                                                                           
       Fresh unicode-xid v0.1.0                                                                                                                       
       Fresh memoffset v0.2.1                                                                                                                         
       Fresh scopeguard v0.3.3                                                                                                                        
       Fresh libc v0.2.43                                                                                                                             
       Fresh adler32 v1.0.3                                                                                                                           
       Fresh byteorder v1.2.6                                                                                                                         
       Fresh either v1.5.0                                                                                                                            
       Fresh bitflags v1.0.4                                                                                                                          
       Fresh lzw v0.10.0                                                                                                                              
       Fresh color_quant v1.0.1                                                                                                                       
       Fresh scoped_threadpool v0.1.9                                                                                                                 
       Fresh crossbeam-utils v0.2.2                                                                                                                   
       Fresh arrayvec v0.4.7                                                                                                                          
       Fresh num_cpus v1.8.0                                                                                                                          
       Fresh inflate v0.4.3                                                                                                                           
       Fresh deflate v0.7.18                                                                                                                          
       Fresh gif v0.10.0                                                                                                                              
       Fresh proc-macro2 v0.4.19                                                                                                                      
       Fresh num-traits v0.2.6                                                                                                                        
       Fresh lazy_static v1.1.0                                                                                                                       
       Fresh quote v0.6.8                                                                                                                             
       Fresh num-integer v0.1.39                                                                                                                      
       Fresh crossbeam-epoch v0.3.1                                                                                                                   
       Fresh syn v0.14.9                                                                                                                              
       Fresh num-iter v0.1.37                                                                                                                         
       Fresh num-rational v0.2.1                                                                                                                      
       Fresh crossbeam-deque v0.2.0                                                                                                                   
       Fresh num-derive v0.2.2                                                                                                                        
       Fresh png v0.12.0                                                                                                                              
       Fresh rayon-core v1.4.1                                                                                                                        
   Compiling tiff v0.2.0                                                                                                                              
       Fresh rayon v1.0.2                                                                                                                             
     Running rustc --crate-name tiff /home/clynamen/.cargo/registry/src/github.com-1ecc6299db9ec823/tiff-0.2.0/src/lib.rs --color always --crate-type lib --emit=dep-info,link -C debuginfo=2 -C metadata=ac012bdec4f68178 -C extra-filename=-ac012bdec4f68178 --out-dir /home/clynamen/software/sources/image/target/debug/deps -L dependency=/home/clynamen/software/sources/image/target/debug/deps --extern byteorder=/home/clynamen/software/sources/image/target/debug/deps/libbyteorder-094c5826d881bf93.rlib --extern lzw=/home/clynamen/software/sources/image/target/debug/deps/liblzw-f9345a6711824bc3.rlib --extern num_derive=/home/clynamen/software/sources/image/target/debug/deps/libnum_derive-c09f5b0bcd884668.so --extern num_traits=/home/clynamen/software/sources/image/target/debug/deps/libnum_traits-afd4d27dce686b16.rlib --cap-lints allow
       Fresh jpeg-decoder v0.1.15                                                                                                                     
error: libproc_macro-d6a28a95b106d2cc.so: cannot open shared object file: No such file or directory                                                   
  --> /home/clynamen/.cargo/registry/src/github.com-1ecc6299db9ec823/tiff-0.2.0/src/lib.rs:11:14                                                      
   |                                                                                                                                                  
11 | #[macro_use] extern crate num_derive;                                                                                                            
   |              ^^^^^^^^^^^^^^^^^^^^^^^^                                                                                                            
                                                                                                                                                      
error: aborting due to previous error                                                                                                                 
                                                                                                                                                      
error: Could not compile `tiff`.                                                                                                                      

Caused by:
  process didn't exit successfully: `rustc --crate-name tiff /home/clynamen/.cargo/registry/src/github.com-1ecc6299db9ec823/tiff-0.2.0/src/lib.rs --color always --crate-type lib --emit=dep-info,link -C debuginfo=2 -C metadata=ac012bdec4f68178 -C extra-filename=-ac012bdec4f68178 --out-dir /home/clynamen/software/sources/image/target/debug/deps -L dependency=/home/clynamen/software/sources/image/target/debug/deps --extern byteorder=/home/clynamen/software/sources/image/target/debug/deps/libbyteorder-094c5826d881bf93.rlib --extern lzw=/home/clynamen/software/sources/image/target/debug/deps/liblzw-f9345a6711824bc3.rlib --extern num_derive=/home/clynamen/software/sources/image/target/debug/deps/libnum_derive-c09f5b0bcd884668.so --extern num_traits=/home/clynamen/software/sources/image/target/debug/deps/libnum_traits-afd4d27dce686b16.rlib --cap-lints allow` (exit code: 101)

Is this normal, or there something that needs change?

missmatch in enum error values

Hi there,

I am running a program which contains some error, and the Cuda runtime keeps returning
MissingConfiguration

I can see in the cuda_sys code that this enum value maps to 2, nevertheless
MissingConfiguration in the Cuda runtime maps to 52, while 2 is

    /**
     * The API call failed because it was unable to allocate enough memory to
     * perform the requested operation.
     */
    cudaErrorMemoryAllocation             =      2,
    /**
     * The device function being invoked (usually via ::cudaLaunchKernel()) was not
     * previously configured via the ::cudaConfigureCall() function.
     */
    cudaErrorMissingConfiguration         =      52

In the rust code there are no 52 nor 53 values:

    PeerAccessAlreadyEnabled = 50,
    PeerAccessNotEnabled = 51,
    DeviceAlreadyInUse = 54,
    ProfilerDisabled = 55,

I am using cuda 10.1.

  • is this version supported?
  • is this a version missmatch?
  • Should I generate the bindings myself from the Cuda header?

Cheers

Panic at Termination on Windows

When I run the example program add.rs on Windows, it adds the numbers correctly and then panics when terminating the program:

thread '<unnamed>' panicked at 'Failed to unload module: cudaError(CUDA_ERROR_DEINITIALIZED)', libcore\result.rs:945:5
stack backtrace:
   0: std::sys::windows::backtrace::unwind_backtrace
             at C:\projects\rust\src\libstd\sys\windows\backtrace\mod.rs:65
   1: std::sys_common::backtrace::_print
             at C:\projects\rust\src\libstd\sys_common\backtrace.rs:71
   2: std::sys_common::backtrace::print
             at C:\projects\rust\src\libstd\sys_common\backtrace.rs:59
   3: std::panicking::default_hook::{{closure}}
             at C:\projects\rust\src\libstd\panicking.rs:211
   4: std::panicking::default_hook
             at C:\projects\rust\src\libstd\panicking.rs:227
   5: std::panicking::rust_panic_with_hook
             at C:\projects\rust\src\libstd\panicking.rs:463
   6: std::panicking::begin_panic_fmt
             at C:\projects\rust\src\libstd\panicking.rs:350
   7: std::panicking::rust_begin_panic
             at C:\projects\rust\src\libstd\panicking.rs:328
   8: core::panicking::panic_fmt
             at C:\projects\rust\src\libcore\panicking.rs:71
   9: core::result::unwrap_failed<accel::error::Error>
             at C:\projects\rust\src\libcore\macros.rs:26
  10: core::result::Result<(), accel::error::Error>::expect<(),accel::error::Error>
             at C:\projects\rust\src\libcore\result.rs:809
  11: accel::module::{{impl}}::drop
             at .\src\module.rs:233
  12: core::ptr::drop_in_place<accel::module::Module>
             at C:\projects\rust\src\libcore\ptr.rs:59
  13: core::ptr::drop_in_place<core::cell::UnsafeCell<accel::module::Module>>
             at C:\projects\rust\src\libcore\ptr.rs:59
  14: core::ptr::drop_in_place<core::cell::RefCell<accel::module::Module>>
             at C:\projects\rust\src\libcore\ptr.rs:59
  15: core::ptr::drop_in_place<core::option::Option<core::cell::RefCell<accel::module::Module>>>
             at C:\projects\rust\src\libcore\ptr.rs:59
  16: std::thread::local::fast::destroy_value<core::cell::RefCell<accel::module::Module>>
             at C:\projects\rust\src\libstd\thread\local.rs:402
  17: std::sys_common::thread_local::register_dtor_fallback::run_dtors
             at C:\projects\rust\src\libstd\sys_common\thread_local.rs:266
  18: std::sys::windows::thread_local::run_dtors
             at C:\projects\rust\src\libstd\sys\windows\thread_local.rs:244
  19: std::sys::windows::thread_local::on_tls_callback
             at C:\projects\rust\src\libstd\sys\windows\thread_local.rs:215
  20: RtlDeactivateActivationContextUnsafeFast
  21: RtlDeactivateActivationContextUnsafeFast
  22: LdrShutdownProcess
  23: RtlExitUserProcess
  24: ExitProcess
  25: exit
  26: exit
  27: __scrt_common_main_seh
             at f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:290
  28: BaseThreadInitThunk
  29: RtlUserThreadStart

It's trying to unload the module and failing with CUDA_ERROR_DEINITIALIZED. I don't know why this is happening, else I would submit a pull request to fix it. I've been able to work around it by commenting out the code in module.rs which checks for errors when the module is dropped, but I don't think that's the right fix.

In documentation

Wrong version in documentation for 0.3.1

[dependencies]
accel = "=0.3.0-alpha.2"

#[kernel] ought to work in the context of a module.

Passing a single function is a huge limitation, wrt to writing helper functions. One should be able to compile an entire module, and mark a single function (or multiple functions) as kernels.

My proposal is as follow:

#[kernel_module]
mod add {
    use algebra::ComplexBigInt;
    #[kernel] 
    fn add(a: *mut ComplexBigInt) {..}
}

etc.

Can't build the sample code

I tried your accel as follows:

https://github.com/zacky1972/test_cuda

( the code is from https://qiita.com/termoshtt/items/41b4e23c4ce5e822319c )

My environment is as follows:

Mac Pro (Mid 2010)
Processor 2.8GHz Quad-Core Intel Xeon
Memory 16GB
NVIDIA GeForce GTX 680 2047 MB

But, the following error occurred:

$ cargo build --verbose
       Fresh unicode-xid v0.1.0
       Fresh unicode-xid v0.0.4
       Fresh quote v0.3.15
       Fresh glob v0.2.11
       Fresh proc-macro2 v0.4.3
   Compiling proc-macro2 v0.1.10
       Fresh synom v0.11.3
       Fresh quote v0.6.2
       Fresh serde v1.0.62
     Running `rustc --crate-name proc_macro2 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/proc-macro2-0.1.10/src/lib.rs --crate-type lib --emit=dep-info,link -C debuginfo=2 --cfg 'feature="nightly"' --cfg 'feature="unstable"' -C metadata=94d016008ad7fd85 -C extra-filename=-94d016008ad7fd85 --out-dir ~/github/test_cuda/target/debug/deps -L dependency=~/github/test_cuda/target/debug/deps --extern unicode_xid=~/github/test_cuda/target/debug/deps/libunicode_xid-2d487da95ef31948.rlib --cap-lints allow`
       Fresh syn v0.11.11
       Fresh cuda-sys v0.1.0
       Fresh syn v0.14.0
       Fresh toml v0.4.6
       Fresh procedurals v0.2.3
       Fresh serde_derive v1.0.62
       Fresh accel v0.1.0
error[E0554]: #![feature] may not be used on the stable release channel
  --> ~/.cargo/registry/src/github.com-1ecc6299db9ec823/proc-macro2-0.1.10/src/lib.rs:22:34
   |
22 | #![cfg_attr(feature = "nightly", feature(proc_macro))]
   |                                  ^^^^^^^^^^^^^^^^^^^^

error: aborting due to previous error

For more information about this error, try `rustc --explain E0554`.
error: Could not compile `proc-macro2`.

Caused by:
  process didn't exit successfully: `rustc --crate-name proc_macro2 ~/.cargo/registry/src/github.com-1ecc6299db9ec823/proc-macro2-0.1.10/src/lib.rs --crate-type lib --emit=dep-info,link -C debuginfo=2 --cfg feature="nightly" --cfg feature="unstable" -C metadata=94d016008ad7fd85 -C extra-filename=-94d016008ad7fd85 --out-dir ~/github/test_cuda/target/debug/deps -L dependency=~/github/test_cuda/target/debug/deps --extern unicode_xid=~/github/test_cuda/target/debug/deps/libunicode_xid-2d487da95ef31948.rlib --cap-lints allow` (exit code: 101)
$

Rewrite accel-derive with rust-ptx-linker

Rewrite nvptx and accel-derive (i.e. reject re-merging nvptx rust-accel/nvptx#16)

Use libcore in rustup

libcore, liballoc, and libstd is in rustup distribution even in stable https://github.com/rust-lang/rust/blob/master/src/librustc_target/spec/nvptx64_nvidia_cuda.rs

Status of rust-ptx-linker

  • rustc-llvm-proxy does not works well both on Linux and Windows.
  • rust-ptx-linker works without llvm-proxy features, which is default and enables rustc-llvm-proxy.
  • Should we consider to distribute rust-ptx-linker with statically linked LLVM?
  • Anyway, accel-derive should be rewriten to use rust-ptx-linker #54

Rust -> PTX without creating crate

As described in #61. In current way, one #[kernel] creates a crate on /tmp and compile them. It can be heavy.

Compile entire crate by nvptx64-nvidia-cuda target

A proposition to resolve #61

Motivation

#[kernel] function cannot use any variable, function, and so on because it will be compiled as a stand alone device code.

fn add_2(a: &mut f32) {
    *a = *a + 2.0;
}

#[kernel] 
pub fn add_2_all(a: *mut f32, n: usize) {
    let i = ::accel_core::index();
    unsafe { add_2(&mut *a.offset(i)) };  // add_2 cannot find
}

Resolution

Compile entire crate both as x86_64 and nvptx64 targets.

  • rust-ptx-linker will eliminate non-PTX kernel code which does not called from PTX kernel

Problems

  • std must be compiled with nvptx
  • Compile flow (How to trigger nvptx build instead of proc-macro?)

Generalising to SYCL-style heterogneous compute?

It does seem like PTX-based intrinsics are the route taken by accel. I am wondering if translating to LLVM-IR would be possible via a similar strategy applied (using proc macro to inline declarations and use statements and other magic). This would allow compilation from LLVM-IR to SPIR-V, PTX and other targets, including maybe back to x64/OpenMP?

This is the premise of meld, a project I have been working on. However, the hardest part seems to be translating the kernel into a valid form.

Maintenance accel

Unfortunately, I have a little time to manage this project. I'd like to seek anyone interested in managing/developing this project.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.