Comments (6)
It's not that OpenCL is less functional however at this current point in time it is indeed less suitable.
-
The most important thing is having CuDNN equivalent. CuDNN is a library of deep learning primitives, Nvidia only. Coding tuned convolution, or recurrent neural network from scratch on GPU is PhD work, see here.
-
Major libraries all uses CUDA as a first-class citizen (mostly due to point 1). Neither Google nor Facebook are supporting it. This is important because for fast development, having reference implementations is key.
-
Both AMD and Intel are doing their own things, and they are not ready for prime-time. AMD Hip will use Cuda syntax to compile to OpenCL/Cuda/Hip. Intel started 5 months ago on clDNN equivalent to CuDNN. This is a technical preview and so a moving target.
-
Every single person working in Deep Learning has Nvidia hardware, even people from Intel, AMD, Apple, just because it is the reference implementation to be compared with. Would be happy to be proven wrong.
-
Resources: I don't have hardware (besides my integrated Intel GPU on a laptop) nor time for it while there are a lot of low-hanging fruits. This startup, Vertex.ai, is doing it, they are heroes and do that full-time. Besides Google and Facebook, Microsoft is saying no, Theano which was one of the leader says OpenCL is currently unusable, Mxnet (Apache/Amazon) is waiting to see what happens to AMD HIP
Lastly, there is a Khronos-led standard in the work but I don't think we will see something out in the next year.
from arraymancer.
My Christmas gift 🎁 #184
from arraymancer.
This is indeed an unfortunate situation.
It came from history, Nvidia invested heavily into GPU compute and then created custom functions (convolutions, recurrent neural networks built-ins ...) specifically for the Deep Learning community. Between 2012 and 2017 there was no equivalent anywhere else.
However, this year AMD woke up and is heavily investing in compute. They started ROCm (Radeon Open Compute) and HIP, a tool that can translate Cuda code to OpenCL and ROCm.
If you check there, the API is almost the same so porting should be easy.
Last promising thing is that Tesla just recruited one of the top computer vision scientist (Andrej Karpathy) and is working with AMD on their Computer Vision for their car. improvements there will probably be ported to consumers cards and scientific libraries.
Now the only tough thing is that I don't have an AMD card (like 90+% of people in the deep learning community).
OpenCL support is planned, but features will probably come slowly due to its lack of maturity.
from arraymancer.
Why support two technologies at once? For OpenCL, you do not need a graphics card from AMD. It can also work on the Nvidia card without changing the code. Is not it ?
Or is OpenCL a less functional and less suitable?
p.s. Thank you for the news about AMD. It pleases, because the domination of Nvidia and what they do, I do not like.
from arraymancer.
@Jipok I think that CUDA is faster than OpenCL when used on Nvidia cards
from arraymancer.
Another interesting library to check: VexCL C++ vector expression template lib that abstracts Cuda and OpenCL and provides:
Initialization
Managing memory
Vector expressions
Parallel primitives and algorithms
Multivectors and multiexpressions
Converting generic C++ algorithms to OpenCL/CUDA
Custom kernels
Interoperability with other libraries
Building VexCL programs with CMake
Talks and publications
Indices and tables
from arraymancer.
Related Issues (20)
- --gc:orc -d:openmp cause a segfault when calling sum(axis = 2) HOT 2
- Importing std/enumerate breaks arraymancer HOT 1
- einsum fails in proc HOT 1
- einsum in proc fails when result is a Tensor HOT 4
- example 3 breaks with Nim >= 1.6 HOT 3
- Issues w/ softmax procedure HOT 11
- Importing model refusing to output any information that is new? HOT 1
- Ability to clone neural nets instead of all the weights/biases tensors manually HOT 3
- set[int] used that depends on odd compiler behavior.
- Device workload split feature HOT 2
- Fix CI by fixing doc generation
- Error: type mismatch: got 'float' for 'sum(conv2d(x, dkernel, dbias, padding, stride, Im2ColGEMM))' but expected 'Tensor[system.float]'
- Multiplication of tensors with rank > 2 HOT 6
- Can't read hdf5: io_hdf5.nim(17, 6) Error: 'parseNameAndGroup' can have side effects
- ShallowCopy problem using Nim v2 RCx HOT 2
- Futuristic premium
- Using the .item function on a Complex Tensor crashes
- Using arange within a [] expression fails HOT 1
- delete it
- 2023-12-31 - Longstanding missing features HOT 25
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arraymancer.