Comments (12)
If I run ccmake
, then I only have ODL_CUDA_COMPUTE
. Is that what you mean?
With this changed to 35 (omg is my GPU so old???) the above example works well! Many thanks!
from odlcuda.
Here is now my example with timings. I hope this motivates you and shows that you are doing an incredible job!
domain_cpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 250, 350], impl='numpy')
domain_gpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 250, 350], impl='cuda')
x_cpu = np.e * domain_cpu.one()
x_gpu = np.e * domain_gpu.one()
% time y = x_cpu.ufuncs.log()
% time y = x_gpu.ufuncs.log()
CPU times: user 7.28 s, sys: 340 ms, total: 7.62 s
Wall time: 7.63 s
CPU times: user 1.36 ms, sys: 338 ยตs, total: 1.7 ms
Wall time: 1.64 ms
It looks I can finally run my code on the real PET data :D
from odlcuda.
The order matters. I first tried ...
Very interesting observation. With that said, you should as a user never explicitly import odlcuda
.
domain_cpu = odl.uniform_discr([0], [1], [3e+8], impl='numpy')
This is not supposed to work, and the error message is quite explicit on why. 3e+8
is a floating point number and we require shape
to be a integer, we're also cautious on casting since it opens up errors. We've had problems with people using shape = 1.5
or something like that, which is then cast to shape = 1
, causing confusion.
The solution is to simply cast it to an integer yourself domain_cpu = odl.uniform_discr([0], [1], [int(3e+8)], impl='numpy')
, or use powers domain_cpu = odl.uniform_discr([0], [1], [3 * 10 ** 8], impl='numpy')
...
cudaFuncGetAttributes
...
This is actually covered in the readme, and the recommended solution is to change CUDA_COMPUTE
to a version supported by your GPU. I need to know what GPU you have to fix that, or you can look it up yourself.
from odlcuda.
To then use CUDA for everything I do, I need to rewrite some functionals that I wrote in ODL. This causes me some trouble.
My strategy was to replace all numpy functionality with ufuncs, i.e. np.log(x) = x.ufunc.log()
. Is this a good strategy? But when it comes to indices, I am struggling a little:
import odl
import numpy as np
domain_gpu = odl.uniform_discr([0, 0, 0], [1, 1, 1], [4000, 250, 350], dtype='float32', impl='cuda')
x_gpu = np.e * domain_gpu.one()
i_gpu = x_gpu.ufuncs.greater(0)
x_gpu[i_gpu]
results in
Traceback (most recent call last):
File "<ipython-input-2-fd24fbf6952c>", line 4, in <module>
x_gpu[i_gpu]
File "/mhome/damtp/s/me404/store/repositories/git_ODL/odl/discr/discretization.py", line 314, in __getitem__
return self.ntuple.__getitem__(indices)
File "/home/me404/.local/lib/python2.7/site-packages/odlcuda-0.5.0-py2.7.egg/odlcuda/cu_ntuples.py", line 419, in __getitem__
return self.data.__getitem__(indices)
ArgumentError: Python argument types in
CudaVectorFloat32.__getitem__(CudaVectorFloat32, DiscreteLpElement)
did not match C++ signature:
__getitem__(CudaVectorImpl<float> {lvalue}, long)
I kind of see what I am doing wrong, but not how to resolve this. Any ideas?
from odlcuda.
Basically the problem here is that I've not implemented comparison between vectors and longs in odlcuda. The workaround for now is to compare to the zero vector, but if this is a performance hog for you i could get it fixed.
from odlcuda.
I thought the problem here is the indexing, as I am indexing with an odl element and not with a "long". If I do your proposed fix, nothing changes.
from odlcuda.
Also in the numpy case I am not sure what kind of indexing is necessary and what is not. Does
i = np.int32(data.ufuncs.greater(0).asarray().ravel())
log_data = data[i].ufuncs.log()
make any sense to you?
from odlcuda.
Now I see what you are aiming at here. Now that would be complicated (mostly given that we don't support advanced indexing). I'd probably try some smooth approximation of the log function if I was you. You could also do something like:
pos = data.ufuncs.greater(epsilon * data.space.one())
log_data = (data * pos).ufunc.log()
Another option would be for you to manually add whatever function you have as raw cuda, modifying odlcuda
shouldn't be too complicated.
With that said, the primary solution here is frankly to wait for holger to finalize the tensor branch and we'll get a really good backend for this stuff.
from odlcuda.
@adler-j, am I correct in assuming that @kohr-h's tensor branch isn't in yet and that the above "problem" still exists?
I tried to look into odlcuda a little but could not get my head around it. Is it possible to just get a pointer to the data on the gpu device so that one could use any gpu code without needing to understand your structures in odlcuda?
from odlcuda.
I think I found the answer to my question in cu_ntuples.py
. Now I start to understand how odl is actually implemented :).
from odlcuda.
I would like to consult something(Please excuse me for not being good at English) , after I installed according to the official document:(https://odlgroup.github.io/odl/getting_started/installing_extensions.html) , the CPU version can work well, but CUDA can not, please help to analyze it, the installation process is as follows:
git clone https://github.com/odlgroup/odlcuda.git
cd odlcuda
conda install conda-build
git checkout conda-build
conda build ./conda CUDA_ROOT=/usr/lss/cudatoolkit-10.1.243-h74a9793_0 CUDA_COMPUTE=60
conda install --use-local odlcuda
python -c "import odl; odl.rn(3, impl='cuda').element()"
from odlcuda.
@wangshuaikun did you install it?
from odlcuda.
Related Issues (20)
- Add more ufuncs
- odl-cpp-utils doesn't build after Eigen removal HOT 2
- Remove pyinstall HOT 12
- A better python installation HOT 4
- Ufuncs and reductions for all dtypes
- Change from boost to pybind11 HOT 8
- nd arrays HOT 5
- Installation issues HOT 1
- Better error with wrong GCC
- Circular dependency with CUDA
- Build for multiple CUDA compute versions? HOT 1
- Installation without admin rights HOT 5
- slow maximum function HOT 17
- installation with local odl HOT 23
- Boost >= 1.65 not supported HOT 1
- performance of odlcuda HOT 3
- odlcuda broken by tensor branch HOT 3
- Error when trying to install odlcuda HOT 2
- Errors when installing odlcuda with GCC. HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from odlcuda.