Code Monkey home page Code Monkey logo

Comments (7)

Rubikplayer avatar Rubikplayer commented on July 19, 2024

Nice work for 3D reconstructon! I have some simliar issues here.

@mjiUST Could you give us some tips to make the code running on a newer system?
My system is:

  • Ubuntu 16.04.2 LTS (amd64)
  • CUDA 8.0 / 9.1, cuDNN 7.1 (edit: I installed cuDNN 5.1 instead)

Or:

Do you have suggestions for running/training without cuDNN?

I observed there are some if-branch, like in similarityNet.py:

if lasagne.utils.theano.sandbox.cuda.dnn_available(): # when cuDNN available
    from lasagne.layers.dnn import Conv2DDNNLayer as ConvLayer 
else:
    from lasagne.layers import Conv2DLayer as ConvLayer

But in layers.py and SurfaceNet.py, some cudnn functions are hardcoded

  • from lasagne.layers.dnn import Conv3DDNNLayer, Pool3DDNNLayer
  • from theano.sandbox.cuda.dnn import gpu_contiguous, GpuDnnConvDesc, gpu_alloc_empty, GpuDnnConv3dGradW

Following the same logic in the if-branch, maybe for Conv3DDNNLayer and Pool3DDNNLayer:

I might be able to hack it to:

from lasagne.layers import Conv3DLayer as Conv3DDNNLayer
from lasagne.layers import Pool3DLayer as Pool3DDNNLayer

But for other functions like gpu_contiguous, I haven't found any functions to replace so far. If you got any suggestion, please let us know! Thanks!

@cdb0y511 How are things going with you?

from surfacenet.

mjiUST avatar mjiUST commented on July 19, 2024

Dear @cdb0y511 @Rubikplayer ,

Thanks for the issue report. I specified the older Theano version

conda install -c rdonnelly theano -y # 0.9.0 version theano

Since the 3D dilated conv layer was implemented using some APIs in CUDNN, I'm not sure whether we could easily discard CUDNN.

If you are worried about that the installation may affect your existing packages' version. Please feel free to use the SurfaceNet/installEnv.sh, that will not change anything of your existing python, theano, and ~/.bashrc. What you need to do is to specify the CUDA/CUDNN pathes accordingly. Please refer to the updated README.

Hope this may help.

from surfacenet.

cdb0y511 avatar cdb0y511 commented on July 19, 2024

@mjiUST
Thanks a lot. And well done. I am a Ph.D. candidate too. Maybe we can disscuss about your work one day.
but frist , I want to figure out how it works.
I have read the installEnv.sh. And I totally understand how to use conda and install specified theano 0.9( even your scrpits install latest theano).
You dont need to discatd CUDNN.
The problem is theano.sandbox is an old back end. You'd better switch to a new backend theano.gpuarray. pls see https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end(gpuarray)
Otherwise new drivers and new cuda may not compatible with it. I know you use the nvidia driver 375, cuda 8.0, cudnn v5.1. But I need cuda 9.0 and cudnnv7.1.1 for tensorflow1.6. So the latest
nvidia driver has been installed.

Even I use theano 0.9.
Exception: ('The following error happened while compiling the node', <theano.sandbox.cuda.DnnVersion object at 0x7f9028151110>(), '\n', 'The nvidia driver version installed with this OS does not give good results for reduction.Installing the nvidia driver available on the same download page as the cuda package will fix the problem: http://developer.nvidia.com/cuda-downloads')

The only way is switching to a new backend theano.gpuarray. Or give up cuda 9.0 and cudnnv7.1.1. Go back to nvidia driver 375, cuda 8.0, cudnn v5.1. Its hard to choose. And it certainly limits your work.

@Rubikplayer I cant find gpu_contiguous too,even in theano 0.9's doucuments. So I guess only the original author can fix it.

from surfacenet.

mjiUST avatar mjiUST commented on July 19, 2024

@cdb0y511
Thanks for your interest and looking forward to having further discussion.

I don't know whether you have tried this method: say you have both /usr/local/cuda-8.0 and /usr/local/cuda that linked to cuda-9.0. Change the 1st line of ~/miniconda2/envs/SurfaceNet/etc/conda/activate.d/activate-cuda.sh to export CUDA_ROOT=/usr/local/cuda-8.0 which will not affact your settings in .bashrc before you source activate SurfaceNet. In this way, even though you may have multiple cuda versions in your PC, a particular one could be specified without ANY influence with your other projects (for example, tensorflow and pytorch).

Similarly, one can also specify a cudnn without influence with other projects by changing the 1st line of ~/miniconda2/envs/SurfaceNet/etc/conda/activate.d/activate-cudnn.sh to any path where the cudnn folder located, e.g., export CUDNN_ROOT=/home/<user-name>/libs/cudnn-8.0-v5.1.

I highly recommend you install CUDNN outside of CUDA folder, so that you can have any combination of CUDA+CUDNN by defining specific environment variables in different conda_envs.

Please feel free to post any queries.

from surfacenet.

Rubikplayer avatar Rubikplayer commented on July 19, 2024

@mjiUST @cdb0y511
Yes, yesterday I did the following, and it can start running the main.py (although some other error occurs):

  • Install CuDNN 5.1 (as you mentioned in "install outside cuda folder")
  • Install theano 0.9, by conda install theano=0.9
  • Specify CUDA version, by exporting environment variable
export CUDA_ROOT=/usr/local/cuda-8.0
export PATH=$PATH:$CUDA_ROOT/binexport 
export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$LD_LIBRARY_PATH
export CPATH=$CUDA_ROOT/include:$CPATH
export LIBRARY_PATH=$CUDA_ROOT/lib64:$LIBRARY_PATH

and setting theano config in ~/.theanorc:

[cuda] 
root=/usr/local/cuda-8.0

@cdb0y511 You can also have a try. I have multiple CUDA installed. Also I installed two versions of CuDNN (although I might have overwritten 7.1 with 5.1).

For the error I encountered, I will open another issue. Thanks for the feedback!
Edit: new issue opened: (#4)

from surfacenet.

mjiUST avatar mjiUST commented on July 19, 2024

@Rubikplayer
Thank you for the feedback. To be precise,

  • before we specify outside cudnn, the original one should be removed OR unlinked (remove from the env variables: LD_LIBRARY_PATH, CPATH, and LIBRARY_PATH)

  • installation of 0.9 version Theano please use the command:

    conda install -c rdonnelly theano -y # 0.9.0 version theano
    Since the one you mentioned conda install theano=0.9 will result in 0.9 version with different commit hash.

from surfacenet.

Rubikplayer avatar Rubikplayer commented on July 19, 2024

@mjiUST Thanks for the response.

  • Yes, as I found in another thread, indeed different versions of CuDNN can result in errors.
  • Thanks for the info! It seems the conda-installed version is okay for now. If any problem, I will switch back to the version you specified.

from surfacenet.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.