Code Monkey home page Code Monkey logo

Comments (18)

deval281shah avatar deval281shah commented on July 25, 2024

Hello,
CuDNN is tested for the following configuration:

Ubuntu 16.04.5 LTS
gcc 5.4.0 g++ 5.4.0
CUDA 8.0
cuDNN 7.1.4
Please try with these settings.

Also, you can use mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist (It has detailed steps as well)

from gpgpu-sim_distribution.

deval281shah avatar deval281shah commented on July 25, 2024

Hi,
Which config files (gpgpusim.config) are you using?

from gpgpu-sim_distribution.

deval281shah avatar deval281shah commented on July 25, 2024

Hi,
Thanks for reporting the issue. This might be an issue is with the config file.
Have you tried with SM7_TITANV. It works for that config.

from gpgpu-sim_distribution.

shengyushen avatar shengyushen commented on July 25, 2024

cuda8.0 dont support tensorcore, wich need cuda9.0, so how to resolve this?

from gpgpu-sim_distribution.

gangmul12 avatar gangmul12 commented on July 25, 2024

I think libcudart path on pytorch installation is not that important because gpgpu-sim is based on dynamic library hijacking. Just running pytorch is not working??

from gpgpu-sim_distribution.

gangmul12 avatar gangmul12 commented on July 25, 2024

I got stucked at another error message, but i think i am a few step ahead of you because gpgpusim is successfully loaded when executing "python main.py" in examples/mnist.
How's the result when just launching python and execute import torch?

*my env
python 2.7.12
gcc 5.4
Ubuntu 16.04.06
CUDA 8.0
cudnn 7.1.4
gpgpu-sim commit 49e95cd

from gpgpu-sim_distribution.

gangmul12 avatar gangmul12 commented on July 25, 2024

@ohcurrent
i wrote issus #101
capture

I produced this message when i test resnet, and reproduced it when runnig examples/mnist

from gpgpu-sim_distribution.

gangmul12 avatar gangmul12 commented on July 25, 2024

@ohcurrent
maybe you should set PYTORCH_BIN env variable as gpgpusim indicated

from gpgpu-sim_distribution.

gangmul12 avatar gangmul12 commented on July 25, 2024

I think we are now in the same problem! However, attribute error seems like python error, not gpgpu-sim. i suggest you to test that your program works well without using gpgpu-sim.

from gpgpu-sim_distribution.

gangmul12 avatar gangmul12 commented on July 25, 2024

Long time no see XD
No, currently i'm not working on this issue now.. Also i've never seen that issue, maybe your problem seems different from mine! However, again, isn't it python error? It doesn't seem like gpgpu-sim error...

from gpgpu-sim_distribution.

cng123 avatar cng123 commented on July 25, 2024

Hi. This has been a known issue, and it has to do with kernels not being found in libcudnn.so. Please try the instructions in the link below and see if it helps.
https://docs.google.com/document/d/17fSM2vrWodP8rWR7ctpgaggVXEw0uD2VCAh0Gi4Gpb4/edit?usp=sharing

from gpgpu-sim_distribution.

ohcurrent avatar ohcurrent commented on July 25, 2024

@cng123 Did your method succeed in any cudaLaunch for PyTorch examples?

from gpgpu-sim_distribution.

cng123 avatar cng123 commented on July 25, 2024

cudaLaunch succeeded for some of the PyTorch examples. However, the kernels might still fail (cuda_status_internal_errror.) Unfortunately, I have not resolved this issue yet.

from gpgpu-sim_distribution.

ohcurrent avatar ohcurrent commented on July 25, 2024

@cng123 Thank you for sharing the link.
However, how do you know 'no PTX implementation~' error is related with wrong cuDNN paths?

from gpgpu-sim_distribution.

cng123 avatar cng123 commented on July 25, 2024

@ohcurrent It was just based on personal experience with experimenting with different builds with different environmental variables. There are probably other problems that may lead to the same error, but the most common one I have seen so far is with the wrong (or unset) cuDNN paths.

It seems like if the caffe shared library is not statically linked to cudnn (so either dynamically linked, or not linked at all,) the 'no PTX implementation' will occur, which is a problem since it means that if pytorch is not built with cudnn and only with cuda, it will not work with gpgpu-sim. I am speculating that it has to do with libcublas and other cuda libraries being dynamically linked, but I cannot say for sure.

from gpgpu-sim_distribution.

mivenHan avatar mivenHan commented on July 25, 2024

@ohcurrent Have you solved cudnn error with mnistCudnn? I encountered the same error with you.

from gpgpu-sim_distribution.

ohcurrent avatar ohcurrent commented on July 25, 2024

@mivenHan
Yes, you can just edit the Makefile of the mnistCudnn, as below.
The application should be compiled with static libraries

#LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm

LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas_static -lcudnn_static_v7 -lculibos -lfreeimage -lstdc++ -lm -ldl -lpthread

from gpgpu-sim_distribution.

mivenHan avatar mivenHan commented on July 25, 2024

@ohcurrent Thank you for your reply first.

I want to confirm that you did complete the mnistCUDNN successfully under the cuda 9.1?
Since I worked with mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist

My cuda version is 9.0
and I try the cudnn 7.6.5 and cudnn 7.1.4 and cudnn 7.0.5.

All failed with a new error.

GPGPU-Sim PTX: WARNING: Asynchronous memset not supported (cudaError_t cudaMemsetAsync(void*, int, size_t, cudaStream_t))
./mnistCUDNN: relocation error: ./mnistCUDNN: symbol cudaFuncSetAttribute, version libcudart.so.9.0 not defined in file libcudart.so.9.0 with link time reference

Have you encountered that error?
Or the code with cuda 9.1 and cudnn 7.0.5 can complete running successfully?

Thank you for your time.

from gpgpu-sim_distribution.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.