Light

Running about gpgpu-sim_distribution HOT 18 CLOSED

gpgpu-sim commented on July 25, 2024

Running

from gpgpu-sim_distribution.

Comments (18)

deval281shah commented on July 25, 2024

Hello,
CuDNN is tested for the following configuration:

Ubuntu 16.04.5 LTS
gcc 5.4.0 g++ 5.4.0
CUDA 8.0
cuDNN 7.1.4
Please try with these settings.

Also, you can use mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist (It has detailed steps as well)

from gpgpu-sim_distribution.

deval281shah commented on July 25, 2024

Hi,
Which config files (gpgpusim.config) are you using?

from gpgpu-sim_distribution.

deval281shah commented on July 25, 2024

Hi,
Thanks for reporting the issue. This might be an issue is with the config file.
Have you tried with SM7_TITANV. It works for that config.

from gpgpu-sim_distribution.

shengyushen commented on July 25, 2024

cuda8.0 dont support tensorcore, wich need cuda9.0, so how to resolve this?

from gpgpu-sim_distribution.

gangmul12 commented on July 25, 2024

I think libcudart path on pytorch installation is not that important because gpgpu-sim is based on dynamic library hijacking. Just running pytorch is not working??

from gpgpu-sim_distribution.

gangmul12 commented on July 25, 2024

I got stucked at another error message, but i think i am a few step ahead of you because gpgpusim is successfully loaded when executing "python main.py" in examples/mnist.
How's the result when just launching python and execute import torch?

*my env
python 2.7.12
gcc 5.4
Ubuntu 16.04.06
CUDA 8.0
cudnn 7.1.4
gpgpu-sim commit 49e95cd

from gpgpu-sim_distribution.

gangmul12 commented on July 25, 2024

@ohcurrent
i wrote issus #101

I produced this message when i test resnet, and reproduced it when runnig examples/mnist

from gpgpu-sim_distribution.

gangmul12 commented on July 25, 2024

@ohcurrent
maybe you should set PYTORCH_BIN env variable as gpgpusim indicated

from gpgpu-sim_distribution.

gangmul12 commented on July 25, 2024

I think we are now in the same problem! However, attribute error seems like python error, not gpgpu-sim. i suggest you to test that your program works well without using gpgpu-sim.

from gpgpu-sim_distribution.

gangmul12 commented on July 25, 2024

Long time no see XD
No, currently i'm not working on this issue now.. Also i've never seen that issue, maybe your problem seems different from mine! However, again, isn't it python error? It doesn't seem like gpgpu-sim error...

from gpgpu-sim_distribution.

cng123 commented on July 25, 2024

Hi. This has been a known issue, and it has to do with kernels not being found in libcudnn.so. Please try the instructions in the link below and see if it helps.
https://docs.google.com/document/d/17fSM2vrWodP8rWR7ctpgaggVXEw0uD2VCAh0Gi4Gpb4/edit?usp=sharing

from gpgpu-sim_distribution.

ohcurrent commented on July 25, 2024

@cng123 Did your method succeed in any cudaLaunch for PyTorch examples?

from gpgpu-sim_distribution.

cng123 commented on July 25, 2024

cudaLaunch succeeded for some of the PyTorch examples. However, the kernels might still fail (cuda_status_internal_errror.) Unfortunately, I have not resolved this issue yet.

from gpgpu-sim_distribution.

ohcurrent commented on July 25, 2024

@cng123 Thank you for sharing the link.
However, how do you know 'no PTX implementation~' error is related with wrong cuDNN paths?

from gpgpu-sim_distribution.

cng123 commented on July 25, 2024

@ohcurrent It was just based on personal experience with experimenting with different builds with different environmental variables. There are probably other problems that may lead to the same error, but the most common one I have seen so far is with the wrong (or unset) cuDNN paths.

It seems like if the caffe shared library is not statically linked to cudnn (so either dynamically linked, or not linked at all,) the 'no PTX implementation' will occur, which is a problem since it means that if pytorch is not built with cudnn and only with cuda, it will not work with gpgpu-sim. I am speculating that it has to do with libcublas and other cuda libraries being dynamically linked, but I cannot say for sure.

from gpgpu-sim_distribution.

mivenHan commented on July 25, 2024

@ohcurrent Have you solved cudnn error with mnistCudnn? I encountered the same error with you.

from gpgpu-sim_distribution.

ohcurrent commented on July 25, 2024

@mivenHan
Yes, you can just edit the Makefile of the mnistCudnn, as below.
The application should be compiled with static libraries

#LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas -lcudnn -lfreeimage -lstdc++ -lm

LIBRARIES += -LFreeImage/lib/$(TARGET_OS)/$(TARGET_ARCH) -LFreeImage/lib/$(TARGET_OS) -lcudart -lcublas_static -lcudnn_static_v7 -lculibos -lfreeimage -lstdc++ -lm -ldl -lpthread

from gpgpu-sim_distribution.

mivenHan commented on July 25, 2024

@ohcurrent Thank you for your reply first.

I want to confirm that you did complete the mnistCUDNN successfully under the cuda 9.1?
Since I worked with mnist example code for GPGPU-Sim: https://github.com/gpgpu-sim/gpgpu-sim_simulations/tree/master/benchmarks/src/cuda/cudnn/mnist

My cuda version is 9.0
and I try the cudnn 7.6.5 and cudnn 7.1.4 and cudnn 7.0.5.

All failed with a new error.

GPGPU-Sim PTX: WARNING: Asynchronous memset not supported (cudaError_t cudaMemsetAsync(void*, int, size_t, cudaStream_t))
./mnistCUDNN: relocation error: ./mnistCUDNN: symbol cudaFuncSetAttribute, version libcudart.so.9.0 not defined in file libcudart.so.9.0 with link time reference

Have you encountered that error?
Or the code with cuda 9.1 and cudnn 7.0.5 can complete running successfully?

Thank you for your time.

from gpgpu-sim_distribution.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.