godweiyang / nn-cuda-example Goto Github PK

View Code? Open in Web Editor NEW

1.2K 1.2K 183.0 123 KB

Several simple examples for popular neural network toolkits calling custom CUDA operators.

License: Apache License 2.0

C++ 13.27% Cuda 3.97% C 0.88% Python 67.33% CMake 14.54%

cpp cuda neural-network python pytorch tensorflow

nn-cuda-example's Introduction

算法码上来 👋

📙 字节跳动AI Lab NLP算法工程师
🔨 研究方向：模型优化、机器翻译、句法分析
🐏 微信公众号：算法码上来
❤️ 微信：godweiyang，长期内推，加我进技术&内推群

nn-cuda-example's People

Contributors

Stargazers

Watchers

Forkers

zhimokc lzu-cvpr duzhiqiang2019 xrosliang dreamerdoremi daqians haochenye tiandiao123 junan007 neuzhangqiang suke0 yyh769130635 pistony assassindesign shi27feng zouzx coolloveboy seanxcwang trendingtechnology shunsunsun shaoyf9 hipapaa lx200916 zju-robotics-lab peng-weil eyuansu62 liyuanlucasliu venaway176 hellcatzm zengyiming-eamon chenskkk stevenjokess yuanzhedong dawnborn deepercs apx103 lingxuan0520 hmmpointcloud annopackage mheriyanto tianyu-su leo038 sui6662012 gitkingly huluxiaohuowa zyf12389 xiyou1024 collector-m heluocs noticeable jxhekang catalyster seangaringle yishengcheng veritasxu kangzf1996 wxyhv whitezou bogan-fma liu09114 outbreak-hui machinelearningsystem shimu007 jingwenwang95 shawn-dm cassieyy dzbwhut xingpanfeng devin-coder achang146 eternal-br dandelight newtonvan zhuofalin zyl1336110861 yuli-yx irvingao xinlinli170 elaina1919 kohsin xy1999729 zyzzu fenglian425 zjchust ericwu23 freefighter robotislove qiaolian9 caczhtus cao-y qwqyyy supercb ning306en ikillery ws0zzg4569 algorithmlover2016 handh1998 eason-777 hy0523 edwardsaga

nn-cuda-example's Issues

subprocess.CalledProcessError: Command '['where', 'cl']' returned non-zero exit status 1.(Using jit)

The environment is competely same....But when I am using pytorch with jit method, below error appears:

subprocess.CalledProcessError: Command '['where', 'cl']' returned non-zero exit status 1.

Any suggestion? Thank you!!!!

About operating system

I want to ask a question, is these code run in the macos？ If so, how can I run on windows？

tf2.3 cuda10.1 tf.load_op_libary("build/libadd2.so") error

Traceback (most recent call last):
File "tensorflow/time.py", line 57, in
cuda_module = tf.load_op_library('build/libadd2.so')
File "/home/guowei/anaconda3/envs/gy_py3.7_tf2.3/lib/python3.7/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: build/libadd2.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

RuntimeError: CUDA error: an illegal memory access was encountered ===== ?

I have changed the train.py here into this. And I meet this error : RuntimeError: CUDA error: an illegal memory access was encountered

(My environment is OK...using normal train.py can work)

Would you provide any suggestion? Thank you!!!

torch.ops.load_library("build/libadd2.so") error

Traceback (most recent call last):
File "time.py", line 60, in
torch.ops.load_library("build/libadd2.so")
File "/home/gzy/anaconda3/envs/pytorch1.7/lib/python3.6/site-packages/torch/_ops.py", line 105, in load_library
ctypes.CDLL(path)
File "/home/gzy/anaconda3/envs/pytorch1.7/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: /home/gzy/NN-CUDA-Example/pytorch/build/libadd2.so: undefined symbol: THPVariableClass

ERROR: expected constructor, destructor, or type conversion before ‘(’ token

Thanks for your sharing. When I compile pytorch project using JIT or Setuptools (e.g., python3 pytorch/setup.py install), I have a error as follows:

/home/chenxingyu/Documents/NN-CUDA-Example/pytorch/add2_ops.cpp:20:14: error: expected constructor, destructor, or type conversion before ‘(’ token
 TORCH_LIBRARY(add2, m) {

Could you help me solve it?

运行结果不对

运行结果：cuda_res 运行前后的结果是一样的，是怎么回事啊

a

有docker相关的环境吗？

大佬，能提供docker相关的环境吗？本地编译的时候，环境不一致出现了报错，折腾了下还是不行

A compilation problem about setup.py

This problem occurs when I compile

running install
running bdist_egg
running egg_info
writing add2.egg-info\PKG-INFO
writing dependency_links to add2.egg-info\dependency_links.txt
writing top-level names to add2.egg-info\top_level.txt
reading manifest file 'add2.egg-info\SOURCES.txt'
writing manifest file 'add2.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_ext
D:\miniconda3\envs\torch-gpu\lib\site-packages\torch\utils\cpp_extension.py:304: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
building 'add2' extension
Emitting ninja build file D:\git-bash\daima\NN-CUDA-Example\build\temp.win-amd64-3.8\Release\build.ninja...
Compiling objects...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
1.10.2.git.kitware.jobserver-1
C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:D:\miniconda3\envs\torch-gpu\lib\site-
packages\torch\lib "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0\lib/x64" /LIBPATH:D:\miniconda3\envs\torch-gpu\libs /LIBPATH:D:\miniconda3\envs\torch-gpu\PCbuild\amd64 "/LIBPATH:C:\Program Files (x86)\Microsoft
Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\lib\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\10\lib\10.0.10240.0\ucrt\x64" "/LIBPATH:C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x64" c10.lib torch.lib tor
ch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda_cu.lib torch_cuda_cpp.lib /EXPORT:PyInit_add2 D:\git-bash\daima\NN-CUDA-Example\build\temp.win-amd64-3.8\Release\pytorch/add2_ops.obj D:\git-bash\daima\NN-CUDA-Example\b
uild\temp.win-amd64-3.8\Release\kernel/add2_kernel.obj /OUT:build\lib.win-amd64-3.8\add2.cp38-win_amd64.pyd /IMPLIB:D:\git-bash\daima\NN-CUDA-Example\build\temp.win-amd64-3.8\Release\pytorch\add2.cp38-win_amd64.lib
LINK : fatal error LNK1181: 无法打开输入文件“D:\git-bash\daima\NN-CUDA-Example\build\temp.win-amd64-3.8\Release\pytorch\add2_ops.obj”
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64\link.exe' failed with exit status 1181

The environment configuration is as follows：
pytorch1.8.1+cuda11.1
python3.8

How to solve this problem？

阳神能不能帮我看看为什么编译失败报错RuntimeError: Error building extension 'add2'，后面这句不理解fatal error: add2.h: No such file or directory

我在Jupiter notebook里运行下面jit命令的编译报错如下
python3 time.py --compiler jit

Using /tmp/torch_extensions as PyTorch extensions root...
Detected CUDA files, patching ldflags
Emitting ninja build file /tmp/torch_extensions/add2/build.ninja...
Building extension module add2...
[1/2] c++ -MMD -MF add2_ops.o.d -DTORCH_EXTENSION_NAME=add2 -DTORCH_API_INCLUDE_EXTENSION_H -I/gby/NN-CUDA-Example-master/pytorch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /gby/NN-CUDA-Example-master/pytorch/add2_ops.cpp -o add2_ops.o
FAILED: add2_ops.o
c++ -MMD -MF add2_ops.o.d -DTORCH_EXTENSION_NAME=add2 -DTORCH_API_INCLUDE_EXTENSION_H -I/gby/NN-CUDA-Example-master/pytorch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.6/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.6/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.6m -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++11 -c /gby/NN-CUDA-Example-master/pytorch/add2_ops.cpp -o add2_ops.o
/gby/NN-CUDA-Example-master/pytorch/add2_ops.cpp:2:10: fatal error: add2.h: No such file or directory
#include "add2.h"
^~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 960, in _build_extension_module
check=True)
File "/usr/lib/python3.6/subprocess.py", line 438, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "time.py", line 56, in
verbose=True)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 658, in load
is_python_module)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 827, in _jit_compile
with_cuda=with_cuda)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 880, in _write_ninja_file_and_build
_build_extension_module(name, build_directory, verbose)
File "/usr/local/lib/python3.6/dist-packages/torch/utils/cpp_extension.py", line 973, in _build_extension_module
raise RuntimeError(message)
RuntimeError: Error building extension 'add2'

运行下面cmake命令的编译报错如下
Traceback (most recent call last):
File "time.py", line 60, in
torch.ops.load_library("build/libadd2.so")
File "/usr/local/lib/python3.6/dist-packages/torch/_ops.py", line 106, in load_library
ctypes.CDLL(path)
File "/usr/lib/python3.6/ctypes/init.py", line 348, in init
self._handle = _dlopen(self._name, mode)
OSError: /gby/NN-CUDA-Example-master/pytorch/build/libadd2.so: cannot open shared object file: No such file or directory

Cmake error，other 2 compile methods succeeded

Checking whether the CUDA compiler is NVIDIA using "" did not match "nvcc: NVIDIA (R) Cuda compiler driver":
Checking whether the CUDA compiler is Clang using "" did not match "(clang version)":
Compiling the CUDA compiler identification source file "CMakeCUDACompilerId.cu" failed.
Compiler: /usr/local/cuda/bin/nvcc
Build flags:
Id flags: -v