I have installed tutel on my machine and have set up the related environment variables

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Cannot compile tutel kernels and got runtime error about tutel HOT 10 CLOSED

microsoft commented on May 16, 2024

Cannot compile tutel kernels and got runtime error

from tutel.

Comments (10)

hyhuang00 commented on May 16, 2024 1

Sure, I'm happy to help. Let me try out the new version and I'll let you know if it works for me.

from tutel.

ghostplant commented on May 16, 2024

Your CUDA environment seems not be installed in the default location(e.g. /usr/local/cuda/include) can you print the value of CUDA_HOME. BTW, you can also try whether export USE_NVRTC=0 will help.

from tutel.

hyhuang00 commented on May 16, 2024

Your CUDA environment seems not be installed in the default location(e.g. /usr/local/cuda/include) can you print the value of CUDA_HOME. BTW, you can also try whether export USE_NVRTC=0 will help.

Thank you for your prompt reply! Yes, my CUDA environment is not installed in the default location because I'm using a shared computation cluster. Is there a parameter I can fix to ensure the compiler can find the correct CUDA? I will try to use export USE_NVRTC=0

$ echo $CUDA_HOME
/public/apps/cuda/11.3

from tutel.

ghostplant commented on May 16, 2024

We just merge a PR that parse CUDA_HOME from environment variable. Can you try whether it works for you?

from tutel.

hyhuang00 commented on May 16, 2024

Thank you! The fix works and I think the CUDA_HOME can be found correctly. I can now successfully run the hello_world.py and hello_world_ddp.py under the examples folder without any error. However, when I tried to use it under the fairseq (the use case is here), I got the following two errors:

Compilation error: (seems like the CUDA compiler worked, otherwise there won't be the second error)

[W custom_kernel.cpp:158] nvrtc: error: unrecognized option --includ`��.�U found
� Failed to use NVRTC for JIT compilation in this Pytorch version, try another approach using CUDA compiler.. (To always disable NVRTC, please: export USE_NVRTC=0)

RuntimeError

 File "/private/home/hyhuang/.conda/envs/newnllb/lib/python3.9/site-packages/fairseq-1.0.0a0+b1b3eda-py3.9-linux-x86_64.egg/fairseq/modules/moe/top2gate.py", line 234, in top2gating
    locations1 = fused_cumsum_sub_one(mask1)

File "/private/home/hyhuang/.local/lib/python3.9/site-packages/tutel/jit_kernels/gating.py", line 22, in fast_cumsum_sub_one
    return torch.ops.tutel_ops.cumsum(data)
RuntimeError: (0) == (cuModuleLoadDataEx(&hMod, image.c_str(), sizeof(options) / sizeof(*options), options, values))INTERNAL ASSERT FAILED at "/tmp/pip-req-build-djl73tcc/tutel/custom/custom_kernel.cpp":214, please report a bug to PyTorch. CHECK_EQ fails.
    return torch.ops.tutel_ops.cumsum(data)

Would you be able to provide any suggestions between these two? I am so confused. This is the same environment I used to run the hello_world.py scripts.

from tutel.

ghostplant commented on May 16, 2024

You need to run unset USE_NVRTC since you may explicitly configure that variable before.

from tutel.

hyhuang00 commented on May 16, 2024

Thank you! That completely resolves this problem. Closing the issue.

from tutel.

ghostplant commented on May 16, 2024

@hyhuang00 Can you help us to test whether the latest version (#170) still work for your environment? As we canceled the way to detect manual CUDA_HOME environment variable, but the new way should be compatible with different environment more robustly.

from tutel.

hyhuang00 commented on May 16, 2024

The new version works on my machine without any error. I installed the package via $ python3 -m pip install --user --upgrade git+https://github.com/microsoft/tutel@main

from tutel.

ghostplant commented on May 16, 2024

Thanks!

from tutel.

Cannot compile tutel kernels and got runtime error about tutel HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent