Code Monkey home page Code Monkey logo

Comments (10)

hyhuang00 avatar hyhuang00 commented on May 16, 2024 1

Sure, I'm happy to help. Let me try out the new version and I'll let you know if it works for me.

from tutel.

ghostplant avatar ghostplant commented on May 16, 2024

Your CUDA environment seems not be installed in the default location(e.g. /usr/local/cuda/include) can you print the value of CUDA_HOME. BTW, you can also try whether export USE_NVRTC=0 will help.

from tutel.

hyhuang00 avatar hyhuang00 commented on May 16, 2024

Your CUDA environment seems not be installed in the default location(e.g. /usr/local/cuda/include) can you print the value of CUDA_HOME. BTW, you can also try whether export USE_NVRTC=0 will help.

Thank you for your prompt reply! Yes, my CUDA environment is not installed in the default location because I'm using a shared computation cluster. Is there a parameter I can fix to ensure the compiler can find the correct CUDA? I will try to use export USE_NVRTC=0

$ echo $CUDA_HOME
/public/apps/cuda/11.3

from tutel.

ghostplant avatar ghostplant commented on May 16, 2024

We just merge a PR that parse CUDA_HOME from environment variable. Can you try whether it works for you?

from tutel.

hyhuang00 avatar hyhuang00 commented on May 16, 2024

Thank you! The fix works and I think the CUDA_HOME can be found correctly. I can now successfully run the hello_world.py and hello_world_ddp.py under the examples folder without any error. However, when I tried to use it under the fairseq (the use case is here), I got the following two errors:

  • Compilation error: (seems like the CUDA compiler worked, otherwise there won't be the second error)
[W custom_kernel.cpp:158] nvrtc: error: unrecognized option --includ`��.�U found
� Failed to use NVRTC for JIT compilation in this Pytorch version, try another approach using CUDA compiler.. (To always disable NVRTC, please: export USE_NVRTC=0)
  • RuntimeError
 File "/private/home/hyhuang/.conda/envs/newnllb/lib/python3.9/site-packages/fairseq-1.0.0a0+b1b3eda-py3.9-linux-x86_64.egg/fairseq/modules/moe/top2gate.py", line 234, in top2gating
    locations1 = fused_cumsum_sub_one(mask1)

File "/private/home/hyhuang/.local/lib/python3.9/site-packages/tutel/jit_kernels/gating.py", line 22, in fast_cumsum_sub_one
    return torch.ops.tutel_ops.cumsum(data)
RuntimeError: (0) == (cuModuleLoadDataEx(&hMod, image.c_str(), sizeof(options) / sizeof(*options), options, values))INTERNAL ASSERT FAILED at "/tmp/pip-req-build-djl73tcc/tutel/custom/custom_kernel.cpp":214, please report a bug to PyTorch. CHECK_EQ fails.
    return torch.ops.tutel_ops.cumsum(data)

Would you be able to provide any suggestions between these two? I am so confused. This is the same environment I used to run the hello_world.py scripts.

from tutel.

ghostplant avatar ghostplant commented on May 16, 2024

You need to run unset USE_NVRTC since you may explicitly configure that variable before.

from tutel.

hyhuang00 avatar hyhuang00 commented on May 16, 2024

Thank you! That completely resolves this problem. Closing the issue.

from tutel.

ghostplant avatar ghostplant commented on May 16, 2024

@hyhuang00 Can you help us to test whether the latest version (#170) still work for your environment? As we canceled the way to detect manual CUDA_HOME environment variable, but the new way should be compatible with different environment more robustly.

from tutel.

hyhuang00 avatar hyhuang00 commented on May 16, 2024

The new version works on my machine without any error. I installed the package via $ python3 -m pip install --user --upgrade git+https://github.com/microsoft/tutel@main

from tutel.

ghostplant avatar ghostplant commented on May 16, 2024

Thanks!

from tutel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.