Code Monkey home page Code Monkey logo

Comments (7)

NOBLES5E avatar NOBLES5E commented on July 30, 2024

It seems that the LooseVersion is deprecated: pydata/xarray#6092

We have removed it from master branch. Just install the lastest master branch version to see if it works.

from bagua.

silverCore97 avatar silverCore97 commented on July 30, 2024

Is it correct to use "pip install bagua-cuda113" to install the latest master branch version? I have tried it, however the problem still exists.

from bagua.

NOBLES5E avatar NOBLES5E commented on July 30, 2024

try python3 -m pip install --pre bagua-cuda113==0.8.3.dev187 --upgrade

from bagua.

silverCore97 avatar silverCore97 commented on July 30, 2024

Now I have a different error message

Traceback (most recent call last):
  File "main.py", line 22, in <module>
    import bagua.torch_api as bagua
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/bagua/torch_api/__init__.py", line 51, in <m            odule>
    from .distributed import BaguaModule  # noqa: F401
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/bagua/torch_api/distributed.py", line 21, in             <module>
    @gorilla.patches(torch.nn.Module, filter=lambda name, obj: "bagua" in name)
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/bagua/torch_api/distributed.py", line 117, i            n BaguaModule
    @property
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/torch/_jit_internal.py", line 373, in unused
    fn._torchscript_modifier = FunctionModifiers.UNUSED
AttributeError: 'property' object has no attribute '_torchscript_modifier'
Killing subprocess 10939
Traceback (most recent call last):
  File "/cluster/apps/nss/python/3.7.4/x86_64/lib64/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/cluster/apps/nss/python/3.7.4/x86_64/lib64/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/bagua/distributed/launch.py", line 343, in <            module>
    main()
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/bagua/distributed/launch.py", line 328, in m            ain
    sigkill_handler(signal.SIGTERM, None)  # not coming back
  File "/cluster/home/kqian/.local/lib/python3.7/site-packages/bagua/distributed/launch.py", line 292, in s            igkill_handler
    returncode=last_return_code, cmd=cmd
subprocess.CalledProcessError: Command '['/cluster/apps/nss/python/3.7.4/x86_64/bin/python3', '-u', 'main.py', '--algorithm', 'gradient_allreduce']' returned non-zero exit status 1.

from bagua.

NOBLES5E avatar NOBLES5E commented on July 30, 2024

That looks weird. The CI runs the examples just fine: https://buildkite.com/bagua/bagua-gpu-test/builds/2220

Which pytorch version you are using?

from bagua.

silverCore97 avatar silverCore97 commented on July 30, 2024

Was apperently a problem with the cluster I was using.

from bagua.

Godricly avatar Godricly commented on July 30, 2024

I got same problem in gitlab-runner with bagua-cuda113. Could you please also upgrade cuda113 release to 0.9.1? The releasing history in pypi is so weird.

from bagua.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.