Code Monkey home page Code Monkey logo

Comments (5)

bmartinn avatar bmartinn commented on June 14, 2024 1

Yes you are correct, I'll make sure the error message will be corrected in the next RC.

Regrading using pypi with torch, the problem is, this is unstabe, for example there is no way of knowing whether the torchvision on pypi is the CPU or the GPU version...
Also for the GPU version, the CUDA version changes from one torch version to another, so you end up with driver mismatch with no good reason.

With all that said, if you know what's the correct version for your setup, you can simple replace the torchvision==0.2.1 with a direct https link to the wheel:
https://files.pythonhosted.org/packages/ca/0d/f00b2885711e08bd71242ebe7b96561e6f6d01fdb4b9dcf4d37e2e13c5e1/torchvision-0.2.1-py2.py3-none-any.whl
This would work, as long as it matches the CPU/CUDA version you are running .

from clearml-agent.

H4dr1en avatar H4dr1en commented on June 14, 2024 1

Regrading using pypi with torch, the problem is, this is unstabe, for example there is no way of knowing whether the torchvision on pypi is the CPU or the GPU version...
Also for the GPU version, the CUDA version changes from one torch version to another, so you end up with driver mismatch with no good reason.

Thank you for pointing that out, this definitely makes sense!

With all that said, if you know what's the correct version for your setup, you can simple replace the torchvision==0.2.1 with a direct https link to the wheel:

Thanks for the workaround! I'll close as soon as the error is more explicit 👍

EDIT:
@H4dr1en, What is the trains-agent version you are using?
What is the package manager trains-agent is using ? see example here
What is the pip version limit configured in trains.conf? see example here

train-agent==0.14.2rc2
package manager = pip
pip version = 0.21

from clearml-agent.

bmartinn avatar bmartinn commented on June 14, 2024

Hi @H4dr1en
Torch is a special case for trains-agent, since the good people of pytorch are actually maintaining packages for different CUDA versions, the trains-agent will automatically select the correct package based on the installed CUDA.

Specifically it seems that you are running without a GPU, so cuda version is 0.
It seems to find the correct package for torch==1.3.1, but fails on torchvision, the thing is it tries to download "torch" not "torchvision" ... Let me see if I can reproduce this behavior ..

EDIT:
@H4dr1en, What is the trains-agent version you are using?
What is the package manager trains-agent is using ? see example here
What is the pip version limit configured in trains.conf? see example here

from clearml-agent.

bmartinn avatar bmartinn commented on June 14, 2024

Hi @H4dr1en
Could you test with trains-agent 0.14.2rc2

pip install trains-agent==0.14.2rc2

I think the problem is that there is no package for torchvision==0.2.0
You can see in the full list here: https://download.pytorch.org/whl/cpu/torch_stable.html

Notice that you can just reset the experiment and edit the requirements to the correct torchvision version :)

from clearml-agent.

H4dr1en avatar H4dr1en commented on June 14, 2024

With trains-agent==0.14.2rc2 it also fails:

Collecting Cython
  Using cached Cython-0.29.17-cp37-cp37m-manylinux1_x86_64.whl (2.1 MB)
Installing collected packages: Cython
Successfully installed Cython-0.29.17
Collecting torch==1.3.1+cpu
  File was already downloaded /home/H4dr1en/.trains/pip-download-cache/cu0/torch-1.3.1+cpu-cp37-cp37m-linux_x86_64.whl
Successfully downloaded torch
Collecting torch==0.2.1
  ERROR: HTTP error 403 while getting http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl
  ERROR: Could not install requirement torch==0.2.1 from http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl because of error 403 Client Error: Forbidden for url: http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl
ERROR: Could not install requirement torch==0.2.1 from http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl because of HTTP error 403 Client Error: Forbidden for url: http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl for URL http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl
trains_agent: ERROR: Could not download wheel name of "http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl"
ERROR: Double requirement given: torch==0.2.1 from http://download.pytorch.org/whl/cu0/torch-0.2.1-cp37-cp37m-linux_x86_64.whl (from -r /tmp/cached-reqsx0eu_ber.txt (line 2)) (already in torch==1.5.0+cpu from file:///home/H4dr1en/.trains/pip-download-cache/cu0/torch-1.5.0%2Bcpu-cp37-cp37m-linux_x86_64.whl (from -r /tmp/cached-reqsx0eu_ber.txt (line 1)), name='torch')
trains_agent: ERROR: Could not install task requirements!
Command '['/home/H4dr1en/.trains/venvs-builds/3.7/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqsx0eu_ber.txt']' returned non-zero exit status 1.
DONE: Running task '63d740ab6fbd4178ad55243df1c4cf07', exit status 1

I think the problem is that there is no package for torchvision==0.2.0

Would it be reasonable to install torchvision (and torch) using pypi repo as a fallback when trains-agent cannot infer the package based on the version of CUDA and torch/torchvision?

In any case, the error should be more meaningfull (currently misleading since it tries to install torch, not torchvision with the version provided for torchvision)

from clearml-agent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.