Code Monkey home page Code Monkey logo

Comments (9)

Lunar13737 avatar Lunar13737 commented on June 8, 2024 2

I encountered the same problem, and it's been solved by downgrading the nvidia-ml-py to a former version 11.525.112 using pip install nvidia-ml-py==11.525.112. I hope it's helpful.

from gpustat.

PyroGenesis avatar PyroGenesis commented on June 8, 2024 1

+1 same error

  • OS: Windows 10 Enterprise (Version: 2004, OS build: 19041.264)
  • NVIDIA Driver version: 536.99
  • The name(s) of GPU card: NVIDIA GeForce RTX 4090 x 2
  • gpustat version: gpustat 1.1.1

Thanks for the workaround @Lunar13737 , it worked for me.

from gpustat.

PyroGenesis avatar PyroGenesis commented on June 8, 2024 1

@wookayin I think it was most likely 12.535.77 that caused the error, though I'm not 100% sure because I didn't keep a record of it. I downgraded to 11.525.112 which worked, and now 12.535.108 works too.

from gpustat.

wookayin avatar wookayin commented on June 8, 2024 1

Thanks. I can conclude that the root cause of this bug is essentially same as #161: one should use neither nvidia-ml-py=11.535.77 nor broken NVIDIA drivers >= 535.43, < 535.98.

gpustat will print warnings when any of these versions of nvml library or driver is detected, so we can close this issue without adding an unnecessary compatibility layer.

from gpustat.

mjmikulski avatar mjmikulski commented on June 8, 2024

+1 and the workaround with downgrading nvidia-ml-py did not work for me :(

  • OS: Windows 11 Pro N
  • NVIDIA Driver Version: 535.98, CUDA Version: 12.2
  • GPU: NVIDIA gpuGeForce RTX 4070
  • gpustat version: gpustat 1.1.1

Any hints?

from gpustat.

wookayin avatar wookayin commented on June 8, 2024

I'd like to reproduce this issue to have a correct fix. But I've never seen the issue.

What we know from #161 (comment):

  • nvidia-ml-py=11.535.77 is buggy, only works for 535.43 and 535.86 (the OP's case):
    • Does the problem go away if you install nvidia-ml-py==12.535.108? @JensWendt
  • It looks like that nvidia-ml-py 12.535.108 should correct all process-information related bugs, reverting the breaking changes in the previous versions. But this is just my guess, I'm not sure. I would need the nvidia-ml-py version installed on the system.

@Lunar13737, @PyroGenesis, @mjmikulski thanks for the datapoints. Could you please try upgrading nvidia-ml-py==12.535.108 and see if the OverflowError is gone?

from gpustat.

PyroGenesis avatar PyroGenesis commented on June 8, 2024

Could you please try upgrading nvidia-ml-py==12.535.108 and see if the OverflowError is gone?

@wookayin I can confirm, overflow error does not occur in nvidia-ml-py 12.535.108

from gpustat.

wookayin avatar wookayin commented on June 8, 2024

@PyroGenesis Thanks. What was the previous version of nvidia-ml-py that resulted in this bug?

from gpustat.

Lunar13737 avatar Lunar13737 commented on June 8, 2024

@wookayin nvidia-ml-py 12.535.108 works for me, no overflow error

from gpustat.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.