Code Monkey home page Code Monkey logo

gpustat's Introduction

gpustat

pypi license

Just less than nvidia-smi?

Screenshot: gpustat -cp

NOTE: This works with NVIDIA Graphics Devices only, no AMD support as of now. Contributions are welcome!

Self-Promotion: A web interface of gpustat is available (in alpha)! Check out gpustat-web.

Quick Installation

Install from PyPI:

pip install gpustat

If you don't have root (sudo) privilege, please try installing gpustat on user namespace: pip install --user gpustat.

To install the latest version (master branch) via pip:

pip install git+https://github.com/wookayin/gpustat.git@master

NVIDIA Driver and pynvml Requirements

Important

DO NOT: pip install pynvml, nor include pynvml as a dependency in your python project. This will not work.

Instead: pip install nvidia-ml-py. nvidia-ml-py is NVIDIA's the official python binding for NVML.

  • gpustat 1.2+: Requires nvidia-ml-py >= 12.535.108 (#161)
  • gpustat 1.0+: Requires NVIDIA Driver 450.00 or higher and nvidia-ml-py >= 11.450.129.
  • If your NVIDIA driver is too old, you can use older gpustat versions (pip install gpustat<1.0). See #107 for more details.

Python requirements

  • gpustat<1.0: Compatible with python 2.7 and >=3.4
  • gpustat 1.0: Python >= 3.4
  • gpustat 1.1: Python >= 3.6

Usage

$ gpustat

Options (Please see gpustat --help for more details):

  • --color : Force colored output (even when stdout is not a tty)
  • --no-color : Suppress colored output
  • -u, --show-user : Display username of the process owner
  • -c, --show-cmd : Display the process name
  • -f, --show-full-cmd : Display full command and cpu stats of running process
  • -p, --show-pid : Display PID of the process
  • -F, --show-fan : Display GPU fan speed
  • -e, --show-codec : Display encoder and/or decoder utilization
  • -P, --show-power : Display GPU power usage and/or limit (draw or draw,limit)
  • -a, --show-all : Display all gpu properties above
  • --id : Target and query specific GPUs only with the specified indices (e.g. --id 0,1,2)
  • --no-processes : Do not display process information (user, memory) (#133)
  • --watch, -i, --interval : Run in watch mode (equivalent to watch gpustat) if given. Denotes interval between updates.
  • --json : JSON Output (#10)
  • --print-completion (bash|zsh|tcsh) : Print a shell completion script. See #131 for usage.

Tips

  • Try gpustat --debug if something goes wrong.
  • To periodically watch, try gpustat --watch or gpustat -i (#41).
    • For older versions, one may use watch --color -n1.0 gpustat --color.
  • Running nvidia-smi daemon (root privilege required) will make querying GPUs much faster and use less CPU (#54).
  • The GPU ID (index) shown by gpustat (and nvidia-smi) is PCI BUS ID, while CUDA uses a different ordering (assigns the fastest GPU with the lowest ID) by default. Therefore, in order to ensure CUDA and gpustat use same GPU index, configure the CUDA_DEVICE_ORDER environment variable to PCI_BUS_ID (before setting CUDA_VISIBLE_DEVICES for your CUDA program): export CUDA_DEVICE_ORDER=PCI_BUS_ID.

Default display

[0] GeForce GTX Titan X | 77°C,  96 % | 11848 / 12287 MB | python/52046(11821M)
  • [0]: GPU index (starts from 0) as PCI_BUS_ID
  • GeForce GTX Titan X: GPU name
  • 77°C: GPU Temperature (in Celsius)
  • 96 %: GPU Utilization
  • 11848 / 12287 MB: GPU Memory Usage (Used / Total)
  • python/...: Running processes on GPU, owner/cmdline/PID (and their GPU memory usage)

Changelog

See CHANGELOG.md

License

MIT License

gpustat's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpustat's Issues

Display driver version in the header

Introduce --header option that specifies whether each of {host,time,driver} will be displayed in the header. Display driver version in the header:

e.g.

$ gpustat        # by default, {host,time} are only shown
$ gpustat --header host,time,driver

which prints:

workstation          Tue Oct 30 20:50:35 2018  396.54
[0] TITAN Xp         | 41'C,   0 % |  4479 / 12196 MB | user(4469M)
[1] TITAN Xp         | 38'C, 100 % |  4479 / 12196 MB | user(4469M)
  • What would be the best format?
  • Should we make it configurable (see #51)?

Problem with pynvml dependency and python 3.6

When I try to run this using Python 3.6.4 (Anaconda) I get:

    import pynvml as N
  File "${PYTHONPATH}/lib/python3.6/site-packages/pynvml.py", line 1831
    print c_count.value

This can be fixed by using the module py3nvml instead and replacing line 24 in gpustat/core.py with

from py3nvml import py3nvml as N

Command not found

I installed gpustat, but when I try to start it "command not found"

My system: Kubuntu 18.04

Document output values

Could the documentation describe the output? Or maybe a heading row in the output. In particular, the second column, the percentage value.

That's cool

That's cool

But it is may be based on nvidia-smi ...

I need one that independent to nvidia-dmi.

Thanks at any rate.

rxvt-unicode support

I have a very strange situation, I currently use urxvt as a terminal emulator and gpustat does not start from bare terminal raising:

Traceback (most recent call last):
  File "/home/rizhiy/anaconda3/bin/gpustat", line 7, in <module>
    from gpustat import main
  File "/home/rizhiy/anaconda3/lib/python3.6/site-packages/gpustat.py", line 32, in <module>
    class GPUStat(object):
  File "/home/rizhiy/anaconda3/lib/python3.6/site-packages/gpustat.py", line 156, in GPUStat
    term=Terminal(),
  File "/home/rizhiy/anaconda3/lib/python3.6/site-packages/blessings/__init__.py", line 105, in __init__
    self._init_descriptor)
_curses.error: setupterm: could not find terminal

So looks like blessings does not recognise urxvt properly.

But when I start gputstat from within tmux it starts ok but doesn't use correct colors (e.g. uses standard black instead of bold_black).

What version of blessings are you using? Does this problem exist in their current version?

ValueError: invalid literal for int() with base 10: '[Not Supported]'

Running gpustat got following return

Traceback (most recent call last):
  File "/home/~~~/anaconda2/bin/gpustat", line 11, in <module>
    sys.exit(main())
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 283, in main
    print_gpustat(**vars(args))
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 242, in print_gpustat
    gpu_stats.update_process_information()
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 196, in update_process_information
    processes = self.running_processes()
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 166, in running_processes
    pid_map = {int(e['pid']) : None for e in process_entries}
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 166, in <dictcomp>
    pid_map = {int(e['pid']) : None for e in process_entries}
ValueError: invalid literal for int() with base 10: '[Not Supported]'
  • Ubuntu 14.04
  • Python 2.7 with Anaconda 4

Fully configurable output format

From @neighthan 's suggestion

It would be nice to have a full-on format string like, e.g., git log does so the user could specify which gpu parameters to show, in which order, with which colors.

To this end, we also need to introduce a config file (e.g. ~/.config/gpustat/gpustat.cfg).
Or, maybe using an environment variable?

Prettier output

It would be nicer if we could get the pipes | aligned for multiple gpu systems 👍
Currently

[0]  Tesla K40c | 33'C,   0 % |     0 / 11439 MB |
[1] GeForce GTX 1080 | 35'C,   0 % |    52 /  8113 MB |

Proposed

[0]  Tesla K40c      | 33'C,   0 % |     0 / 11439 MB |
[1] GeForce GTX 1080 | 35'C,   0 % |    52 /  8113 MB |

Error trying to run gpustat: NVMLError_LibraryNotFound

Full message with --debug and Python3:

Error on querying NVIDIA devices. Use --debug flag for details
Traceback (most recent call last):
  File "/usr/local/bin/gpustat", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/gpustat.py", line 502, in main
    print_gpustat(**vars(args))
  File "/usr/local/lib/python3.6/dist-packages/gpustat.py", line 459, in print_gpustat
    traceback.print_exc(file=sys.stderr)
  File "/usr/lib/python3.6/traceback.py", line 159, in print_exc
    print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
  File "/usr/lib/python3.6/traceback.py", line 100, in print_exception
    type(value), value, tb, limit=limit).format(chain=chain):
  File "/usr/lib/python3.6/traceback.py", line 462, in __init__
    _seen.add(exc_value)
TypeError: unhashable type: 'NVMLError_LibraryNotFound'

I have also tried with Python2, just in case. Similar error:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gpustat.py", line 454, in print_gpustat
    gpu_stats = GPUStatCollection.new_query()
  File "/usr/local/lib/python2.7/dist-packages/gpustat.py", line 254, in new_query
    N.nvmlInit()
  File "/usr/local/lib/python2.7/dist-packages/pynvml.py", line 747, in nvmlInit
    _LoadNvmlLibrary()
  File "/usr/local/lib/python2.7/dist-packages/pynvml.py", line 785, in _LoadNvmlLibrary
    _nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
  File "/usr/local/lib/python2.7/dist-packages/pynvml.py", line 405, in _nvmlCheckReturn
    raise NVMLError(ret)
NVMLError_LibraryNotFound: NVML Shared Library Not Found

My guess is that I am missing some dependency, path, or correct version, but I could not find which is the problem yet. TBH I do not know how many layers of software are involved when running gpustat. I have the driver, its dev version, cuda, cuda toolkit,...

I hope/guess it is an easy to fix noob mistake. Do you have any idea about the possible cause? Would you like additional logs, version numbers, or anything?

PD: I had a bunch of nvidia things in version 387 and now I have moved to 384. I think in that process I have moved from Nvidia packages to Ubuntu packages too. After everything, it works, but that is all I know.

Not supported: no process information

What I tried

➜ gpustat
v1rtl-CR70-2M-CX70-2OC-CX70-2OD  Fri Oct  4 21:22:05 2019  390.116
[0] GeForce GT 740M  | 64'C,  ?? % |   360 /  2004 MB | (Not Supported)

Problem

gpustat doesn't show anything

Reason

nvidia-smi also shows "Not supported"

Details

OS Ubuntu 18.04
GPU NVIDIA GT 740M

Power usage

Hi,
How to add power usage and efficiency information ?

Error in running_processes()

I get the following error when calling gpustat while an actual GPU process is running. If no process involving GPUs is running, the output is correct as expected.

Error:

$ ./gpustat
Traceback (most recent call last):
  File "./gpustat", line 260, in <module>
    main(args)
  File "./gpustat", line 224, in main
    gpu_stats.update_process_information()
  File "./gpustat", line 184, in update_process_information
    processes = self.running_processes()
  File "./gpustat", line 179, in running_processes
    process_entry.update(pid_map[pid])
TypeError: 'NoneType' object is not iterable

The nvidia-smi output at the time ist correct:

|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     11812    C   ./juspic.x                                      96MiB |

Any ideas why the list seems to be empty? (I have not yet debugged it myself.)

Monitoring as a daemon mode

Currently, one has to use watch --color -n0.1 gpustat for periodical watching.
A built-in daemon mode (collecting some statistics as well) is supposed to be added.

offline installation does not work

I tried pip install ... with --no-index, but this does not help - the installation process tries to call pypi server. The machine does not have internet access. I can install any other package in offline mode with conda or pip.

Displaying username on dark terminals

Currently, black color is used to display the username. This makes it impossible to see anything on dark terminals. Any chance, a different color (other than black or white) can be used?

Maximum Sampling Rate of GPU Power

Hey! This might be slightly unrelated but I was wondering what the maximum sampling rate of call to nvmlDeviceGetPowerUsage would be. It must be limited by the sampling rate of the power measurement hardware on the GPU. Do you know about this based on your usage of the NVML API?

Thanks!

gpustat --json gives an error with old GPU (GTX 660)

Error

$ gpustat --json

Traceback (most recent call last):
  File "/home/miczi/miniconda3/envs/th/bin/gpustat", line 11, in <module>
    sys.exit(main())
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 502, in main
    print_gpustat(**vars(args))
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 463, in print_gpustat
    gpu_stats.print_json(sys.stdout)
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 434, in print_json
    o = self.jsonify()
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 424, in jsonify
    "gpus" : [g.jsonify() for g in self]
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 424, in <listcomp>
    "gpus" : [g.jsonify() for g in self]
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 237, in jsonify
    for p in self.entry['processes']]
TypeError: 'NoneType' object is not iterable

How to reproduce (on gpustat 0.4.1):

pip install gpustat
gpustat --json

Demo:

https://asciinema.org/a/4XmXvHBkE7gv658VxIQrYzmie

Let --gpuname-width clip gpu names longer than the specified width

I guess the behaviour should change the output from
gpustat [0] GeForce GTX 1080 Ti | 32'C, 0 % | 10795 / 11172 MB

to
gpustat --gpuname-width 10 [0] GeForce GT | 32'C, 0 % | 10795 / 11172 MB

but it doesn't seem to do anything

version: gpustat 0.5.0.dev1
os: Ubuntu 16.04.5 LTS
python: Python 3.6.4 :: Anaconda, Inc.

D(isk sleep) status when using gpustat -i

when I use gpustat -i to watch my GPU usage in SSH, but my local computer is shutdowm unexpectedly. When I open my server in SSH, I find that the process becomes a D(isk sleep) status and i can’t kill it. I have no permission to reboot, how can I slove it?

Feature request: Cluster GPU monitoring

Hi,

Thank you for your great tool for monitoring GPU usage, showing much clearer information than nvidia-smi.

Feature

Currently, I am working on a cluster with multiple GPU nodes. Do you think it would be a good idea to support monitoring for multiple nodes on one node? The use case is described as follows.

Use case

For a user with two nodes n1 and n2, it would be great if we can run the following command to show the GPU usage on all nodes.

gpustat -a -i 1 -s n1,n2

Displaying:
n1 time cuda version
GPU information

n2 time cuda version
GPU information

Possible Implementation

One of the possible implementations would be to start a gpustat service on each given node by sending commands with ssh and then on the master node, collect the information from each server periodically.

Expose GPU enc/dec utilization

Looks like the utilization.gpu value right now is the one shown via nvidia-smi (as GPU-Util).

If one runs nvidia-smi dmon instead, it also exposes a percentage representing encoding/decoding utilization (as a percentage under the encand dec columns).

Any chance of including those in gpustat? I'm running into a case where enc is at or near 100%, but the general GPU utilization is at a much lower ~25%, so it can't effectively be used as a proxy.

Edit: I see gpustat is using nvmlDeviceGetUtilizationRates, but the API also exposes nvmlDeviceGetEncoderUtilizationand nvmlDeviceGetDecoderUtilization which would probably be what we're looking for.

gpustat -cp does not refresh the stats

gpustat -cp should refresh every x seconds like in the screenshot of the readme. The current behavious is that gpustat -cp runs one iteration then closes.

gpustat -cp, gpustat -c, gpustat -p and gpustat return the same output:

[0] GeForce MX150    | 39'C,   0 % |     0 /  2002 MB |

Python 3.7.0

Thank you

Show CPU usage and RAM usage

I think it might be nice to have the total CPU and RAM usage being displayed in the header.

Do you think that's something to include?

Installed GPU but got error ImportError: No Module named gpustat

I installed gpustat using below command, when I run gpustat, I got error: ImportError: No module named gpustat. Do I need to set the path variable to a specific location.

"ubuntu@lambda-quad:~$ sudo -H pip install git+https://github.com/wookayin/gpustat.git@master
Collecting git+https://github.com/wookayin/gpustat.git@master
Cloning https://github.com/wookayin/gpustat.git (to master) to /tmp/pip-hIbHPs-build
Requirement already satisfied (use --upgrade to upgrade): gpustat==0.5.0.dev1 from git+https://github.com/wookayin/gpustat.git@master in /usr/local/lib/python2.7/dist-packages
Requirement already satisfied: six>=1.7 in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)
Requirement already satisfied: nvidia-ml-py>=7.352.0 in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)
Requirement already satisfied: psutil in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)
Requirement already satisfied: blessings>=1.6 in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)"

"ubuntu@lambda-quad:~$ gpustat
Traceback (most recent call last):
File "/usr/local/bin/gpustat", line 9, in
load_entry_point('gpustat==0.5.0.dev1', 'console_scripts', 'gpustat')()
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 542, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 2569, in load_entry_point
return ep.load()
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 2229, in load
return self.resolve()
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 2235, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
ImportError: No module named gpustat
"

Support Python2.6

Python2.6 doesn't support dict comprehension syntax.

  File "gpustat.py", line 168
    g = GPUStat({col_name: col_value.strip() for \
                                               ^
SyntaxError: invalid syntax

nvidia-smi is not recognized as an internal or external command: with 0.3.x versions on windows

C:>gpustat -cp
'nvidia-smi' is not recognized as an internal or external command,
operable program or batch file.
Error on calling nvidia-smi

C:>nvidia-smi --query-gpu=index,uuid,name,temperature.gpu,utilization.gpu,memory.used,memory.total --format=csv,noheader,nounits
0, GPU-9d01c9ef-1d73-7774-8b4f-5bee4b3bf644, GeForce GTX 1080 Ti, 28, 65, 9219, 11264
1, GPU-9da3de3f-cdf2-8ca9-504d-fd9bc414a78e, GeForce GTX 1080 Ti, 22, 0, 140, 11264

Any idea what might be the issue?
Windows 10, Python 3.7.2, latest nvidia drivers, etc, as of the time of this post.

use too much CPU resource

Hi, gpustat -i uses about 80% CPU on my machine, is it expected or a bug?
In contrast, nvidia-smi -l 1 uses less than 10%.

OS: Ubuntu 18.04
Nv Driver: 410.48
CUDA: 10.0.130
CPU: AMD Threadripper 1900x
GPU: 2080Ti + 1080

image

Add a flag to toggle displaying of the hostname and timestamp.

Due to the reverted commit in #32, is there consideration to add a flag which does not show the first line? That way, using a simple while loop would work well enough for repeated/logging GPU usage information.

For example, right now here is what I get using a while loop.

[hak8or@hak8ordesktop ~]$ while sleep 1; do gpustat -ucpP; done;
hak8ordesktop  Sat Jun 23 20:36:39 2018
[0] GeForce GTX 1070 | 63'C,  10 %,   39 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
hak8ordesktop  Sat Jun 23 20:36:40 2018
[0] GeForce GTX 1070 | 62'C,  10 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
hak8ordesktop  Sat Jun 23 20:36:41 2018
[0] GeForce GTX 1070 | 63'C,  10 %,   40 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
hak8ordesktop  Sat Jun 23 20:36:42 2018
[0] GeForce GTX 1070 | 62'C,  11 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)

Ideally a flag like --notitle could be added which disables the first line (hak8ordesktop Sat Jun 23 20:36:39 2018 in my case). This would create the following output:

[hak8or@hak8ordesktop ~]$ while sleep 1; do gpustat -ucpP; done;
[0] GeForce GTX 1070 | 63'C,  10 %,   39 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
[0] GeForce GTX 1070 | 62'C,  10 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
[0] GeForce GTX 1070 | 63'C,  10 %,   40 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
[0] GeForce GTX 1070 | 62'C,  11 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)

Error: No module named '_curses' on Windows 10

I try to use this package on Windows 10, while there is an error raised (No module named '_curses'). I wonder whether this package is made to support Windows. Is there some method to make is available for Windows?

Show full commands running

Would you support an option to show the full commands being run instead of just, e.g, python? These could be rather long, so one option might be to put each of them on a new line below the GPU that they're on (maybe indented a bit). Since gpustat seems to focus on being very concise, I'd understand if not. I have my own modification of nvidia-smi to show more about the processes running on each GPU, and I was planning to make it more concise when I ran across your project. I just want to check if you'd support this so I don't have to duplicate some of your work here nividia-smi if so.

nvidia-ml-py version requirement

I just upgraded to current HEAD from a previous version without nvidia-ml-py (I was planning to add in the power usage as in #13). I now get

$ gpustat --debug
Error on querying NVIDIA devices. Use --debug flag for details
Traceback (most recent call last):
  File "/home/<user>/.local/lib/python2.7/site-packages/gpustat-0.4.0.dev1-
py2.7.egg/gpustat.py", line 398, in print_gpustat
    gpu_stats = GPUStatCollection.new_query()
  File "/home/<user>/.local/lib/python2.7/site-packages/gpustat-0.4.0.dev1-
py2.7.egg/gpustat.py", line 300, in new_query
    gpu_info = get_gpu_info(handle)
  File "/home/<user>/.local/lib/python2.7/site-packages/gpustat-0.4.0.dev1-
py2.7.egg/gpustat.py", line 261, in get_gpu_info
    nv_graphics_processes = N.nvmlDeviceGetGraphicsRunningProcesses(handle)
AttributeError: 'module' object has no attribute 'nvmlDeviceGetGraphicsRunningProcesses'

During installation I had:

Searching for nvidia-ml-py==375.53.1
Best match: nvidia-ml-py 375.53.1
Adding nvidia-ml-py 375.53.1 to easy-install.pth file

So it looks like there are API differences between the versions of nvidia-ml-py. Can I ask which version you've been working with?

Show fan speed

Command nvidia-smi shows also fan speed:

$ nvidia-smi
Wed Apr  3 14:09:10 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37                 Driver Version: 396.37                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  On   | 00000000:03:00.0  On |                  N/A |
| 30%   42C    P8    16W / 250W |     53MiB / 11177MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 31%   43C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  On   | 00000000:81:00.0 Off |                  N/A |
| 51%   68C    P2    76W / 250W |  10781MiB / 11178MiB |     17%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  On   | 00000000:82:00.0 Off |                  N/A |
| 29%   34C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |

Could gpustat show this information too?

No percent displayed?

When I start gpustat (great program btw) I see two question marks instead of a percent? Could be my graphics card which is a GTX 560.

add "watch" option

Please add an option to keep outputting the GPU stats at fixed intervals (similar to how the usual *stat tools work)

Filter processes by memory/GPU usage

I work on Machine Learning and use this utility to monitor how much my GPUs are being used.
I frequently have one or two main processes and dozens of smaller ones. Smaller ones only use 500MB of memory total, but take up most of the space on the screen.

Is it possible to add a filter which will prevent processes which use small amounts of memory from being displayed?

Error on calling nvidia-smi: Command 'ps ...' returned non-zero exit status 1

got above error msg when i run gpustat. but nvidia-smi works on my machine
here are some details
OS:Ubuntu 14.04.5 LTS
Python Version: anaconda3.6

Error on calling nvidia-smi. Use --debug flag for details
Traceback (most recent call last):
  File "/usr/local/bin/gpustat", line 417, in print_gpustat                                                      gpu_stats = GPUStatCollection.new_query()
  File "/usr/local/bin/gpustat", line 245, in new_query
    return GPUStatCollection(gpu_list)
  File "/usr/local/bin/gpustat", line 218, in __init__
    self.update_process_information()
  File "/usr/local/bin/gpustat", line 316, in update_process_information
    processes = self.running_processes()
  File "/usr/local/bin/gpustat", line 275, in running_processes
    ','.join(map(str, pid_map.keys()))
  File "/usr/local/bin/gpustat", line 46, in execute_process
    stdout = check_output(command_shell, shell=True).strip()
  File "/home/xiyun/apps/anaconda3/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/home/xiyun/apps/anaconda3/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'ps -o pid,user:16,comm -p1 -p 14471' returned non-zero exit status 1.

how can i fix this ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.