wookayin / gpustat Goto Github PK

View Code? Open in Web Editor NEW

3.8K 46.0 273.0 327 KB

📊 A simple command-line utility for querying and monitoring GPU status

Home Page: https://pypi.python.org/pypi/gpustat

License: MIT License

Python 100.00%

nvidia-smi gpustat gpu python command-line monitoring

gpustat's Introduction

`gpustat`

Just less than nvidia-smi?

NOTE: This works with NVIDIA Graphics Devices only, no AMD support as of now. Contributions are welcome!

Self-Promotion: A web interface of gpustat is available (in alpha)! Check out gpustat-web.

Quick Installation

Install from PyPI:

pip install gpustat

If you don't have root (sudo) privilege, please try installing gpustat on user namespace: pip install --user gpustat.

To install the latest version (master branch) via pip:

pip install git+https://github.com/wookayin/gpustat.git@master

NVIDIA Driver and `pynvml` Requirements

Important

DO NOT: pip install pynvml, nor include pynvml as a dependency in your python project. This will not work.

Instead: pip install nvidia-ml-py. nvidia-ml-py is NVIDIA's the official python binding for NVML.

gpustat 1.2+: Requires nvidia-ml-py >= 12.535.108 (#161)
gpustat 1.0+: Requires NVIDIA Driver 450.00 or higher and nvidia-ml-py >= 11.450.129.
If your NVIDIA driver is too old, you can use older gpustat versions (pip install gpustat<1.0). See #107 for more details.

Python requirements

gpustat<1.0: Compatible with python 2.7 and >=3.4
gpustat 1.0: Python >= 3.4
gpustat 1.1: Python >= 3.6

Usage

$ gpustat

Options (Please see gpustat --help for more details):

--color : Force colored output (even when stdout is not a tty)
--no-color : Suppress colored output
-u, --show-user : Display username of the process owner
-c, --show-cmd : Display the process name
-f, --show-full-cmd : Display full command and cpu stats of running process
-p, --show-pid : Display PID of the process
-F, --show-fan : Display GPU fan speed
-e, --show-codec : Display encoder and/or decoder utilization
-P, --show-power : Display GPU power usage and/or limit (draw or draw,limit)
-a, --show-all : Display all gpu properties above
--id : Target and query specific GPUs only with the specified indices (e.g. --id 0,1,2)
--no-processes : Do not display process information (user, memory) (#133)
--watch, -i, --interval : Run in watch mode (equivalent to watch gpustat) if given. Denotes interval between updates.
--json : JSON Output (#10)
--print-completion (bash|zsh|tcsh) : Print a shell completion script. See #131 for usage.

Tips

Try gpustat --debug if something goes wrong.
To periodically watch, try gpustat --watch or gpustat -i (#41).
- For older versions, one may use watch --color -n1.0 gpustat --color.
Running nvidia-smi daemon (root privilege required) will make querying GPUs much faster and use less CPU (#54).
The GPU ID (index) shown by gpustat (and nvidia-smi) is PCI BUS ID, while CUDA uses a different ordering (assigns the fastest GPU with the lowest ID) by default. Therefore, in order to ensure CUDA and gpustat use same GPU index, configure the CUDA_DEVICE_ORDER environment variable to PCI_BUS_ID (before setting CUDA_VISIBLE_DEVICES for your CUDA program): export CUDA_DEVICE_ORDER=PCI_BUS_ID.

Default display

[0] GeForce GTX Titan X | 77°C,  96 % | 11848 / 12287 MB | python/52046(11821M)

[0]: GPU index (starts from 0) as PCI_BUS_ID
GeForce GTX Titan X: GPU name
77°C: GPU Temperature (in Celsius)
96 %: GPU Utilization
11848 / 12287 MB: GPU Memory Usage (Used / Total)
python/...: Running processes on GPU, owner/cmdline/PID (and their GPU memory usage)

Changelog

See CHANGELOG.md

License

MIT License

gpustat's People

Stargazers

Watchers

Forkers

jeffzhengye wenchieh brainiarc7 passosquanticos ssp154774273 jlertle ethanhe42 lebinhe archenroot nfoti yancz1989 iamsile chelovekhe leoleishi manzilzaheer yffud gninnur dogancan xindaya underxirox blakepan wjliu kenanpelit ly0n wonglkd temcom weili-nlp ashwin amanusk arasharchor nerrickt jqueguiner d-grossman isdnasd1215 twskipper inedited rubenvereecken stonesjtu stevefoy bbouffaut butsugiri noeqeon boo1245 asdkant cjw85 lovemusic jonpol01 yasunorikudo denisalevi drons jawsnfl-xx appoose edelmoral kalkov tzhang2014 angelberihuete whtstq2 roebel aplyer jakirkham burness ldmlf jbdatascience lanpa shenggaozhu fossabot chamrc stanchan alexandonian simplyautomationized pandinosaurus yzfedora abdul-git liu3xing3long xxxhycl2010 dntai nebriv glennwood carlthome xk1411 greiny helloldw sj6077 harrywei wanglc2008 andytengca fanfanruyun volkancirik cceyda jason-lee-lxx valeoai zueigung1419 arinbjornk xiamenwcy jhonathan-pedroso fgaim li-js tony32769 mikepapadim jinnu92

gpustat's Issues

Display driver version in the header

~~Introduce --header option that specifies whether each of {host,time,driver} will be displayed in the header.~~ Display driver version in the header:

e.g.

$ gpustat        # by default, {host,time} are only shown
$ gpustat --header host,time,driver

which prints:

workstation          Tue Oct 30 20:50:35 2018  396.54
[0] TITAN Xp         | 41'C,   0 % |  4479 / 12196 MB | user(4469M)
[1] TITAN Xp         | 38'C, 100 % |  4479 / 12196 MB | user(4469M)

What would be the best format?
Should we make it configurable (see #51)?

Problem with pynvml dependency and python 3.6

When I try to run this using Python 3.6.4 (Anaconda) I get:

    import pynvml as N
  File "${PYTHONPATH}/lib/python3.6/site-packages/pynvml.py", line 1831
    print c_count.value

This can be fixed by using the module py3nvml instead and replacing line 24 in gpustat/core.py with

from py3nvml import py3nvml as N

Command not found

I installed gpustat, but when I try to start it "command not found"

My system: Kubuntu 18.04

Document output values

Could the documentation describe the output? Or maybe a heading row in the output. In particular, the second column, the percentage value.

python 2.7 Command 'gpustat' not found

That's cool

But it is may be based on nvidia-smi ...

I need one that independent to nvidia-dmi.

Thanks at any rate.

rxvt-unicode support

I have a very strange situation, I currently use urxvt as a terminal emulator and gpustat does not start from bare terminal raising:

Traceback (most recent call last):
  File "/home/rizhiy/anaconda3/bin/gpustat", line 7, in <module>
    from gpustat import main
  File "/home/rizhiy/anaconda3/lib/python3.6/site-packages/gpustat.py", line 32, in <module>
    class GPUStat(object):
  File "/home/rizhiy/anaconda3/lib/python3.6/site-packages/gpustat.py", line 156, in GPUStat
    term=Terminal(),
  File "/home/rizhiy/anaconda3/lib/python3.6/site-packages/blessings/__init__.py", line 105, in __init__
    self._init_descriptor)
_curses.error: setupterm: could not find terminal

So looks like blessings does not recognise urxvt properly.

But when I start gputstat from within tmux it starts ok but doesn't use correct colors (e.g. uses standard black instead of bold_black).

What version of blessings are you using? Does this problem exist in their current version?

ValueError: invalid literal for int() with base 10: '[Not Supported]'

Running gpustat got following return

Traceback (most recent call last):
  File "/home/~~~/anaconda2/bin/gpustat", line 11, in <module>
    sys.exit(main())
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 283, in main
    print_gpustat(**vars(args))
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 242, in print_gpustat
    gpu_stats.update_process_information()
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 196, in update_process_information
    processes = self.running_processes()
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 166, in running_processes
    pid_map = {int(e['pid']) : None for e in process_entries}
  File "/home/~~~/anaconda2/lib/python2.7/site-packages/gpustat.py", line 166, in <dictcomp>
    pid_map = {int(e['pid']) : None for e in process_entries}
ValueError: invalid literal for int() with base 10: '[Not Supported]'

Ubuntu 14.04
Python 2.7 with Anaconda 4

No module named 'fcntl' on Windows 10

I try to use this tool on Win10. But the used fcntl library is not available on Win10. Is there any way or plan to support Windows in future?

Fully configurable output format

From @neighthan 's suggestion

It would be nice to have a full-on format string like, e.g., git log does so the user could specify which gpu parameters to show, in which order, with which colors.

To this end, we also need to introduce a config file (e.g. ~/.config/gpustat/gpustat.cfg).
Or, maybe using an environment variable?

Prettier output

It would be nicer if we could get the pipes | aligned for multiple gpu systems 👍
Currently

[0]  Tesla K40c | 33'C,   0 % |     0 / 11439 MB |
[1] GeForce GTX 1080 | 35'C,   0 % |    52 /  8113 MB |

Proposed

[0]  Tesla K40c      | 33'C,   0 % |     0 / 11439 MB |
[1] GeForce GTX 1080 | 35'C,   0 % |    52 /  8113 MB |

Error trying to run gpustat: NVMLError_LibraryNotFound

Full message with --debug and Python3:

Error on querying NVIDIA devices. Use --debug flag for details
Traceback (most recent call last):
  File "/usr/local/bin/gpustat", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/gpustat.py", line 502, in main
    print_gpustat(**vars(args))
  File "/usr/local/lib/python3.6/dist-packages/gpustat.py", line 459, in print_gpustat
    traceback.print_exc(file=sys.stderr)
  File "/usr/lib/python3.6/traceback.py", line 159, in print_exc
    print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
  File "/usr/lib/python3.6/traceback.py", line 100, in print_exception
    type(value), value, tb, limit=limit).format(chain=chain):
  File "/usr/lib/python3.6/traceback.py", line 462, in __init__
    _seen.add(exc_value)
TypeError: unhashable type: 'NVMLError_LibraryNotFound'

I have also tried with Python2, just in case. Similar error:

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/gpustat.py", line 454, in print_gpustat
    gpu_stats = GPUStatCollection.new_query()
  File "/usr/local/lib/python2.7/dist-packages/gpustat.py", line 254, in new_query
    N.nvmlInit()
  File "/usr/local/lib/python2.7/dist-packages/pynvml.py", line 747, in nvmlInit
    _LoadNvmlLibrary()
  File "/usr/local/lib/python2.7/dist-packages/pynvml.py", line 785, in _LoadNvmlLibrary
    _nvmlCheckReturn(NVML_ERROR_LIBRARY_NOT_FOUND)
  File "/usr/local/lib/python2.7/dist-packages/pynvml.py", line 405, in _nvmlCheckReturn
    raise NVMLError(ret)
NVMLError_LibraryNotFound: NVML Shared Library Not Found

My guess is that I am missing some dependency, path, or correct version, but I could not find which is the problem yet. TBH I do not know how many layers of software are involved when running gpustat. I have the driver, its dev version, cuda, cuda toolkit,...

I hope/guess it is an easy to fix noob mistake. Do you have any idea about the possible cause? Would you like additional logs, version numbers, or anything?

PD: I had a bunch of nvidia things in version 387 and now I have moved to 384. I think in that process I have moved from Nvidia packages to Ubuntu packages too. After everything, it works, but that is all I know.

Not supported: no process information

What I tried

➜ gpustat
v1rtl-CR70-2M-CX70-2OC-CX70-2OD  Fri Oct  4 21:22:05 2019  390.116
[0] GeForce GT 740M  | 64'C,  ?? % |   360 /  2004 MB | (Not Supported)

Problem

gpustat doesn't show anything

Reason

nvidia-smi also shows "Not supported"

Details

OS Ubuntu 18.04
GPU NVIDIA GT 740M

Power usage

Hi,
How to add power usage and efficiency information ?

Error in running_processes()

I get the following error when calling gpustat while an actual GPU process is running. If no process involving GPUs is running, the output is correct as expected.

Error:

$ ./gpustat
Traceback (most recent call last):
  File "./gpustat", line 260, in <module>
    main(args)
  File "./gpustat", line 224, in main
    gpu_stats.update_process_information()
  File "./gpustat", line 184, in update_process_information
    processes = self.running_processes()
  File "./gpustat", line 179, in running_processes
    process_entry.update(pid_map[pid])
TypeError: 'NoneType' object is not iterable

The nvidia-smi output at the time ist correct:

|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0     11812    C   ./juspic.x                                      96MiB |

Any ideas why the list seems to be empty? (I have not yet debugged it myself.)

Monitoring as a daemon mode

Currently, one has to use watch --color -n0.1 gpustat for periodical watching.
A built-in daemon mode (collecting some statistics as well) is supposed to be added.

offline installation does not work

I tried pip install ... with --no-index, but this does not help - the installation process tries to call pypi server. The machine does not have internet access. I can install any other package in offline mode with conda or pip.

Displaying username on dark terminals

Currently, black color is used to display the username. This makes it impossible to see anything on dark terminals. Any chance, a different color (other than black or white) can be used?

Maximum Sampling Rate of GPU Power

Hey! This might be slightly unrelated but I was wondering what the maximum sampling rate of call to nvmlDeviceGetPowerUsage would be. It must be limited by the sampling rate of the power measurement hardware on the GPU. Do you know about this based on your usage of the NVML API?

Thanks!

gpustat --json gives an error with old GPU (GTX 660)

Error

$ gpustat --json

Traceback (most recent call last):
  File "/home/miczi/miniconda3/envs/th/bin/gpustat", line 11, in <module>
    sys.exit(main())
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 502, in main
    print_gpustat(**vars(args))
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 463, in print_gpustat
    gpu_stats.print_json(sys.stdout)
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 434, in print_json
    o = self.jsonify()
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 424, in jsonify
    "gpus" : [g.jsonify() for g in self]
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 424, in <listcomp>
    "gpus" : [g.jsonify() for g in self]
  File "/home/miczi/miniconda3/envs/th/lib/python3.6/site-packages/gpustat.py", line 237, in jsonify
    for p in self.entry['processes']]
TypeError: 'NoneType' object is not iterable

How to reproduce (on gpustat 0.4.1):

pip install gpustat
gpustat --json

Demo:

https://asciinema.org/a/4XmXvHBkE7gv658VxIQrYzmie

Add a flag to suppress out graphics application

Followup of #18 --- I think it would be great to have an option to exclude graphics application (i.e. show only CUDA/computing processes). Something like --exclude-graphics

@kapsh What do you think?

Let --gpuname-width clip gpu names longer than the specified width

I guess the behaviour should change the output from
gpustat [0] GeForce GTX 1080 Ti | 32'C, 0 % | 10795 / 11172 MB

to
gpustat --gpuname-width 10 [0] GeForce GT | 32'C, 0 % | 10795 / 11172 MB

but it doesn't seem to do anything

version: gpustat 0.5.0.dev1
os: Ubuntu 16.04.5 LTS
python: Python 3.6.4 :: Anaconda, Inc.

D(isk sleep) status when using gpustat -i

when I use gpustat -i to watch my GPU usage in SSH, but my local computer is shutdowm unexpectedly. When I open my server in SSH, I find that the process becomes a D(isk sleep) status and i can’t kill it. I have no permission to reboot, how can I slove it?

Feature request: Cluster GPU monitoring

Hi,

Thank you for your great tool for monitoring GPU usage, showing much clearer information than nvidia-smi.

Feature

Currently, I am working on a cluster with multiple GPU nodes. Do you think it would be a good idea to support monitoring for multiple nodes on one node? The use case is described as follows.

Use case

For a user with two nodes n1 and n2, it would be great if we can run the following command to show the GPU usage on all nodes.

gpustat -a -i 1 -s n1,n2

Displaying:
n1 time cuda version
GPU information

n2 time cuda version
GPU information

Possible Implementation

One of the possible implementations would be to start a gpustat service on each given node by sending commands with ssh and then on the master node, collect the information from each server periodically.

JSON output

No process and user information in output (for graphics application)

I have no process and user information while running gpustat with -cpu flags.

gpustat 0.4.0.dev
% python -m gpustat -cpu                       
feline  Sun Aug 13 15:45:59 2017
[0] GeForce GTX 750 Ti | 48'C,   7 % |   213 /  1995 MB |

nvidia-smi v384.59 shows processes as well. Here is XML dump from it:
nvidia-smi_out_gtx750ti.xml.txt

Expose GPU enc/dec utilization

Looks like the utilization.gpu value right now is the one shown via nvidia-smi (as GPU-Util).

If one runs nvidia-smi dmon instead, it also exposes a percentage representing encoding/decoding utilization (as a percentage under the encand dec columns).

Any chance of including those in gpustat? I'm running into a case where enc is at or near 100%, but the general GPU utilization is at a much lower ~25%, so it can't effectively be used as a proxy.

Edit: I see gpustat is using nvmlDeviceGetUtilizationRates, but the API also exposes nvmlDeviceGetEncoderUtilizationand nvmlDeviceGetDecoderUtilization which would probably be what we're looking for.

gpustat -cp does not refresh the stats

gpustat -cp should refresh every x seconds like in the screenshot of the readme. The current behavious is that gpustat -cp runs one iteration then closes.

gpustat -cp, gpustat -c, gpustat -p and gpustat return the same output:

[0] GeForce MX150    | 39'C,   0 % |     0 /  2002 MB |

Python 3.7.0

Thank you

gpustat return "error on calling nvidia-smi"?

nvidia-smi works on my machine

Add an option to hide card name, temperature, etc.

pls add option to disable card type and temperature.

Show CPU usage and RAM usage

I think it might be nice to have the total CPU and RAM usage being displayed in the header.

Do you think that's something to include?

Installed GPU but got error ImportError: No Module named gpustat

I installed gpustat using below command, when I run gpustat, I got error: ImportError: No module named gpustat. Do I need to set the path variable to a specific location.

"ubuntu@lambda-quad:~$ sudo -H pip install git+https://github.com/wookayin/gpustat.git@master
Collecting git+https://github.com/wookayin/gpustat.git@master
Cloning https://github.com/wookayin/gpustat.git (to master) to /tmp/pip-hIbHPs-build
Requirement already satisfied (use --upgrade to upgrade): gpustat==0.5.0.dev1 from git+https://github.com/wookayin/gpustat.git@master in /usr/local/lib/python2.7/dist-packages
Requirement already satisfied: six>=1.7 in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)
Requirement already satisfied: nvidia-ml-py>=7.352.0 in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)
Requirement already satisfied: psutil in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)
Requirement already satisfied: blessings>=1.6 in /usr/local/lib/python2.7/dist-packages (from gpustat==0.5.0.dev1)"

"ubuntu@lambda-quad:~$ gpustat
Traceback (most recent call last):
File "/usr/local/bin/gpustat", line 9, in
load_entry_point('gpustat==0.5.0.dev1', 'console_scripts', 'gpustat')()
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 542, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 2569, in load_entry_point
return ep.load()
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 2229, in load
return self.resolve()
File "/usr/lib/python2.7/dist-packages/pkg_resources/init.py", line 2235, in resolve
module = import(self.module_name, fromlist=['name'], level=0)
ImportError: No module named gpustat
"

Support Python2.6

Python2.6 doesn't support dict comprehension syntax.

  File "gpustat.py", line 168
    g = GPUStat({col_name: col_value.strip() for \
                                               ^
SyntaxError: invalid syntax

Extra character in watch colour mode on Ubuntu 17.10

When I use command watch --color -n1.0 gpustat --color I get a lot of extra ^: https://imgur.com/a/A9Fxc

This problem doesn't occur without watch.
I'm on Ubuntu 17.10 with wayland.

nvidia-smi is not recognized as an internal or external command: with 0.3.x versions on windows

C:>gpustat -cp
'nvidia-smi' is not recognized as an internal or external command,
operable program or batch file.
Error on calling nvidia-smi

C:>nvidia-smi --query-gpu=index,uuid,name,temperature.gpu,utilization.gpu,memory.used,memory.total --format=csv,noheader,nounits
0, GPU-9d01c9ef-1d73-7774-8b4f-5bee4b3bf644, GeForce GTX 1080 Ti, 28, 65, 9219, 11264
1, GPU-9da3de3f-cdf2-8ca9-504d-fd9bc414a78e, GeForce GTX 1080 Ti, 22, 0, 140, 11264

Any idea what might be the issue?
Windows 10, Python 3.7.2, latest nvidia drivers, etc, as of the time of this post.

use too much CPU resource

Hi, gpustat -i uses about 80% CPU on my machine, is it expected or a bug?
In contrast, nvidia-smi -l 1 uses less than 10%.

OS: Ubuntu 18.04
Nv Driver: 410.48
CUDA: 10.0.130
CPU: AMD Threadripper 1900x
GPU: 2080Ti + 1080

Add a flag to toggle displaying of the hostname and timestamp.

Due to the reverted commit in #32, is there consideration to add a flag which does not show the first line? That way, using a simple while loop would work well enough for repeated/logging GPU usage information.

For example, right now here is what I get using a while loop.

[hak8or@hak8ordesktop ~]$ while sleep 1; do gpustat -ucpP; done;
hak8ordesktop  Sat Jun 23 20:36:39 2018
[0] GeForce GTX 1070 | 63'C,  10 %,   39 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
hak8ordesktop  Sat Jun 23 20:36:40 2018
[0] GeForce GTX 1070 | 62'C,  10 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
hak8ordesktop  Sat Jun 23 20:36:41 2018
[0] GeForce GTX 1070 | 63'C,  10 %,   40 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
hak8ordesktop  Sat Jun 23 20:36:42 2018
[0] GeForce GTX 1070 | 62'C,  11 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)

Ideally a flag like --notitle could be added which disables the first line (hak8ordesktop Sat Jun 23 20:36:39 2018 in my case). This would create the following output:

[hak8or@hak8ordesktop ~]$ while sleep 1; do gpustat -ucpP; done;
[0] GeForce GTX 1070 | 63'C,  10 %,   39 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
[0] GeForce GTX 1070 | 62'C,  10 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
[0] GeForce GTX 1070 | 63'C,  10 %,   40 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)
[0] GeForce GTX 1070 | 62'C,  11 %,   38 / 151 W |   552 /  8116 MB | hak8or:mpv/12795(132M) root:Xorg/1305(246M) hak8or:gnome-shell/1611(167M) hak8or:mpv/12795(132M)

'gpustatt' Not internal or external commands, not runnable programs

Excuse me, I have a problem.

After “pip install gpustat”，I encountered the following error：
'gpustatt' Not internal or external commands, not runnable programs.
My execution environment is in Windows 10.
What should I do about it?

Error: No module named '_curses' on Windows 10

I try to use this package on Windows 10, while there is an error raised (No module named '_curses'). I wonder whether this package is made to support Windows. Is there some method to make is available for Windows?

Show full commands running

Would you support an option to show the full commands being run instead of just, e.g, python? These could be rather long, so one option might be to put each of them on a new line below the GPU that they're on (maybe indented a bit). Since gpustat seems to focus on being very concise, I'd understand if not. I have my own modification of nvidia-smi to show more about the processes running on each GPU, and I was planning to make it more concise when I ran across your project. I just want to check if you'd support this so I don't have to duplicate some of your work here nividia-smi if so.

nvidia-ml-py version requirement

I just upgraded to current HEAD from a previous version without nvidia-ml-py (I was planning to add in the power usage as in #13). I now get

$ gpustat --debug
Error on querying NVIDIA devices. Use --debug flag for details
Traceback (most recent call last):
  File "/home/<user>/.local/lib/python2.7/site-packages/gpustat-0.4.0.dev1-
py2.7.egg/gpustat.py", line 398, in print_gpustat
    gpu_stats = GPUStatCollection.new_query()
  File "/home/<user>/.local/lib/python2.7/site-packages/gpustat-0.4.0.dev1-
py2.7.egg/gpustat.py", line 300, in new_query
    gpu_info = get_gpu_info(handle)
  File "/home/<user>/.local/lib/python2.7/site-packages/gpustat-0.4.0.dev1-
py2.7.egg/gpustat.py", line 261, in get_gpu_info
    nv_graphics_processes = N.nvmlDeviceGetGraphicsRunningProcesses(handle)
AttributeError: 'module' object has no attribute 'nvmlDeviceGetGraphicsRunningProcesses'

During installation I had:

Searching for nvidia-ml-py==375.53.1
Best match: nvidia-ml-py 375.53.1
Adding nvidia-ml-py 375.53.1 to easy-install.pth file

So it looks like there are API differences between the versions of nvidia-ml-py. Can I ask which version you've been working with?

Show fan speed

Command nvidia-smi shows also fan speed:

$ nvidia-smi
Wed Apr  3 14:09:10 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 396.37                 Driver Version: 396.37                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  On   | 00000000:03:00.0  On |                  N/A |
| 30%   42C    P8    16W / 250W |     53MiB / 11177MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  On   | 00000000:04:00.0 Off |                  N/A |
| 31%   43C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  On   | 00000000:81:00.0 Off |                  N/A |
| 51%   68C    P2    76W / 250W |  10781MiB / 11178MiB |     17%      Default |
+-------------------------------+----------------------+----------------------+
|   3  GeForce GTX 108...  On   | 00000000:82:00.0 Off |                  N/A |
| 29%   34C    P8    16W / 250W |      2MiB / 11178MiB |      0%      Default |

Could gpustat show this information too?

No percent displayed?

When I start gpustat (great program btw) I see two question marks instead of a percent? Could be my graphics card which is a GTX 560.

add "watch" option

Please add an option to keep outputting the GPU stats at fixed intervals (similar to how the usual *stat tools work)

We need a better way to display cmd name of running process on one card.

These contents are too messy. Can this tool be able to display line breaks between different cmds?

What terminal emulator is the one in the screenshot?

It's so much prettier than LXTerminal....

features First thank you for your coding and I am very happy to find a way to get GPU stat better ?

I want to monitor a GPU status and I can use another program continue when a program stoped .Do you have any good idea ?
thank you for your coding,if you have any trouble in reading my issures ,you can contact me .my native language is not english .

Drop Python 2 Support

@Stonesjtu Pointed out that Python 2 is being retired:

python2.7 is retiring in Jan 1st 2020, so probably the master branch can focus on only python 3.
Ref. https://pythonclock.org/

According to https://stackoverflow.com/questions/40114100/uploading-different-versions-python-2-7-vs-3-5-to-pypi, we can simply add python_requires='>=3.4' and bump the version.
Then if you install the package via pip2, the out-dated one instead of the newest version will be selected.

Filter processes by memory/GPU usage

I work on Machine Learning and use this utility to monitor how much my GPUs are being used.
I frequently have one or two main processes and dozens of smaller ones. Smaller ones only use 500MB of memory total, but take up most of the space on the screen.

Is it possible to add a filter which will prevent processes which use small amounts of memory from being displayed?

Error on calling nvidia-smi: Command 'ps ...' returned non-zero exit status 1

got above error msg when i run gpustat. but nvidia-smi works on my machine
here are some details
OS:Ubuntu 14.04.5 LTS
Python Version: anaconda3.6

Error on calling nvidia-smi. Use --debug flag for details
Traceback (most recent call last):
  File "/usr/local/bin/gpustat", line 417, in print_gpustat                                                      gpu_stats = GPUStatCollection.new_query()
  File "/usr/local/bin/gpustat", line 245, in new_query
    return GPUStatCollection(gpu_list)
  File "/usr/local/bin/gpustat", line 218, in __init__
    self.update_process_information()
  File "/usr/local/bin/gpustat", line 316, in update_process_information
    processes = self.running_processes()
  File "/usr/local/bin/gpustat", line 275, in running_processes
    ','.join(map(str, pid_map.keys()))
  File "/usr/local/bin/gpustat", line 46, in execute_process
    stdout = check_output(command_shell, shell=True).strip()
  File "/home/xiyun/apps/anaconda3/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/home/xiyun/apps/anaconda3/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command 'ps -o pid,user:16,comm -p1 -p 14471' returned non-zero exit status 1.

how can i fix this ?