Code Monkey home page Code Monkey logo

gpu_monitor's Introduction

GPU Monitor

This is a simple tool that monitoring the occupation of GPU cards. With this tool, you won't have to login to your servers to check whether there are free cards one by one. It is a simple web page looks like the picture below.
The picture displays GPU memory occupation for several servers, the green rows mean that the memory usage of these cards is less than 1/3, the orange rows mean that the memory usage is between 1/3 and 2/3, the red rows mean that the memory usage is more than 2/3.

screen

How this works

Collect memory occupation info

We use nvidia-smi command to get GPU cards' info.

$ nvidia-smi --query-gpu=index,name,memory.used,memory.total,utilization.gpu --format=csv

index, name, memory.used [MiB], memory.total [MiB], utilization.gpu [%]
0, Tesla V100-SXM2-32GB, 21339 MiB, 32510 MiB, 100 %
1, Tesla V100-SXM2-32GB, 27 MiB, 32510 MiB, 0 %
2, Tesla V100-SXM2-32GB, 20934 MiB, 32510 MiB, 99 %
3, Tesla V100-SXM2-32GB, 3 MiB, 32510 MiB, 0 %

Run nvidia-smi --help-query-gpu for more details.

Get the infos from other servers

To get the infos from other servers, ssh with -t option will be used. The line below could excute commands on the remote server without logining to it.

$ ssh -t user@host commands

To run this command without typing password, you should generate a passphraseless SSH key and push it to your remote server. Just hit Enter for the key and both passphrases:

$ ssh-keygen -t rsa -b 2048
Generating public/private rsa key pair.
Enter file in which to save the key (/home/username/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/username/.ssh/id_rsa.
Your public key has been saved in /home/username/.ssh/id_rsa.pub.

Copy your keys to the target server:

$ ssh-copy-id user@server
user@server's password: 

Set up

  • Install requirements
pip install -r requirements.txt
  • Generate a passphraseless SSH key and push it to each of your target server, and make sure you could login to the target server without typing your password.

  • Edit the config.yaml accordingly, there are only three items, one is the server name (for displaying purpose), one is the user name (could login to the server without password), the last one is the ip of the server you want get its info.

servers:
  -
    name: server-one
    user: pkufool
    ip: 127.0.0.1
  -
    name: server-two
    user: pkufool
    ip: 127.0.0.2
  • Run the flask application (could change the listening port if you like)
bash ./run.sh

Type http://ip:port on your browser, you should see the page similar to the screen shot above.

Acknowledgement

Flask: https://github.com/pallets/flask
Vue: https://github.com/vuejs/vue
Element UI: https://element.eleme.cn

License

gpu_monitor is MIT-licensed, as found in the LICENSE file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.