Code Monkey home page Code Monkey logo

gemini's Introduction

Gemini

About

Gemini is an efficient GPU resource sharing system with fine-grained control for Linux platforms.

It shares a NVIDIA GPU among multiple clients with specified resource constraint, and works seamlessly with any CUDA-based GPU programs. Besides, it is also work-conserving and with low overhead, so nearly no compute resource waste will happen.

System Structure

Gemini consists of two components: scheduler (back-end) and hook library (front-end).

  • scheduler (GPU device manager) (gem-schd): A daemon process managing token. Based on information provided in resource configuration file (resource.conf), scheduler determines whom to give token. Clients can launch CUDA kernels only when holding a valid token.
  • hook library (libgemhook.so.1): A library intercepting CUDA-related function calls. It utilizes the mechanism of LD_PRELOAD, which forces our hook library being loaded before any other dynamic linked libraries.

Currently we use Unix domain socket as the communication interface between components.

Installation

Dependencies

  • libzmq
  • glib-2.0
  • gio-2.0

Compile

Basically all components can be built with the following command:

make [CUDA_PATH=/path/to/cuda/installation] [PREFIX=/place/to/install] [DEBUG=1]

This command will install the built binaries in $(PREFIX)/bin and $(PREFIX)/lib. Default value for PREFIX is $(pwd)/...

Adding DEBUG=1 in above command will make hook library and scheduler output more scheduling details.

Usage

Resource configuration file format

We follow the syntax of XDG Desktop Entry Specification for configuration file.

Settings for a client group is described by a section like below:

[client_group_name]
MinUtil = 0.1  # minimum required ratio of GPU usage time (between 0 and 1); default is 0
MaxUtil = 0.5  # maximum allowed ratio of GPU usage time (between 0 and 1); default is 1
MemoryLimit = 2GiB  # maximum allowed GPU memory usage (in bytes); default is 1GiB

Suffixes like M, GB will be interpreted as power of 10 (e.g. 1 000 000 and 1 000 000 000), and suffixes like Ki, MiB will be interpreted as power of 2 (e.g. 1 024 and 1 048 576).

Default values will be used for unspecified parameters.

Changes to this file will be monitored by gem-schd. After each change, scheduler will read this file again and update settings.

Run

Specify a directory in environment variable GEMINI_IPC_DIR for keeping unix domain socket files. Default value is /tmp/gemini/ipc.

When launching applications, set environment variable GEMINI_GROUP_NAME to name of client group and LD_PRELOAD to location of libgemhook.so.1.

For convenience, we provide a Python script tools/launch-command.py for launching applications. Refer to scripts and source codes for more details.

Contributors

jim90247 eee4017 ncy9371

gemini's People

Contributors

jim90247 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.