Code Monkey home page Code Monkey logo

RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Paper on arXiv | Live demo (browser) | Discord

Documentation Documentation
Run tutorials on Binder Run Example on Colab

animated animated
Trained on a 2020 MacBook Pro (M1) using RLtools TD3

animated
Trained on a 2020 MacBook Pro (M1) using RLtools PPO

Benchmarks

Benchmarks of training the Pendulum swing-up using different RL libraries (PPO and SAC respectively)

Benchmarks of training the Pendulum swing-up on different devices (SAC, RLtools)

Benchmarks of the inference frequency for a two-layer [64, 64] fully-connected neural network across different microcontrollers (types and architectures).

Algorithms

Algorithm Example
TD3 Pendulum, Racing Car, MuJoCo Ant-v4, Acrobot
PPO Pendulum, Racing Car, MuJoCo Ant-v4 (CPU), MuJoCo Ant-v4 (CUDA)
SAC Pendulum (CPU), Pendulum (CUDA), Acrobot

Projects Based on RLtools

Getting Started

Simple example on how to implement your own environment and train a policy using PPO:

Clone and checkout:

git clone https://github.com/rl-tools/example
cd example
git submodule update --init external/rl_tools

build and run:

mkdir build
cd build
cmake .. -DCMAKE_BUILD_TYPE=Release
cmake --build .
./my_pendulum

Note this example does not have dependencies and should work on any system with CMake and a C++ 17 compiler.

Documentation

The documentation is available at docs.rl.tools and consists of C++ notebooks. You can also run them locally to tinker around:

docker run -p 8888:8888 rltools/documentation

After running the Docker container, open the link that is displayed in the CLI (http://127.0.0.1:8888/...) in your browser and enjoy tinkering!

Chapter Documentation Interactive Notebook
0 Overview -
1 Containers Binder
2 Multiple Dispatch Binder
3 Deep Learning Binder
4 CPU Acceleration Binder
5 MNIST Classification Binder
6 Deep Reinforcement Learning Binder
7 The Loop Interface Binder
8 Custom Environment Binder
9 TinyRL: A Python Interface for RLtools Run Example on Colab

Repository Structure

To build the examples from source (either in Docker or natively), first the repository should be cloned. Instead of cloning all submodules using git clone --recursive which takes a lot of space and bandwidth we recommend cloning the main repo containing all the standalone code for RLtools and then cloning the required sets of submodules later:

git clone https://github.com/rl-tools/rl-tools.git rl_tools

Cloning submodules

There are three classes of submodules:

  1. External dependencies (in external/)
    • E.g. HDF5 for checkpointing, Tensorboard for logging, or MuJoCo for the simulation of contact dynamics
  2. Examples/Code for embedded platforms (in embedded_platforms/)
  3. Redistributable dependencies (in redistributable/)
  4. Test dependencies (in tests/lib)
  5. Test data (in tests/data)

These sets of submodules can be cloned incrementally/independent of each other. For most use-cases (like e.g. most of the Docker examples) you should clone the submodules for external dependencies:

cd rl_tools
git submodule update --init --recursive -- external

The submodules for the embedded platforms, the redistributable binaries and test dependencies/data can be cloned in the same fashion (by replacing external with the appropriate folder from the enumeration above). Note: Make sure that for the redistributable dependencies and test data git-lfs is installed (e.g. sudo apt install git-lfs on Ubuntu) and activated (git lfs install) otherwise only the metadata of the blobs is downloaded.

Docker

If you would like to take advantage of the features that require additional dependencies, but don't want to install them on your machine yet, you can use Docker. In our experiments on Linux using the NVIDIA container runtime we were able to achieve close to native performance. Docker instructions & examples While it depends on personal preferences, we believe that there are good reasons (ease of debugging, usage of IDEs etc.) to run everything natively when developing. We make sure that the additional dependencies requried for the full feature set are not invasive and usually available through your systems package manager. We believe sudo ./setup.sh is harmful and should not exist. Instead we make the setup explicit so that users maintain agency over their systems.

Native

For maximum performance and malleability for research and development we recommend to run RLtools natively. Since RLtools itself is dependency free the most basic examples don't need any platform setup. However, for an improved experience, we support HDF5 checkpointing and Tensorboard logging as well as optimized BLAS libraries which comes with some system-dependent requirements.

Python Bindings: TinyRL

We provide experimental Python bindings available as tinyrl. Note that the Python bindings are still work in progress and using Python Gym environments slows down the trianing significantly compared to native RLtools environments.

pip install tinyrl gymnasium

Usage:

from tinyrl import SAC
import gymnasium as gym

seed = 0xf00d
def env_factory():
    env = gym.make("Pendulum-v1")
    env.reset(seed=seed)
    return env

sac = SAC(env_factory)
state = sac.State(seed)

finished = False
while not finished:
    finished = state.step()

Embedded Platforms

Inference & Training

Inference

Naming Convention

We use snake_case for variables/instances, functions as well as namespaces and PascalCase for structs/classes. Furthermore, we use upper case SNAKE_CASE for compile-time constants.

Citing

When using RLtools in an academic work please cite our publication using the following Bibtex citation:

@misc{eschmann2023rltools,
      title={RLtools: A Fast, Portable Deep Reinforcement Learning Library for Continuous Control}, 
      author={Jonas Eschmann and Dario Albani and Giuseppe Loianno},
      year={2023},
      eprint={2306.03530},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

RLtools's Projects

libattopng icon libattopng

A minimal C library to write uncompressed PNG images

rl-tools icon rl-tools

A Fast, Portable Deep Reinforcement Learning Library for Continuous Control

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.