Code Monkey home page Code Monkey logo

cl_initial_buffer's Introduction

Contrastive Initial State Buffer for Reinforcement Learning

youtube_video

This is the code for the ICRA24 paper Contrastive Initial State Buffer for Reinforcement Learning (PDF) by Nico Messikommer, Yunlong Song, and Davide Scaramuzza. For an overview of our method, check out our video.

If you use any of this code, please cite the following publication:

@Article{Messikommer24icra,
  author  = {Nico Messikommer and Yunlong Song and Davide Scaramuzza},
  title   = {Contrastive Initial State Buffer for Reinforcement Learning},
  journal = {2024 IEEE International Conference on Robotics and Automation (ICRA)},
  year    = {2024},
}

Abstract

In Reinforcement Learning, the trade-off between exploration and exploitation poses a complex challenge for achieving efficient learning from limited samples. While recent works have been effective in leveraging past experiences for policy updates, they often overlook the potential of reusing past experiences for data collection. Independent of the underlying RL algorithm, we introduce the concept of a Contrastive Initial State Buffer, which strategically selects states from past experiences and uses them to initialize the agent in the environment in order to guide it toward more informative states. We validate our approach on two complex robotic tasks without relying on any prior information about the environment: (i) locomotion of a quadruped robot traversing challenging terrains and (ii) a quadcopter drone racing through a track. The experimental results show that our initial state buffer achieves higher task performance than the nominal baseline while also speeding up training convergence.

Content

This repository contains the code for the Contrastive Initial State Buffer (CL-Buffer), which can be installed as a Python library. The repository does not contain the code for training an RL agent in an environment (Drone Racing or Legged Locomotion). However, with the given toy example (toy_example.py), it is straightforward to implement the CL-Buffer in an existing RL framework.

Installation

  1. If desired, a conda environment can be created using the following command:
conda create -n <env_name>
  1. If needed, the dependencies for the toy_example.py script can be installed via the requirements.txt file.
pip install -r requirements.txt

Dependencies:
  • PyTorch
  • Numpy
  • Fast Pytorch Kmeans

  1. Install the CL-Buffer library by running the following command inside the directory where the setup.py file is located.
pip install .

Usage

Installing the initial state buffer as a library makes it possible to import the buffer using the import statement directly

from initial_buffer.algorithms.projection_buffer import ProjectionBuffer

The ProjectionBuffer class includes three sampling methods: ['network', 'observations', 'random']. The CL-Buffer corresponds to the 'network' sampling strategy. For the explanation of the other sampling strategies, we refer to the paper.

There are multiple hyperparameters that can be set for the training of the buffer; see the arguments in the __init__ function for the ProjectionBuffer class in initial_buffer/algorithms/projection_buffer.py. Generally, we noticed that the initial state clustering is not affected much by parameters in a similar range as the default parameters.

For a toy example, please have a look at the toy_example.py script. It includes template functions for adding visited experiences to the buffer, training the buffer, and using the buffer for the selection of states. The visited state buffer is not included in the toy_example.py since it highly depends on the underlying environment. However, the visited state buffer can be implemented relatively easily using a simple array/dict/list storing the states, observations, dones, and rewards of the collected experiences.

cl_initial_buffer's People

Contributors

messikommernico avatar

Stargazers

 avatar Ning Huang avatar Young avatar  avatar unaughty avatar  avatar Gonzalo Olguín avatar  avatar Ajay Shankar Sriam avatar  avatar Aman Arora avatar Jung Yeon Lee avatar  avatar Sukrit Gupta avatar Francesco avatar Holly Dinkel avatar Ahmad Kheirandish avatar Samuel Shuai Lee avatar Adam Chełchowski avatar Nick Imanzi avatar Silvio Traversaro avatar

Watchers

Harmish Khambhaita avatar Mathias Gehrig avatar Yunlong avatar Marco Cannici avatar Alex Barden avatar Kostas Georgiou avatar  avatar

cl_initial_buffer's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.