The dsaa5009rl from fragile-azalea

dsaa5009rl's Introduction

DSAA5009: Offline Reinforcement Learining project.

In this project, you need to train an agent to learn to do continuous control on the D4RL dataset, which is a standard dataset for offline reinforcement learning. There are many datasets in D4RL. Here we focus on the Mujoco locomotion task hopper, in which you are required to use the offline dataset hopper-medium-v2 (1M samples from a policy trained to approximately 1/3 the performance of the expert) to train the agent to control the one-leg entity to hop forward:

Compared to online reinforcement learning tasks, offline reinforcement learning tasks present a greater challenge due to the agent's reliance solely on available data, precluding its interaction with the environment to obtain performance evaluations and modify its actions. While online reinforcement learning algorithms can be employed to train the agent, caution must be exercised when the agent takes actions that lack feedback from the offline dataset. This is because in the context of offline reinforcement learning, emphasizing actions with spurious high-yield actions results in the accumulation of errors and, ultimately, the failure of the performance. In the hopper task, such a failure would cause the one-leg to fall down.

Prior to commencing the project, you need to prepare the Mujoco environment and install the D4RL datasets. To establish the Mujoco task environment, you may refer to the guidelines or just simply run:

pip install -r requirements.txt

To use the D4RL dataset, you need to setup the D4RL environment by

git clone https://github.com/rail-berkeley/d4rl.git
cd d4rl
pip install -e .

Pls use train_offline.py

Have fun!

Recommend Projects

fragile-azalea / dsaa5009rl Goto Github PK

dsaa5009rl's Introduction

DSAA5009: Offline Reinforcement Learining project.

dsaa5009rl's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent