Code Monkey home page Code Monkey logo

deep-reinforcement-learning-book / chapter16-robot-learning-in-simulation Goto Github PK

View Code? Open in Web Editor NEW
47.0 2.0 18.0 21.45 MB

Chapter 16 Robot Learning in Simulation in book Deep Reinforcement Learning: example of Sawyer robot learning to reach the target with paralleled Soft Actor-Critic (SAC) algorithm, using PyRep for Sawyer robot simulation and game building. The environment is wrapped into OpenAI Gym format.

Home Page: https://deep-reinforcement-learning-book.github.io

Jupyter Notebook 75.89% Python 24.11%

chapter16-robot-learning-in-simulation's Introduction

Chapter 16: Robot Learning in Simulation (Project 4)

Description:

Example of Sawyer robot learning to reach the target with paralleled Soft Actor-Critic (SAC) algorithm, using PyRep for Sawyer robot simulation and game building. The environment is wrapped into OpenAI Gym format.

Dependencies:

Note:

  • The later version of V-REP 3.6.2 is renamed CoppeliaSim after verison 4.0.0, which may have some incompatible issues with PyRep during the process of this project, so we suggest to use V-REP 3.6.2 here and the maintained PyRep in our repository.
  • The official repository of PyRep is here, but we maintain a stable version here in our repository for supporting V-REP 3.6.2, please use the version we provide (here) for avoiding unnecessary incompatibility.

Contents:

  • arms/: object models of arms;
  • hands/: object models of grippers;
  • objects/: models of other objects in the scene;
  • scenes/: built scenes for Sawyer robot grasping;
  • figures/: figures for displaying;
  • model/: the model after training, and two pre-trained models with different reward functions;
  • data/: reward logs of with different reward functions;
  • sawyer_grasp_env_boundingbox.py: script of Sawyer robot grasping environment;
  • sac_learn.py: pralleled Soft Actor-Critic algorithm for solving Sawyer robot grasping task;
  • reward_log.npy: log of episode reward during training;
  • plot.ipynb: displaying the learning curves.

Usage:

  1. First check the environment can run successfully:

    $ python sawyer_grasp_env_boundingbox.py

    If it works properly with VRep called to run a scene, with Sawyer robot arm moving randomly, then go to next step; otherwise check the dependencies for necessary packages and versions.

  2. Run $ python sac_learn.py --train for training the policy

  3. Run $ python sac_learn.py --test for testing the trained policy, remember to change the trained_model_path, which is default to be the trained model we provided.

  4. The training process will provide a reward_log.npy file for recording the reward value during training, which can be displayed with $ jupyter notebook in a new terminal, choose plot.ipynband Shift+Enter to run the first cell, shown as follows:

Authors:

Zihan Ding, Yanhua Huang

Citing:

@misc{DeepReinforcementLearning-Chapter16-RobotLearninginSimulation,
  author = {Zihan Ding, Yanhua Huang},
  title = {Chapter16-RobotLearninginSimulation},
  year = {2019},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/deep-reinforcement-learning-book/Chapter16-Robot-Learning-in-Simulation}},
}

or

@book{deepRL-2020,
 title={Deep Reinforcement Learning: Fundamentals, Research, and Applications},
 editor={Hao Dong, Zihan Ding, Shanghang Zhang},
 author={Hao Dong, Zihan Ding, Shanghang Zhang, Hang Yuan, Hongming Zhang, Jingqing Zhang, Yanhua Huang, Tianyang Yu, Huaqing Zhang, Ruitong Huang},
 publisher={Springer Nature},
 note={\url{http://www.deepreinforcementlearningbook.org}},
 year={2020}
}

chapter16-robot-learning-in-simulation's People

Contributors

quantumiracle avatar zsdonghao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

chapter16-robot-learning-in-simulation's Issues

About Comparision with different reward functions

I noticed that the picture of ''Comparision with different reward functions''didn't change.
So does the code only provide augmented reward?
What is the difference between sparse reward,dense reward and augment reward?
Looking forward to your reply. 😄

ImportError: libcoppeliaSim.so.1: cannot open shared object file: No such file or directory

Encounter 'Building wheel for cffi (setup.py) ... error' when trying to install the python library using pip3 install -r requirements.txt on Ubuntu 18.04

Requirement already satisfied: numpy in /FILEPATH/anaconda3/lib/python3.9/site-packages (from -r requirements.txt (line 1)) (1.20.3) Collecting cffi==1.11.5 Using cached cffi-1.11.5.tar.gz (438 kB) Requirement already satisfied: pycparser in /FILEPATH/anaconda3/lib/python3.9/site-packages (from cffi==1.11.5->-r requirements.txt (line 2)) (2.20) Building wheels for collected packages: cffi Building wheel for cffi (setup.py) ... error ............... Running setup.py clean for cffi Failed to build cffi Installing collected packages: cffi Attempting uninstall: cffi Found existing installation: cffi 1.14.2 Uninstalling cffi-1.14.2: Successfully uninstalled cffi-1.14.2 Running setup.py install for cffi ... error

I tried to install cffi seperately, it turns out both cffi ver1.11.5 and ver1.14.2 can be successfully installed.

error:"Gripper position is nan"

Hello,
i download the demo for SAC and i'm trying to train from scratch.

When I set the max_episode to 20, the demo can work. But when I set max_episode to 1000 or more, i get always error "Gripper position is nan", I don't know why this error always appear.

any advice?

Thanks in advance.
Jian

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.