Code Monkey home page Code Monkey logo

drlnd-multiagent-project's Introduction

Deep Reinforcement Learning - Collaboration and Competition Project

In this notebook, we have implemented the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm for the "Collaboration and Competition" project of the Udacity Deep Reinforcement Learning Nanodegree program.

By Sebastian Castro, 2020


Project Introduction

This project uses a version of the Tennis environment in Unity ML-Agents.

This environment consists of two tennis players, or agents, each of which has its own local set of observations, actions, and rewards. The specifics are discussed below, but the environment is structured such that a "good" game consists of an infinite volley where both players are constantly hitting the ball back to each other without scoring.

Trained Agents Playing Tennis

The reinforcement learning specifics for each agent are:

  • State: 24 variables (8 observations stacked for 3 subsequent time steps) corresponding to position and velocity of the ball and racket.
  • Actions: A vector with 2 elements -- one for moving towards/away from the net and another for jumping. Both are continuous variables between -1.0 and 1.0.
  • Reward: The agent receives +0.1 reward each time it hits the ball over the net, and -0.01 if it lets a ball hit the ground or go out of bounds. This is what incentivizes the agents to play forever rather than scoring, unlike your typical game of tennis.

As per the project specification, both agents are considered to have "solved" the problem if the maximum return of the 2 agents is greater than 0.5 over a sustained 100-episode average.

To see more details about the MADDPG agent implementation, network and training hyper parameters, and results, refer to the Report included in this repository.


Getting Started

To get started with this project, first you should perform the setup steps in the Udacity Deep Reinforcement Learning Nanodegree Program GitHub repository. Namely, you should

  1. Install Conda and create a Python 3.6 virtual environment
  2. Install OpenAI Gym
  3. Clone the Udacity repo and install the Python requirements included
  4. Download the Tennis Unity files appropriate for your operating system and architecture (Linux, Mac OSX, Win32, Win64)

Once you have performed this setup, you should be ready to run the tennis_maddpg.ipynb Jupyter Notebook in this repo. This notebook contains all the steps needed to define and train MADDPG agents to solve this environment.

drlnd-multiagent-project's People

Contributors

sea-bass avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.