Code Monkey home page Code Monkey logo

collab-compete_ddpg's Introduction

Collaboration and Competition

Introduction

This project works with the Tennis environment.

Trained Agent

In this environment, two agents control rackets to bounce a ball over a net. If an agent hits the ball over the net, it receives a reward of +0.1. If an agent lets a ball hit the ground or hits the ball out of bounds, it receives a reward of -0.01. Thus, the goal of each agent is to keep the ball in play.

The observation space consists of 8 variables corresponding to the position and velocity of the ball and racket. Each agent receives its own, local observation. Two continuous actions are available, corresponding to movement toward (or away from) the net, and jumping.

The task is episodic, and in order to solve the environment, the two agents must get an average score of +0.5 (over 100 consecutive episodes, after taking the maximum over both agents). Specifically,

  • After each episode, we add up the rewards that each agent received (without discounting), to get a score for each agent. This yields 2 (potentially different) scores. We then take the maximum of these 2 scores.
  • This yields a single score for each episode.

The environment is considered solved, when the average (over 100 episodes) of those scores is at least +0.5.

INFO:unityagents:
'Academy' started successfully!
Unity Academy name: Academy
        Number of Brains: 1
        Number of External Brains : 1
        Lesson number : 0
        Reset Parameters :
		
Unity brain name: TennisBrain
        Number of Visual Observations (per agent): 0
        Vector Observation space type: continuous
        Vector Observation space size (per agent): 8
        Number of stacked Vector Observation: 3
        Vector Action space type: continuous
        Vector Action space size (per agent): 2
        Vector Action descriptions: , 

Installation Instruction

The README has instructions for installing dependencies or downloading needed files.

Python 3.6 is required. The program requires PyTorch, the ML-Agents toolkit, and a few more Python packages required to complete the project.

git clone https://github.com/udacity/deep-reinforcement-learning.git  
cd deep-reinforcement-learning/python  
pip install .

Run the following to create drlnd kernel in ipython so that the right unity environment is loaded correctly

python -m ipykernel install --user --name drlnd --display-name "drlnd"

Pytorch can be installed with the commands recommended in https://pytorch.org/ for the respective OS. Fo example for Conda package installling into a Windows environment with Python 3.6 is: ''' conda install pytorch -c pytorch '''

Getting Started

Place Report.ipynb in the folder p3_collab-compet/ together with the following two files:

  1. ddpg_agent.py - contains the DDPG agent code.
  2. model.py - contains Actor and Critic neural network modules classes

The Unity Reacher environment can be downloaded from here:

(For Windows users) Check out this link if you need help with determining if your computer is running a 32-bit version or 64-bit version of the Windows operating system.

Choose the environment suitable for your machine. Unzipping will create another Tennis_XXX folder. For example, if the Tennis Windows 64-bits environment is downloaded, Tennis_Windows_x86_64 will be created.

Run p3_collab-compet/Report.ipynb

Enter the right path for the Unity Tennis environment in Report.ipynb. For example for a folder consisting a Windows 64-bits environemnt is:

env = UnityEnvironment(file_name="./Tennis_Windows_x86_64/Tennis.exe")

Run the remaining cell as ordered in Report.ipynb to train the Actor-Critic DDPG agent.

collab-compete_ddpg's People

Contributors

jiewwantan avatar

Stargazers

Mohammed Aydid Hasan avatar Luis Diaz avatar

Watchers

 avatar Luis Diaz avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.