Code Monkey home page Code Monkey logo

maddpg-pytorch's Introduction

MADDPG-PyTorch

PyTorch Implementation of MADDPG from Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments (Lowe et. al. 2017)

Requirements

The versions are just what I used and not necessarily strict requirements.

How to Run

All training code is contained within main.py. To view options simply run:

python main.py --help

Results

Physical Deception

In this task, the two blue agents are rewarded by minimizing the closest of their distances to the green landmark (only one needs to be close to get optimal reward), while maximizing the distance of the red adversary from the green landmark. The red adversary is rewarded by minimizing it's distance to the green landmark; however, on any given trial, it does not know which landmark is green, so it must follow the blue agents. As such, the blue agents should learn to deceive the red agent by covering both landmarks.

Cooperative Communication

This task involves two agents, one that is stationary and one that can move. The stationary agent sees the color of the other agent as its observation, and outputs a one-hot communication vector as its action. The moving agent receives the communication vector, as well as its relative distance to all landmarks on the screen; however, it does not know its own color. The goal of both agents is for the moving agent to reach the landmark that matches its own color. Thus, the agents must learn to communicate such that the moving agent knows where to go on each randomized trial.

Predator-Prey

This task involves a single prey agent (in green) and a team of three predators (in red). The prey agent is 30% faster than the predators, so the predators must learn how to team up in order to catch the prey.

In the trials below, the prey agent uses DDPG as its learning algorithm.

Not Implemented

There are a few items from the paper that have not been implemented in this repo

  • Ensemble Training
  • Inferring other agents' policies
  • Mixed continuous/discrete action spaces

Acknowledgements

The OpenAI baselines Tensorflow implementation and Ilya Kostrikov's Pytorch implementation of DDPG were used as references. After the majority of this codebase was complete, OpenAI released their code for MADDPG, and I made some tweaks to this repo to reflect some of the details in their implementation (e.g. gradient norm clipping and policy regularization).

maddpg-pytorch's People

Contributors

shariqiqbal2810 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.