ic3net / ic3net Goto Github PK

View Code? Open in Web Editor NEW

206.0 206.0 49.0 63 KB

Code for ICLR 2019 paper: Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks

Home Page: https://arxiv.org/abs/1812.09755

License: MIT License

Python 100.00%

communication multiagent multiagent-systems pytorch reinforcement-learning

ic3net's Introduction

IC3Net

This repository contains reference implementation for IC3Net paper (accepted to ICLR 2019), Learning when to communicate at scale in multiagent cooperative and competitive tasks, available at https://arxiv.org/abs/1812.09755

Cite

If you use this code or IC3Net in your work, please cite the following:

@article{singh2018learning,
  title={Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks},
  author={Singh, Amanpreet and Jain, Tushar and Sukhbaatar, Sainbayar},
  journal={arXiv preprint arXiv:1812.09755},
  year={2018}
}

Standalone environment version

Find gym-starcraft at this repository: apsdehal/gym-starcraft
Find ic3net-envs at this repository: apsdehal/ic3net-envs

Installation

First, clone the repo and install ic3net-envs which contains implementation for Predator-Prey and Traffic-Junction

git clone https://github.com/IC3Net/IC3Net
cd IC3Net/ic3net-envs
python setup.py develop

Optional: If you want to run experiments on StarCraft, install the gym-starcraft package included in this package. Follow the instructions provided in README inside that packages.

Next, we need to install dependencies for IC3Net including PyTorch. For doing that run:

pip install -r requirements.txt

Running

Once everything is installed, we can run the using these example commands

Note: We performed our experiments on nprocesses set to 16, you can change it according to your machine, but the plots may vary.

Note: Use OMP_NUM_THREADS=1 to limit the number of threads spawned

Predator-Prey

IC3Net on easy version

python main.py --env_name predator_prey --nagents 3 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 5 --max_steps 20 --ic3net --vision 0 --recurrent

CommNet on easy version

python main.py --env_name predator_prey --nagents 3 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 5 --max_steps 20 --commnet --vision 0 --recurrent

IC on easy version

python main.py --env_name predator_prey --nagents 3 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 5 --max_steps 20 --vision 0 --recurrent

IRIC on easy version

python main.py --env_name predator_prey --nagents 3 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 5 --max_steps 20 --mean_ratio 0 --vision 0 --recurrent

For medium version, change the following arguments:

nagents to 5
max_steps to 40
vision to 1
dim to 10

For hard version, change the following arguments:

nagents to 10
max_steps to 80
vision to 1
dim to 20

Traffic Junction

IC3Net on easy version

python main.py --env_name traffic_junction --nagents 5 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 6 --max_steps 20 --ic3net --vision 0 --recurrent --add_rate_min 0.1 --add_rate_max 0.3 --curr_start 250 --curr_end 1250 --difficulty easy

CommNet on easy version

python main.py --env_name predator_prey --nagents 5 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 6 --max_steps 20 --commnet --vision 0 --recurrent  --add_rate_min 0.1 --add_rate_max 0.3 --curr_start 250 --curr_end 1250 --difficulty easy

IC on easy version

python main.py --env_name predator_prey --nagents 5 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 6 --max_steps 20 --vision 0 --recurrent  --add_rate_min 0.1 --add_rate_max 0.3 --curr_start 250 --curr_end 1250 --difficulty easy

IRIC on easy version

python main.py --env_name predator_prey --nagents 5 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 6 --max_steps 20 --mean_ratio 0 --vision 0 --recurrent --add_rate_min 0.1 --add_rate_max 0.3 --curr_start 250 --curr_end 1250 --difficulty easy

For medium version, change the following arguments:

nagents to 10
max_steps to 40
dim to 14
add_rate_min to 0.05
add_rate_max to 0.02
difficulty to medium

For hard version, change the following arguments:

nagents to 20
max_steps to 80
dim to 18
add_rate_min to 0.02
add_rate_max to 0.05
difficulty to hard

StarCraft

Make sure you have gym-starcraft properly installed and configuration properly configured.

For explore task 50x50, 10Medic, see the examples below, replace torchcraft_dir argument with your torchcraft directory location

IC3Net

python -u main.py --env_name starcraft --task_type explore --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 1 --our_unit_type 34 --enemy_unit_type 34 --init_range_end 150 --ic3net --recurrent --rnn_type LSTM --detach_gap 10 --stay_near_enemy --explore_vision 10 --step_size 16

CommNet

python -u main.py --env_name starcraft --task_type explore --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 1 --our_unit_type 34 --enemy_unit_type 34 --init_range_end 150 --commnet --recurrent --rnn_type LSTM --detach_gap 10 --stay_near_enemy --explore_vision 10 --step_size 16

IRIC

python -u main.py --env_name starcraft --task_type explore --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 1 --our_unit_type 34 --enemy_unit_type 34 --init_range_end 150 --mean_ratio 0 --recurrent --rnn_type LSTM --detach_gap 10 --stay_near_enemy --explore_vision 10 --step_size 16

python -u main.py --env_name starcraft --task_type explore --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 1 --our_unit_type 34 --enemy_unit_type 34 --init_range_end 150 --recurrent --rnn_type LSTM --detach_gap 10 --stay_near_enemy --explore_vision 10 --step_size 16

For 75x75, set --init_range_end to 175.

For Combat version:

IC3Net

python -u main.py --env_name starcraft --task_type combat --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 3 --our_unit_type 0 --enemy_unit_type 65 --init_range_end 150 --ic3net --recurrent --rnn_type LSTM --detach_gap 10 --explore_vision 10 --step_size 16

CommNet

python -u main.py --env_name starcraft --task_type combat --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 3 --our_unit_type 0 --enemy_unit_type 65 --init_range_end 150 --commnet --recurrent --rnn_type LSTM --detach_gap 10 --explore_vision 10 --step_size 16

IRIC

python -u main.py --env_name starcraft --task_type combat --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 3 --our_unit_type 0 --enemy_unit_type 65 --init_range_end 150 --mean_ratio 0 --recurrent --rnn_type LSTM --detach_gap 10 --explore_vision 10 --step_size 16

python -u main.py --env_name starcraft --task_type combat --nagents 10 --num_epochs 1000 --hid_size 128 --lrate 0.002 --max_steps 60 --nprocesses 16 --torchcraft_dir=~/Public/TorchCraft --frame_skip 8 --nenemies 3 --our_unit_type 0 --enemy_unit_type 65 --init_range_end 150 --recurrent --rnn_type LSTM --detach_gap 10 --explore_vision 10 --step_size 16

Contributors

Amanpreet Singh (@apsdehal)
Tushar Jain (@tshrjn)
Sainbayar Sukhbaatar (@tesatory)

License

Code is available under MIT license.

ic3net's People

Contributors

Stargazers

Watchers

Forkers

yinjiangjin harveyphm hatleon imxiaoxuesheng tanmdl rainwangphy zhangnyg tao2020 sarraalqahtani quanticnova sunnyem tzuren lee15253 arm-comal wanghuimu timefly-1989 tianqi-777 wsg1873 rafaelmp2 faithmai mrsmithx2970 imcl-marl paladinee15 gutsy-robot omsrisagar ozaki39 djmartingale kadhirumasankar naive-lzm yyds-xtt minitsl maxiao94 howrhy khalil-hennara hejichao2020 xiayurain95 luckyredpanda guttappa1238 rza-a salwamostafa amirkasra007 jw3il ustckh mezereonxp pennyxqz qst75693

ic3net's Issues

Starcraft: "Error: file_reader: failed to open ./Patch_rt.mpq for reading"

Hi, I hope you can help with this error. I'm consistently getting a message
Error: file_reader: failed to open ./Patch_rt.mpq for reading
when trying to run with Starcraft. I have successfully installed it and everything. The error appears to happen within StarCraftBaseEnv during init, and specifically when it returns when final_init=False.

It continues running but stays stuck in a loop reading lines from TorchCraft (here).

I have grep'd and find'd for this file in the top level IC3Net directory and found nothing. Do you know where this is supposed to be and it isn't?

Thanks in advance! I'd like to extend your method and need to run this code.

A small question of implementation

Thank you for your sharing, but I have a small question. Why take prev_hid[0] and prev_hid[1] out of the calculation graph in trainer.py. (click here).

if (t + 1) % self.args.detach_gap == 0:
    if self.args.rnn_type == 'LSTM':
        prev_hid = (prev_hid[0].detach(), prev_hid[1].detach())
    else:
        prev_hid = prev_hid.detach()

Looking forward to your reply.

Display trained models

Is it possible to display, for example, predator-prey after training and record a video? I saw information about the display(during training) flag, but I am interested in the results after training.

Replicate paper results on Traffic Junction env

Hi,

I am trying to replicate one of the experiments shown in the CommNet article:

"Impressively, with zero visibility (the cars are driving blind) the CommNet model is still able to succeed 90% of the time."

I am trying to run the following:
python3 main.py --env_name traffic_junction --nagents 10 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 6 --max_steps 40 --commnet --vision 0 --add_rate_min 0.05 --add_rate_max 0.02 --curr_start 250 --curr_end 1250 --difficulty medium

I can't run the code with 16 CPUs (only with 1 CPU), and you mentioned in another issue the results of the paper could only be replicated with 16 CPUs at least, but to what extend?

So I then tried running the following:
python3 main.py --env_name traffic_junction --nagents 10 --nprocesses 1 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 6 --max_steps 40 --commnet --vision 0 --add_rate_min 0.05 --add_rate_max 0.02 --curr_start 250 --curr_end 1250 --difficulty medium

I get this model summary which seems to make sense:

(heads): ModuleList((0): Linear(in_features=128, out_features=2, bias=True))
(encoder): Linear(in_features=29, out_features=128, bias=True)
(f_modules): ModuleList((0): Linear(in_features=128, out_features=128, bias=True))
(C_modules): ModuleList((0): Linear(in_features=128, out_features=128, bias=True))
(tanh): Tanh()
(value_head): Linear(in_features=128, out_features=1, bias=True))

But the model performs very poorly (success rate varies between 0.6 and 0.7)

Also, I used the environment with RLlib and I get roughly the same score.

Thanks for any help figuring this out.

I can only run in one cpu?

I'm very sorry to trouble you. When I run --nprocesses more than one, it can't run. But --nprocesses 1 is ok, and the result of different nprocesses is similar?

One issue when run your code

Hi,

When I run your code there happens an error attached below, please see it.

Cannot run the code

Hi, I'm sorry to trouble you,
I am trying to run the code in Traffic junction with the default setting with IC3Net and the others.
Except the others, only when run the IC3Net which is your proposed model, the following error is occur.
Traceback (most recent call last):
File "/home/anaconda3/envs/ic3net/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/multi_processing.py", line 27, in run
batch, stat = self.trainer.run_batch(epoch)
File "/home/trainer.py", line 234, in run_batch
episode, episode_stat = self.get_episode(epoch)
File "/home//trainer.py", line 54, in get_episode
action_out, value, prev_hid = self.policy_net(x, info)
File "/home/envs/ic3net/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/IC3Net/comm.py", line 175, in forward
agent_mask *= comm_action_mask.double()
RuntimeError: unsupported operation: more than one element of the written-to tensor refers to a single memory location. Please clone() the tensor before performing the operation

I guess the hard_attention makes the problem, but not sure.
can you fix it or guide to run the IC3Net with default setting? Thanks.

About the result in Traffic Junction

Hi,

I'm very sorry to trouble you. I am trying to run the code in Traffic Junction (easy version) and follow the arguments. But I find the result is not right. Finally, the success rate is about 23%. And I don' t know where the error is now. So I want to seek your help. Thanks!

'PredatorPreyEnv' object has no attribute 'stdscr'

When I try to render the predatorPreyEnv, I see the Error message

'PredatorPreyEnv' object has no attribute 'stdscr'

I have gym==0.9.6
Do you have any idea to fix this error?

Cannot reproduct your experiment results?

Hi prefessor :) Thank you for making a great contribution to MARL areas and your gate mechanism plays an important role in communication. Also it improves the communication efficiency.
Here I meet a tiny problem that when I run
python main.py --env_name predator_prey --nagents 3 --nprocesses 16 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.001 --dim 5 --max_steps 20 --ic3net --vision 0 --recurrent
in PP env with easy and hard mode, the results of average steps-taken are 15 and 75 respectively, which is differnt from you result of 8.9 and 52.4.
I can confirm that I don't update any code. So, could you kindly tell me why it happend?

After running your code, I got poor results. I want to ask what I did wrong?

I set the same seed for numpy and pytorch.
torch.manual_seed(args.seed)
np.random.seed(args.seed)

The results I got are as follows:

python main.py --env_name traffic_junction --nagents 10 --nprocesses 1 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.003 --dim 14 --max_steps 40 --ic3net --recurrent --vision 0 --add_rate_min 0.05 --add_rate_max 0.2 --curr_start 250 --curr_end 1250 --difficulty medium --seed 7715

python main.py --env_name traffic_junction --nagents 10 --nprocesses 1 --num_epochs 2000 --hid_size 128 --detach_gap 10 --lrate 0.003 --dim 14 --max_steps 40 --ic3net --recurrent --vision 3 --add_rate_min 0.05 --add_rate_max 0.2 --curr_start 250 --curr_end 1250 --difficulty medium --seed 1314

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.