Code Monkey home page Code Monkey logo

elf's Introduction

ELF: An Extensive, Lightweight and Flexible Platform for Game Research

Overview

ELF is an Extensive, Lightweight and Flexible platform for game research, in particular for real-time strategy (RTS) games. On the C++-side, ELF hosts multiple games in parallel with C++ threading. On the Python side, ELF returns one batch of game state at a time, making it very friendly for modern RL. In comparison, other platforms (e.g., OpenAI Gym) wraps one single game instance with one Python interface. This makes concurrent game execution a bit complicated, which is a requirement of many modern reinforcement learning algorithms.

Besides, ELF now also provides a Python version for running concurrent game environments, by Python multiprocessing with ZeroMQ inter-process communication. See ./ex_elfpy.py for a simple example.

For research on RTS games, ELF comes with an fast RTS engine, and three concrete environments: MiniRTS, Capture the Flag and Tower Defense. MiniRTS has all the key dynamics of a real-time strategy game, including gathering resources, building facilities and troops, scouting the unknown territories outside the perceivable regions, and defend/attack the enemy. User can access its internal representation and can freely change the game setting.

Overview

ELF has the following characteristics:

  • End-to-End: ELF offers an end-to-end solution to game research. It provides miniature real-time strategy game environments, concurrent simulation, intuitive APIs, web-based visualzation, and also comes with a reinforcement learning backend empowered by Pytorch with minimal resource requirement.

  • Extensive: Any game with C/C++ interface can be plugged into this framework by writing a simple wrapper. As an example, we already incorporate Atari games into our framework and show that the simulation speed per core is comparable with single-core version, and is thus much faster than implementation using either multiprocessing or Python multithreading. In the future, we plan to incorporate more environments, e.g., DarkForest Go engine.

  • Lightweight: ELF runs very fast with minimal overhead. ELF with a simple game (MiniRTS) built on RTS engine runs 40K frame per second per core on a MacBook Pro. Training a model from scratch to play MiniRTS takes a day on 6 CPU + 1 GPU.

  • Flexible: Pairing between environments and actors is very flexible, e.g., one environment with one agent (e.g., Vanilla A3C), one environment with multiple agents (e.g., Self-play/MCTS), or multiple environment with one actor (e.g., BatchA3C, GA3C). Also, any game built on top of the RTS engine offers full access to its internal representation and dynamics. Besides efficient simulators, we also provide a lightweight yet powerful Reinforcement Learning framework. This framework can host most existing RL algorithms. In this open source release, we have provided state-of-the-art actor-critic algorithms, written in PyTorch.

Tutorials

See here.

Install scripts

You need to have cmake >= 3.8, gcc >= 4.9 and tbb (linux libtbb-dev) in order to install this script successfully.

# Download miniconda and install. 
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O $HOME/miniconda.sh
/bin/bash $HOME/miniconda.sh -b
$HOME/miniconda3/bin/conda update -y --all python=3

# Add the following to ~/.bash_profile (if you haven't already) and source it:
export PATH=$HOME/miniconda3/bin:$PATH

# Create a new conda environment and install the necessary packages:
conda create -n elf python=3
source activate elf
# If you use cuda 8.0
# conda install pytorch cuda80 -c soumith
conda install pytorch -c soumith 

pip install --upgrade pip
pip install msgpack_numpy
conda install tqdm
conda install libgcc

# Install cmake >= 3.8, gcc >= 4.9 and libtbb-dev
# This is platform-dependent.

# Clone and build the repository:
cd ~
git clone https://github.com/facebookresearch/ELF
cd ELF/rts/
mkdir build && cd build
cmake .. -DPYTHON_EXECUTABLE=$HOME/miniconda3/bin/python
make

# Train the model
cd ../..
sh ./train_minirts.sh --gpu 0

Supported Environments

Any game with C/C++ interface can be plugged into this framework by writing a simple wrapper. Currently we have the following environment:

  1. MiniRTS and its extensions (./rts)
    A miniature real-time strategy game that captures the key dynamics of its genre, including building workers, collecting resources, exploring unseen territories, defend the enemy and attack them back. The game runs extremely fast (40K FPS per core on a laptop) to faciliate the usage of many existing on-policy reinforcement learning approaches.

  2. Atari games (./atari)
    We incorporate Arcade Learning Environment (ALE) into ELF so that you can load any rom and run 1000 concurrent game instances easily.

  3. Go engine (./go)
    We reimplement our DarkForest Go engine in ELF platform. Now you can easily load a bunch of .sgf files and train your own Go AI with minimal resource requirements (i.e., a single GPU plus a week).

Reference

When you use ELF, please reference the paper with the following BibTex entry:

ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games
Yuandong Tian, Qucheng Gong, Wenling Shang, Yuxin Wu, C. Lawrence Zitnick
NIPS 2017

@article{tian2017elf, 
  title={ELF: An Extensive, Lightweight and Flexible Research Platform for Real-time Strategy Games},
  author={Yuandong Tian and Qucheng Gong and Wenling Shang and Yuxin Wu and C. Lawrence Zitnick},
  journal={Advances in Neural Information Processing Systems (NIPS)},
  year={2017}
}

Relevant Materials

Slides in ICML Video Games and Machine Learning (VGML) workshop.

Demo. Top-left is trained bot while bottom-right is rule-based bot.

Documentation

Check here for detailed documentation. You can also compile your version in ./doc using sphinx.

Basic Usage

ELF is very easy to use. The initialization looks like the following:

# We run 1024 games concurrently.
num_games = 1024

# Wait for a batch of 256 games.
batchsize = 256  

# The return states contain key 's', 'r' and 'terminal'
# The reply contains key 'a' to be filled from the Python side.
# The definitions of the keys are in the wrapper of the game.  
input_spec = dict(s='', r='', terminal='')
reply_spec = dict(a='')

context = Init(num_games, batchsize, input_spec, reply_spec)

The main loop is also very simple:

# Start all game threads and enter main loop.
context.Start()  
while True:
    # Wait for a batch of game states to be ready
    # These games will be blocked, waiting for replies.
    batch = context.Wait()

    # Apply a model to the game state. The output has key 'pi'
    # You can do whatever you want here. E.g., applying your favorite RL algorithms.
    output = model(batch)

    # Sample from the output to get the actions of this batch.
    reply['a'][:] = SampleFromDistribution(output)

    # Resume games.
    context.Steps()   

# Stop all game threads.
context.Stop()  

Please check train.py and eval.py for actual runnable codes.

Dependency

C++ compiler with C++11 support (e.g., gcc >= 4.9) is required. The following libraries are required tbb. CMake >=3.8 is also required.

Python 3.x is required. In addition, you need to install following package: PyTorch version 0.2.0+, tqdm, zmq, msgpack, msgpack_numpy

How to train

To train a model for MiniRTS, please first compile ./rts/game_MC (See the instruction in ./rts/ using cmake). Note that a compilation of ./rts/backend is not necessary for training, unless you want to see visualization.

Then please run the following commands in the current directory (you can also reference train_minirts.sh):

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model \ 
python3 train.py 
    --num_games 1024 --batchsize 128                                                                  # Set number of games to be 1024 and batchsize to be 128.  
    --freq_update 50                                                                                  # Update behavior policy after 50 updates of the model.
    --players "fs=50,type=AI_NN,args=backup/AI_SIMPLE|delay/0.99|start/500;fs=20,type=AI_SIMPLE"      # Specify AI and its opponent, separated by semicolon. `fs` is frameskip that specifies How often your opponent makes a decision (e.g., fs=20 means it acts every 20 ticks)
                                                                                                      # If `backup` is specified in `args`, then we use rule-based AI for the first `start` ticks, then trained AI takes over. `start` decays with rate `decay`. 
    --tqdm                                                                  # Show progress bar.
    --gpu 0                                                                 # Use first gpu. If you don't specify gpu, it will run on CPUs. 
    --T 20                                                                  # 20 step actor-critic
    --additional_labels id,last_terminal         
    --trainer_stats winrate                                                 # If you want to see the winrate over iterations. 
                                                                            # Note that the winrate is computed when the action is sampled from the multinomial distribution (not greedy policy). 
                                                                            # To evaluate your model more accurately, please use eval.py.

Note that long horizon (e.g., --T 20) could make the training much faster and (at the same time) stable. With long horizon, you should be able to train it to 70% winrate within 12 hours with 16CPU and 1GPU. You can control the number of CPUs used in the training using taskset -c.

Here is one trained model with 80% winrate against AI_SIMPLE for frameskip=50. Here is one game replay.

The following is a sample output during training:

Version:  bf1304010f9609b2114a1adff4aa2eb338695b9d_staged
Num Actions:  9
Num unittype:  6
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5000/5000 [01:35<00:00, 52.37it/s]
[2017-07-12 09:04:13.212017][128] Iter[0]:
Train count: 820/5000, actor count: 4180/5000
Save to ./
Filename = ./save-820.bin
Command arguments run.py --batchsize 128 --freq_update 50 --fs_opponent 20 --latest_start 500 --latest_start_decay 0.99 --num_games 1024 --opponent_type AI_SIMPLE --tqdm
0:acc_reward[4100]: avg: -0.34079, min: -0.58232[1580], max: 0.25949[185]
0:cost[4100]: avg: 2.15912, min: 1.97886[2140], max: 2.31487[1173]
0:entropy_err[4100]: avg: -2.13493, min: -2.17945[438], max: -2.04809[1467]
0:init_reward[820]: avg: -0.34093, min: -0.56980[315], max: 0.26211[37]
0:policy_err[4100]: avg: 2.16714, min: 1.98384[1520], max: 2.31068[1176]
0:predict_reward[4100]: avg: -0.33676, min: -1.36083[1588], max: 0.39551[195]
0:reward[4100]: avg: -0.01153, min: -0.13281[1109], max: 0.04688[124]
0:rms_advantage[4100]: avg: 0.15646, min: 0.02189[800], max: 0.79827[564]
0:value_err[4100]: avg: 0.01333, min: 0.00024[800], max: 0.06569[1549]

 86%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉                    | 4287/5000 [01:23<00:15, 46.97it/s]

To evaluate a model for MiniRTS, try the following command (you can also reference eval_minirts.sh):

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model \ 
python3 eval.py 
    --load [your model]
    --batchsize 128 
    --players "fs=50,type=AI_NN;fs=20,type=AI_SIMPLE"  
    --num_games 1024 
    --num_eval 10000
    --tqdm                          # Nice progress bar
    --gpu 0                         # Use GPU 0 as the evaluation gpu.
    --additional_labels id          # Tell the game environment to output additional dict entries.
    --greedy                        # Use greedy policy to evaluate your model. If not specified, then it will sample from the action distributions. 

Here is an example output (it takes 1 min 40 seconds to evaluate 10k games with 12 CPUs):

Version:  dc895b8ea7df8ef7f98a1a031c3224ce878d52f0_
Num Actions:  9
Num unittype:  6
Load from ./save-212808.bin
Version:  dc895b8ea7df8ef7f98a1a031c3224ce878d52f0_
Num Actions:  9
Num unittype:  6
100%|████████████████████████████████████████████████████████████████████████████████████████████| 10000/10000 [01:40<00:00, 99.94it/s]
str_acc_win_rate: Accumulated win rate: 0.735 [7295/2628/9923]
best_win_rate: 0.7351607376801297
new_record: True
count: 0
str_win_rate: [0] Win rate: 0.735 [7295/2628/9923], Best win rate: 0.735 [0]
Stop all game threads ...

SelfPlay

Try the following script if you want to do self-play in Minirts. It will start with two bots, both starting with the pre-trained model. One bot will be trained over time, while the other is held fixed. If you just want to check their winrate without training, try --actor_only.

sh ./selfplay_minirts.sh [your pre-trained model] 

Visualization

To visualize a trained bot, you can specify --save_replay_prefix [replay_file_prefix] when running eval.py to save (lots of) replays. Note that the same flag can also be applied to training/selfplay.

All replay files contain action sequences, are in .rep and should reproduce the exact same game when loaded. To load the replay in the command line, using the following:

./minirts-backend replay --load_replay [your replay] --vis_after 0

and open the webpage ./rts/frontend/minirts.html to check the game. To load and run the replay in the command line only (e.g, if you just want to quickly see who win the game), try:

./minirts-backend replay_cmd --load_replay [your replay]

elf's People

Contributors

easyhard avatar ericnakagawa avatar facebook-github-bot avatar harouwu avatar jamespinkerton avatar maskray avatar ppwwyyxx avatar qucheng avatar richard-zhang avatar shubho avatar weakish avatar yhyu13 avatar yuandong-tian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

elf's Issues

[Question:] Change units behaviour

Sorry to bother you.
In miniRTS game, players can only gain the resource belong to theirselves, but in my research project I really need them to gain the common resources.

So where should I change the code that a resource can be gained by two players?

Also, how to change the properties of units such as capacity of a worker or the gaining speed?

Thanks!

how to run ELF on mac : Torch not compiled with CUDA enabled

The only way is to change the graphics card??

I use MacBook to train minirts.

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model \
python3 train.py
    --num_games 1024 --batchsize 128                                                                  # Set number of games to be 1024 and batchsize to be 128.
    --freq_update 50                                                                                  # Update behavior policy after 50 updates of the model.
    --players "fs=50,type=AI_NN,args=backup/AI_SIMPLE|delay/0.99|start/500;fs=20,type=AI_SIMPLE"      # Specify AI and its opponent, separated by semicolon. `fs` is frameskip that specifies How often your opponent makes a decision (e.g., fs=20 means it acts every 20 ticks)
                                                                                                      # If `backup` is specified in `args`, then we use rule-based AI for the first `start` ticks, then trained AI takes over. `start` decays with rate `decay`.
    --tqdm                                                                  # Show progress bar.
    --gpu 0                                                                 # Use first gpu. If you don't specify gpu, it will run on CPUs.
    --T 20                                                                  # 20 step actor-critic
    --additional_labels id,last_terminal
    --trainer_stats winrate                                                 # If you want to see the winrate over iterations.
                                                                            # Note that the winrate is computed when the action is sampled from the multinomial distribution (not greedy policy).

and get following error message:

Namespace(T=20, actor_only=False, additional_labels='id,last_terminal', arch='ccpccp;-,64,64,64,-', batchsize=128, cmd_dumper_prefix=None, discount=0.99, entropy_ratio=0.01, epsilon=0.0, eval=False, freq_update=50, game_multi=None, gpu=0, grad_clip_norm=None, greedy=False, handicap_level=0, load=None, max_tick=30000, mcts_threads=64, min_prob=1e-06, model_no_spatial=False, num_episode=10000, num_games=1024, num_minibatch=5000, output_file=None, players='fs=50,type=AI_NN,args=backup/AI_SIMPLE|delay/0.99|start/500;fs=20,type=AI_SIMPLE', record_dir='./record', sample_node='pi', sample_policy='epsilon-greedy', save_dir=None, save_prefix='save', save_replay_prefix=None, seed=0, shuffle_player=False, tqdm=True, trainer_stats='winrate', verbose_collector=False, verbose_comm=False, wait_per_group=False)
Handicap: 0
Max tick: 30000
Seed: 0
Shuffled: False
[name=][fs=50][type=AI_NN][FoW=True][args=backup/AI_SIMPLE|delay/0.99|start/500]
[name=][fs=20][type=AI_SIMPLE][FoW=True]
MCTS #threads: 64 #rollout/thread: 50
Output_prompt_filename: ""
Cmd_dumper_prefix: ""
Save_replay_prefix: ""
Version:  cd4caf696ece372eee2d78cd8806546c9c64cba1_staged
Num Actions:  9
Num unittype:  6
#recv_thread = 4
Deal with connector. key = train, hist_len = 20, player_name =
Traceback (most recent call last):
  File "train.py", line 36, in <module>
    GC = game.initialize()
  File "/Users/xxx/MyProjects/AI/ELF2/ELF/rts/engine/common_loader.py", line 128, in initialize
    return GCWrapper(GC, co, desc, gpu=args.gpu, use_numpy=False, params=params)
  File "/Users/xxx/MyProjects/AI/ELF2/ELF/elf/utils_elf.py", line 149, in __init__
    self._init_collectors(GC, co, descriptions, use_gpu=gpu is not None, use_numpy=use_numpy)
  File "/Users/xxx/MyProjects/AI/ELF2/ELF/elf/utils_elf.py", line 194, in _init_collectors
    inputs.append(Batch.load(GC, "input", input, group_id, use_gpu=use_gpu, use_numpy=use_numpy))
  File "/Users/xxx/MyProjects/AI/ELF2/ELF/elf/utils_elf.py", line 69, in load
    v, info = Batch._alloc(info, use_gpu=use_gpu, use_numpy=use_numpy)
  File "/Users/xxx/MyProjects/AI/ELF2/ELF/elf/utils_elf.py", line 48, in _alloc
    v = v.pin_memory()
  File "/usr/local/lib/python3.5/site-packages/torch/tensor.py", line 82, in pin_memory
  File "/usr/local/lib/python3.5/site-packages/torch/storage.py", line 83, in pin_memory
    allocator = torch.cuda._host_allocator()
  File "/usr/local/lib/python3.5/site-packages/torch/cuda/__init__.py", line 220, in _host_allocator
    _lazy_init()
  File "/usr/local/lib/python3.5/site-packages/torch/cuda/__init__.py", line 84, in _lazy_init
    _check_driver()
  File "/usr/local/lib/python3.5/site-packages/torch/cuda/__init__.py", line 51, in _check_driver
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

my device info:

  • Intel Iris Pro & AMD Radeon R9 M370X
  • macOS Sierra 10.12.6

Cannot compile game_TD and game_CF

Hi

I can compile game_MC successfully, but when I try compiling the other two games, I get this error message:

[cc] wrapper_callback.cc ...
In file included from wrapper_callback.cc:15:0:
ai.h:25:69: error: 'AIComm has not been declared'

No module named 'rlmethod_forward'

Hi,
When running the run.py in the package of rlpytorch, an error arised. It said that no module named 'rlmethod_forward' and it seems that this module does not appear in the project. How can I solve this problem? Thank you very much! : )
default

Compiled with g++ 7.1: what() regex_error

After compiling game_MC and backend with g++ 7.1 on ubuntu 14.04, I get this error message:

what() regex_error
core dump

Is this because g++ does not fully support regex?

[Question:] How can we visualize trained agent?

In ELF/rts/backend/, I run ./minirts selfplay --vis_after 0 or ./minirts humanplay --vis_after 0 to visualize build-in AI's selfplay or humanplay. But how to load trained A3C agents and visualize agent's play?
It seems that action spaces in game_MC/ and backend/ are not same.

[Question:] Multiplayer

Hi, my question is it possible that two bots as teammates to confront the third party? and the two bots cannot attack each other.

Thanks!

atari build error : No package 'ale' found -- help

when i try to build atari with ALE already build and install in python , this is what i got:
atari/build$cmake ..
-- pybind11 v2.3.dev0
-- Checking for module 'ale'
-- No package 'ale' found
CMake Error at /home/tshh/anaconda3/share/cmake-3.9/Modules/FindPkgConfig.cmake:412 (message):
A required package was not found
Call Stack (most recent call first):
/home/tshh/anaconda3/share/cmake-3.9/Modules/FindPkgConfig.cmake:588 (_pkg_check_modules_internal)
CMakeLists.txt:18 (pkg_check_modules)
-- Configuring incomplete, errors occurred!
########################################################################

pip list got result that ALE installed as ale-python-interface
/atari/build$ pip list
DEPRECATION: The default format will switch to columns in the future. You can use --format=(legacy|columns) (or define a format=(legacy|columns) in your pip.conf under the [list] section) to disable this warning.
alabaster (0.7.9)
ale-python-interface (0.0.1)
anaconda-clean (1.0)
anaconda-client (1.5.1)
anaconda-navigator (1.3.1)

**What should i do ?**help!!!

backend compilation problem

Hi,

When I try to compile the backend, I received the following message:

/usr/bin/ld: warning: libzmq.so.5.1.3, needed by /usr/local/lib/libczmq.so, may conflict with libzmq.so.3

I did the following:

  1. downloaded and install zeromq 4.0.4 from here
  2. installed cppzmq (the latest version)
  3. downloaded czmp-3.0.2 and build it

Am I doing something wrong?

make rts/backend failed

Hi, sorry to bother. I failed to make rts/backend, and info was:

In file included from comm_ai.cc:10:
In file included from ./comm_ai.h:13:
./../../vendor/CZMQ-ZWSSock/zwssock.h:13:43: error: unknown type name 'zctx_t'
CZMQ_EXPORT zwssock_t* zwssock_new_router(zctx_t *ctx);
....
....

I'm using mac, and I have brewed install zmq and czmq. No idea why this error occurred, hope to know. Thanks!

Error in evaluating a model

When I run

eval_only=1 game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model python3 run.py --batchsize 128 --fs_opponent 20 --latest_start 500 --latest_start_decay 0.99 --num_games 1024 --opponent_type AI_SIMPLE --stats winrate --num_eval 10000 --tqdm

There come two errors:

Traceback (most recent call last):
  File "run.py", line 154, in <module>
    evaluator = Eval()
  File "run.py", line 65, in __init__
    ("num_eval", 500)
TypeError: __init__() got an unexpected keyword argument 'define_params'
Exception ignored in: <bound method Eval.__del__ of <__main__.Eval object at 0x7fde053ee160>>
Traceback (most recent call last):
  File "run.py", line 132, in __del__
    self.GC.Stop()
AttributeError: 'Eval' object has no attribute 'GC'

For the first error TypeError: __init__() got an unexpected keyword argument 'define_params', I change

class Eval:
    def __init__(self):
        self.args = ArgsProvider(
            call_from = self,
            define_params = [
                ("stats", dict(type=str, choices=["rewards", "winrate"], default="rewards")),
                ("num_eval", 500)
            ]
        )

from line 59 in run.py as

class Eval:
    def __init__(self):
        self.args = ArgsProvider(
            call_from = self,
            define_args = [   # Change `define_params` as `define_args`
                ("stats", dict(type=str, choices=["rewards", "winrate"], default="rewards")),
                ("num_eval", 500)
            ]
        )

and it works. But I don't know how to debbug for second error AttributeError: 'Eval' object has no attribute 'GC'
I did have some .bin files (e.g. save-9059.bin) in the folder, are there anything that I missed?

BTW, how can I test and visualize my model from .bin file in html?
Thanks!

How to edit map?

It seems that we can edit map in ELF/rts/engine/map.h, do you have any examples about this?

Thank you!

How to evaluate a trained ai like your replay?

The replay, the battle between trained-ai and build-in ai, really interests me!
How can we evaluate a trained model using .bin file in a html like the video?
In the file ELF/rts/backend/main_loop.cc, I did not find any function supporting trainedAI.

By the way, what was the use of the function commented?
RTSGameOptions ai_vs_mcts(const Parser &parser, string *players)

How to initialize miniRTS environment in wrapper

Sorry to bother you 😃

  • I read docs about wrapper, what should I write for [game_env] and [your game params] in GC = [game_env].GameContext([your game params]) if I want to initialize a miniRTS game?
  • It seems that there are no introduction of actions (what action can I choose to setting reply_batch) in miniRTS training yet :D. I find there are 9 commands at the end of the paper, should I use it by one-hot list or a specific dic?

Error when using GPU

If I set --gpu 0 to run run.py, everything is fine.
But if I choose other gpu to use, for example --gpu 1, I will get the error as follow:

chenyukang@node02:~/ELF$ game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model python3 run.py --batchsize 128 --freq_update 50 --fs_opponent 20 --latest_start 500 --latest_start_decay 0.99 --num_games 1024 --opponent_type AI_HIT_AND_RUN --tqdm --num_episode 1000 --gpu 7 --save_dir ./save
Namespace(T=6, actor_only=False, additional_labels=None, ai_type='AI_NN', batchsize=128, discount=0.99, entropy_ratio=0.01, epsilon=0.0, eval=False, freq_update=50, fs_ai=50, fs_opponent=20, game_multi=None, gpu=7, grad_clip_norm=None, greedy=False, handicap_level=0, latest_start=500, latest_start_decay=0.99, load=None, max_tick=30000, mcts_threads=64, min_prob=1e-06, num_episode=1000, num_games=1024, num_minibatch=5000, opponent_type='AI_HIT_AND_RUN', ratio_change=0, record_dir='./record', sample_node='pi', sample_policy='epsilon-greedy', save_dir='./save', save_prefix='save', seed=0, simple_ratio=-1, tqdm=True, verbose_collector=False, verbose_comm=False, wait_per_group=False)
Version: cc84f9b70e52759a90ffe5a92278e8d54ba6b136_staged
Num Actions: 9
Num unittype: 6
0%| | 0/5000 [00:00<?, ?it/s]Traceback (most recent call last):
File "run.py", line 205, in
runner.run()
File "/home/chenyukang/ELF/rlpytorch/trainer.py", line 208, in run
self.GC.Run()
File "/home/chenyukang/ELF/elf/utils_elf.py", line 224, in Run
res = self._call(self.infos)
File "/home/chenyukang/ELF/elf/utils_elf.py", line 211, in _call
reply = self._cb[infos.gid](sel, sel_gpu)
File "/home/chenyukang/ELF/rlpytorch/trainer.py", line 139, in actor
state_curr = self.mi.forward("actor", sel_gpu[0])
File "/home/chenyukang/ELF/rlpytorch/model_interface.py", line 102, in forward
_, res = self.modelskey
File "/home/chenyukang/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "./rts/game_MC/model.py", line 79, in forward
output = self.net(self._var(s), self._var(r0), self._var(r1))
File "/home/chenyukang/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "./rts/game_MC/model.py", line 42, in forward
h1 = self.conv1(input.view(input.size(0), self.m, 20, 20))
File "/home/chenyukang/anaconda3/lib/python3.5/site-packages/torch/nn/modules/module.py", line 206, in call
result = self.forward(*input, **kwargs)
File "/home/chenyukang/anaconda3/lib/python3.5/site-packages/torch/nn/modules/conv.py", line 237, in forward
self.padding, self.dilation, self.groups)
File "/home/chenyukang/anaconda3/lib/python3.5/site-packages/torch/nn/functional.py", line 40, in conv2d
return f(input, weight, bias)
RuntimeError: tensors are on different GPUs

It seems that the parameter 'all_args.gpu' is not given to inner model.

Visualization for CF and TD

Hi,

Is visualization supported for CF and TD games?
It seems that no HTML file is generated for these games.

Several errors during compiling

Everytime I type make gen as follow, I got these errors as follow.
Do you know why? Thanks in advance.

chenyukang@node02:/ELF/rts/game_MC$ make gen
[inl] ...
python ../engine/compile_cmds.py --def_file ../engine/cmd --name engine
python ../engine/compile_cmds.py --def_file ../engine/cmd_specific --name engine_specific
python ../engine/compile_cmds.py --def_file cmd_specific --name minirts_specific
chenyukang@node02:
/ELF/rts/game_MC$ make
[dep] ../engine/rule_actor.cc ...
[dep] ../engine/omni_ai.cc ...
[dep] ../engine/game.cc ...
[dep] ../engine/cmd_specific.cc ...
[dep] ../engine/cmd.cc ...
[dep] ../engine/bullet.cc ...
[dep] mc_rule_actor.cc ...
[dep] gamedef.cc ...
[dep] cmd_specific.cc ...
[dep] ai.cc ...
[cc] ai.cc ...
In file included from ../engine/omni_ai.h:13:0,
from ai.h:12,
from ai.cc:10:
../../vendor/../elf/comm_template.h: In member function ‘void CommStats::Feed(const int64_t&, Generator&)’:
../../vendor/../elf/comm_template.h:99:20: error: ‘sleep_for’ is not a member of ‘std::this_thread’
../../vendor/../elf/comm_template.h: In member function ‘bool CommT<DataAddr, Value>::SendDataWaitReply(const Key&, Value&)’:
../../vendor/../elf/comm_template.h:206:13: error: ‘sleep_for’ is not a member of ‘std::this_thread’
In file included from ../engine/omni_ai.h:13:0,
from ai.h:12,
from ai.cc:10:
../../vendor/../elf/comm_template.h: In member function ‘void ContextT<_Options, _Data, _Reply>::Start(ContextT<_Options, _Data, _Reply>::GameStartFunc)’:
../../vendor/../elf/comm_template.h:446:13: error: ‘sleep_for’ is not a member of ‘std::this_thread’
In file included from ../engine/common.h:24:0,
from ../engine/cmd.h:13,
from ../engine/cmd_receiver.h:13,
from ../engine/omni_ai.h:15,
from ai.h:12,
from ai.cc:10:
../engine/serializer.h: In instantiation of ‘serializer::loader& serializer::operator>>(serializer::loader&, std::map<Key, T>&) [with Key = std::pair<int, int>; T = float; serializer::loader = serializer::loader]’:
../engine/serializer.h:344:27: recursively required from ‘serializer::loader& serializer::Load(serializer::loader&, T&, Args& ...) [with T = PlayerPrivilege; Args = {int, std::vector<Fog, std::allocator >, std::map<std::pair<int, int>, float, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, float> > >, std::map<std::pair<int, int>, std::pair<int, int>, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, std::pair<int, int> > > >}]’
../engine/serializer.h:344:27: required from ‘serializer::loader& serializer::Load(serializer::loader&, T&, Args& ...) [with T = int; Args = {PlayerPrivilege, int, std::vector<Fog, std::allocator >, std::map<std::pair<int, int>, float, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, float> > >, std::map<std::pair<int, int>, std::pair<int, int>, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, std::pair<int, int> > > >}]’
../engine/player.h:148:5: required from here
../engine/serializer.h:304:13: error: ‘class std::map<std::pair<int, int>, float>’ has no member named ‘emplace’
../engine/serializer.h: In instantiation of ‘serializer::loader& serializer::operator>>(serializer::loader&, std::map<Key, T>&) [with Key = std::pair<int, int>; T = std::pair<int, int>; serializer::loader = serializer::loader]’:
../engine/serializer.h:344:27: recursively required from ‘serializer::loader& serializer::Load(serializer::loader&, T&, Args& ...) [with T = PlayerPrivilege; Args = {int, std::vector<Fog, std::allocator >, std::map<std::pair<int, int>, float, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, float> > >, std::map<std::pair<int, int>, std::pair<int, int>, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, std::pair<int, int> > > >}]’
../engine/serializer.h:344:27: required from ‘serializer::loader& serializer::Load(serializer::loader&, T&, Args& ...) [with T = int; Args = {PlayerPrivilege, int, std::vector<Fog, std::allocator >, std::map<std::pair<int, int>, float, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, float> > >, std::map<std::pair<int, int>, std::pair<int, int>, std::less<std::pair<int, int> >, std::allocator<std::pair<const std::pair<int, int>, std::pair<int, int> > > >}]’
../engine/player.h:148:5: required from here
../engine/serializer.h:304:13: error: ‘class std::map<std::pair<int, int>, std::pair<int, int> >’ has no member named ‘emplace’
make: *** [obj/ai.o] Error 1

How to modify Reward Function For Mini-RTS ?

"For Mini-RTS, the agent only receives a reward when the game ends (±1 for win/loss)."

But I want to modify the reward function, so that the agent can receives rewards when kill units of the opponent or obtain more resource.
Would you please tell me where to modify it? Thank you in advance!

KeyError: 'Batch(): specified key: id or last_id not found!'

Hi,
When I ran demo, I met a problem :
log as follow :

Traceback (most recent call last):
  File "run.py", line 194, in <module>
    runner.run()
  File "/disk1/benjen/work-space/ELF/rlpytorch/trainer.py", line 179, in run
    self.GC.Run()
  File "/disk1/benjen/work-space/ELF/elf/utils_elf.py", line 258, in Run
    res = self._call(self.infos)
  File "/disk1/benjen/work-space/ELF/elf/utils_elf.py", line 249, in _call
    reply = self._cb[infos.gid](sel, sel_gpu)
  File "/disk1/benjen/work-space/ELF/rlpytorch/trainer.py", line 109, in actor
    self.stats.feed_batch(sel)
  File "/disk1/benjen/work-space/ELF/rlpytorch/stats.py", line 188, in feed_batch
    return self.collector.feed_batch(batch, hist_idx=hist_idx)
  File "/disk1/benjen/work-space/ELF/rlpytorch/stats.py", line 68, in feed_batch
    ids = batch["id"][hist_idx]
  File "/disk1/benjen/work-space/ELF/elf/utils_elf.py", line 88, in __getitem__
    raise KeyError("Batch(): specified key: %s or %s not found!" % (key, key_with_last))
KeyError: 'Batch(): specified key: id or last_id not found!'

my command is :

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model python3 run.py --num_games 1024 --batchsize 128 --freq_update 50 --fs_opponent 20 --latest_start 500  --latest_start_decay 0.99 --opponent_type AI_SIMPLE --tqdm  --T 20   

the following is my environment:
os: ubuntu 16.04
nvidia driver version : 375.66
gcc version: 5.4.0

Cmake of rts build fails if Thread Building Blocks isn't installed

If you install without Intel TBB installed, and run make on the cmake build you get:

In file included from .../ELF/elf/../elf/state_collector.h:27:0,
from .../ELF/elf/../elf/comm_template.h:27,
from .../ELF/rts/game_MC/wrapper_callback.h:12,
from .../ELF/rts/game_MC/python_wrapper.cc:18:
.../ELF/elf/../elf/primitive.h:11:36: fatal error: tbb/concurrent_queue.h: No such file or directory
compilation terminated.

Looking in the file, there's a USE_TBB option that's likely not being properly detected or set. Alternatively, the error message could be improved to specify TBB as a dependency.

ImportError: dynamic module does not define module export function (PyInit_minirts)

Hi,

I was attempting to train with the command line arguments specified in the repo's README,
however I was getting this error.

Traceback (most recent call last):
File "run.py", line 140, in
game = load_module(os.environ["game"]).Loader()
File "/Users/Ethan/dev/github/ELF/rlpytorch/utils.py", line 510, in load_module
module = import(os.path.basename(mod))
File "./rts/game_MC/game.py", line 8, in
import minirts
ImportError: dynamic module does not define module export function (PyInit_minirts)

Any idea why this is happening?

Unable to compile miniRTS for TD and CF

Hi,

I'm able to compile the backend for TD and CF. The instructions work fine for MC.
The command
make minirts GAME_DIR=../game_TD

results in

Linking ...
obj/main_loop.o: In function `PlayerSelector::GetPlayer(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)':
main_loop.cc:(.text._ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi[_ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi]+0x43): undefined reference to `vtable for HitAndRunAI'
main_loop.cc:(.text._ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi[_ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi]+0x21d): undefined reference to `vtable for SimpleAI'
collect2: error: ld returned 1 exit status
Makefile:73: recipe for target 'minirts' failed
make: *** [minirts] Error 1

while
make minirts GAME_DIR=../game_CF

yields:

Linking ...
obj/main_loop.o: In function `PlayerSelector::GetPlayer(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int)':
main_loop.cc:(.text._ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi[_ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi]+0x43): undefined reference to `vtable for HitAndRunAI'
main_loop.cc:(.text._ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi[_ZN14PlayerSelector9GetPlayerENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEi]+0x21d): undefined reference to `vtable for SimpleAI'
collect2: error: ld returned 1 exit status
Makefile:73: recipe for target 'minirts' failed
make: *** [minirts] Error 1

Introduction about AI?

"AI_SIMPLE", "AI_HIT_AND_RUN", "AI_NN", "AI_FLAG_NN", "AI_TD_NN"
Are there any introduction about these 5 different AI?

Compilation error in atari

After following the instructions to build ALE (in ~/Arcade-Learning-Environment/), I attempted to build atari/. Here's the error with the default make instruction:

# PYTHON_CONFIG=/opt/conda/bin/python3-config make
Package ale was not found in the pkg-config search path.
Perhaps you should add the directory containing `ale.pc'
to the PKG_CONFIG_PATH environment variable
No package 'ale' found
[cc] atari_game.cc ...
In file included from atari_game.cc:13:0:
atari_game.h:15:33: fatal error: ale/ale_interface.hpp: No such file or directory
compilation terminated.
Makefile:64: recipe for target 'obj/atari_game.o' failed
make: *** [obj/atari_game.o] Error 1

Here's the error with PKG_CONFIG_PATH set to the ale build/ directory (which contains ale.pc):

# PYTHON_CONFIG=/opt/conda/bin/python3-config PKG_CONFIG_PATH=~/Arcade-Learning-Environment/build/ make
[cc] atari_game.cc ...
In file included from atari_game.cc:13:0:
atari_game.h:15:33: fatal error: ale/ale_interface.hpp: No such file or directory
compilation terminated.
Makefile:64: recipe for target 'obj/atari_game.o' failed
make: *** [obj/atari_game.o] Error 1

Any idea how to fix this? Thanks!

[Question:] Different opponents

Is it possible for the agent to play with different opponents, and get their batch data respectively?
I also wonder where I can find the code about AI_SIMPLE acting. Thanks!

[Question]: Create new environment for RTS

Hi there,

I am writing this issue in order to understand if it is possible to create a new environment in ELF by exploiting a game level editors or if it is necessary to implement it by using the C++ API.

In both cases, could you please suggest the best practices about how to create a new environment that can be easily integrated in your platform?

Thank you so much for your help.

Alessandro

No module named 'minirts'

  File "../rts/game_MC/game.py", line 8, in <module>
    import minirts
ImportError: No module named 'minirts'

coredump when train models

when I run the train script run.sh, it coredumps.
I tried to locate the reason and found that all coedumps happened when it initiallizes the game context, the line
GC = game.initialize() in run.py
I tracebacked the c code in elf, found that it was caused by initializing the random generator _g(_rd());
the random generators, that
std::random_device _rd;
std::mt19937 _g;
defined in
comm_template.h
state_collector.h
and initialized in such code snippets:

CommT(const ContextOptions &context_options, CustomFieldFunc field_func)
: _context_options(context_options), _total_collectors(0),
//: _context_options(context_options), _g(_rd()), _total_collectors(0),
_verbose(context_options.verbose_comm), _field_func(field_func) {

in comm_template.h and

CustomFieldFunc field_func, SyncSignal *signal, bool verbose)
: _gid(gid), _hist_len(hist_len), _last_seq(signal->num_games(), -1), _game_counter(signal->num_games(), 0),
//_g(_rd()), _pool(num_collectors), _verbose(verbose) {
_pool(num_collectors), _verbose(verbose) {

in state_collector.h

And if I annotate the code initializes the random generator just like above, everything goes well and reproduce the result in the readme.

I'm still trying to find out why?

Install libczmq 2.0.2 from source?

Tried to compile the rts backend.

error:
../../vendor/CZMQ-ZWSSock/zwshandshake.c:345:2: error: unknown type name ‘zdigest_t’

Solved by install 3.0.2 instead.

typo in ./atari/game.py, line 81

Hi.

It seems that there is a typo in ./atari/game.py, line 81:

I think the second "]" after "[input]" should not be there,
params["action_batchsize"] = int(desc["actor"]["input"]]["_batchsize"])

Best,

Pedro N.

[Question:] Is the Advantage Actor-Critic Implementation LSTM-safe?

I'm reading the code, and I'm not entirely clear on whether the implementation of ActorCritic (or LearningMethod as an abstract interface in general) is LSTM safe. Correct me if I'm wrong, but it appears to me that the environment is meant to call update(batch) as a sort of thread-safe callback at each timestep. But since each game's update doesn't have any ID (that I know of), it's impossible to fetch the relevant LSTM state for forward/backward passes.

The model I want to use requires using heuristics to set a Mask for MaskedSelect anyway, so I know I'll need to edit the code a little, but I'm mostly curious if an LSTM model (e.g. a modified LSTM-A3C from the A3C paper) would work with the current implementation, and if not what I need to change.

Problem compiling CF and TD games

Hi,

When trying to compile both games, I have received the following error message:

ai.cc: In member function ‘virtual bool FlagTrainedAI::on_act(const GameEnv&)’:
ai.cc:102:36: error: ‘class AICommT<ContextT<PythonOptions, ExtGame, Reply> >’ has no member named ‘newest’
     const Reply& reply = _ai_comm->newest().reply;
                                    ^
Makefile:75: recipe for target 'obj/ai.o' failed
make: *** [obj/ai.o] Error 1

How to enrich Action Space?

wechatimg196
As described in the paper, there are only 9 commands in action space. And commands about attacks are designed for all units.
How can a agent give commands to a single unit? And how can we add more flexible ones to action space. To be honest, the poor action space is really a drawback of this platform.

No module named 'go_game'

Hi! I was trying to run darkforestGo , but after I typed
sh ./train_df.sh --gpu 1 --no_leaky_relu --list_file ./go/sample1.sgf,
I got the following error
Traceback (most recent call last): File "train.py", line 20, in <module> env, all_args = load_env(os.environ, trainer=trainer, runner=runner) File "/home/Phil/ELF/rlpytorch/model_loader.py", line 109, in load_env game = load_module(envs["game"]).Loader() File "/home/Phil/ELF/rlpytorch/model_loader.py", line 18, in load_module module = __import__(os.path.basename(mod)) File "./go/game.py", line 10, in <module> import go_game as go ModuleNotFoundError: No module named 'go_game'

Could you @yuandong-tian please help me? Thanks very much!

RuntimeError: input and target have different number of elements

Hi,
I met another problem :
log as follow :

Traceback (most recent call last):
  File "run.py", line 194, in <module>
    runner.run()
  File "/disk1/benjen/work-space/ELF/rlpytorch/trainer.py", line 179, in run
    self.GC.Run()
  File "/disk1/benjen/work-space/ELF/elf/utils_elf.py", line 254, in Run
    res = self._call(self.infos)
  File "/disk1/benjen/work-space/ELF/elf/utils_elf.py", line 245, in _call
    reply = self._cb[infos.gid](sel, sel_gpu)
  File "run.py", line 178, in train_and_update
    reply = trainer.train(sel, sel_gpu)
  File "/disk1/benjen/work-space/ELF/rlpytorch/trainer.py", line 85, in train
    self.rl_method.run(sel_gpu)
  File "/disk1/benjen/work-space/ELF/rlpytorch/rlmethod_base.py", line 93, in run
    self.update(batch)
  File "/disk1/benjen/work-space/ELF/rlpytorch/rlmethod_common.py", line 124, in update
    value_err = self.value_loss(V, Variable(R))
  File "/usr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 224, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/lib/python3.5/site-packages/torch/nn/modules/loss.py", line 406, in forward
    return F.smooth_l1_loss(input, target, size_average=self.size_average)
  File "/usr/lib/python3.5/site-packages/torch/nn/functional.py", line 811, in smooth_l1_loss
    return _functions.thnn.SmoothL1Loss.apply(input, target, size_average)
  File "/usr/lib/python3.5/site-packages/torch/nn/_functions/thnn/auto.py", line 47, in forward
    output, *ctx.additional_args)
RuntimeError: input and target have different number of elements: input[128 x 1] has 128 elements, while target[128 x 128] has 16384 elements at /pytorch/torch/lib/THCUNN/generic/SmoothL1Criterion.cu:12

my command is :

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model python3 run.py --additional_labels id,last_terminal --num_games 1024 --batchsize 128 --freq_update 50 --fs_opponent 20 --latest_start 500  --latest_start_decay 0.99 --opponent_type AI_SIMPLE --tqdm  --T 20 --gpu 0  

i suppose maybe '--batchsize' has problem.

the following is my environment:
os: ubuntu 16.04
nvidia driver version : 375.66
gcc version: 5.4.0

Failed to load the 80% winrate model

HI,

I installed ELF according to the install script and downloaded the trained model. After that, I tried to run ./eval_minirts.sh ./rts/game_MC/model/model-winrate-80.0-357800.bin 50, the 1st argument is the path to model, the 2nd argument is skipped frame per action (which is 50 as the README suggested). It throws the following error:

Load from ./rts/game_MC/model/model-winrate-80.0-357800.bin
/home/ubuntu/miniconda3/envs/elf/lib/python3.6/site-packages/torch/serialization.py:286: SourceChangeWarning: source code of class 'model.Model_ActorCritic' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
Traceback (most recent call last):
  File "eval.py", line 25, in <module>
    model = env["model_loaders"][0].load_model(GC.params)
  File "/home/ubuntu/ELF/rlpytorch/model_loader.py", line 82, in load_model
    model.load(self.load, omit_keys=omit_keys)
  File "/home/ubuntu/ELF/rlpytorch/model_base.py", line 107, in load
    self.load_state_dict(data["stats_dict"])
  File "/home/ubuntu/miniconda3/envs/elf/lib/python3.6/site-packages/torch/nn/modules/module.py", line 369, in load_state_dict
    raise KeyError('missing keys in state_dict: "{}"'.format(missing))
KeyError: 'missing keys in state_dict: "{\'Wt3.bias\', \'Wt.weight\', \'Wt2.weight\', \'Wt3.weight\', \'Wt.bias\', \'Wt2.bias\'}"'

I am guessing that the trained model is a little different from actor-critic, right?

unable to start train

hi
I can run standalone backend game_MC successfully, but when I try to run the codes below

game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model \ 
python3 run.py 
    --num_games 1024 --batchsize 128              # Set number of games to be 1024 and batchsize to be 128.  
    --freq_update 50                              # Update behavior policy after 50 updates of the model.
    --fs_opponent 20                              # How often your opponent makes a decision (every 20 ticks)
    --latest_start 500  --latest_start_decay 0.99 # Use rule-based AI for the first 500 ticks, then trained AI takes over. latest_start decays with rate latest_start_decay. 
    --opponent_type AI_SIMPLE                     # Use AI_SIMPLE as rule-based AI
    --tqdm                                        # Show progress bar.
    --gpu 0                                       # Use first gpu. 
    --T 20                                        # 20 step actor-critic

I get this error message:

Namespace(T=20, actor_only=False, additional_labels=None, ai_type='AI_NN', batchsize=128, discount=0.99, entropy_ratio=0.01, epsilon=0.0, eval=False, freq_update=50, fs_ai=50, fs_opponent=20, game_multi=None, gpu=0, grad_clip_norm=None, greedy=False, handicap_level=0, latest_start=500, latest_start_decay=0.99, load=None, max_tick=30000, mcts_threads=64, min_prob=1e-06, num_episode=10000, num_games=1024, num_minibatch=5000, opponent_type='AI_SIMPLE', ratio_change=0, record_dir='./record', sample_node='pi', sample_policy='epsilon-greedy', save_dir=None, save_prefix='save', seed=0, simple_ratio=-1, tqdm=True, verbose_collector=False, verbose_comm=False, wait_per_group=False)
段错误 (核心已转储)   # means "segmentation fault"

The program just terminates with segmentation fault.

[Question:] How to replace opponent AI with my own agent to train in game_MC?

In game_MC, there are only two build-in AIs which may be not enough for training. So I want to train an agent with an other agent designed by myself to be the opponent. In this case, the game environment would be able to interact with two agents.
Are there any possible to support this? If possible, please give me some instructions to do it. It is key to my research. Thank you in advance.

./train_atari.sh is running but need two changes in code.

have to modify code in two places:
in model_loader.py :
#model.cuda(device_id=args.gpu)
model.cuda(args.gpu)
and model_interface.py
#self.models[key].cuda(device_id=gpu_id)
self.models[key].cuda(gpu_id)
because i got :
TypeError: cuda() got an unexpected keyword argument 'device_id'

./train_atari.sh is running and gpu is beasy.

Segmentation fault error

root@b6252d26dc95:/workspace# game=./rts/game_MC/game model=actor_critic model_file=./rts/game_MC/model taskset -c 0-9 python3 run.py --batchsize 128 --freq_update 50 --fs_opponent 20 --latest_start 500 --latest_start_decay 0.99 --num_games 1024 --opponent_type AI_SIMPLE --tqdm
Traceback (most recent call last):
  File "run.py", line 142, in <module>
    game = load_module(os.environ["game"]).Loader()
  File "/workspace/rlpytorch/utils.py", line 510, in load_module
    module = __import__(os.path.basename(mod))
  File "./rts/game_MC/game.py", line 8, in <module>
    import minirts
ImportError: dynamic module does not define module export function (PyInit_minirts)

I compiled game_MC success

Problems in humanplay

After ./minirts humanplay --vis_after 0, I can compete with the AI in the webpage. But there seems to be some problems:

  • Cannot gather resource. I click on a worker and then click on resource, then the worker just move around the resource but cannot gather it.
  • Cannot build tanks or workers. I click on the base and barrack, and have no idea how to build workers or tanks.
  • After clicking Pause button, cannot resume the game by clicking Pause again or clicking somewhere else.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.