Code Monkey home page Code Monkey logo

sungka-ai's Introduction

Mastering the Game of Sungka from Random Play

CS 295 Project for AY 2018-2019 by Darwin Bautista and Raimarc Dionido

Documentation

Project Structure

  • pretrained - pretrained model weights
  • environment.py - Sungka environment implementation in OpenAI's Gym
  • model.py - PyTorch model (and derived policy) and DQN-based trainer
  • policy.py - handcrafted policies
  • train.py - training loop. Running with default parameters would recreate the results shown in the paper.
  • test.py - test code for evaluating or playing against (i.e. opponent == 'human') the trained DQN agent.
  • options.py - common options and hyperparameters
  • benchmark.sh - benchmark code for generating the data in the paper

Training

To get the exact results shown in the paper, train the model using the default parameters:

$ python train.py --save_path results/

Every 100 training episodes, model weights will be saved in results. The final model weight would be in results/p1-09999.pth, and results/p2-09999.pth for the agent trained as player 2. The score and win rate plots would be saved as:

results/p1-test-rewards.png  (Figure 3 in the paper)
results/p1-train-rewards.png
results/p1-win-rates.png     (Figure 5 in the paper)
results/p2-test-rewards.png
results/p2-train-rewards.png
results/p2-win-rates.png

Training hyperparameters are as follows:

 --batch_size BATCH_SIZE
                        batch size; default=128
  --lr LR               learning rate; default=1e-5
  --gamma GAMMA         gamma/discount factor; default=0.9
  --mem_cap MEM_CAP     memory capacity; default=2000
  --num_episodes NUM_EPISODES
                        number of episodes; default=10000
  --num_test NUM_TEST   number of test episodes; default=100
  --opp_policy OPP_POLICY
                        opponent policy during training; default=random
  --q_net_iter Q_NET_ITER
                        number of iterations before updating target; default=100

Testing

To play with the pretrained agent:

$ python test.py --load_path pretrained/p1-09999.pth --opp_policy human --render

Testing options are:

  --num_test NUM_TEST   number of test episodes; default=100
  --opp_policy OPP_POLICY
                        opponent policy during training; default=random
  --player PLAYER       player turn; default=1
  --render              render; default=False
  --pvp                 P1 weights vs P2 weights; default=False

Benchmark

The complete benchmark code is provided for recreating the data in Tables I, II, and III:

$ ./benchmark.sh

sungka-ai's People

Contributors

baudm avatar remarksd avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.