Code Monkey home page Code Monkey logo

train-google-s-dinosaur-with-dqn's Introduction

Train-Google-s-dinosaur-with-DQN

使用Deep Q Learning训练Google Chrome离线小恐龙。

参考资料:

game play

包含了用于获取人类游戏数据的脚本。

  • count.py
    • 获取transition的总数
  • capture.py
    • PIL 截取屏幕(仅windows可用)
    • pyHook 监控按键
    • pickle 将数据序列化
    • 数据存储在 ./game play/human/
  • gameoverCLF.py
    • 用于判断截屏是不是game over
    • 使用 Keras 2.0 API
    • h5 文件位于 ./game play/clf/
  • analyze.ipynb
    • 提供简易分析,检查capture是否符合DQN的要求
  • preprocess.ipynb
    • 对截屏进行预处理,转化成可供DQN直接学习的transition
    • ./game play/human/提取文件,处理完毕后,存储在./game play/transitions/

deepQnetwork.py

Deep Q Learning的架构,不包括神经网络定义

需要从外部传入神经网络定义,和其他一些参数

对象方法

  • learn(self,transGen)
    • 从已有的transitions中学习,需要传入生成器
    • 把从生成器中获取的trans存入经验池experiences,随机生成minibatch,反复调用backward进行学习
  • backward(self)
    • 目标函数是最小化 predict_Q 与 reward + future_maxQ 的差
    • 遇到terminal state时不计算future_maxQ
  • forward(self,state)
    • state 要经过预处理,大小应为(30,150,4)
    • 探索模式分 ε-greedy method 和 softmax action selection
    • softmax 参数不好调,不建议使用

参数说明

  • discount
    • gamma,或折现率
  • experienceSize
    • 经验池的最大容量
  • startLearningThreshold
    • 经验池存储多少样本后才开始学习
  • explorationMode
    • ε-greedy method
    • softmax action selection
  • temperature
  • minEpsilon
    • ε-greedy method 参数
    • 最小探索率
  • randomStartup
    • ε-greedy method 参数
    • 初始完全随机的步数
  • stepsUntilReachMinEpsilon
    • ε-greedy method 参数
    • 降到最低探索率所需步数
  • distribution
    • ε-greedy method 参数
    • 选择action的概率分布,默认为均等分布  

interaction.py

实现与windows系统的交互,窗口聚焦,模拟按键

train-google-s-dinosaur-with-dqn's People

Contributors

filwaline avatar

Stargazers

Zhi Zeng avatar

Watchers

James Cloos avatar  avatar Zhi Zeng avatar

Forkers

zengzhi2015

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.