Code Monkey home page Code Monkey logo

tensorflow-deepq's Introduction

Reinforcement Learning using Tensor Flow

For all of you smarty pants, who discovered the "continuous" branch of my repo: it is not yet functional, I never got it to coverge and it probably has a software bug. If you want to play around with it you are welcome, but be prepared to do some serious debugging. If you do get it to work though, definitely let me know! ;-) I'll probably get back to fixing it some time in February / March.

Quick start

Check out Karpathy game in notebooks folder.

The image above depicts a strategy learned by the DeepQ controller. Available actions are accelerating top, bottom, left or right. The reward signal is +1 for the green fellas, -1 for red and -5 for orange.

Requirements

  • future==0.15.2
  • euclid==0.1

How does this all fit together.

tf_rl has controllers and simulators which can be pieced together using simulate function.

Using human controller.

Want to have some fun controlling the simulation by yourself? You got it! Use tf_rl.controller.HumanController in your simulation.

To issue commands run in terminal

python2 tf_rl/controller/human_controller.py

For it to work you also need to have a redis server running locally.

Writing your own controller

To write your own controller define a controller class with 3 functions:

  • action(self, observation) given an observation (usually a tensor of numbers) representing an observation returns action to perform.
  • store(self, observation, action, reward, newobservation) called each time a transition is observed from observation to newobservation. Transition is a consequence of action and has associated reward
  • training_step(self) if your controller requires training that is the place to do it, should not take to long, because it will be called roughly every action execution.

Writing your own simulation

To write your own simulation define a simulation class with 4 functions:

  • observe(self) returns a current observation
  • collect_reward(self) returns the reward accumulated since the last time function was called.
  • perform_action(self, action) updates internal state to reflect the fact that aciton was executed
  • step(self, dt) update internal state as if dt of simulation time has passed.
  • to_html(self, info=[]) generate an html visualization of the game. info can be optionally passed an has a list of strings that should be displayed along with the visualization

Creating GIFs based on simulation

The simulate method accepts save_path argument which is a folder where all the consecutive images will be stored. To make them into a GIF use scripts/make_gif.sh PATH where path is the same as the path you passed to save_path argument

tensorflow-deepq's People

Contributors

benderv avatar lupino avatar siemanko avatar stas-sl avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.