Code Monkey home page Code Monkey logo

physics_learning_rl's Introduction

physics_learning_rl

Reinforcement learning agent that actively learns physical motion mechanisms in a 2D simulated domain. Sample videos of trained agent episodes are accessable in ./videos/.

Previous research paper: https://psyarxiv.com/6vr4g/

Training

To train an active learning agent with q-value function approximator in separate framework, simply execute the following command with dependencies ready:

$ python learning_system.py --save_model True > exp_log/active_training_log.txt

or use active_learning_bash.sh to do the same thing.

Branches

  • master & loss_reward: Separate training framework. Train active agent & predictor separately, reward agent for predictor's mean evaluation loss over 5 frames during each action (5 frames). Then generate new training data for model predicor with trained active agent.

  • loss_reduction: Concurrent training framework. Train active agent & predictor together, reward agent for predictor's mean of training loss reduction over 5 frames during each action (5 frames).

Model predictor was pretrained on human experiment data with weighted average loss and learning rate = 1e-05 for 20 epochs.

Active agent was pretrained with reward only for catching & approaching pucks for 10000 episodes.

Package dependency

python 3.7+

tensorflow-gpu 1.13.1+

pyduktape 0.0.6

File structures (master branch)

  • learning_system.py An integreated launch script that runs separate training framework in loop. active_learning_bash.sh does the same thing but run scripts one by one.

  • RQN_agent.py The active agent based on Recurrent Q-Network, includes training and testing methods.

  • random_agent.py Baseline agent, samples random actions.

  • interaction_env.py The interaction enviroment, defines action space and reward signals, communicate with JavaScript simulator.

  • config.py Constant parameters define enviroment and model settings.

  • js_simulator/ JavaScript simualtor, contains data generation and environment settings.

  • model_predictor/ The world predictor based on LSTM, includes training and testing methods.

  • model_predictor/video_generation.py Use moviepy.editor to generate videos with recorded trials data.

  • exp_log/ & model_predictor/exp_log/ Training and testing logs of the active agent and world predictor

  • checkpoints/ & model_predictor/checkpoints/ Checkpoints of pre-trained active agent and world predictor

physics_learning_rl's People

Contributors

mingxuanm avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.