Code Monkey home page Code Monkey logo

discrete_mean_field_game's Introduction

Deep Mean Field Games

This is the implementation of all experiments conducted for the ICLR 2018 paper Learning Deep Mean Field Games for Modeling Large Population Behavior

ac_irl.py is the main code for maximum entropy inverse reinforcement learning and a standard actor-critic RL solver.

mfg_ac2.py is an alternative version that implements the same forward RL solver for a pre-specified reward function.

rlbot_twitter (not currently maintained) was used to collect population data for these experiments.

discrete_mean_field_game's People

Contributors

011235813 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

discrete_mean_field_game's Issues

The math definition of Value function

Hi Jiachen,

Sorry for disturbing. In your paper/Appendix B-Algorithms/Algorithm 2, there's V(pi_n+1; w)-V(pi_n; w) which corresponds your below code: # TD error = r + gamma * v(s'; w) - v(s; w); delta = reward + discount*(vec_features_next.dot(ac.w)) - (vec_features.dot(ac.w))
May I ask what's the math definition of Value function? I don't find the specific math definition of the Value function, but only see the definition of Value function in your code as discount*(vec_features_next.dot(ac.w)). Is there some reference or math formula can help me to understand V(s, w)=discount*(vec_features_next.dot(ac.w))?
Thank you very much in advance!

Convergence of theta

Hi Jiachen, sorry for disturbing. It seems that the log likelihood log(F) has multiple local maxima in terms of theta.
Below test results show one local maxima between theta=7 and 8 and others between theta=9 and 10, 12 and 13, 15 and 16, 17 and 18. Is there some way to determine the global maxima of log(F)(theta)? Thank you very much in advance!

Initial theta Exiting train at episode 2000 with theta
5 5.002481
6 6.003725
7 7.000529
8 7.997179
9 9.15272365
10 9.8571062
11 11.402721
12 12.150159
13 12.87025287
14 13.23533
15 16.102402
16 15.818937
17 17.414376
18 17.507292

Input Data Formatting

Hi,

Could you provide some sample data (train_round2, etc)? I am not sure what the expected format is of the input data, but an example input would be incredibly helpful to modify my data to work with the infrastructure in place.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.