hongzimao / a3c Goto Github PK
View Code? Open in Web Editor NEWTensorflow implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Home Page: https://arxiv.org/abs/1602.01783
Tensorflow implementation of Asynchronous Advantage Actor Critic (A3C) from "Asynchronous Methods for Deep Reinforcement Learning".
Home Page: https://arxiv.org/abs/1602.01783
I am trying to implement A3C for a particular problem.
With some tweaking, I have got it working for my specific problem.
I do have a few queries regarding the optimisation though.
1.The actor and Critic are to be optimised simultaneously right? Although it does seem like they are being optimised simultaneously, they both have totally separate models. Could you throw some light on that specific implementation?
2. In some cases of higher difficulty of my problem, the tdloss seems to have converged to 0, but the expected reward is still some way off the actual expected rewards and doesn't seem to converge completely. Do you know why that could be happening?
3. One thing I did notice in the implementation is that while training if the episode doesn't reach the done state within the train_Batch_length, it stops after that and trains it till the size with done state as false. This seems to work really good when my state space is quite small. But for Larger state spaces, it becomes really tricky to tweak this exact training batch size.
Thanks!
In the a3c.py, there is a function compute_entropy(x) which is actually not called anywhere.
Is it something that needs to be used somewhere or am I missing something?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.