Comments (5)
@anushmanukyan Could I know which environment and algorithm you are training for?
from reinforcement-learning-algorithms.
@TianhongDai I am using PPO.
from reinforcement-learning-algorithms.
@anushmanukyan I guess you just load the weights of the model. if you check the line here: https://github.com/TianhongDai/reinforcement-learning-algorithms/blob/master/07-proximal-policy-optimization/ppo_agent.py#L113 When I test the network, I also load the object of running mean filter
. Because during training , I use the running mean filter
to normalize the input. So, if you want to retrain your model, you should also load the "trained" running mean filter
. Otherwise you will get different result.
from reinforcement-learning-algorithms.
I added running mean filter
and retraining seems to work better now.
However I have another question: how the demo.py works? Basically I can not figure out how the testing works, since I save the best model, but then when i test this model it has different reward than it had while saving that model. How it can be possible? And also if I run several times the same model then I get different performance.
Thank you so much for your help.
from reinforcement-learning-algorithms.
@anushmanukyan Hi, I think demo.py
should work fine, you can download my pre-trained model from: https://drive.google.com/drive/u/2/folders/1cZjjCA5WHs-Lfw63ntzeUjMo_wZoIgXw Then, just run python demo.py
. It will still get same high scores as it get during training. You can check https://github.com/TianhongDai/reinforcement-learning-algorithms/blob/master/07-proximal-policy-optimization/ppo_agent.py#L111 here to see how did i test the network.
from reinforcement-learning-algorithms.
Related Issues (10)
- How to visualize reward-epoch? HOT 2
- SAC Agent still wrong with "tuple index out of range" use "MountainCar-v0" HOT 1
- the same code? HOT 1
- Add prioritized experience replay HOT 5
- A3C in description HOT 1
- How to use the results? HOT 1
- BrokenPipeError: [Errno 32] Broken pipe HOT 4
- Bug using SAC with torch version 1.8.0a0+963f762 HOT 1
- Plotted Reward Scale HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from reinforcement-learning-algorithms.