Comments (3)
I think you could do something like this (pseudo code because I haven't touched Rust in a while 😁):
impl TerminationStrategy for StopAtTerminalState {
fn should_stop(state) {
state.actions().is_empty()
}
}
so the algorithm stops on a terminal state
The problem then is that it may not have discovered much yet, so you'll have to run multiple episodes of the algorithm:
let mut trainer = AgentTrainer::new();
for i in 1..1000 {
let mut agent = MyAgent { state: MyState { x: 0, y: 0 }};
trainer.train(&mut agent,
&QLearning::new(0.2, 0.01, 2.),
&StopAtTerminalState ::new(),
&RandomExploration::new());
}
Let me know if it helps.. I think it's a common scenario, so I could put something in the project to make it easier.
in many cases the board state cannot be improved any further, and it should back track to try a different action
Note that you are supposed to return all possible actions, not just actions that would lead to an improved state.
from rurel.
Thanks @milanboers!
One issue is that calling state.actions()
inside the should_stop()
would repeat the actions list generation - in my case this is a fairly expensive process. Caching it is tricky because should_stop(&mut self, state: &S)
gets non-mutable state, and neither is fn actions()
, so cloning would be required (might get expensive).
in many cases the board state cannot be improved any further, and it should back track to try a different action
Note that you are supposed to return all possible actions, not just actions that would lead to an improved state.
Of course. I always generate all available actions - its just that all my actions are "additive" (like in Go) -- i keep adding to the board while its possible, but at some point there are no more valid moves.
from rurel.
If generating the actions is expensive, you can try to generate only one possible action (at random). You're essentially combining the action generation with the exploration strategy then.
It's a bit of a shortcoming of putting things into a framework like this. You may just want to implement Q-learning yourself, the algorithm is relatively straightforward.
from rurel.
Related Issues (11)
- Sink States HOT 3
- Train by individual steps HOT 1
- Eucdist example doesn't seem to be correct HOT 2
- Can't represent a state from which there are no actions HOT 2
- How to make 2 networks play against each other? HOT 1
- Usage for future values? HOT 1
- Saving model to disk HOT 2
- Dumping results to file
- Document your code, please. HOT 1
- Example doesn't "learn" anything HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rurel.