It seems that if the State::actions() returns an empt

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Empty list of actions panics about rurel HOT 3 CLOSED

milanboers commented on July 28, 2024

Empty list of actions panics

from rurel.

Comments (3)

milanboers commented on July 28, 2024

I think you could do something like this (pseudo code because I haven't touched Rust in a while 😁):

impl TerminationStrategy for StopAtTerminalState {
  fn should_stop(state) {
    state.actions().is_empty()
  }
}

so the algorithm stops on a terminal state

The problem then is that it may not have discovered much yet, so you'll have to run multiple episodes of the algorithm:

let mut trainer = AgentTrainer::new();

for i in 1..1000 {
    let mut agent = MyAgent { state: MyState { x: 0, y: 0 }};
    trainer.train(&mut agent,
              &QLearning::new(0.2, 0.01, 2.),
              &StopAtTerminalState ::new(),
              &RandomExploration::new());
}

Let me know if it helps.. I think it's a common scenario, so I could put something in the project to make it easier.

in many cases the board state cannot be improved any further, and it should back track to try a different action

Note that you are supposed to return all possible actions, not just actions that would lead to an improved state.

from rurel.

nyurik commented on July 28, 2024

Thanks @milanboers!

One issue is that calling state.actions() inside the should_stop() would repeat the actions list generation - in my case this is a fairly expensive process. Caching it is tricky because should_stop(&mut self, state: &S) gets non-mutable state, and neither is fn actions(), so cloning would be required (might get expensive).

in many cases the board state cannot be improved any further, and it should back track to try a different action

Note that you are supposed to return all possible actions, not just actions that would lead to an improved state.

Of course. I always generate all available actions - its just that all my actions are "additive" (like in Go) -- i keep adding to the board while its possible, but at some point there are no more valid moves.

from rurel.

milanboers commented on July 28, 2024

If generating the actions is expensive, you can try to generate only one possible action (at random). You're essentially combining the action generation with the exploration strategy then.

It's a bit of a shortcoming of putting things into a framework like this. You may just want to implement Q-learning yourself, the algorithm is relatively straightforward.

from rurel.

Empty list of actions panics about rurel HOT 3 CLOSED

Comments (3)

Related Issues (11)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent