Code Monkey home page Code Monkey logo

Comments (2)

jmacglashan avatar jmacglashan commented on July 18, 2024

Hi,

To be clear, the code has two nested ifs: the outer is to check if the states are the same, the other to check if the actions are the same. So when the states and actions are the same, the trace is set to 1 (replacing traces). When the states are the same, but the actions are different, the trace is reset to 0. Although there are different choices for how to replacing traces in control, this resetting of non-selected actions in the trace to zero is the approached advocated in the book:

https://webdocs.cs.ualberta.ca/~sutton/book/ebook/node80.html

There are several possible ways to generalize replacing eligibility traces for use in control methods. Obviously, when a state is revisited and a new action is selected, the trace for that action should be reset to 1. But what of the traces for the other actions for that state? The approach recommended by Singh and Sutton (1996) is to set the traces of all the other actions from the revisited state to 0.

Interestingly, the math they provide in the book is actually wrong, because after that text they include the accumulating traces for the same state-action pair and the reset to zero for state state different action :p However, the paper they're citing is clear and much more detailed on the reasoning. That paper is here:

http://www-all.cs.umass.edu/pubs/1995_96/singh_s_ML96.pdf

See the pseudocode on page 142

If you don't think there is something else I'm missing, I'll close.

from burlap.

mysl avatar mysl commented on July 18, 2024

Huh, interesting. It looks like the second version of Sutton's book has inconsistency with his first version. (maybe because it's incomplete yet)
http://webdocs.cs.ualberta.ca/~sutton/book/the-book.html (page 161 definition, page 162 pseudocode)

Anyway, thanks very much for your confirmation and the reference. I am good to close it

from burlap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.