Code Monkey home page Code Monkey logo

Comments (8)

timoklein avatar timoklein commented on August 22, 2024

Is it a tiny issue like an edge case or something very big?

I'm pretty sure the dead neuron tracking is correct. But I think the reset procedure isn't doing what it's supposed to in terms of reducing the number of dead neurons.

I am writing some code based on your implementation and will be analyzing it thoroughly, so maybe I will be able to help.

Sure, sounds great. For me, this is just a side project and I wasn't able to find the time to debug it next to other commitments. I've seen other people forking it, that's why I put the disclaimer. Hopefully, I'll make some progress next week :)

from redo.

JankowskiChristopher avatar JankowskiChristopher commented on August 22, 2024

Thanks a lot

from redo.

timoklein avatar timoklein commented on August 22, 2024

@JankowskiChristopher

I've had time to get back to this over the last week (sorry it's so late; I had some teaching duties) and ran a bunch of experiments with different fixes. Here's some info on the current status:

  • The dead neuron counting is correct as far as I can see.
  • The way I was iterating over the layers for the reset is overly complicated. It will be simplified.
  • Currently, the code doesn't reduce the $\tau=0.0$ dead neuron fraction a lot. I believe that the reason for that is that I was resetting both weight AND bias for the outgoing weights. Setting weight+bias for a neuron to zero obviously introduces a new dead neuron :)

I'm running experiments right now, and I'm tentatively optimistic that I've fixed the weight resets. Whether it outperforms the baseline, I'll have to see: In my runs, the baseline doesn't plateau on DemonAttack like in the paper. Therefore, ReDo might not improve on it as much as claimed in the paper.
I'm gonna check whether the momentum resets are actually correct and once that is confirmed, I'll push the changes and update the README with some results.

from redo.

JankowskiChristopher avatar JankowskiChristopher commented on August 22, 2024

@timoklein
Thanks a lot!
I have a small question regarding resetting the biases. Do we need to reset them?
In the paper mostly weights were mentioned (no occurence of words bias/biases) which was clear to me as a dead neuron corresponds to one ingoing/outgoing weight, but was not so clear to me with biases as my understanding was that a bias corresponds to connections from many neurons and maybe therefore we should not reset it?
I did not though analyze the official source code that much, however also the word "bias" was present in the hard reset part - which is understandable as we reset everything.

from redo.

timoklein avatar timoklein commented on August 22, 2024

As far as I can understand, the source code only resets the bias of the ingoing weights, and it does so here
https://github.com/google/dopamine/blob/485ea995655ebdf58a725dff5ec954b8847cae5f/dopamine/labs/redo/weight_recyclers.py#L600-L603

I don't think bias resets for the outgoing layers are needed. The goal for resetting the outgoing weights to 0 is that the random features obtained from resetting the ingoing weights don't affect the computations within the neural net too much. For that, it's enough to reset the weights.

from redo.

timoklein avatar timoklein commented on August 22, 2024

The implementation on the main branch should now be correct btw.

from redo.

JankowskiChristopher avatar JankowskiChristopher commented on August 22, 2024

Awesome, thanks a lot.

from redo.

timoklein avatar timoklein commented on August 22, 2024

Closing this with #2. Feel free to open a new issue if you have more questions.

from redo.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.