Code Monkey home page Code Monkey logo

Comments (3)

hongzimao avatar hongzimao commented on August 25, 2024

Which objective metric you used to observe skipping bitrate 2 and 4? Notice in figure 3(b) and section 3 example 2, this behavior might be desirable.

About picking the model forward, we didn't experience 2k model significantly outperforms 20k in validation. Usually the more you train it, the better the performance, unless there is significant overfitting

from pensieve.

karanrak avatar karanrak commented on August 25, 2024

QoE linear was the metric used right? I am referring to the pretrain model that you provided with the code. Just as shown in the figures 3a/3b the model only chooses from qualities 4.3mbps / 1.8mbps / 750mbps / 300kbps, while skipping the qualities 2 and 4.
It is indeed performing slightly better than other models that I've tried training (which dont ignore those bitrates), but I want to know if there is some intuition/methodology behind achieving such behaviour?

This is the process I am using for training/validation.

  1. Start the training with entropy X (e.g. 1). Keep printing the test results every 100 epochs (during model saving).
  2. If the test results are >= max, store that model separately as the new best model.
  3. Continue for 30k iterations. Use the best model as the base for next 30k iterations with a lower entropy.
    I am using the provided training and testing sets for the above. Is there something I am missing? Coz I am getting my "best model" sometimes as early as 8k iterations out of 30k iterations. And at other times it might be at 28k iterations. Unlikely for overfitting, since the results are oscillating near max values, which shouldn't occur if there is overfitting right?

from pensieve.

hongzimao avatar hongzimao commented on August 25, 2024

Thanks for pointing this out but we didn't notice this behavior before. One intuition I can think of is this might reduce the variance for the policy (outputting just a subset of the actions). So as long as the performance is improved, the agent has all the correction intention of reduce its entropy for a subset of the actions. This might not be preferable in reality, and I think you might want to increase the training dataset to get rid of this issue.

As for the overfitting, based on what you described, it might not be an overfitting issue. But you might want to checkpoint the model and test on validation set at each step to make sure.

from pensieve.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.