Code Monkey home page Code Monkey logo

mids-w251-fa2020-hw11's Introduction

mids-w251-fa2020-hw11

This repository is my homework #11 for MIDS W251 (FA2020).

The links to videos are here:

Note: I cannot play these directly in my browser, but if I download and open them, they play fine. Must be some unusual codec.

Questions:

What parameters did you change, and what values did you use?

I changed the number of neurons in the layers. My best run was 150/120. I also changed the learning rate to 0.0005 (cut it in half, basically). Another thing I tried was increasing the epsilon decay to 0.997. Finally, I increased the number of epochs to 10.

Whhy did you change these parameters?

I changed the number of neurons because the original network seemed very small, and not capable of learning very much. I reduced the learning rate, because when I watched the training process, I got the sense that it would over-correct for things. Could have been my imagination, but it seemed to help. Increasing the epochs I hoped would mean that it would learn more from each episode, but I'm not sure if it did anything. Increasing the decay rate meant that it would do more random actions, which I guessed would mean that it would learn more at the expense of taking longer to converge.

Did you try any other changes (like adding layers or changing the epsilon value) that made things better or worse?

I tried a few other things too: I added a third layer to the network, I added a new feature (sqrt(x-pos^2 + y-pos^2), or the total distance to the goal), and I also modified gamma. The new feature didn't hurt, but it didn't help as much as I had hoped. The other values got me further away, and sometimes didn't converge. Probably the most important change didn't really have anything to do with the newtork: I moved the generation of the video to every 20th episode. This actually dramatically sped up the execution of each episode, and allowed me to iterate faster and try more experimentation with the parameters.

Did your changes improve or degrade the model? How close did you get to a test run with 100% of the scores above 200?

Scores for my best model

I didn't make it to 100%, at 93%. I tried a bunch of different changes (as described for previous questions) but I really was not able to improve on my best run.

Based on what you observed, what conclusions can you draw about the different parameters and their values?

  • Epsilon has a narrow useful range. Make it too small, and training will take ages, too big and you don't learn as much.
  • Innitial settings for number of neurons in the layers were too small by a good bit. On the otherhand, making them 100x bigger doesn't actually help that much, all other things being equal.
  • Adding a bunch of epochs doesn't really seem to do much.

What is the purpose of the epsilon value?

The epsilon value dictates how much the learning process will explore new (random) approaches to solving the problem. More specifically, epsilon represents the probability that the agent will try something random. 1-epsilon then is the probability that the agent will use it's best available information to figure out what the next action should be. We start out relatively high, because the agent hasn't learned much and needs to explore to figure things out. We decay it over time, because as the agent learns, we want it to rely more and more on the best available information, reinforcing what it has learned.

Describe "Q-Learning".

Q-learning is the process we use to train a neural network to figure out what the 'best' action is given the current state of a problem. It trains the network to do so by governing how the network explores possible options versus reinforces existing learning over time.

mids-w251-fa2020-hw11's People

Contributors

jtrobec avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.