Code Monkey home page Code Monkey logo

iyanuoluwa-vic / learning-paths-using-reinforcement-learning-alphazero Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 1.34 MB

An agent-based system which over several iterations explores and learns an initially unknown environment, a five by five grid consisting of three different types of spaces, normal locations, pickup locations, and drop off locations.

Python 100.00%
alpha-zero artficial-intelligence experiment machine-learning python q-learning-algorithm

learning-paths-using-reinforcement-learning-alphazero's Introduction

AlphaZero

Steps to Run

-dowload python 3

-Run main.py

-run visualization.py

For this assignment our group designed an agent-based system which over several iterations explores and learns an initially unknown environment, a five by five grid consisting of three different types of spaces, normal locations, pickup locations, and drop off locations. This agent was tasked with moving items from the aforementioned pickup locations to the aforementioned drop off locations, with the aim that over time, through reinforcement learning the agent would become more efficient with their movements and develop the capability to complete their assigned task in fewer iterations. Our group conducted five different experiments, each using a different combination of policies in order to better understand their effects on the agent and its ability to solve the task at hand. Each time an experiment is conducted, data we have deemed interesting, Q tables, the number of times each cell is visited, the percentage of cells visited, etc., is outputted to a file. Initially, we treated our Q Tables slightly differently from what we have seen thus far, instead of simply considering the Q values of each action of each cell, we also took into consideration the Q values of each action of each cell of each permutation of the location, whether or not the agent was holding a block, and the states of each of the pickup and drop off locations, as in whether or not blocks were available to pickup from pickup locations and whether or not drop off locations were full. Albeit interesting, this resulted in an overwhelming amount of data to comb through, thus in addition to these Q Tables, we also produced Q Tables simply representing the Q values of each action of each cell, while taking into consideration whether or not the agent was holding a block. This paper will serve as a medium for our group to communicate the findings of each run of each experiment, any interesting trends we may have noticed in each of these experiments, and which combination of learning algorithm and policy type, in our opinion based on our results, proved to be the most useful.

Loom.Message.-.21.April.2022.1.mp4

learning-paths-using-reinforcement-learning-alphazero's People

Contributors

iyanuoluwa-vic avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.