Light

putoze / rl_lecture Goto Github PK

View Code? Open in Web Editor NEW

1.0 1.0 0.0 142.54 MB

Python 98.11% Jupyter Notebook 1.89%

rl_lecture's Introduction

2023 NCHU RL Lecture

Install the following Python libraries:

python version 3.9.16
PyYAML (yaml parser)
NumPy (high-performance matrix library)
PyTorch, aka “torch” (tensor & neural network library)
OpenAI Gym version 0.25.2, aka “gym” (RL environment simulation library)
Pygame & imageio-ffmpeg (for OpenAI video rendering)
scikit-learn (sklearn)
Matplotlib (data visualization)

HW Note

Hw1: Rabbit farm
To make you be familiar with python and its libraries.
Hw2: Value Function using matrix approach
Numpy is a good tools for solving matrix computing problems.
Hw3: Value Function & optimal policy using iterative DP method
Getting to know more about RL with "Values" & "Policy".
Hw4: Monte Carlo with Cat-Mouse environment
Testing environments with Cat-Mouse envs with our First real RL method "Monte Carlo".
Hw5: Q-Learning with Cat-Mouse env
A "model free" method, computing "Q(S,A)" without building the States as last homework.
Hw6: Sarsa & Expected Sarsa with epsilon greedy policy
Sarsa is a little bit more conservative strategy with more exploring state while implementing "Epsilon Decay".
Hw7: Pytorch & CNN
An overview with CNN models & pytorch API.
Hw8: Semi-gradient TD(0) with NN function approximation
Cat-Mouse env again! Learning our "Values" & "Policy" with NN model. Interesting.
Hw9: Q-Learning with NN && Acrobot environment
Off-policy Q-Learning sometimes causes "Dead Triad". How do we solve it?
Hw10: Using Policy-gradient with Monte Carlo to implement the missing code in pg_mc.py

Reference

[1] 從根本學習Reinforcement Learning 系列

[2] 强化学习系列（十二）：Eligibility Traces

[3] [Python教學] @property是什麼? 使用場景和用法介紹

[4] Deep Learning 從零開始到放棄的 30 天 PyTorch 數字辨識模型系列

rl_lecture's People

Contributors

Stargazers

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.