The tensorflow2.0-for-deep-reinforcement-learning from huixxi

tensorflow2.0-for-deep-reinforcement-learning's Introduction

TensorFlow 2.0 for Deep Reinforcement Learning

This is a simple tutorial of deep reinforcement learning with tensorflow 2.0, which has simple demos and detailed model implementations to help beginners get start in this research region.

How to install TensorFlow 2.0

$ conda create --name tensorflow_2_0
$ conda activate tensorflow_2_0
$ pip install tensorflow==2.0.0-b1 # pip install tensorflow-gpu==2.0.0-b1 for GPU version

Test:

>>> import tensorflow as tf
>>> tf.__version__
'2.0.0-beta1'

TensorFlow 2.0 Tutorial

Python Tutorial

Welcome to visit my Fast Py3 Repo. This is a fast python3 tutorial.

Gym Tutorial

Basic Gym

Reinforcement Learning

Book notes ...

Deep Reinforcement Learning

Rainbow (Building Rainbow Step by Step with TensorFlow2.0)
- Deep Q-Network
- +Double DQN
- +Prioritized Experience Replay
- +Dueling Network
- +Multi-Step Q-Learning
- +Distributional RL(Failed Yet, But I Will Try My BestTo Make It Work Soon!)
- +Noisy Network(Failed Yet, But I Will Try My Best To Make It Work Soon!)

Paper Reading

Welcome to visit my personal blog website: HU's Blog. There is a list of RL Paper Overviews.

tensorflow2.0-for-deep-reinforcement-learning's People

Contributors

Stargazers

Watchers

tensorflow2.0-for-deep-reinforcement-learning's Issues

where do you use "next_state" in def train(self):

you get next_state from self.get_n_step_info(self.n_step_buffer, self.gamma), but the next_state is not used.
may be self.store_transition(p, obs, action, reward, next_obs, done) should be self.store_transition(p, obs, action, reward, next_state, done).

Agent doesn't learn anything

Thank you so much for sharing this code, truly helpful. However, the agent couldn't learn anything when I trained with "Breakout-ram-v0" and "Pong-ram-vo". I tried different setting, such as:

buffer_size=100000, learning_rate=.0015, epsilon=.99, epsilon_dacay=0.9999,
min_epsilon=.1, gamma=.95, batch_size=64, target_update_iter=400, 
train_nums=10000, start_learning=200

agent network is:

self.input_layer = tf.keras.layers.InputLayer(input_shape=(num_states,))
self.fc1 = tf.keras.layers.Dense(hidden_units, activation = 'relu', kernel_initializer = 'he_uniform')
self.fc2 = tf.keras.layers.Dense(hidden_units, activation = 'relu', kernel_initializer = 'he_uniform')
self.output_layer = tf.keras.layers.Dense(num_actions,name = 'q_values')

loss function is "mse", optimizer is Adam. Could anyone help? Really appreciate it!

NoisyDense incorrect sigma init?

You have this code when init sigma:

sigma_initializer = tf.constant_initializer(self.std_init / np.sqrt(self.units))

Going into the original paper (section 3.2) I would assume the init to be like this:

sigma_initializer = tf.constant_initializer(self.std_init / np.sqrt(input_dim))

Was this change intended? Or maybe I'm misunderstanding the paper?

Recommend Projects

huixxi / tensorflow2.0-for-deep-reinforcement-learning Goto Github PK