andyk / vanilla_policy_gradient Goto Github PK
View Code? Open in Web Editor NEWI coded up vanilla policy gradient in Google Colab from memory after carefully studying # The following is copied & pasted from Aurelien Geron's O'Reilly book example code notebook called 18_reinforcement_learning.ipyn