Comments (1)
#代码中你说用的 td_error 的 actor-critic 算法,但实际算actor的gradient时,你用的是q而不是td_error, 修改如下
def learn(self, s, a, r, s_):
s, s_ = s[np.newaxis, :], s_[np.newaxis, :] next_a = [[i] for i in range(N_A)] s_ = np.tile(s_,[N_A,1]) q_ = self.sess.run(self.td_error, {self.s: s_,self.a:next_a}) q_ = np.max(q_,axis=0,keepdims=True) q, _ = self.sess.run([self.q, self.train_op], {self.s: s, self.q_: q_, self.r: r,self.a:[[a]]}) return q
直接看源码吧,源码就是td_error,这个作者改来改去的反而把东西改错了,而且还把源码声明删了。
https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow/blob/967c829335fa34a329b7976b29fc1f579776d67f/contents/8_Actor_Critic_Advantage/AC_CartPole.py#L74
from tensorflow_practice.
Related Issues (20)
- 请问DKN各位可以运行吗?需要怎样的计算资源?
- Bug in RL/Policy_Gradient/RL_brain.py
- gbdt+lr问题
- DKN中数据预处理,TransE.cpp无法运行 HOT 2
- DCG的问题
- 这个变量应该设置成trainable=False
- 这块应该每次进来重新置0吧
- FFM这个实现有问题吧. 这个值每次都会被更新吧
- tensorflow_practice/basic/Basic-Transformer-Demo/训练问题
- A2C 数据集在百度网盘上被屏蔽了
- 数据集链接挂了 HOT 15
- deepfm篇您用的哪里的数据集 可以分享下么
- 有数据集的老哥分享一下 这个链接挂了 HOT 1
- dataset失效了 HOT 2
- 请问一下Rainbow里面环境的问题
- maddpg算法有一些问题
- 给所有需要data的朋友 HOT 1
- 请问FFM代码的tf.squeeze 为啥要把维度是一的数据删去,不是很明白
- AC算法的critic网络不对吧?应该用v计算td_error,r + gamma * q是啥?
- 数据集连接可以分享下吗
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from tensorflow_practice.