white127 / qa-deep-learning Goto Github PK
View Code? Open in Web Editor NEWtensorflow and theano cnn code for insurance QA(question Answer matching)
tensorflow and theano cnn code for insurance QA(question Answer matching)
网络权值的初始化仍然是使用tf.random_uniform()使初始权重服从均匀分布?不知道有没有理解错误,望解答。
大家跑出来结果怎么样?我怎么跑不到那么好的结果?
请问这个怎么解决,谢谢!
python insqa_lstm.py
insqa_lstm.py:279: UserWarning: DEPRECATION: the 'ds' parameter is not going to exist anymore as it is going to be replaced by the parameter 'ws'.
pooled_out = pool.pool_2d(input=conv_out, ds=(sequence_len - filter_size + 1, 1), ignore_border=True, mode='max')
Traceback (most recent call last):
File "insqa_lstm.py", line 428, in
train()
File "insqa_lstm.py", line 399, in train
x1: p1, x2: p2, x3: p3, m1: q1, m2: q2, m3: q3
File "/Users/timothy/anaconda/envs/tensorflow/lib/python2.7/site-packages/theano/compile/function.py", line 317, in function
output_keys=output_keys)
File "/Users/timothy/anaconda/envs/tensorflow/lib/python2.7/site-packages/theano/compile/pfunc.py", line 449, in pfunc
no_default_updates=no_default_updates)
File "/Users/timothy/anaconda/envs/tensorflow/lib/python2.7/site-packages/theano/compile/pfunc.py", line 208, in rebuild_collect_shared
raise TypeError(err_msg, err_sug)
TypeError: ('An update must have the same type as the original shared variable (shared_var=<TensorType(float32, matrix)>, shared_var.type=TensorType(float32, matrix), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, matrix)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.')
这个实际使用怎么预测?如果候选集的答案很多计算会很慢吧
gen.py文件中不是只生成了w2v.train文件吗?还有test2.prepro和predict1文件又是在哪里生成的呢?
where do i get this file?
哥们您好,我也是做nlp的。
自己尝试复现了一下,发现效果不好,我的结构就是q,a都过lstm(两边共享的参数),然后maxpooling得到向量,cos之后triplet loss,但是只跑到了0。5,而且跑得非常慢,我一个q采样了100个negative a,想问一下啊您的模型快不快呢?我大约要一天才能收敛,参数都是我从别人论文里面找来的。。。
我用的作者原封不动的代码,数据是从这里拿的https://github.com/codekansas/insurance_qa_python,改成和作者一样的格式,跑起来以后发现用train拿来训练,用test1拿来validate。
无论是在作者tensorflow还是theano的代码上top-1 accuracy都轻松达到0.86,learning rate为0.1,epoch大概6000, 在K80上训练时间都不超过半小时。
据我所知,该项目的state-of-art不超过0.7,作者的代码简直轻松完虐。我仔细检查了代码,并没有发现明显的错误,有小伙伴跑出一样的结果吗?
运行的代码在这(仅仅清理了一下代码放上了train和test data,参数结构和原作者的完全一致) https://github.com/pcgreat/insuranceQA-cnn-lstm/tree/master/cnn/tensorflow
方便大家重现结果
现在 cnn 和 rnn 的 数据输入都是一个batch_size,一个batch_size的。但是有个问题,所有数据的最后一个batch可能已经不足一个batch_size的大小了。怎么办呢??? 如果是matconvnet,最后一个batch 可以大小不如batch_size的。我看tutorial的处理是,最后一个就不处理了。那测试时候呢,也不处理了?tutorial给的样例不是很好。可能我对于tensorflow读的代码比较少,尤其在lstm方面,需要预定batch大小,state_init_R = tf.tile(init_R, [batch_size, 1]) ,这里必须要指定batch_size的大小。我问了下,theano这方面是比较灵活的。我看你既用了theano,也用了tensorflow。应该了解的比较深入。这个问题困扰我一段时间了,没有找到比较好的办法,请问你怎么看呢?谢谢!
rzai@rzai00:/prj/insuranceQA-cnn-lstm/lstm_cnn/theano$ python insqa_lstm.py/prj/insuranceQA-cnn-lstm/lstm_cnn/theano$
Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled)
Traceback (most recent call last):
File "insqa_lstm.py", line 433, in
train()
File "insqa_lstm.py", line 387, in train
num_filters=num_filters)
File "insqa_lstm.py", line 259, in init
cnn1 = self._cnn_net(tparams, cnn_input1, batch_size, sequence_len, num_filters, filter_sizes, proj_size)
File "insqa_lstm.py", line 283, in _cnn_net
conv_out = conv2d(input=cnn_input, filters=W, filter_shape=filter_shape, input_shape=image_shape)
File "/usr/local/lib/python2.7/dist-packages/theano/tensor/nnet/conv.py", line 149, in conv2d
imshp=imshp, kshp=kshp, nkern=nkern, bsize=bsize, **kargs)
TypeError: init() got an unexpected keyword argument 'input_shape'
rzai@rzai00:
第一列和第二列是什么意思?
你好,请问下这是参考哪篇论文的?谢谢。
Dear Authors,
I am happy to see your results on the insurance data set and tempted to re-produce on my side. But, I could not replicate on my test data. The reasons are that you applied stop-word removal, text normalization and lemmatization on the text. So, my test data and your test data are not matching. If possible, could you please share full test data or 500 or 1000 test queries similar to what you have provided for (20 test queries).
--Veera.
谢谢啦,如果 可以的话能发我的邮箱吗[email protected]
代码中训练数据的获取接口是: utils.gen_train_batch_qpn(train_data, FLAGS.batch_size)
但是在该函数中
def gen_train_batch_qpn(_data, batch_size):
psample = random.sample(_data, batch_size)
nsample = random.sample(_data, batch_size)
q = [s1 for s1, s2 in psample]
qp = [s2 for s1, s2 in psample]
qn = [s2 for s1, s2 in nsample]
return np.array(q), np.array(qp), np.array(qn)
psample和nsample获取方式一样??
模型中只用了一层卷积,filter是(1,2,3,5)* embedding_size的,如果构建一般图像处理的模型呢,filter是个小滑窗(如5*5),多次conv+max_pool,这样的模型会有什么问题吗
请问做
第14行
def build_vocab():
code = int(0)
vocab = {}
vocab['UNKNOWN'] = code
code += 1
for line in open('/export/jw/cnn/insuranceQA/train'):
items = line.strip().split(' ')
for i in range(2, 3)://就是这一行,为什么不是range(2, 4)
words = items[i].split('_')
for word in words:
if not word in vocab:
vocab[word] = code
code += 1
作者你好,我把模型训练好之后,调用保存好的模型,打印参数提示不存在,
ValueError: Fetch argument 'W:0' cannot be interpreted as a Tensor. ("The name 'W:0' refers to a Tensor which does not exist. The operation, 'W', does not exist in the graph.")
使用tensorboard进行可视化提示:
No dashboards are active for the current data set.
Probable causes:
You haven’t written any data to your event files.
TensorBoard can’t find your event files.
能帮忙看一下吗
作者能否公开下lstm+cnn tensorflow的代码啊?
I converted original idx_xx format to real-word format (see ./insuranceQA/train ./insuranceQA/test1.sample)
这个样本数据在哪能看到,或者说这个样本数据的格式是什么样子的呢
训练模型的时候,train中的数据,回答都是正确答案吧,都是qp,没有qn吧,我理解的有问题吗?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.