limteng-rpi / neural_name_tagging Goto Github PK
View Code? Open in Web Editor NEWCode for "Reliability-aware Dynamic Feature Composition for Name Tagging" (ACL2019)
Code for "Reliability-aware Dynamic Feature Composition for Name Tagging" (ACL2019)
我认为这个论文无效:
在中文数据集上的试验,使用 jieba 分词,代码tf 如下:
with tf.variable_scope('word_char_embedding_combine'):
with tf.variable_scope('word_freq_conversion'):
word_freq_compress = self.word_freq_placeholder * 0.005
word_freq_conversion = tf.tanh(word_freq_compress, name='tanh_word_freq')
self.word_freq_conversion = word_freq_conversion
#with tf.variable_scope("word_embedding_gate"):
# word 级别的各项参数
#word_w = tf.layers.dense(inputs=self.x_word_embedding, units=1, activation=None)
#char_w = tf.layers.dense(inputs=x_char_cnn_embedding, units=1, activation=None)
# freq_w = tf.layers.dense(inputs=word_freq_conversion, units=1, activation=None)
# b_w = tf.get_variable(shape=[1], name="b_w")
# word_gate = tf.sigmoid(freq_w + b_w + word_w + char_w, name='word_embedding_gate')
self.word_gate = word_freq_conversion
可以看一下上面的代码是否有误
隐藏层的那一部分同理
试验效果
['同时', ',']
['同时<pad_char><pad_char>', ',<pad_char><pad_char><pad_char>']
word_freq: [1. 1. ]
word_gate: [0.98816895 0.9967069 ]
char_gate: [0.89430475 0.8514535 ]
['至此', ',']
['至此<pad_char><pad_char>', ',<pad_char><pad_char><pad_char>']
word_freq: [0.9715941 1. ]
word_gate: [0.99966156 0.9967069 ]
char_gate: [0.7587908 0.8514535 ]
['<unk_word>', '…']
['多大仇<pad_char>', '…<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.9449244 0.9953596 ]
char_gate: [0.7027143 0.87556285 ]
['<unk_word>', ',']
['鳞次栉比', ',<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.96310204 0.9967069 ]
char_gate: [0.68618333 0.8514535 ]
['<unk_word>', ',']
['正巧<pad_char><pad_char>', ',<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.7656248 0.9967069 ]
char_gate: [0.7440602 0.8514535 ]
可以看一下,这里的门概率取值和本身词汇 的词频没任何线性关系,完全是随机的。
想问一下,既然假设前提是,词频低的词向量不够可靠,那决定因素是词频,和不准确的词向量本身有什么关系呢?
然后我将词向量全部剔除,只保留词频做参数,添加一个线性变换,得到的结论是:
即 门控的影响很小,在训练过程中被压缩了。
隐藏层的影响范围更加小,在 0.72~0.77 之间。
最终的在人民日报、MSRA、boson等数据集上的测试效果均不理想,不如直接将 char 和 word 的向量直接 concat。
所以,论文方法无效。
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.