The neural_name_tagging's discuss from limteng-rpi

对这个论文效果的验证

我认为这个论文无效：
在中文数据集上的试验，使用 jieba 分词，代码tf 如下：

        with tf.variable_scope('word_char_embedding_combine'):
            with tf.variable_scope('word_freq_conversion'):
                
                word_freq_compress = self.word_freq_placeholder * 0.005
                word_freq_conversion = tf.tanh(word_freq_compress, name='tanh_word_freq')
                self.word_freq_conversion = word_freq_conversion

            #with tf.variable_scope("word_embedding_gate"):
                # word 级别的各项参数
                #word_w = tf.layers.dense(inputs=self.x_word_embedding, units=1, activation=None)
                #char_w = tf.layers.dense(inputs=x_char_cnn_embedding, units=1, activation=None)
            #    freq_w = tf.layers.dense(inputs=word_freq_conversion, units=1, activation=None)
            #    b_w = tf.get_variable(shape=[1], name="b_w")
            #    word_gate = tf.sigmoid(freq_w + b_w + word_w + char_w, name='word_embedding_gate')
            self.word_gate = word_freq_conversion

可以看一下上面的代码是否有误

隐藏层的那一部分同理

试验效果

['同时', '，']
['同时<pad_char><pad_char>', '，<pad_char><pad_char><pad_char>']
word_freq: [1. 1. ]
word_gate: [0.98816895 0.9967069 ]
char_gate: [0.89430475 0.8514535 ]

['至此', '，']
['至此<pad_char><pad_char>', '，<pad_char><pad_char><pad_char>']
word_freq: [0.9715941 1. ]
word_gate: [0.99966156 0.9967069 ]
char_gate: [0.7587908 0.8514535 ]

['<unk_word>', '…']
['多大仇<pad_char>', '…<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.9449244 0.9953596 ]
char_gate: [0.7027143 0.87556285 ]

['<unk_word>', '，']
['鳞次栉比', '，<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.96310204 0.9967069 ]
char_gate: [0.68618333 0.8514535 ]

['<unk_word>', '，']
['正巧<pad_char><pad_char>', '，<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.7656248 0.9967069 ]
char_gate: [0.7440602 0.8514535 ]

可以看一下，这里的门概率取值和本身词汇的词频没任何线性关系，完全是随机的。
想问一下，既然假设前提是，词频低的词向量不够可靠，那决定因素是词频，和不准确的词向量本身有什么关系呢？

然后我将词向量全部剔除，只保留词频做参数，添加一个线性变换，得到的结论是：

word-gate 的值范围在 0.77~0.93 之间，
若 tanh 为 1，即词向量非常可靠，则门取值 0.93，
若tanh 为 0.004 ，即词向量非常不可靠，则门去 0.77

即门控的影响很小，在训练过程中被压缩了。
隐藏层的影响范围更加小，在 0.72~0.77 之间。

最终的在人民日报、MSRA、boson等数据集上的测试效果均不理想，不如直接将 char 和 word 的向量直接 concat。

所以，论文方法无效。

limteng-rpi / neural_name_tagging Goto Github PK

neural_name_tagging's Issues

对这个论文效果的验证

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent