Code Monkey home page Code Monkey logo

neural_name_tagging's Issues

对这个论文效果的验证

我认为这个论文无效:
在中文数据集上的试验,使用 jieba 分词,代码tf 如下:

        with tf.variable_scope('word_char_embedding_combine'):
            with tf.variable_scope('word_freq_conversion'):
                
                word_freq_compress = self.word_freq_placeholder * 0.005
                word_freq_conversion = tf.tanh(word_freq_compress, name='tanh_word_freq')
                self.word_freq_conversion = word_freq_conversion

            #with tf.variable_scope("word_embedding_gate"):
                # word 级别的各项参数
                #word_w = tf.layers.dense(inputs=self.x_word_embedding, units=1, activation=None)
                #char_w = tf.layers.dense(inputs=x_char_cnn_embedding, units=1, activation=None)
            #    freq_w = tf.layers.dense(inputs=word_freq_conversion, units=1, activation=None)
            #    b_w = tf.get_variable(shape=[1], name="b_w")
            #    word_gate = tf.sigmoid(freq_w + b_w + word_w + char_w, name='word_embedding_gate')
            self.word_gate = word_freq_conversion

可以看一下上面的代码是否有误

隐藏层的那一部分同理

试验效果

['同时', ',']
['同时<pad_char><pad_char>', ',<pad_char><pad_char><pad_char>']
word_freq: [1. 1. ]
word_gate: [0.98816895 0.9967069 ]
char_gate: [0.89430475 0.8514535 ]

['至此', ',']
['至此<pad_char><pad_char>', ',<pad_char><pad_char><pad_char>']
word_freq: [0.9715941 1. ]
word_gate: [0.99966156 0.9967069 ]
char_gate: [0.7587908 0.8514535 ]

['<unk_word>', '…']
['多大仇<pad_char>', '…<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.9449244 0.9953596 ]
char_gate: [0.7027143 0.87556285 ]

['<unk_word>', ',']
['鳞次栉比', ',<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.96310204 0.9967069 ]
char_gate: [0.68618333 0.8514535 ]

['<unk_word>', ',']
['正巧<pad_char><pad_char>', ',<pad_char><pad_char><pad_char>']
word_freq: [0.00499996 1. ]
word_gate: [0.7656248 0.9967069 ]
char_gate: [0.7440602 0.8514535 ]

可以看一下,这里的门概率取值和本身词汇 的词频没任何线性关系,完全是随机的。
想问一下,既然假设前提是,词频低的词向量不够可靠,那决定因素是词频,和不准确的词向量本身有什么关系呢?

然后我将词向量全部剔除,只保留词频做参数,添加一个线性变换,得到的结论是:

  • word-gate 的值范围在 0.77~0.93 之间,
  • 若 tanh 为 1,即词向量非常可靠,则门取值 0.93,
  • 若tanh 为 0.004 ,即词向量非常不可靠,则门去 0.77

即 门控的影响很小,在训练过程中被压缩了。
隐藏层的影响范围更加小,在 0.72~0.77 之间。

最终的在人民日报、MSRA、boson等数据集上的测试效果均不理想,不如直接将 char 和 word 的向量直接 concat。

所以,论文方法无效。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.