Code Monkey home page Code Monkey logo

Comments (4)

limteng-rpi avatar limteng-rpi commented on May 30, 2024

我们在OntoNotes和一些LORELEI数据集上测试了我们的模型,但没有在中文数据集上测试过。中文本身具有特殊性(例如字符数量远大于中文,字符本身语义丰富,多数单词为单字词或者双字词),英文或类似语言上的结论在中文上不一定成立。
关于Gating mechanism的设计:通常情况下,OOV和稀有词在文本中的比例并不高。对于其他词频较高的单词,我们也希望模型能够对不同的单词动态地决定如何混合character-level representation和word-level representation,类似的结构可以参考[1]和[2]。同时因为每个gate有character/word-level representation和reliability signal三种输入,其输出的平均值和词频并不一定有线性关系。

[1] Miyamoto, Yasumasa, and Kyunghyun Cho. "Gated word-character recurrent language model." In 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, pp. 1992-1997. Association for Computational Linguistics (ACL), 2016.
[2] Rei, Marek, Gamal Crichton, and Sampo Pyysalo. "Attending to Characters in Neural Sequence Labeling Models." In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pp. 309-318. 2016.

from neural_name_tagging.

dongrixinyu avatar dongrixinyu commented on May 30, 2024

@limteng-rpi
1、如果词频对于门的设计不太重要,那么词频信息添加进去也是徒劳,作用太小。
2、如果这样的设计对高频词也有效。高频词数量比例是非常大的,按理说结果应该有提升。但实际上这样的操作最终结果指标在多个数据集上还不如直接 concat 字和词,甚至不如直接使用词。
3、关于中英文的区别说法,只会显得论文的结果更加不可靠吧。

4、最后,最重要的是,我做了两次试验,使用同样的数据集,同样参数配方,拟合过后,取同样的词看gating 配比,发现差别还是挺大的。也就是说,同样的词,模型都不知道学到了什么,更可见模型很难说从词频和字词向量中学到了什么配比。

from neural_name_tagging.

dongrixinyu avatar dongrixinyu commented on May 30, 2024

另外,可能会怀疑我的模型过拟合导致了权重配比每次训练不一样。我训练的语料抛除 人日和 微软,共计 3800万字,样本量百万条,模型大小也有控制。词向量和字向量训练在 10亿字上。所以不是过拟合导致了配比不准确。

from neural_name_tagging.

limteng-rpi avatar limteng-rpi commented on May 30, 2024

就我们在OntoNotes上的结果而言(Table 2),reliability signals的加入确实在wb以外的子集上有所提升,所以我个人不认为“词频对于门的设计不太重要”。这样的设计也许并不适用于中文,但这不等同于论文的结果不可靠。我们的结论是根据OntoNotes上的结果得出的,结论是否能推广到其他语言或者差别较大的domain需要另外的实验来验证。我认为中英文的差别可能是导致concat的方法效果更好的原因之一。基于你的实验,你可以认为这种gating mechanism在相关数据集上无效。

from neural_name_tagging.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.