Code Monkey home page Code Monkey logo

chinese_ner's Introduction

实验环境

  • python3.x
  • pip install -r requirements.txt

数据预处理生成npy文件

python datamodels/DataModel.py

训练模型并测试结果

下载chinese-roberta-wwm-ext放到主目录下

bash bash/run_bert.sh
bash bash/run_bert_crf.sh
bash bash/run_bert_crf_crl3e-3.sh
...

train.log中有BertCrf+C+CrfLr100(3e-5 vs 3e-3)的训练过程

实验测试

python test --path=checkpoints/xxx

xxx为具体模型的checkpoint

test的badcase以及相关记录可以参考badcase文件夹下的文件

实验结果

训练过程中的F1值是自己较为严格的评测脚本得到的。要求如果一个实体出现多次,那么每一次都会参与到准确率的计算中 参考evaluate.py文件中的f1_score函数,因此train.log中记录的f1值会比最终报告的f1值低1个点左右(当然也有数据分布问题)。

最后的F1值,如下所示,是根据助教给的评测得到的。 参考evaluate_reference.py中的get_f1_score函数

  • Bert:0.76748/0.76192/0.76181/0.76673/0.76376

    • F1 detail:{'address': 0.638676844783715, 'book': 0.7960526315789473, 'company': 0.7756160830090791, 'game': 0.8006535947712419, 'government': 0.8076923076923077, 'movie': 0.750809061488673, 'name': 0.8804347826086956, 'organization': 0.7510316368638239, 'position': 0.7832009080590239, 'scene': 0.6907216494845362}
    • F1 detail:{'address': 0.5976714100905561, 'book': 0.7859424920127794, 'company': 0.7777777777777778, 'game': 0.8180300500834725, 'government': 0.769811320754717, 'movie': 0.7796610169491527, 'name': 0.8755364806866952, 'organization': 0.7523939808481532, 'position': 0.7874015748031497, 'scene': 0.675}
    • F1 detail:{'address': 0.6118836915297091, 'book': 0.7741935483870968, 'company': 0.7694267515923567, 'game': 0.8200000000000001, 'government': 0.803088803088803, 'movie': 0.7574750830564784, 'name': 0.8759439050701187, 'organization': 0.7409395973154362, 'position': 0.783573806881243, 'scene': 0.681592039800995}
    • F1 detail:{'address': 0.6096938775510203, 'book': 0.7962382445141065, 'company': 0.785900783289817, 'game': 0.8282828282828283, 'government': 0.8007590132827325, 'movie': 0.7999999999999999, 'name': 0.8805166846071043, 'organization': 0.7177419354838709, 'position': 0.7655719139297849, 'scene': 0.6826923076923077}
    • F1 detail:{'address': 0.5970149253731343, 'book': 0.7828947368421053, 'company': 0.7391874180865006, 'game': 0.8184818481848184, 'government': 0.793233082706767, 'movie': 0.7834394904458599, 'name': 0.8741865509761388, 'organization': 0.7744680851063829, 'position': 0.7713950762016413, 'scene': 0.7033492822966507}
  • BertCrf:0.7569645/0.767833/0.76863/0.769568/0.76974

    • F1 detail:{'address': 0.6004901960784313, 'book': 0.794701986754967, 'company': 0.803129074315515, 'game': 0.8079470198675496, 'government': 0.766859344894027, 'movie': 0.7523510971786834, 'name': 0.8760869565217391, 'organization': 0.7461430575035063, 'position': 0.7569367369589345, 'scene': 0.665}
    • F1 detail:{'address': 0.6173800259403371, 'book': 0.7777777777777778, 'company': 0.7858081471747701, 'game': 0.8283828382838284, 'government': 0.7842401500938087, 'movie': 0.7972972972972971, 'name': 0.8669527896995709, 'organization': 0.7658959537572255, 'position': 0.7682119205298014, 'scene': 0.6943765281173594}
    • F1 detail:{'address': 0.6004901960784313, 'book': 0.794701986754967, 'company': 0.803129074315515, 'game': 0.8079470198675496, 'government': 0.766859344894027, 'movie': 0.7523510971786834, 'name': 0.8760869565217391, 'organization': 0.7461430575035063, 'position': 0.7569367369589345, 'scene': 0.665}
    • F1 detail:{'address': 0.6165803108808291, 'book': 0.8196721311475409, 'company': 0.7962962962962963, 'game': 0.8281505728314239, 'government': 0.7923809523809524, 'movie': 0.7609427609427609, 'name': 0.877005347593583, 'organization': 0.7698744769874476, 'position': 0.7813211845102506, 'scene': 0.6534653465346535}
    • F1 detail:{'address': 0.6125166444740346, 'book': 0.8155339805825242, 'company': 0.768041237113402, 'game': 0.8295081967213115, 'government': 0.8155339805825242, 'movie': 0.773972602739726, 'name': 0.8669527896995709, 'organization': 0.7478753541076487, 'position': 0.7718501702610668, 'scene': 0.6956521739130435
  • BertCrf+C:0.7975/0.7916/0.79446/0.78681/0.78913

    • F1 detail:{'address': 0.6532374100719425, 'book': 0.8327645051194539, 'company': 0.8098495212038305, 'game': 0.8305647840531561, 'government': 0.8514851485148515, 'movie': 0.7887323943661971, 'name': 0.8932461873638344, 'organization': 0.790014684287812, 'position': 0.814903846153846, 'scene': 0.7109974424552429}
    • F1 detail:{'address': 0.6490066225165563, 'book': 0.8131147540983608, 'company': 0.7958921694480102, 'game': 0.820097244732577, 'government': 0.8266129032258065, 'movie': 0.7801418439716311, 'name': 0.8913043478260869, 'organization': 0.7932960893854749, 'position': 0.7986425339366516, 'scene': 0.7481662591687042}
    • F1 detail:{'address': 0.6657824933687002, 'book': 0.8247422680412372, 'company': 0.8059701492537313, 'game': 0.8314238952536825, 'government': 0.8284023668639053, 'movie': 0.802721088435374, 'name': 0.8900862068965517, 'organization': 0.7696709585121603, 'position': 0.799081515499426, 'scene': 0.7268041237113402}
    • F1 detail:{'address': 0.6398891966759003, 'book': 0.785234899328859, 'company': 0.806970509383378, 'game': 0.8388214904679376, 'government': 0.8202020202020202, 'movie': 0.7713310580204777, 'name': 0.8996692392502755, 'organization': 0.7885714285714286, 'position': 0.810126582278481, 'scene': 0.7073791348600509}
    • F1 detail:{'address': 0.6420079260237781, 'book': 0.8203389830508474, 'company': 0.7967032967032966, 'game': 0.8471760797342194, 'government': 0.8230769230769229, 'movie': 0.7801418439716311, 'name': 0.8804347826086956, 'organization': 0.7866108786610879, 'position': 0.7995418098510881, 'scene': 0.7153284671532847}
  • BertCrf+CrfLr100(3e-5 vs 3e-3): 0.803778/0.80271/0.8023150/0.798411/0.8016774

    • F1 detail:{'address': 0.6612466124661247, 'book': 0.8092105263157895, 'company': 0.8111888111888113, 'game': 0.8401360544217686, 'government': 0.8339622641509433, 'movie': 0.8235294117647058, 'name': 0.8947951273532668, 'organization': 0.8145985401459854, 'position': 0.8023121387283237, 'scene': 0.7468030690537084}
    • F1 detail:{'address': 0.660427807486631, 'book': 0.8025078369905956, 'company': 0.8075880758807588, 'game': 0.8411867364746947, 'government': 0.8402366863905325, 'movie': 0.8201438848920863, 'name': 0.891514500537057, 'organization': 0.8172362555720654, 'position': 0.8170731707317074, 'scene': 0.7292817679558011}
    • F1 detail:{'address': 0.6373937677053825, 'book': 0.803921568627451, 'company': 0.8102288021534321, 'game': 0.8595890410958903, 'government': 0.8416833667334669, 'movie': 0.8215488215488216, 'name': 0.8876772082878953, 'organization': 0.8058823529411765, 'position': 0.8116959064327486, 'scene': 0.7435294117647058}
    • F1 detail:{'address': 0.6622340425531915, 'book': 0.808080808080808, 'company': 0.8257372654155496, 'game': 0.8469387755102041, 'government': 0.8385826771653542, 'movie': 0.8215488215488216, 'name': 0.8881578947368421, 'organization': 0.7919075144508672, 'position': 0.7972190034762457, 'scene': 0.7037037037037037}
    • F1 detail:{'address': 0.6703755215577192, 'book': 0.8378378378378377, 'company': 0.8161559888579387, 'game': 0.8330522765598651, 'government': 0.8266129032258065, 'movie': 0.8292682926829269, 'name': 0.8980477223427331, 'organization': 0.7754491017964072, 'position': 0.7863849765258215, 'scene': 0.7435897435897436}
  • BertCrf+C+CrfLr100(3e-5 vs 3e-3): 0.798169/0.79619/0.803578/0.80687/0.803076

    • F1 detail:{'address': 0.6371191135734072, 'book': 0.8289473684210527, 'company': 0.8121546961325966, 'game': 0.845360824742268, 'government': 0.8346774193548387, 'movie': 0.8122866894197952, 'name': 0.8784227820372399, 'organization': 0.787878787878788, 'position': 0.803337306317044, 'scene': 0.741514360313316}
    • F1 detail:{'address': 0.6504065040650406, 'book': 0.8231292517006803, 'company': 0.8263305322128851, 'game': 0.8291873963515755, 'government': 0.8267716535433071, 'movie': 0.8214285714285715, 'name': 0.8932461873638344, 'organization': 0.7925608011444921, 'position': 0.7952662721893491, 'scene': 0.7036011080332409}
    • F1 detail:{'address': 0.6569767441860466, 'book': 0.8124999999999999, 'company': 0.8090787716955942, 'game': 0.8386023294509152, 'government': 0.8483606557377049, 'movie': 0.8054607508532423, 'name': 0.8964013086150491, 'organization': 0.7994269340974212, 'position': 0.8227114716106605, 'scene': 0.7462686567164178}
    • F1 detail:{'address': 0.6711590296495956, 'book': 0.825938566552901, 'company': 0.8168642951251647, 'game': 0.8513513513513514, 'government': 0.8293650793650794, 'movie': 0.8247422680412372, 'name': 0.8896174863387979, 'organization': 0.7970802919708029, 'position': 0.8107476635514019, 'scene': 0.7518427518427518}
    • F1 detail:{'address': 0.65, 'book': 0.823529411764706, 'company': 0.8168642951251647, 'game': 0.8438538205980066, 'government': 0.845691382765531, 'movie': 0.8013468013468015, 'name': 0.8879216539717083, 'organization': 0.7926657263751763, 'position': 0.8182857142857144, 'scene': 0.7506053268765134}
  • C: Crf Constrain

  • CrfLrX: Crf学习率是bert的学习率的X倍:

chinese_ner's People

Contributors

wangpeiyi9979 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.