Code Monkey home page Code Monkey logo

howl-anderson / seq2annotation Goto Github PK

View Code? Open in Web Editor NEW
85.0 9.0 22.0 9.03 MB

基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF 和 IDCNN+CRF,更多算法正在持续添加中)实现中文分词(Tokenizer / segmentation)、词性标注(Part Of Speech, POS)和命名实体识别(Named Entity Recognition, NER)等序列标注任务。

License: Apache License 2.0

Python 94.25% Shell 0.43% Dockerfile 2.35% Makefile 2.97%
tensorflow-models sequence-annotation tensorflow bilstm-crf-model bilstm-crf idcnn-crf idcnn part-of-speech-tagger part-of-speech named-entity-recognition

seq2annotation's Introduction

seq2annotation

基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRFIDCNN+CRF,更多算法正在持续添加中)实现中文分词(Tokenizer / segmentation)、词性标注(Part Of Speech, POS)和命名实体识别(Named Entity Recognition, NER)等序列标注任务。

特色

  • 通用的序列标注:能够解决通用的序列标注问题:分词、词性标注和实体识别仅仅是特例。
  • Tag schema free: 你可以选择你想用的任何 Tagset。依赖于 tokenizer_tools 提供的编码、解码功能

TODO

  • current TF Metrics is not launch on pypi, but seq2annotation depends on it, so seq2annotation currently can't packaged as python package on pypi

More Algorithms To Do

Credits

增加 NER 评估方案

From http://www.davidsbatista.net/blog/2018/05/09/Named_Entity_Evaluation/

seq2annotation's People

Contributors

howl-anderson avatar shfshf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

seq2annotation's Issues

Remove useless .py files

文件中,类似于configure.json这样无用的文件建议都清理掉。
否则和我去一起卖烧烤吧。

keras save the model bug

在develop分支下:
把下面代码中的serving_only=True删除,在服务器上运行的时候就不会报错了

Save the model

model.save(create_file_dir_if_needed(config["h5_model_file"]))

tf.keras.experimental.export_saved_model(
    model, create_dir_if_needed(config["saved_model_dir"]), serving_only=True
)

time++

deliverable_model中生成的metadata.json增加 一个日期时间,方便管理回溯

添加checkpoint warm_start_from 配置

seq2annotation.trainer.train_model.py中添加warm_start_from=config.get("warm_start_dir", None):
estimator = tf.estimator.Estimator(
model_fn, instance_model_dir, cfg, estimator_params,
warm_start_from=config.get("warm_start_dir", None)
)
在configure.yaml配置中添加:warm_start_dir:

经过非专业测试,已无问题

results directory

seq2annotation.trainer.cli_keras中运行程序的生成的results文件中,无法生成目录model_dir与h5_model,从而导致报错。
快和我去开烧烤摊吧。。。。。

您好开发者

您好,我是百度飞桨运营,看了您的项目觉得很优秀,希望能与您取得联系,请问可以加一下我的微信(paddlehelp)备注飞桨开发者么?
期待您的回复~

Results tracking

estimate 在程序运行的时候会默认去读results里面的checkpoint文件,在程序突然中止的时候,再开启是有帮助的。
但在调参的过程中,希望每次程序都能重新开始运行,这样有助于观察比较results。
如果这样都做不到的话,就别在google混了,跟我去开烧烤摊吧。。。。。

Close this useless function

In this directory -----
seq2annotation.trainer.cli_keras---
def classification_report(y_true, y_pred, labels):
......

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.