Code Monkey home page Code Monkey logo

slovakbert-conll2003-sk-ner's Introduction

Description

Traning procedure and evaluation for ju-bezdek/slovakbert-conll2003-sk-ner

Training

For local training run

python src/train.py

For training on azure run

python train_on_azure.py create_ws -sub_id -ws -rg

then

python train_on_azure.py run_remote

Trained model usage

from transformers import pipeline, AutoModel, AutoTokenizer
from spacy import displacy
import os


model_path="ju-bezdek/slovakbert-conll2003-sk-ner"

aggregation_strategy="max"
ner_pipeline = pipeline(task='ner', model=model_path, aggregation_strategy=aggregation_strategy)

input_sentence= "Ruský premiér Viktor Černomyrdin v piatok povedal, že prezident Boris Jeľcin , ktorý je na dovolenke mimo Moskvy , podporil mierový plán šéfa bezpečnosti Alexandra Lebedu pre Čečensko, uviedla tlačová agentúra Interfax"
ner_ents = ner_pipeline(input_sentence)
print(ner_ents)

ent_group_labels = [ner_pipeline.model.config.id2label[i][2:] for i in ner_pipeline.model.config.id2label if i>0]

options = {"ents":ent_group_labels}

dicplacy_ents = [{"start":ent["start"], "end":ent["end"], "label":ent["entity_group"]} for ent in ner_ents]
displacy.render({"text":input_sentence, "ents":dicplacy_ents}, style="ent", options=options, jupyter=True, manual=True)

Result:

Ruský MISC premiér Viktor Černomyrdin PER v piatok povedal, že prezident Boris Jeľcin, PER , ktorý je na dovolenke mimo Moskvy LOC , podporil mierový plán šéfa bezpečnosti Alexandra Lebedu PER pre Čečensko, LOC uviedla tlačová agentúra Interfax ORG
[{
    'entity_group': 'MISC',
    'score': 0.82277083,
    'word': ' Ruský',
    'start': 0,
    'end': 5
}, {
    'entity_group': 'PER',
    'score': 0.9821574,
    'word': ' Viktor Černomyrdin',
    'start': 14,
    'end': 32
}, {
    'entity_group': 'PER',
    'score': 0.9796225,
    'word': ' Boris Jeľcin',
    'start': 64,
    'end': 76
}, {
    'entity_group': 'LOC',
    'score': 0.94837284,
    'word': ' Moskvy',
    'start': 106,
    'end': 112
}, {
    'entity_group': 'PER',
    'score': 0.94473803,
    'word': ' Alexandra Lebedu',
    'start': 154,
    'end': 170
}, {
    'entity_group': 'LOC',
    'score': 0.81060684,
    'word': ' Čečensko,',
    'start': 175,
    'end': 184
}, {
    'entity_group': 'ORG',
    'score': 0.9785074,
    'word': ' Interfax',
    'start': 210,
    'end': 218
}]

slovakbert-conll2003-sk-ner's People

Contributors

ju-bezdek avatar

Stargazers

Sebastián Petrík avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.