Code Monkey home page Code Monkey logo

italian-nlp-library's Introduction

Deprecation

This library is not maintained since years, so it is now archived. I recommend Spacy for NLP tasks and Wiktextract for a lexical database (for Italian or pretty much every language).

Build Status

Italian NLP library

A Java 8 library or REST server to perform NLP tasks on Italian language, more specifically is able to:

  • detect the conjugation (person, number, time and mode) of a givern verb
  • conjugate verbs
  • detect stopwords
  • detect numbers
  • PoS tagging, sentencing and tokening (based on OpenNLP)

Verb detection and conjugation are based on an analysis of en.wiktionary, containing about 9000 verb lemmas. When a root is not found, suffixed are used instead.

Use as a REST server

The easiest way is to lunch it with Docker:

docker run -p 5678:5678jacopofar/italian-nlp-library

POS tagger

curl -X POST -H "Content-Type: application/json"  -d '{"text":"Mi piace correre e scherzare ma anche bere una tazza di tè"}' "http://localhost:5678/postagger"

{
"annotations": [
  {
    "span_start": 0,
   "span_end": 2,
   "annotation": {
    "POS": "PC"
  }
},
{
  "span_start": 3,
  ...

verb conjugations

curl "http://localhost:5678/conjugations/mangiare"

{
"indicative past historic 2s": "mangiasti",
"indicative future 1s": "mangerò",
"indicative future 1p": "mangeremo",
...

match POS tags

curl -X POST -H "Content-Type: application/json"  -d '{"parameter":"S.+","text":"Mi piace correre e scherzare ma anche bere una tazza di tè"}' "http://localhost:5678/posmatch"

Use as a library

Use Maven to build and install it, mvn package to build a JAR To use and test the library is necessary to have a set of resource files which can be downloaded from the releases page

italian-nlp-library's People

Contributors

dependabot[bot] avatar jacopofar avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

Forkers

mbednarski

italian-nlp-library's Issues

FileNotFoundException running the docker container

I get this FileNotFoundException running the docker container:

java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.FileNotFoundException: database /opt/italian-nlp-library/target/classes//it_verb_model.db not found or not a file
        at com.github.jacopofar.italib.ItalianModel.<init>(ItalianModel.java:222)
        at com.github.jacopofar.italib.ItalianModel.<init>(ItalianModel.java:300)
        at com.github.jacopofar.italib.restserver.Server.main(Server.java:31)
        ... 6 more

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.