Code Monkey home page Code Monkey logo

lingua-cli's Introduction

Crate GitHub release Project Status: Active – The project has reached a stable, usable state and is being actively developed. Technology Readiness Level 7/9 - Release Candidate - Technology ready enough and in initial use by end-users in intended scholarly environments. Further validation in progress.

Lingua-cli

This is a small command-line tool for language detection, it is a simple wrapper around the lingua-rs library for Rust, read there for extensive documentation. A distinguishing feature is that this library works better for short texts thanmany other libraries

Installation

Ensure you have Rust's package manager cargo, then download, isntall and compile lingua-cli in one go as follows:

$ cargo install lingua-cli

Usage

Pass text as parameter

$ lingua-cli bonjour à tous

Pass text via standard input:

$ echo "bonjour à tous" | lingua-cli

Constrain the languages you want to detect using -l with iso-639-1 languages codes. Constraining the list improves accuracy. Do -L to see a list of supported languages.

$ echo "bonjour à tous" | lingua-cli -l "fr,de,es,nl,en"

To classify input line-by-line, pass -n.

$ echo -e "bonjour à tous\nhola a todos\nhallo allemaal" | lingua-cli -n -l "fr,de,es,nl,en"

fr      0.9069164472389637      bonjour à tous
es      0.918273871035807       hola a todos
nl      0.988293648761749       hallo allemaal

Output is TSV and consists of an iso-639-1 language code, confidence score, and in line-by-line mode, a copy of the line.

You can also classified mixed text using the --multi option. This will then output UTF-8 byte offsets:

$ lingua-cli --multi -l fr,de,en < /tmp/test.txt
0       23      fr      Parlez-vous français? 
23      73      de      Ich spreche ein bisschen spreche Französisch ja. 
73      110     en      A little bit is better than nothing.

lingua-cli's People

Contributors

proycon avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.