Code Monkey home page Code Monkey logo

chrentranslate's Introduction

Cherokee-English Translate Demo

Online Website

Our demo is at https://chren.cs.unc.edu/. However, to avoid misusing, we have not opened it to the public yet.

If you are an Cherokee-English expert and want to participate in our human-in-the-loop development, do not hesitate to contact us (Shiyue Zhang, shiyue At cs.unc.edu).

Local Setup

Models

Download NMT translation models, vocabulary files, SMT translation models, and QE models from here. Decompress all of them and put them under this code directory, e.g., ./ChrEnTranslate/models.

Note that, the data used to train our translation models is provided here. This data is not necessary for setting up this demo.

Python Environment

python3.8 -m venv test
source test/bin/activate
pip install -r requirements.txt

Download our modified OpenNMT-py from here.

cd OpenNMT-py; python setup.py install

Moses

Download mosesdecoder and compile it following the instruction. Be sure to use --with-xmlrpc-c so that it can work as a Service. Then, start the two SMT models in the background:

nohup mosesdecoder/bin/moses -f smt/enchr_demo_feedback_round2_ced/mert-work/moses.ini --server --server-port 5001 --server-log enchr.log &
nohup mosesdecoder/bin/moses -f smt/chren_demo_feedback_round2_ced/mert-work/moses.ini --server --server-port 5002 --server-log chren.log &

Here we use nohup, but any method that runs two models in the background should work.

MongoDB

Our demo requires MongoDB to store feedback and monolingual data.

After you setup your MongoDB, put your MONGO_URI in a file named confidential.py.

By default, we use the following collections:

  • expert: for storing experts' feedback, e.g., { "_id" : ObjectId("xxx"), "type" : "tochr", "model" : "nmt", "en" : "I'm going to get some coffee.", "chr" : "ᎧᏫ ᏫᏓᏥᏁᎩᏏ.", "qe" : "1.9693317866337503", "rate" : "5", "text" : "ᎧᏫ ᏫᏓᏥᏁᎩᏏ.", "timestamp" : 1618012380.4715204, "comment" : "" }
  • user: for storing common users' feedback, e.g., { "_id" : ObjectId("xxx"), "type" : "tochr", "model" : "nmt", "en" : "Hello", "chr" : "ᏏᏲ", "qe" : "2.840020728156041", "rate" : "5", "timestamp" : 1616613202.3531222, "comment" : "" }
  • chr: for storing unlabeled Cherokee data, e.g., { "_id" : ObjectId("xxx"), "uid" : "040921_19", "text" : "ᏍᎩᏅ ᎠᏯᏨᏗ ᏍᎪᎢᏯᎦᏴᎵᎠᏕᎳ ᏚᎵᎬᏩᎳ?"", "status" : "unlabeled" }
  • en: for storing unlabeled English data, e.g., { "_id" : ObjectId("xxx"), "uid" : "3980085662.jpg#0r1e_2", "text" : "boys play football", "status" : "unlabeled" }

See ChrEnTranslator.py for how we use MongoDB.

Start

After the preparation above, simply start the demo by running:

python ChrEnTranslator.py 

Then you can access to the website via http://127.0.0.1:5000/ or http://127.0.0.1:5000/expert.

Here is a screen shot: Demo

If you want to run it as a online service. Please refer to this tutorial.

Disclaimer

  • Whenever you use our demo, you agree to be aware that the translations provided by our demo are translated by machines, are highly likely to be poor-quality, and are not supposed to be used anywhere else without confirmation from professional Cherokee-English translators.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.