Code Monkey home page Code Monkey logo

translatejpbot's Introduction

Translate JP Bot

Chat bot that translates Japanese words into English using JMdict.

You can talk to the bot right now with Telegram The bot's handle is @TranslateJPBot. The bot will try to translate any text that you send to it.

screenshot

How it works

The bot parses the Jmdict XML file and inserts each word, in both hiragana and kanji forms, into a Red Black Tree. Because of this, finding a word is a O(log(n)) operation. Once the tree is ready it spins up a Scotty server exposing a single endpoint for the telegram Webhook. When telegram posts an update, it tries to reply with a translation.

Limitations

There's still a lot of room for improvement. Here's a list of current limitations that I want to change in the future.

  • Telegram is the only platform supported. Eventually it should support other popular chat apps like Facebook or Discord.
  • The bot can't translate sentences, let alone verb/adjective conjugations. The bot lacks an algorithm to map conjugated forms to dictionary entry forms, as well as an algorithm to identify each word in a sentence. Right now it only works if you pass the word exactly as it is on a dictionary.
  • The bot is memory intensive. Since it loads the whole Japanese dictionary into a binary tree in memory, it uses about 2 GB of RAM when running.

Build

Clone with git and build with cmake and Stack

git clone https://github.com/GAumala/TranslateJPBot
make
stack setup
stack build

Run the server with:

TELEGRAM_TOKEN=<MY_SECRET_TOKEN> stack exec bot

Deploy

If you already have an nginx server setup with SSL, you can easily deploy the bot by adding a new location to your existing server block.

# /etc/nginx/nginx.conf

server {
 
               # Existing configuration...

+               location /telegram/ {
+                       proxy_pass http://localhost:4000;
+               }
        }

After that, you need to register a webhook to the Telegram API. We use the secret token as part of the webhook URL to avoid malicious attackers to try to talk to the bot. To register the url just use curl

curl -F “url=https://<YOURDOMAIN.EXAMPLE>/telegram/<MY_SECRET_TOKEN>" https://api.telegram.org/bot<MY_SECRET_TOKEN>/setWebhook

That's it! You're done! If you want to test that the server is running correctly you can modify test/serverTests.sh to point to your server and run the tests.

token=$(printenv TELEGRAM_TOKEN)
-host=http://localhost:4000
+host=https://<YOURDOMAIN.EXAMPLE>
url=$host/telegram/$token

If you see status 200 on each request, then the bot is running correctly.

translatejpbot's People

Contributors

gaumala avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

translatejpbot's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.