Code Monkey home page Code Monkey logo

machine-translation's Introduction

Machine Translation

Translating a text from One Language To Another.



Methods Used

  1. Encoder - Decoder Architecture

  2. Attention Mechanism

  3. Training and Inference Mode of Decoder

Encoder-Decoder Architecture

  1. The encoder LSTM is used to process the entire input sentence and encode it into a context vector, which is the last hidden state of the LSTM/RNN. This is expected to be a good summary of the input sentence. All the intermediate states of the encoder are ignored, and the final state id supposed to be the initial hidden state of the decoder.

  2. The decoder LSTM or RNN units produce the words in a sentence one after another. The model is trained by Teacher Forcing technique and tested through the Inference mode.

Attention Mechanism

In psychology, attention is the cognitive process of selectively concentrating on one or a few things while ignoring others. The attention mechanism was born to help memorize long source sentences in neural machine translation (NMT).

Rather than building a single context vector out of the encoder's last hidden state, the secret sauce invented by attention is to create shortcuts between the context vector and the entire source input.

Bahdanau Attention

  1. Producing the Encoder Hidden States - Encoder produces hidden states of each element in the input sequence.

  2. Calculating Alignment Scores between the previous decoder hidden state and each of the encoder’s hidden states are calculated (Note: The last encoder hidden state can be used as the first hidden state in the decoder).

  3. Softmaxing the Alignment Scores - the alignment scores for each encoder hidden state are combined and represented in a single vector and subsequently softmaxed.

  4. Calculating the Context Vector - the encoder hidden states and their respective alignment scores are multiplied to form the context vector.

  5. Decoding the Output - the context vector is concatenated with the previous decoder output and fed into the Decoder RNN for that time step along with the previous decoder hidden state to produce a new output.

  6. The process (steps 2-5) repeats itself for each time step of the decoder until an token is produced or output is past the specified maximum length.



Technologies Used

1. LSTM

2. GRU

3. Sequence To Sequence Model(Encoder-Decoder)

4. Bahdanau Attention Mechanism

Usage

It can be used as a translator where you will be getting the result in a new language.

Improvement

We can improve the model by improving the dataset. Training with more epochs with the improved dataset.

Author

Rahul Kumar Patro

machine-translation's People

Contributors

rahul1582 avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

ajeebkp23

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.