The aim of this project is to build a predictive typing system for an arbitrary language. The system should rely on the language model that is built from a large corpus given as input. The underlying language model should incorporate some of the advanced smoothing methods such as Witten-Bell or Kneser-Ney methods. The implemented language models with dierent smoothing techniques should be evaluated using measures of perplexity and cross-entropy on the separate test fraction of the corpus. Finally, a text editor with the predictive typing capabilities should be developed.
msantl / apt Goto Github PK
View Code? Open in Web Editor NEWText Analysis and Retrieval - project repository