- Build and evaluate Word2Vec Models using different Text corpora.
- Use the best model to implement Word Mover Distance.
- Optimize the Word Mover Distance using three techniques
proposed by Kusner et al.
a. Word Centroid Distance
b. Relaxed Word Mover Distance
c. Prefetch and Prune - Use Word Mover Distance to explore different heuristics to generate summaries.
Reference: Distributed Representations of Words and Phrases and their Compositionality: Thomas Mikoloc et al. From Word Embeddings To Document Distances: Matt J kusner
Note: First we need to build the word2vec model to generate word vectors. These are used to generate document distances and summaries.