The task of this project was to create a plagiarism detector with custom similarity features between source and answer files and longest common subsequence. I trained and deployed a decision tree model on Amazon Sagemaker.
The feature engineering follows the presented methods of this paper and is presented in this notebook. The trainig, evaluating and deployment on AWS of the model is presented in this notebook.
This project was part of the Udacity Machine Learning Engineer Nanodegree.