With the increased amount of data publicly available and the increased focus on unstructured text data, understanding how to clean, process, and analyze that text data is tremendously valuable. Its a quick summary of basic natural language processing (NLP) concepts, covers advanced data cleaning and vectorization techniques, and then a deep dive into building machine learning classifiers. During this last step, it shows how to build two different types of machine learning models, as well as how to evaluate and test variations of those models. Topics include: 1• What are NLP and NLTK? 2• Using regular expressions. 3• Using stemming and lemmatizing. 4• Methods to vectorize raw data. 5• Building and evaluating machine learning classifiers.
seema200 / spam_filter_nlp Goto Github PK
View Code? Open in Web Editor NEWSPAM Filter using Natural Language processing(NLP)