This repository contains materials related to Sentiment Analysis of Large Movie Review.
The link for the datatset used are as follows: - https://ai.stanford.edu/~amaas/data/sentiment/
a) utilize both statistical and semantic approaches for sentiment analysis and compare their performance in models;
b) identify the feature representations that work best on the Large Movie Review dataset; and
c) determine the machine learning model that performs best for Sentiment Analysis and explore the impact of optimizing hyperparameters on accuracy and computational efficiency.
- Data collection
- type, shape, manipulation
- Exploratory Data Analysis
- Data distribution
- Word count
- Word length
- Identified positive and negative words
- Most occuring word
- Data Preprocesssing
- Stop words, special characters, white spaces, punctuation removal
- Contraction expansion
- Lowercase transformation
- Model Training
- Train-test set split
- Feature extraction
- Cross validation
- Training and testing
- Prediction
- Model Evaluation
- Classification report
- Accuracy