movie-review-project's Introduction

Large Movie Review Sentiment Analysis

This repository contains materials related to Sentiment Analysis of Large Movie Review.

The link for the datatset used are as follows: - https://ai.stanford.edu/~amaas/data/sentiment/

Project Objectives

a) utilize both statistical and semantic approaches for sentiment analysis and compare their performance in models;

b) identify the feature representations that work best on the Large Movie Review dataset; and

c) determine the machine learning model that performs best for Sentiment Analysis and explore the impact of optimizing hyperparameters on accuracy and computational efficiency.

Processes

Data collection
- type, shape, manipulation
Exploratory Data Analysis
- Data distribution
- Word count
- Word length
- Identified positive and negative words
- Most occuring word
Data Preprocesssing
- Stop words, special characters, white spaces, punctuation removal
- Contraction expansion
- Lowercase transformation
Model Training
- Train-test set split
- Feature extraction
- Cross validation
- Training and testing
- Prediction
Model Evaluation
- Classification report
- Accuracy

Recommend Projects