Code Monkey home page Code Monkey logo

review_sentiment_analysis's Introduction

Sentiment Analysis using Word2Vec and FastText

This repository contains code for sentiment analysis using Word2Vec and FastText models. The project aims to analyze movie reviews and classify them as positive or negative.

Instructions

To execute the code, follow these steps:

  1. Install the required dependencies:

    pip install -r requirements.txt
  2. Unzip data and model folders.

  3. Execute the Jupyter Notebook SE23MAID010_Assign3.ipynb:

    jupyter notebook SE23MAID010_Assign3.ipynb
  4. Follow the instructions and run each cell in the notebook to train the models, process the test data, and evaluate the results.

  5. Preprocessed data is attached with the code so if you want to skip the data preprocessing step, in the notebook, navigate to the 5th index of the Table of Contents titled "Building Word2Vec Model" and proceed with running the code cells from there.

Implementation Details

  • The code is implemented in Python using Jupyter Notebook.
  • The models are trained using the Word2Vec and FastText algorithms.
  • The preprocess_text function preprocesses the raw text data by tokenization, removing stopwords, and applying other text preprocessing techniques.
  • The trained models are evaluated using metrics such as accuracy, precision, recall, and AUC.

Directory Structure

  • data/: Contains the raw data files.
  • models/: Stores the trained Word2Vec and FastText models.

Results

Word2Vec Model

Training Results:

  • Accuracy: 59.71%
  • AUC: 63.16%
  • Precision: 58.14%
  • Recall: 74.48%

Test Results:

  • Test Loss: 0.538
  • Test Accuracy: 74.80%
  • Test Precision: 69.27%
  • Test Recall: 88.66%
  • Test AUC: 89.14%

FastText Model

Training Results:

  • Accuracy: 99.88%
  • AUC: 99.98%
  • Precision: 99.89%
  • Recall: 99.88%

Test Results:

  • Test Loss: 0.829
  • Test Accuracy: 84.87%
  • Test Precision: 89.23%
  • Test Recall: 84.19%
  • Test AUC: 85.87%

Analysis and Observations

  • Both Word2Vec and FastText models are trained and evaluated.
  • FastText shows slightly better performance in terms of accuracy and precision.
  • FastText converges faster during training compared to Word2Vec.
  • Adjustments to hyperparameters and model architecture may further improve performance.

review_sentiment_analysis's People

Contributors

batul02 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.