Code Monkey home page Code Monkey logo

brunotanabe / twitter-text-classification Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 6.75 MB

Welcome to Twitter Text Classification! This project focuses on sentiment analysis of tweets in Portuguese using machine learning techniques. The main goal is to classify tweets as positive or negative based on their content. Below is a detailed explanation of the project components, from data preprocessing to model evaluation.

License: MIT License

Jupyter Notebook 100.00%

twitter-text-classification's Introduction

đŸĻ Twitter Text Classification đŸĻ

This project focuses on sentiment analysis of tweets in Portuguese using machine learning techniques. The main goal is to classify tweets as positive or negative based on their content. Below is a detailed explanation of the project components, from data preprocessing to model evaluation.

đŸŽ¯ Project Objectives đŸŽ¯

The main objective of this project is to classify tweets into positive and negative sentiments using natural language processing (NLP) techniques. Specifically, the project aims to achieve the following objectives:

  • Develop a text classification model capable of accurately categorizing tweets based on sentiments.
  • Utilize preprocessing techniques to clean and prepare text data for analysis.
  • Evaluate the performance of the classification model on training and testing datasets.

ℹī¸ Key Features ℹī¸

  • Importing necessary libraries for data manipulation, visualization, and machine learning.
  • Loading training and testing databases containing tweets for sentiment analysis.
  • Text preprocessing steps including lowercase conversion, username handling, URL handling, emoticon handling, irrelevant word removal, lemmatization, and punctuation removal.
  • Creating a text classification model using the textcat component of spaCy.
  • Training the model using the training dataset and evaluating its performance.
  • Testing the trained model on sample sentences and evaluating its predictions.
  • Model evaluation using accuracy score and confusion matrix.

đŸ’ģ Technologies Used đŸ’ģ

The project utilizes the following technologies and libraries:

  • Python 🐍
  • spaCy 🧠
  • NumPy đŸ”ĸ
  • pandas đŸŧ
  • scikit-learn 🧮
  • Matplotlib 📊
  • Seaborn 🌊

📋 Requirements 📋

To run the project, make sure you have the following installed:

  • Python 3.x
  • Jupyter Notebook or another Python environment
  • Required Python libraries: spaCy, NumPy, pandas, scikit-learn, Matplotlib, Seaborn

â–ļī¸ Setting Up the Project â–ļī¸

Setting up the environment on Linux

  1. Clone this repository using the command git clone https://github.com/BrunoTanabe/twitter-text-classification.
  2. Navigate to the twitter-text-classification folder using the command cd twitter-text-classification.
  3. Create a virtual environment using the command python3 -m venv venv.
  4. Activate the virtual environment using the command source venv/bin/activate.
  5. Install requirements using the command pip install -r requirements.txt.
  6. Execute the command python -m spacy download pt_core_news_lg to download the NLP model for text processing.

Setting up the environment on Windows

  1. Clone this repository using the command git clone https://github.com/BrunoTanabe/twitter-text-classification.
  2. Navigate to the twitter-text-classification folder using the command cd twitter-text-classification.
  3. Create a virtual environment using the command python -m venv venv.
  4. Activate the virtual environment using the command .\venv\Scripts\activate.
  5. Install requirements using the command pip install -r requirements.txt.
  6. Execute the command python -m spacy download pt_core_news_lg to download the NLP model for text processing.

⚠ī¸ Important Note ⚠ī¸

Ensure that the file paths for loading and saving data/models are correctly configured based on the structure of your local directory.

✍ī¸ Authors ✍ī¸

This project was created by Bruno Tanabe. For any questions or feedback, please contact [email protected].

twitter-text-classification's People

Contributors

brunotanabe avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤ī¸ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.