Code Monkey home page Code Monkey logo

text-emotion-classification's Introduction

100-lines-text-emotion-classification

Open Source Love

Introduction

A easy-to-build emotion classification pipeline with ktrain, which is a lightweight wrapper for the deep learning library TensorFlow Keras. We used BERT pre-trained model and trained on a unified dataset which consist of 12 emotion corporas.

Motivation

  1. ktrain is very friendly for the beginner
  2. Emotion datasets are usually built in different ways. This leads to several research gaps: supervised models often only use a limited set of available resources. Thus, we a promising unified emotion corpora from Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart. The authors selected 12 emotion datasets and reannotated them. We expect to build more general model via this dataset.

How to use

  1. Build virtual environment
python3 -m venv venv
. venv/bin/activate
  1. Install required package
pip install -r requirements.txt
(or) pip install -e .
  1. Prepare data:
  • you can download the unified emotion corpora here
  • The default dataset is in json or tsv format.
  • Each datapoint has following information: 'ID', 'Corpora', 'Text', 'Emotion'
  1. Fine-tune the pre-trained model:
  • you can download the model I already fine-tuned here
  • make checkpoints directory in the main root and put checkpoint files inside
  • alternatively, you can fine-tune you own model

Result sample

Input Text: Always love 'Jeni's' ice cream🍨💓#my #favorite #icecream #ohaio #yum #delicious #happy… https://t.co/JtQ9a1Ag1z
Predicted Label --> joy /      Ground Truth Label --> joy

Input Text: @masters_say sounds like a perfect night that I miss spending with you! #imiss12303
Predicted Label --> sadness /   Ground Truth Label --> sadness

Input Text: You can't beat a bit of Division. Interzone.
Predicted Label --> noemo /     Ground Truth Label --> joy

Input Text: @JordanWooten yeah, yeah. we'll see....can't ruin Christmas.
Predicted Label --> surprise /  Ground Truth Label --> surprise

Input Text: Bon. On va tenter la cuisine avec l'huile d'arachide ...
Predicted Label --> fear /      Ground Truth Label --> fear

Evaluation

Screen Shot 2022-01-25 at 11 05 59 PM


To-dos

  • Evaluate on individual dataset
  • Build similar pipeline for via pytorch-lightning
  • More firendly to other dataset

Contact

If you have any question or suggestion, feel free to contact me at [email protected]. Contributions are also welcomed. Please open a pull-request or an issue in this repository.


Citation

@inproceedings{bostan-klinger-2018-analysis,
    title = "An Analysis of Annotated Corpora for Emotion Classification in Text",
    author = "Bostan, Laura-Ana-Maria  and
      Klinger, Roman",
    booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
    month = aug,
    year = "2018",
    address = "Santa Fe, New Mexico, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/C18-1179",
}

text-emotion-classification's People

Contributors

yichidev avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.