Code Monkey home page Code Monkey logo

mec's Introduction

MEC

Music emotion classifiers based on lyrics using LDA, SVM, and AdaBoost

Setup instructions

  1. Install git-lfs (Follow instructions on https://git-lfs.github.com/)
  2. Clone repo
  3. Pull using git-lfs
  4. Create virtual env (Optional)
  5. Run 'pip install -r requirements.txt'

Dataset

Original using dataset found in 'data/spotify/' and 'data/deezer/' directories, which do not include lyrics. Lyrics for each song was scraped from Genius using tweaked-version of lyricsgenius (with slight modification to search_song api). Upon completion, a gen_{dataset_name}_data.csv and gen_{dataset_name}_error_log.txt is created for each dataset.

gen_dataset.py

Running 'python shared/gen_dataset.py' will initiate scraping for lyrics for each song in the datasets. The result of api call is verified quickly by checking that song title & artist names found online match with dataset values before being store.

Columns

  1. song - song title (string)
  2. artist - artist name (string)
  3. valence - valence rating (numeric)
  4. arousal - arousal rating (numeric)
  5. lyrics - lyrics (string)
  6. found_song - song title found online (string)
  7. found_artist - artist name found online (string)

clean_dataset.py

Running 'python shared/clean_dataset.py' will initiate some basic cleaning/preprocessing of datasets generated by shared/gen_dataset.py. Songs are cleaned based on lyric length, lyric tags (e.g. [VERSE 1]), language (English only using langdetect), word count limit, and unique word count threshold. Following this, generation of emotion class label ('y') takes place based on the quadrants of valence-arousal space (see fig. below).

alt text

Columns

  1. song - song title (string)
  2. artist - artist name (string)
  3. valence - valence rating (numeric)
  4. arousal - arousal rating (numeric)
  5. lyrics - lyrics (string)
  6. found_song - song title found online (string)
  7. found_artist - artist name found online (string)
  8. y - emotion class label from 1-4 (numeric)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.