Code Monkey home page Code Monkey logo

muse-wild_2020's Introduction

Multi-modal Continuous Dimensional Emotion Recognition Using Recurrent Neural Network and Self-Attention Mechanism

In this repo, we present our solutions to the MuSe-Wild sub-challenge in MuSe 2020-The Multimodal Sentiment in Real-life Media Challenge.

Requirements

  • Python 3.7
  • PyTorch 1.4.0
  • Pandas
  • Matplotlib
  • NumPy
  • Pickle

Model

We utilize the Long Short Term Memory (LSTM) recurrent neural network as well as the self-attention mechanism (denoted by the dotted lines) for continuous dimensional emotion recognition.

Architecture

Fusion

We adopt both early fusion and late fusion for multi-modal emotion recognition in this challenge. For early fusion, we simply concatenate multiple uni-modal features and feed them into the model. For late fusion, we employ a second-level LSTM model to fuse the predictions from several uni-modal features.

Usage

  • Change the dataset path PATH_TO_MUSE_2020 in config.py to yours.
  • For training the uni-modal and early fusion model, run command like this
python main.py --emo_dim_set [arousal or valence] --feature_set [directory of your feature set] ...

e.g.,

python main.py --emo_dim_set valence --feature_set egemaps fasttext --d_rnn 64 --rnn_n_layers 1 --rnn_bi --attn --n_layers 1 --n_heads 8 --epochs 100 --batch_size 1024 --lr 0.005 --seed 43 --n_seeds 1 --min_lr 1e-5 --rnn_dr 0.0 --attn_dr 0.0 --out_dr 0.0 --win_len 200 --hop_len 100 --add_seg_id --log --gpu 6

The above options can be found in main.py. By default, the feature_set is the directory (can be more than one for multi-modal setting) under [path to your dataset]/c1_muse_wild/feature_segments/label_aligned.

  • For training the late fusion model, run command like this
python main_fusion.py --emo_dim_set [arousal or valence] --base_dir [your fusion folder] ...

e.g.,

python main_fusion.py --emo_dim_set arousal --base_dir ./fusion/test--d_model 32 --rnn_bi --n_layers 1 --epochs 15 --batch_size 64 --lr 0.001 --seed 42 --n_seeds 3 --loss ccc --min_lr 1e-5 --gpu 3 --log

The above options can be found in main_fusion.py. Note that it's needed to put the multiple uni-modal predictions (generated by uni-modal or early fusion model) into the source sub-folder in base_dir.

Results

In the sub-challenge, Concordance Correlation Coefficient (CCC) is chosen as the evaluation metric. The best submission results on validation set and test set are as follows.

Emotion Partition Baseline [1] Ours
Arousal Val 0.3978 0.5616
Valence Val 0.1506 0.4876
Arousal Test 0.2834 0.4726
Valence Test 0.2431 0.5996

References

[1] Lukas Stappen, Alice Baird, Georgios Rizos, Panagiotis Tzirakis, Xinchen Du, Felix Hafner, Lea Schumann, Adria Mallol-Ragolta, Bj ̈orn W. Schuller, Iulia Lefter, Erik Cambria, Ioannis Kompatsiaris: “The 2020 Multimodal Sentiment Analysis in Real-life Media Workshop and Challenge: Emotional Car Reviews in-the-wild”, Proceedings of ACM-MM 2020, Seattle, United States, 2020.

muse-wild_2020's People

Contributors

youcaisun avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.