Code Monkey home page Code Monkey logo

deezer / muzeeglot Goto Github PK

View Code? Open in Web Editor NEW
8.0 9.0 2.0 10.89 MB

Web interface application to visualize multilingual music genre embeddings and generated cross-lingual music genre annotations for Wikipedia music entities (artists, bands, albums, tracks).

Home Page: https://research.deezer.com/muzeeglot

Makefile 3.54% Dockerfile 2.12% Shell 0.44% Python 38.99% JavaScript 23.64% HTML 1.27% Vue 29.98%
demo extended-abstract deezer jep-taln2020

muzeeglot's Introduction

Muzeeglot

In this repository, we present Muzeeglot, a propotype aiming at illustrating how multilingual music genre embedding space representations can be leveraged to generate cross-lingual music genre annotations for DBpedia music entities (artists, albums, tracks, etc ...).

Muzeeglot includes a web interface to visualize these multilingual music genre embeddings.

How it works

Based on annotations from one or several source languages, our system automatically predicts the corresponding annotations in a target language.

Languages supported:

  • 🇫🇷 French
  • 🇬🇧 English
  • 🇪🇸 Spanish
  • 🇳🇱 Dutch
  • 🇨🇿 Czech
  • 🇯🇵 Japanese

You will find more information about application usage here.

Architecture

Muzeeglot is based on a classic N-tier architecture including :

  • A Redis instance as storage engine.
  • A REST API developed in Python with FastAPI.
  • A frontend developed with VueJS, as a SPA (Single Page Application).

The overall stack is loadbalanced using Nginx webserver :

Data such as entities, tags, and languages are stored into the Redis instance. Additionnally, a text search index based on Whoosh is maintained using ngram tokenization on entity names.

Deployment

Deploying Muzeeglot requires the following tools to be installed :

You can then clone this repository and start Muzeeglot1 :

git clone https://github.com/deezer/muzeeglot
cd muzeeglot
make start

Behind the scene it will build the required docker images and run a compose file with everything required locally in daemon mode.

1 first deployment will be long as it requires data ingestion and indexing.

SSL support

In case you want to deploy Muzeeglot with SSL using LetsEncrypt, you need to first create certificate using the provided bot challenge. Start by editing the following configuration files to add your target domain :

  • frontend/nginx/certificate-builder.conf
  • frontend/nginx/muzeeglot-ssl.conf

Once you did so, you can run the following command to generate SSL certificates:

make letsencrypt DOMAIN=mydomain.tld

It will create a docker volume and provision it with certificate. Then you can run Muzeeglot as follows:

make ssl start

Development

Project can be managed using GNU Make through the following goals :

Goal Description
api Build api image
frontend Build frontend image
run Start the entire stack using docker-compose
start Start the entire stack in daemon mode
stop Stop the entier stack using docker-compose
logs Display stack logs when running in daemon mode
clean Clean docker volume for storage and indexes
letsencrypt Generate certificate volume

Additional goals can be used to provide extra parameters:

Goal Description
no-cache Build images using --no-cache flag
ssl Enable SSL support

If you want to use your own data, please provide the following files into api/data directory2:

  • Tag embeddings such as music genres are expected through embeddings.csv CSV file.
  • Reduced embeddings for display are expected through embeddings_reduced.csv CSV file.
  • Supported language are expected through languages.csv CSV file.
  • Indexed entities are expected through entites.csv CSV file.
  • Test corpus is expected through corpus.csv CSV file.

2 you need to clean the data storage and index to force data ingestion when you redeploy.

Cite

@inproceedings{epure2020muzeeglot,
  title={Muzeeglot: annotation multilingue et multi-sources d'entit{\'e}s musicales {\`a} partir de repr{\'e}sentations de genres musicaux},
  author={Epure, Elena V and Salha, Guillaume and Voituret, F{\'e}lix and Baranes, Marion and Hennequin, Romain},
  booktitle={Actes de la 6e conf{\'e}rence conjointe Journ{\'e}es d'{\'E}tudes sur la Parole (JEP, 31e {\'e}dition), Traitement Automatique des Langues Naturelles (TALN, 27e {\'e}dition), Rencontre des {\'E}tudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (R{\'E}CITAL, 22e {\'e}dition). Volume 4: D{\'e}monstrations et r{\'e}sum{\'e}s d'articles internationaux},
  pages={18--21},
  year={2020},
  organization={ATALA}
}

How we learn multilingual music genre embeddings in more detail:

@inproceedings{epure2020modeling,
  title={Modeling the Music Genre Perception across Language-Bound Cultures},
  author={Epure, Elena V and Salha, Guillaume and Manuel, Moussallam and Hennequin, Romain},
  booktitle={The 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)},
  month = nov,
  year={2020},
  publisher = {Association for Computational Linguistics},
}

muzeeglot's People

Contributors

elenavepure avatar faylixe avatar guillaumesalhagalvan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

muzeeglot's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.