Code Monkey home page Code Monkey logo

audiostylenet-face-gan's Introduction

AudioStyleNet - Controlling StyleGAN through Audio

This repository contains the code for my master thesis on talking head generation by controlling the latent space of a pretrained StyleGAN model. The work was done at the Visual Computing and Artificial Intelligence Group at the Technical University of Munich under the supervision of Matthias Niessner and Justus Thies.

Video

See the demo video for more details and results.

AudioStyleNet

Set-up

The code uses Python 3.7.5 and it was tested on PyTorch 1.4.0 with cuda 10.1. (This project requires a GPU with cuda support.)

Clone the git project:

$ git clone https://github.com/FeliMe/Emotion-Aware-Facial-Animation.git

Create two virtual environments:

$ conda create -f environment.yml
$ conda create -n deepspeech python=3.6

Install requirements:

$ conda activate audiostylenet
$ pip install -r requirements.txt
$ conda activate deepspeech
$ pip install -r deepspeech/deepspeech_requirements.txt

Install ffmpeg

sudo apt install ffmpeg

Demo

Download the pretrained AudioStyleNet model and the StyleGAN model from Google Drive and place them in the model/ folder.

run

$ python run_audiostylenet.py 

Use your own audio

To test the model with your own audio, first convert your audio to waveform and then run the following:

$ cd deepspeech
$ conda activate deepspeech
$ python run_voca_feature_extraction.py --audiofiles <path to .wav file> --out_dir ../data/audio/
$ conda deactivate

Then run python run_audiostylenet.py with adapted arguments.

audiostylenet-face-gan's People

Contributors

felime avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.