Comments (14)
是的
from livespeechportraits.
hello, 请问在训练audio2mouth的时候,嘴部关键点需要做特殊的对齐操作吗?比如makeitTalk中的事先配准,或者其他方法中的facealignment? 训练audio2mouth的时候,特征影响会有多大,直接使用mfcc训练mouth关键点是否可行?还有一点就是如果不做论文中的LLE,对于这一块的训练会有多大影响?
from livespeechportraits.
- 我的关键点是在归一化的三维人脸空间做的(类似makeittalk的配准)
- audio特征影响还是很大的,mfcc泛化性个人觉得会差一些;
- LLE在训练中没有参与,只是后处理;
from livespeechportraits.
你好,有尝试过在中文数据集上训练吗?我自己尝试在中文数据集上训练,但是效果不好,尝试过训练makeittalk中的speech content分支,也就是audio2landmark部分,发现嘴部学的并不好,几乎不动的样子。特征提取部分用的是论文中描述的autovc,现在不清楚这个方法对于跨域后特征提取有多大影响,(english->chinese),所以打算直接尝试mfcc特征去train。我发现audio2landmark这一步很关键,同时也比较难做,归一化非常重要,但是makeittalk数据预处理代码并没有。用作者提供好的数据,训练是其作用的,但是自己构造的一批数据效果很差。想请问一下,是否之前有尝试过 retrain makeitTalk?另外,对于这个方向刚入门,很多问题是一知半解,现在想结合makeittalk跑个中文的demo,卡在audio2landmarks这一块,有什么经验可以提供吗?或者说这个过程中需要特别注意的细节?另外,是否了解有合适的中文数据集呢?
谢谢🙏!
from livespeechportraits.
不好意思,没有做过相关测试,也没有训过makeittalk。landmark的归一化肯定是重要的,学习的对象需要在同一空间内。
中文数据集可以去看看common voice,aishell这些。
from livespeechportraits.
非常感谢,这些数据集好像只有音频,但是没有视频。有包含视频的相关数据集吗?
from livespeechportraits.
不好意思,这方面没有了解过。
from livespeechportraits.
作者你好,看了一下细节部分,论文提到视频抽帧是60fps, 音频计算mfcc特征的时候帧长是1/60, 但是帧移是1/120,这样子mfcc序列长度会是视频帧数序列的2倍,在训练audio2mouth时候,帧数怎么对齐呢?
from livespeechportraits.
我没有用mfcc,用的是mel谱作为浅层feature,送入APC得到深度feature。
两比一的长度很简单,用两个feature生成一帧就可以了。
from livespeechportraits.
ok, 所以实际上是mfcc特征输入到apc模型,最后得到的预测帧数和视频序列长度保持了一致。这么设计提取mfcc特征,是考虑到了apc模型的特性嘛?
from livespeechportraits.
没有mfcc,APC是深度语音特征的一种,他用的梅尔谱作为浅层特征。当然,你也可以用从raw wave得到的深度特征,比如wav2vec
from livespeechportraits.
你好,在训练audio2mouth过程中发现嘴部关键点在某一帧可能会突变是为什么?
from livespeechportraits.
你好,在训练audio2mouth过程中发现嘴部关键点在某一帧可能会突变是为什么?
这个问题是LSP还是Makeittalk中产生的?如果排除了音频噪声,可以看看超参数。我在Makeittalk中训练遇到类似的问题,通过调整超参解决了
from livespeechportraits.
- 我的关键点是在归一化的三维人脸空间做的(类似makeittalk的配准)
- audio特征影响还是很大的,mfcc泛化性个人觉得会差一些;
- LLE在训练中没有参与,只是后处理;
你好,请问APC_feat_database是怎样生成的,比如我有一段5分钟的说话视频,是5分钟数据全部用于生成,还是按照一定的规律挑选一段?
from livespeechportraits.
Related Issues (20)
- What is the meaning of implementing by C++? HOT 1
- 候选照片,一共四张,是基于什么逻辑进行选择的?
- how can i use it in real time? HOT 1
- Does anyone implement the training code of this project? HOT 1
- How to run demo in "Real-time" HOT 1
- 模型得到的矩阵值可以和ARkit进行映射吗?
- RuntimeError: Found no NVIDIA driver on your system.
- Great project, where does the author achieve real-time performance? HOT 2
- 如何生成自己的模型。从哪里导入我的视频素材生成我自己的模型。
- How to train these models in custom dataset? Any documentation? HOT 1
- What tool did you use to create a sketch from a face image, in case i want to train the image to image transition model?
- 73 facial landmarks HOT 1
- FileNotFoundError: [Errno 2] No such file or directory: './data/May\\mean_pts3d.npy' HOT 1
- 数字人技术交流群请联系VX:metahuman668
- GMMLogLoss for training audio2headpose
- training data download
- Is the Released Models Trained on Whole Video Clip?
- code for data processing, training HOT 2
- REAL TIME 哪里去了?不是说好可以根据音频流来实时输出吗? HOT 2
- Lip sync result HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from livespeechportraits.