Code Monkey home page Code Monkey logo

talkshow's People

Contributors

feifeifeiliu avatar yhw-yhw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

talkshow's Issues

ValueError: need at least one array to concatenate

Hello,

I am getting this Value Error while trying to run your scripts by following the README.md

ValueError: need at least one array to concatenate

this is the traceback for the error :

Traceback (most recent call last):
  File "scripts/train.py", line 10, in <module>
    trainer = Trainer()
  File "/data/users1/user/TalkSHOW/trainer/Trainer.py", line 72, in __init__
    self.init_dataloader()
  File "/data/users1/user/TalkSHOW/trainer/Trainer.py", line 168, in init_dataloader
    config=self.config
  File "/data/users1/user/TalkSHOW/data_utils/dataloader_torch.py", line 255, in __init__
    self.complete_data=np.concatenate(self.complete_data, axis=0)
  File "<__array_function__ internals>", line 6, in concatenate
ValueError: need at least one array to concatenate


for now, i am trying to load only two speakers, oliver and conan to make sure everything works smoothly.

I have changed this in apply_split.py :

speakers = ['oliver', 'conan']

and did the same in body_vq.json
and given my data_root as the dataset path

Kindly let me know what is the fix for this issue?

Thanks a bunch in advance.

Generalization Ability To Chinese

Really excellent project! I am wondering whether TalkSHOW could generalise to chinese speech? If not, how can I make this possible? Perhaps I should create paired chineseSpeech-smplx labels(fitted from videos using pipeline in your paper).

Thank you very much.

Configuration files are inconsistent

The configuration file in the folder (./experiments) in the pre-training weights folder is inconsistent with the one in ./talkshow/config。
e.g. ./TalkSHOW/experiments/2022-11-02-smplx_S2G-body-pixel-3d/smplx_S2G.json VS ./TalkSHOW/config/body_pixel.json

Missing train_3d_mfcc.pkl?

I meet the follow errors when run the 'train_body_vq.sh'. It mentions that I need the train_3d_mfcc.pkl. Where can I get it?
image

Possible Miscalculation of MSELoss in Face Generator

I would like to report a possible miscalculation of the loss in the face generator.

Issue description

Please have a look at the following code snippet:

TalkSHOW/nets/smplx_face.py

Lines 155 to 159 in 38aab30

MSELoss = torch.mean(torch.abs(pred_poses[:, :, :6] - gt_poses[:, :, :6]))
if self.expression:
expl = torch.mean((pred_poses[:, :, -100:] - gt_poses[:, :, -100:])**2)
else:
expl = 0

I believe the loss calculation at line 155 is wrong. The slice should go up to index 3, not 6.
That's because the dimensions for the jaw_pose are 3.

I would like to remind you that pred_poses shape is (N, seq_length, 103), where the first 3 dimensions are for the jaw_pose while the rest 100 are for the expression.

For the gt_poses the shape is (N, seq_length, 265) where the first 3 dimensions are for the jaw pose and the last 100 are for expression.
The 3 next dimensions after the first 3 of the jaw pose are for the left eye.

When we do MSELoss = torch.mean(torch.abs(pred_poses[:, :, :6] - gt_poses[:, :, :6])) we compare correctly the first 3 jaw_pose features but also we compare 3 left eye features from gt_poses with 3 features expression features from pred_poses.

Proposed Fix:

I believe the correct way to calculate the loss is by changing 6 to 3, as follows:
MSELoss = torch.mean(torch.abs(pred_poses[:, :, :3] - gt_poses[:, :, :3])).

Please let me know if my assertion is correct or whether I misunderstood something.

Can we still expect a Colab for TalkSHOW?

Thanks for the Colab for SHOW. That's very helpful. I'm still hopeful that there will be a Colab for TalkSHOW. Please let us know if we can still expect one. Thanks again!

Different Results with Paper

I got very different metric values for the face using your provided checkpoint, compared with the values on paper.

On paper:
image

I tested:
image

the questions about autoregressive models 关于自回归模型的问题

大家好,请教大家问题:

我在训练pixel自回归模型时,遇到以下两个问题:
When I was training a pixel autoregressive model, I encountered the following two problems:

  1. gated pixelcnn 非常容易过拟合,在1、2个epoch之后,val loss就一直上升不降。
    Gated pixelcnn is very easy to overfit. After 1 or 2 epochs, the val loss keeps rising.
    2.自回归模型会出现身份泄露的问题,比如speakerA生成时,会出现speakerB的动作和手势。
    The autoregressive model will have the problem of identity leakage. For example, when speakerA is generated, the actions and gestures of speakerB will appear.

请教大家有什么解决问题的策略?

@yhw-yhw @feifeifeiliu

Windows 11

Is there any clear instructions to follow for Windows 11 with Conda Env?

File config/style_gestures.json is Missing

I did not find the config/style_gestures.json in your repo. when i run test.py it report "FileNotFoundError: [Errno 2] No such file or directory: './config/style_gestures.json'". and while i run test_face.sh it report "ValueError: need at least one array to concatenate". i m not sure how can i fix this problem.

rendering problem

AttributeError: 'GLXPlatform' object has no attribute 'OSMesa'

/NeRF/TalkSHOW/scripts/diversity.py(340)main()
sudo apt-get install libosmesa6
sudo apt-get install libosmesa6-dev
set os.environ['PYOPENGL_PLATFORM'] = 'osmesa'
still doesn't work

new data training

Awesome work! I use a new video data to train, but find the test results are shaky and jitter.
The test data is the same as train, and the score is:
LVD=0.06410979024071979
error=10.288773031787722
diverse=0.006435648955709916
The training data is less or other training params are not setting right?
Or the show outputs are shaky?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.