First of all, thank you very much for providing the code, but I have encountered some

Hi, sorry but your questions is really confusing: I don't get

I used the reconstruction results of train mode to calculate the L1 loss and the

Some question about model trained on 768 size TED dataset about articulated-animation HOT 4 OPEN

snap-research commented on June 17, 2024

Some question about model trained on 768 size TED dataset

from articulated-animation.

Comments (4)

AliaksandrSiarohin commented on June 17, 2024

Hi, sorry but your questions is really confusing:

I don't get the question. There is no L1 in train mode.
What is 768?
Could you provide example?
Depends on what objects will be in the new dataset.

from articulated-animation.

Zenobia7 commented on June 17, 2024

I used the reconstruction results of train mode to calculate the L1 loss and the reconstruction results of avd mode are almost the same, so I think avd mode is not effective
I cut TED dataset with 768*768 size
The new dataset is based on half-speaker video objects. Some videos of the new dataset are below,The new data sets are highly heterogeneous and diverse
https://user-images.githubusercontent.com/28126038/182800076-b9e4dea5-d927-41cd-ab7d-038e2cfccbf3.mp4
https://user-images.githubusercontent.com/28126038/182800140-632904d1-27e7-4a4a-9ec2-142fc59e01b5.mp4
https://user-images.githubusercontent.com/28126038/182800340-c7f54217-72a0-4a01-99d4-6cd7c4ec64e9.mp4

3.train mode visualization Results

0gks6ceq4eQ.004737.004870.mp4.mp4

avd mode visualization Results

0gks6ceq4eQ.004737.004870.mp4.mp4

train log visualization

Is it convenient for you to provide the training log? I want to compare it with my log. Thank you. Is there anything unclear

from articulated-animation.

AliaksandrSiarohin commented on June 17, 2024

Reconstruction does not make sense for avd, since it specifically designed for cross identity, where the shapes of the objects could be different.
There are no explicit handling of parts that is not visible most of the time, I guess you will have to device some way of handling that.
I can't see what bothers you in optical flow map.
Unfortunately I don't have logs anymore.

from articulated-animation.

Zenobia7 commented on June 17, 2024

Reconstruction does not make sense for avd, since it specifically designed for cross identity, where the shapes of the objects could be different.

There are no explicit handling of parts that is not visible most of the time, I guess you will have to device some way of handling that.

I can't see what bothers you in optical flow map.

Unfortunately I don't have logs anymore.

Thank you for your prompt reply.

Since there is no problem with the optical flow diagram, does it mean that there will be a problem that the details of the reconstruction are not clear? Is the reason that the reconstruction details are not clear is that the generator is not strong enough or the information of the optical flow diagram is not fully utilized?
Do you think it is OK for me to use half-speaker videos with complex background and inconsistent height in my self-built data set? It seems to me that Loss is decreasing rapidly at present, and then it will not decrease