RuntimeError: The size of tensor a (7998) must match the size of tensor b (1998) at non-singleton dimension 1,about facebookresearch/audio2photoreal

Comments (6)

evonneng commented on July 22, 2024

Hi thanks for posting this! This actually should be due to the fact that we only support recording up to 20 seconds of audio for now. The conditioning size for 20 seconds will result in a max embedding sequence length of 1998, which unfortunately is hardcoded here:

audio2photoreal/model/diffusion.py

Line 136 in 548aeeb

emb_len = 1998 # hardcoded for now

I believe the above error should result if you end up with an audio embedding that is longer than such a sequence.
But please let me know if you're still having issues with recording lengths less than 20 seconds.

from audio2photoreal.

infusion-zero-edit commented on July 22, 2024

thanks for replying back, after i posted this i came to know this fixed number by searching over git, but even after uploading 6sec audio, i am getting following error:

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run gradio deploy from Terminal to deploy to Spaces (https://huggingface.co/spaces)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:25<00:00, 3.98it/s]
created 3 samples
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:11<00:00, 9.01it/s]
created 3 samples
0%| | 0/120 [00:03<?, ?it/s]
Traceback (most recent call last):
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/queueing.py", line 489, in call_prediction
output = await route_utils.call_process_api(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/blocks.py", line 1561, in process_api
result = await self.call_function(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/blocks.py", line 1179, in call_function
prediction = await anyio.to_thread.run_sync(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/utils.py", line 678, in wrapper
response = f(*args, **kwargs)
File "/home/jupyter/audio2photoreal/demo/demo.py", line 232, in audio_to_avatar
gradio_model.body_renderer.render_full_video(
File "/home/jupyter/audio2photoreal/visualize/render_codes.py", line 153, in render_full_video
self._write_video_stream(
File "/home/jupyter/audio2photoreal/visualize/render_codes.py", line 94, in _write_video_stream
out = self._render_loop(motion, face)
File "/home/jupyter/audio2photoreal/visualize/render_codes.py", line 121, in _render_loop
preds = self.model(**default_inputs_copy)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/models/mesh_vae_drivable.py", line 301, in forward
enc_preds = self.encode(geom, lbs_motion, face_embs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/models/mesh_vae_drivable.py", line 266, in encode
face_dec_preds = self.decoder_face(face_embs_hqlp)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/nn/face.py", line 80, in forward
texout = self.texmod(self.texmod2(encview).view(-1, 256, 4, 4))
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/nn/layers.py", line 335, in forward
output = thf.conv_transpose2d(
RuntimeError: GET was unable to find an engine to execute this computation

from audio2photoreal.

infusion-zero-edit commented on July 22, 2024

all the torch versions and cuda versions which is 11.7 is there as per requirements file

from audio2photoreal.

evonneng commented on July 22, 2024

Glad to hear the other issue is solved!
Regarding the GET issue, could you try checking to see if your pytorch and cuda versions are compatible please? specifically, is the torch version compiled with cuda 11.7 and is your system using cuda 11.7?
for the torch version, it could be installed with:

pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2

The mismatch seems to be the core of the reason you may be getting this error. Here are some links that are hopefully helpful?

could be because there is a mismatch in cuda version you're running vs. cuda version compiled with torch link
your cuda version is not properly linked link

from audio2photoreal.

infusion-zero-edit commented on July 22, 2024

These issues has been solved i have solved it by reinstalling cuda-11.7. Thanks for the reply just another question if i want to replace the avataar or new person without having any training data for that, how can we do that?

from audio2photoreal.

alexanderrichard commented on July 22, 2024

Glad to hear the issue is resolved.
Unfortunately, you can't replace the avatar with a new person without having training data. See my reply here #33

from audio2photoreal.

RuntimeError: The size of tensor a (7998) must match the size of tensor b (1998) at non-singleton dimension 1 about audio2photoreal HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent