Comments (6)
Hi thanks for posting this! This actually should be due to the fact that we only support recording up to 20 seconds of audio for now. The conditioning size for 20 seconds will result in a max embedding sequence length of 1998, which unfortunately is hardcoded here:
audio2photoreal/model/diffusion.py
Line 136 in 548aeeb
I believe the above error should result if you end up with an audio embedding that is longer than such a sequence.
But please let me know if you're still having issues with recording lengths less than 20 seconds.
from audio2photoreal.
thanks for replying back, after i posted this i came to know this fixed number by searching over git, but even after uploading 6sec audio, i am getting following error:
This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run gradio deploy
from Terminal to deploy to Spaces (https://huggingface.co/spaces)
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:25<00:00, 3.98it/s]
created 3 samples
100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [00:11<00:00, 9.01it/s]
created 3 samples
0%| | 0/120 [00:03<?, ?it/s]
Traceback (most recent call last):
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/queueing.py", line 489, in call_prediction
output = await route_utils.call_process_api(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/blocks.py", line 1561, in process_api
result = await self.call_function(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/blocks.py", line 1179, in call_function
prediction = await anyio.to_thread.run_sync(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/gradio/utils.py", line 678, in wrapper
response = f(*args, **kwargs)
File "/home/jupyter/audio2photoreal/demo/demo.py", line 232, in audio_to_avatar
gradio_model.body_renderer.render_full_video(
File "/home/jupyter/audio2photoreal/visualize/render_codes.py", line 153, in render_full_video
self._write_video_stream(
File "/home/jupyter/audio2photoreal/visualize/render_codes.py", line 94, in _write_video_stream
out = self._render_loop(motion, face)
File "/home/jupyter/audio2photoreal/visualize/render_codes.py", line 121, in _render_loop
preds = self.model(**default_inputs_copy)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/models/mesh_vae_drivable.py", line 301, in forward
enc_preds = self.encode(geom, lbs_motion, face_embs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/models/mesh_vae_drivable.py", line 266, in encode
face_dec_preds = self.decoder_face(face_embs_hqlp)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/nn/face.py", line 80, in forward
texout = self.texmod(self.texmod2(encview).view(-1, 256, 4, 4))
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/container.py", line 217, in forward
input = module(input)
File "/opt/conda/envs/a2p_env/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/home/jupyter/audio2photoreal/visualize/ca_body/nn/layers.py", line 335, in forward
output = thf.conv_transpose2d(
RuntimeError: GET was unable to find an engine to execute this computation
from audio2photoreal.
all the torch versions and cuda versions which is 11.7 is there as per requirements file
from audio2photoreal.
Glad to hear the other issue is solved!
Regarding the GET issue, could you try checking to see if your pytorch and cuda versions are compatible please? specifically, is the torch version compiled with cuda 11.7 and is your system using cuda 11.7?
for the torch version, it could be installed with:
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2
The mismatch seems to be the core of the reason you may be getting this error. Here are some links that are hopefully helpful?
- could be because there is a mismatch in cuda version you're running vs. cuda version compiled with torch link
- your cuda version is not properly linked link
from audio2photoreal.
These issues has been solved i have solved it by reinstalling cuda-11.7. Thanks for the reply just another question if i want to replace the avataar or new person without having any training data for that, how can we do that?
from audio2photoreal.
Glad to hear the issue is resolved.
Unfortunately, you can't replace the avatar with a new person without having training data. See my reply here #33
from audio2photoreal.
Related Issues (20)
- How to change the position of camera/model? HOT 1
- Training the model with different data format HOT 1
- The lips regressor predicts unexpected result HOT 5
- Switching from Recording to Uploading Audio in a Demo: Is it Possible? HOT 1
- Why the data is not as in the README ? HOT 2
- Models and pre-requisites models unavailable HOT 3
- Does it support languages other than English? HOT 1
- Models and pre-requisites models unavailable HOT 3
- What model was used to extract the body pose ? HOT 4
- Multiple GPUs DDP error HOT 5
- Data acquisition and processing HOT 3
- The evaluation code for lip reconstructions HOT 1
- Is it possible to run the demo in a laptop without GPU? HOT 3
- Training inference time and test data HOT 2
- How to train a new model from scratch HOT 1
- Visualize 2 avatars in the same scene, just like the introduction page HOT 1
- Replancement of fairseq HOT 1
- Video data HOT 1
- 1
- when i run python -m demo.demo, I test a audio,have a error? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from audio2photoreal.