mayuelala / followyourpose Goto Github PK

[AAAI 2024] Follow-Your-Pose: This repo is the official implementation of "Follow-Your-Pose : Pose-Guided Text-to-Video Generation using Pose-Free Videos"

Home Page: https://follow-your-pose.github.io/

License: MIT License

Python 79.77% Shell 0.27% Jupyter Notebook 19.96%

aaai-2024 aigc video-generation follow-your-pose laion-pose-dataset

followyourpose's Introduction

🕺🕺🕺 Follow-Your-Pose 💃💃💃
Pose-Guided Text-to-Video Generation using Pose-Free Videos (AAAI 2024)

Yue Ma*, Yingqing He*, Xiaodong Cun, Xintao Wang, Siran Chen, Ying Shan, Xiu Li, and Qifeng Chen


"The man is sitting on chair, on the park"	"The Iron man, on the street "

"The stormtrooper, in the gym "	"The astronaut, earth background, Cartoon Style "

💃💃💃 Demo Video

demo3.mp4

💃💃💃 Abstract

TL;DR: We tune the text-to-image model (e.g., stable diffusion) to generate the character videos from pose and text description.

CLICK for full abstract

Generating text-editable and pose-controllable character videos have an imperious demand in creating various digital human. Nevertheless, this task has been restricted by the absence of a comprehensive dataset featuring paired video-pose captions and the generative prior models for videos. In this work, we design a novel two-stage training scheme that can utilize easily obtained datasets (i.e., image pose pair and pose-free video) and the pre-trained text-to-image (T2I) model to obtain the pose-controllable character videos. Specifically, in the first stage, only the keypoint-image pairs are used only for a controllable textto-image generation. We learn a zero-initialized convolutional encoder to encode the pose information. In the second stage, we finetune the motion of the above network via a pose-free video dataset by adding the learnable temporal self-attention and reformed cross-frame self-attention blocks. Powered by our new designs, our method successfully generates continuously pose-controllable character videos while keeps the editing and concept composition ability of the pre-trained T2I model. The code and models will be made publicly available.

🕺🕺🕺 Changelog

[2024.03.15] 🔥 🔥 🔥 We release the Second Follower Follow-Your-Click, the first framework to achieve regional image animation. Try it now! Please give us a star! ⭐️⭐️⭐️ 😄
[2023.12.09] 🔥 The paper is accepted by AAAI 2024!
[2023.08.30] 🔥 Release some new results!
[2023.07.06] 🔥 Release A new version of 浦源内容平台 demo ! Thanks for the support of Shanghai AI Lab!
[2023.04.12] 🔥 Release local gradio demo and you could run it locally, only need a A100/3090.
[2023.04.11] 🔥 Release some cases in huggingface demo.
[2023.04.10] 🔥 Release A new version of huggingface demo , which support both raw video and skeleton video as input. Enjoy it!
[2023.04.07] Release the first version of huggingface demo. Enjoy the fun of following your pose! You need to download the skeleton video or make your own skeleton video by mmpose. Additionaly, the second version which regard the video format as input is comming.
[2023.04.07] Release a colab notebook and updata the requirements for installation!
[2023.04.06] Release code, config and checkpoints!
[2023.04.03] Release Paper and Project page!

💃💃💃 HuggingFace Demo

🎤🎤🎤 Todo

Release the code, config and checkpoints for teaser
Colab
Hugging face gradio demo
Release more applications

🍻🍻🍻 Setup Environment

Our method is trained using cuda11, accelerator and xformers on 8 A100.

conda create -n fupose python=3.8
conda activate fupose

pip install -r requirements.txt

xformers is recommended for A100 GPU to save memory and running time.

Click for xformers installation

We find its installation not stable. You may try the following wheel:

wget https://github.com/ShivamShrirao/xformers-wheels/releases/download/4c06c79/xformers-0.0.15.dev0+4c06c79.d20221201-cp38-cp38-linux_x86_64.whl
pip install xformers-0.0.15.dev0+4c06c79.d20221201-cp38-cp38-linux_x86_64.whl

Our environment is similar to Tune-A-video (official, unofficial). You may check them for more details.

💃💃💃 Training

We fix the bug in Tune-a-video and finetune stable diffusion-1.4 on 8 A100. To fine-tune the text-to-image diffusion models for text-to-video generation, run this command:

TORCH_DISTRIBUTED_DEBUG=DETAIL accelerate launch \
    --multi_gpu --num_processes=8 --gpu_ids '0,1,2,3,4,5,6,7' \
    train_followyourpose.py \
    --config="configs/pose_train.yaml"

🕺🕺🕺 Inference

Once the training is done, run inference:

TORCH_DISTRIBUTED_DEBUG=DETAIL accelerate launch \
    --gpu_ids '0' \
    txt2video.py \
    --config="configs/pose_sample.yaml" \
    --skeleton_path="./pose_example/vis_ikun_pose2.mov"

You could make the pose video by mmpose , we detect the skeleton by HRNet. You just need to run the video demo to obtain the pose video. Remember to replace the background with black.

💃💃💃 Local Gradio Demo

You could run the gradio demo locally, only need a A100/3090.

python app.py

then the demo is running on local URL: http://0.0.0.0:Port

🕺🕺🕺 Weight

[Stable Diffusion] Stable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input. The pre-trained Stable Diffusion models can be downloaded from Hugging Face (e.g., Stable Diffusion v1-4)

[FollowYourPose] We also provide our pretrained checkpoints in Huggingface. you could download them and put them into checkpoints folder to inference our models.

FollowYourPose
├── checkpoints
│   ├── followyourpose_checkpoint-1000
│   │   ├──...
│   ├── stable-diffusion-v1-4
│   │   ├──...
│   └── pose_encoder.pth

💃💃💃 Results

We show our results regarding various pose sequences and text prompts.

Note mp4 and gif files in this github page are compressed. Please check our Project Page for mp4 files of original video results.


"Trump, on the mountain "	"man, on the mountain "	"astronaut, on mountain"


"girl, simple background"	"A Iron man, on the beach"	"A Hulk, on the mountain"


"A policeman, on the street"	"A girl, in the forest"	"A Iron man, on the street"


"A Robot, in Sahara desert"	"A Iron man, on the beach"	"A panda, son the sea"


"A man in the park, Van Gogh style"	"The fireman in the beach"	"Batman, brown background"


"A Hulk, on the sea"	"A superman, in the forest"	"A Iron man, in the snow"


"A man in the forest, Minecraft."	"A man in the sea, at sunset"	"James Bond, grey simple background"


"A Panda on the sea."	"A Stormtrooper on the sea"	"A astronaut on the moon"


"A astronaut on the moon."	"A Robot in Antarctica."	"A Iron man on the beach."


"The Obama in the desert"	"Astronaut on the beach."	"Iron man on the snow"


"A Stormtrooper on the sea"	"A Iron man on the beach."	"A astronaut on the moon."


"Astronaut on the beach"	"Superman on the forest"	"Iron man on the beach"


"Astronaut on the beach"	"Robot in Antarctica"	"The Stormtroopers, on the beach"

🎼🎼🎼 Citation

If you think this project is helpful, please feel free to leave a star⭐️⭐️⭐️ and cite our paper:

@article{ma2023follow,
  title={Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos},
  author={Ma, Yue and He, Yingqing and Cun, Xiaodong and Wang, Xintao and Shan, Ying and Li, Xiu and Chen, Qifeng},
  journal={arXiv preprint arXiv:2304.01186},
  year={2023}
}

👯👯👯 Acknowledgements

This repository borrows heavily from Tune-A-Video and FateZero. thanks the authors for sharing their code and models.

🕺🕺🕺 Maintenance

This is the codebase for our research work. We are still working hard to update this repo and more details are coming in days. If you have any questions or ideas to discuss, feel free to contact Yue Ma or Yingqing He or Xiaodong Cun.

⭐️⭐️⭐️ Star History

followyourpose's People

Contributors

Stargazers

Watchers

Forkers

vinthony eltociear hyojunguy johndpope moileehyeji arryboom pleonard212 fangdejia amirothman mayuefighting yeohoonyun anthonyheckmann itxasooderiz jinwook-shim liushuchun wjgaas anipen paperwave syntheticthinkers 1359347500cwc ganbayard arvintashakori sunpro108 keyhsw splendon curry0505 windseekerblade bigdatasciencegroup weichunpeng moonrabbitt sranc3 fs75 kch-chaihong wangqixian15 nobodomon yuioqp irfaan96 boweijia lavenzaa uygnis yuxun19999 rawsashimi1604 allessent aloysiuswoory pangkaho14 wf1024966 chuck-ma jerryfat7 fangzhirui1 li-ronghui hadryan davincibj chnxindong gibsoncorey ssusantachary bluescale007 zhihao-chen javasuncn yumianhuli1 youdutaidi yepjin moon920110 virualboot normidar jeongiin kh-54 wipwai ylhua jackgo2080 adityas1998 utopic-dev sanjayasl fjza1168 cabelo hhy5277 ego imanrabet kai0226 cctvastu bruinxiong zhoufumen

followyourpose's Issues

No such file or directory: 'Your dataset path/caption_rm2048_train.csv'

Hi,

could you send me the csv files that are needed? I can't find it anywhere. thanks

SD Auto's 1111 Ui Extension ?

The project looks really promising so far !
But sadly I'm not able to make it work on the hugginface demo, are you planning on releasing an extension for Auto's UI ?

No such file or directory: 'Your dataset path/caption_rm2048_train.csv'

In the train command to below, It seems it did not find a file name "caption_rm2048_train.csv" from hdvila.py called. would you pls provide the file or guid me to get thru it. Thank you.

TORCH_DISTRIBUTED_DEBUG=DETAIL accelerate launch train_followyourpose.py --config="configs/pose_train.yaml"

Portion error log to beloa:

File "/home/cc/FollowYourPose/followyourpose/data/hdvila.py", line 109, in _load_metadata
with open(caption_path, 'r',encoding="utf-8") as csvfile: #41s
FileNotFoundError: [Errno 2] No such file or directory: 'Your dataset path/caption_rm2048_train.csv'

What parts of Cross-Frame Attention have been reformed in your project relative to Tune-A-Video？

As the author mentioned in Abstract: In the second stage, we finetune the motion of the above network via a pose-free video dataset by adding the learnable temporal self-attention and reformed cross-frame self-attention blocks.

Can I understand the cross-frame attn mentioned in your paper is the SparseCausalAttention Class in your opened-source codes, which is the same as the SparseCausalAttention Class writen in Tune-A-Video? In this case, how does the Cross-Frame Attn reformed in your project? Which part of the code is embodied?

No such file or directory: 'Your dataset path/caption_rm2048_train.csv'

Duplicated Issue of: #26 (comment)

I would like to request for dataset that has been used in this project.

Where is the bug in multi-gpu training in Tune-A-Video codebase

Hi, thanks for your great work.

I also use the codebase from Tune-A-Video, and meet the same problem that it cannot support multi-gpu training. So my question is how to modify the code to make it work...

Hand + face Pose Guide to generate

Hi,
Is it possible to generate a single character from the Pose for about 5 seconds?

I have a video of Pose ( openpose + hands + face) and i was wondering if it is possible to generate an output video withe the length of 5 seconds that has a consistent character/Avatar which plays Dance, .... from the controled (pose) input?

i want to generate human like animation (No matter what, but just a consistent Character/Avatar)
Sample Video

Thanks
Best regards

No such file or directory: 'Your dataset path/caption_rm2048_train.csv'

Same issue for caption_rm2048_train.csv, it is not available in https://github.com/microsoft/XPretrain/tree/main/hd-vila or https://github.com/microsoft/XPretrain/tree/main/hd-vila-100m.
Could you please point me to the place where you found it?
Appreciate your help

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

Cannot connect to the server

[W socket.cpp:697] [c10d] The client socket has failed to connect to [DESKTOP-SB3DEO9]:29500 (system error: 10049 - The requested address is not valid in its context.).

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

Lionelding commented last week •
Same issue for caption_rm2048_train.csv, it is not available in https://github.com/microsoft/XPretrain/tree/main/hd-vila or https://github.com/microsoft/XPretrain/tree/main/hd-vila-100m.
Could you please point me to the place where you found it?
Appreciate your help

The web paper shows examples of poses, but they all seemsm to be about "dancing"?

Hello,
I wanted to know how many types of poses were there please? And how much control do we have?

I actually tried to read the prompts, and the prompt were never used to describe the pose animations am I wrong?
I did not install it but wanted to learn more before starting to use it, can we actually have some control over the poses realisticlly please? thanks

Paper Link?

Where is your technical report or paper?

Dataset

Hi,
great work! Are there any plans to release the dataset LAION-Pose?

Is Tesla T4 usable in Colab?

I noticed in the quick_demo.ipynb that the GPU used is Tesla T4 and the whole process seems to be fine. But when I run it myself, there is CUDA out of memory. Your homepage says that it needs A100/3090, so I want to know whether Tesla T4 is usable in Colab or not, and how to fix CUDA out of memory in Colab without changing the GPU. Thanks a lot!

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

dataset

excuse me, can you introduce how to get the dataset needed for training? i did not get it . thanks!

error colab

/content/FollowYourPose
/content/FollowYourPose
The following values were not passed to `accelerate launch` and had defaults used instead:
	`--num_processes` was set to a value of `1`
	`--num_machines` was set to a value of `1`
	`--mixed_precision` was set to a value of `'no'`
	`--dynamo_backend` was set to a value of `'no'`
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
A matching Triton is not available, some optimizations will not be enabled.
Error caught was: module 'triton.language' has no attribute 'constexpr'
/usr/local/lib/python3.8/dist-packages/torchvision/transforms/_functional_video.py:6: UserWarning: The 'torchvision.transforms._functional_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms.functional' module instead.
  warnings.warn(
/usr/local/lib/python3.8/dist-packages/torchvision/transforms/_transforms_video.py:22: UserWarning: The 'torchvision.transforms._transforms_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms' module instead.
  warnings.warn(
Traceback (most recent call last):
  File "txt2video.py", line 28, in <module>
    from followyourpose.pipelines.pipeline_followyourpose import FollowYourPosePipeline
  File "/content/FollowYourPose/followyourpose/pipelines/pipeline_followyourpose.py", line 43, in <module>
    class FollowYourPosePipeline(DiffusionPipeline):
  File "/content/FollowYourPose/followyourpose/pipelines/pipeline_followyourpose.py", line 333, in FollowYourPosePipeline
    **kwargs,
NameError: name 'kwargs' is not defined
Traceback (most recent call last):
  File "/usr/local/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/accelerate_cli.py", line 45, in main
    args.func(args)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 1097, in launch_command
    simple_launcher(args)
  File "/usr/local/lib/python3.8/dist-packages/accelerate/commands/launch.py", line 552, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/usr/bin/python3.8', 'txt2video.py', '--config=configs/pose_sample.yaml', '--skeleton_path=./pose_example/vis_ikun_pose2.mov']' returned non-zero exit status 1.

Will the dataset LAION-Pose be provided?

HuggingFace Demo not working and showing an error

When trying to open the HuggingFace ModelCard / Demo it only shows some error loading the model.

Would be great to test this model when it is working.

Fix the bug in multi-gpu training

Hi, thanks for your great work!

As for the multi-gpu training bug in Tune-A-Video, may I ask where the bug is?

I compare the training scripts in this repo and Tune-A-Video. But I do not find the difference.

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

Add license please

I noticed this repo doesn't have a license, can you add a license?

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

pose video

Hi, can you elaborate on the production process of pose video, Why do I always make videos with mmpose with key points， thank you

Could not load the Space: fffiloni/mmpose-estimation

Fetching Space from: https://huggingface.co/spaces/fffiloni/mmpose-estimation
Traceback (most recent call last):
File "/root/miniconda3/envs/env39/lib/python3.9/site-packages/gradio/external.py", line 436, in from_spaces
config = json.loads(result.group(1)) # type: ignore
AttributeError: 'NoneType' object has no attribute 'group'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/root/autodl-tmp/FollowYourPose/app.py", line 28, in
pipe = merge_config_then_run()
File "/root/autodl-tmp/FollowYourPose/inference_followyourpose.py", line 25, in init
self.mmpose = gr.load(name="spaces/fffiloni/mmpose-estimation")
File "/root/miniconda3/envs/env39/lib/python3.9/site-packages/gradio/external.py", line 68, in load
return load_blocks_from_repo(
File "/root/miniconda3/envs/env39/lib/python3.9/site-packages/gradio/external.py", line 107, in load_blocks_from_repo
blocks: gradio.Blocks = factory_methods[src](name, api_key, alias, **kwargs)
File "/root/miniconda3/envs/env39/lib/python3.9/site-packages/gradio/external.py", line 438, in from_spaces
raise ValueError("Could not load the Space: {}".format(space_name))
ValueError: Could not load the Space: fffiloni/mmpose-estimation

I used to be successful, but today suddenly it's not working. Has anyone had the same problem？

Training Code

is the output config file from running the Training code :

TORCH_DISTRIBUTED_DEBUG=DETAIL accelerate launch \
    --multi_gpu --num_processes=8 --gpu_ids '0,1,2,3,4,5,6,7' \
    train_followyourpose.py \
    --config="configs/pose_train.yaml"

supposed to be used in the Inference code?:

TORCH_DISTRIBUTED_DEBUG=DETAIL accelerate launch \
    --gpu_ids '0' \
    txt2video.py \
    --config="configs/pose_sample.yaml" \
    --skeleton_path="./pose_example/vis_ikun_pose2.mov"

the output config.yaml file doesnt seem to look like the pose_sample.yaml file that is supposed to be used in the Inference code.

when to release the whole training code and data

Hello, as the paper has been accepted, did you have plan to release the whole training code and data?

When will the code be released?

Thanks for your great work. I wonder when will your code be released?

Encoder path hardcoded.

Hey, thank you for the repo! Really cool.

While trying to get the sample up, I realize that the encoder path is hardcoded followyourpose/models/unet.py Line 215:

adapter_weight = torch.load('./checkpoints/pose_encoder.pth')

I think I can add a PR to make it so that it be accepted through the OmegaConfig, but before I do that, just wanna ask if in case this is some kinda workaround that I'm not aware about.

Cheers!

Code for 1 stage training

Hello,

Can you share the code for 1st stage of training your model (Pose-Controllable Text-to-Image Generation)?

Thank you in advance.

how could I optimize the inference process?

mmpose TypeError: wrapper() got an unexpected keyword argument 'fn_index'

Traceback (most recent call last):
File "/FollowYourPose/app.py", line 177, in
gr.Examples(examples=examples,
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/helpers.py", line 75, in create_examples
examples_obj.create()
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/helpers.py", line 301, in create
client_utils.synchronize_async(self.cache)
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio_client/utils.py", line 808, in synchronize_async
return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, **kwargs) # type: ignore
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/fsspec/asyn.py", line 103, in sync
raise return_result
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/fsspec/asyn.py", line 56, in _runner
result[0] = await coro
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/helpers.py", line 362, in cache
prediction = await Context.root_block.process_api(
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/blocks.py", line 1561, in process_api
result = await self.call_function(
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/blocks.py", line 1179, in call_function
prediction = await anyio.to_thread.run_sync(
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/utils.py", line 695, in wrapper
response = f(*args, **kwargs)
File "/FollowYourPose/inference_followyourpose.py", line 50, in run
infer_skeleton(self.mmpose, data_path)
File "/FollowYourPose/inference_mmpose.py", line 97, in infer_skeleton
mmpose_frame = get_mmpose_filter(mmpose, i)
File "/FollowYourPose/inference_mmpose.py", line 63, in get_mmpose_filter
image = mmpose(i, fn_index=0)[1]
File "/root/miniconda3/envs/python-app/lib/python3.9/site-packages/gradio/events.py", line 74, in call
return self.fn(*args, **kwargs)
TypeError: wrapper() got an unexpected keyword argument 'fn_index'

RuntimeError: No such operator xformers::efficient_attention_forward_cutlass - did you forget to build xformers with `python setup.py develop`?

Hi, i tried running the training code and inference code but i kept recieving these errors

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

how to enhance the generation quality?

Hello, I've been using your project for action video generation. I conducted experiments on a V100-32G GPU, inputting a skeleton video of around seven seconds and trying various prompts. However, the generated results didn't quite meet the showcased effects. I'm wondering, without shortening the video length, which parameters can be modified to enhance the generation quality?

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

Other Pose/ Skeletons

Hi
Is there any possibilities to use ohter Poses/Skelton methods like either OSX https://osx-ubody.github.io/ or https://github.com/wholebody3d/wholebody3d

UBody Dataset
UBody is a large-scale Upper-Body dataset with the following annotations:
2D whole-body keypoints

Plan to support new version of diffusers?

Hello, I would like to personalize the origin model SD1.4 with dreambooth and integrate with your pipeline for inference. However, I use the latest version of diffusers to train Dreambooth. Therefore, when loading the model, I encounter this error:

ValueError: unknown mid_block_type : UNetMidBlock2DCrossAttn

Would you please help me with this error?

No such file or directory: 'Your dataset path/caption_rm2048_train.csv' #26

Stage 1 code

I am looking for the training code for stage 1 (pose encoder) in this repo, but didn't find it. Will this code be released or any suggestions for training on my own pose/other conditions datasets?

Thanks!

What the bugs did you fiex in Tune-a-video

Hi, I want to know what the bugs did you fiex in Tune-a-video? Thanks!

I want to retrain the model, where and how to download hdvila dataset?

Lawyer letter warning! 律师函警告

Hi,

I really appreciate your exciting work and the interesting demo. They really made my day, lol...

LAION-Pose Link

Hi,

I was looking for the links the LAION-Pose dataset from the paper. I was looking to help a team for the Huggingface JAX sprint train a MediaPipe hand tracking annotator for ControlNet.

Is the dataset publicly available?

More specific description of the datasets used (LAION-POSE, HDVLIA)

Could you please provide a more specific description of the "LAION-POSE" and "HDVLIA" datasets (e.g., total number of images used, metadata)?

How to obtain the pose video? Could you please give the implement details of mmpose and HRNet?

Excellent work! I want to know the implement details of mmpose and HRNet for obtaining the pose video, since your source code don't show the details.

How to make pose video?

I want make the pose video like the mov in pose_example. What tool should i use?

What will be needed if we wanna generate a higher fps video sequence?

Current demo from this repo seems generating 4-8 fps videos pretty decently. However, to make the video really useful, I would imagine we would like to generate some smoother video and 30fps could be a better start.
Is that easy to configure current code to generate a 30fps video? The calculation will increase for sure, but are there anything else we should be mindful of? For example, background flickering might be more obvious as we increase fps, what is the best way to achieve the best quality when expanding the fps based on current code?

mayuelala / followyourpose Goto Github PK

followyourpose's Introduction

🕺🕺🕺 Follow-Your-Pose 💃💃💃 Pose-Guided Text-to-Video Generation using Pose-Free Videos (AAAI 2024)

💃💃💃 Demo Video

💃💃💃 Abstract

🕺🕺🕺 Changelog

💃💃💃 HuggingFace Demo

🎤🎤🎤 Todo

🍻🍻🍻 Setup Environment

💃💃💃 Training

🕺🕺🕺 Inference

💃💃💃 Local Gradio Demo

🕺🕺🕺 Weight

💃💃💃 Results

🎼🎼🎼 Citation

👯👯👯 Acknowledgements

🕺🕺🕺 Maintenance

⭐️⭐️⭐️ Star History

followyourpose's People

Contributors

Stargazers

Watchers

Forkers

followyourpose's Issues

Recommend Projects

Recommend Topics

Recommend Org

🕺🕺🕺 Follow-Your-Pose 💃💃💃
Pose-Guided Text-to-Video Generation using Pose-Free Videos (AAAI 2024)