Preprocess of UCF101,about vchitect/latte

Comments (15)

maxin-cn commented on August 16, 2024 5

@maxin-cn how can i generate the train_256_list.txt, is there some process scripts? if I want to train with ucf101_img_train.yaml

Please refer to the following code (firstly transfer video to frames and resize at the same time. You can refer to this.):

import os
from tqdm import tqdm

ffs_image_root = '/UCF101/images_256/'
ffs_image_txt = '/UCF101/train_256_list.txt'

def get_filelist(file_path):
    Filelist = []
    for home, dirs, files in os.walk(file_path):
        for filename in files:
            Filelist.append(os.path.join(home, filename))
            # Filelist.append(filename)
    return Filelist

ffs_files = get_filelist(ffs_image_root)

for i in tqdm(ffs_files):
    relative_path = i.split(ffs_image_root)[-1]
    with open(ffs_image_txt, 'a+') as f:
        f.writelines(relative_path + '\n')

from latte.

maxin-cn commented on August 16, 2024 1

Dear authors,

I really appreciate your exceptional open-source work. As a newcomer to the video field, I have been exploring your repository and have a couple of questions that I hope you can help clarify.

Training Step vs. Epoch: the training procedure specifies max_train_steps rather than max_train_epochs. My understanding is that when using multiple GPUs, the steps per epoch decrease, but the total number of epochs increases, which doesn't seem to leverage the acceleration benefits of multi-GPU training. Could you please explain the rationale behind this setting? And why does the Fig.8 only use 150k training iterations at most but ffs_train.yaml and 'ucf101_train.yaml' adopt max_train_steps=1000k?

UCF-101 Dataset Training Details: I would also like to inquire about the training resources, time, and performance of the model on the UCF-101 dataset. According to the code, training the Latte-S/2 model on UCF-101 with two 32G NVIDIA V100 GPUs takes approximately 2.5 minutes for every 100 steps. This would translate to an 18-day training period for 1 million steps, which seems quite extensive. Could you provide details on the configuration used for training? Additionally, information on the training time and Inception Score (IS) precision for the Latte-S/2 model on the UCF-101 dataset would be invaluable. You mentioned that 8 cards were used for all experiments in Issue #39 and I wonder which type of GPUs are used.

Thank you for your time and for maintaining such a fantastic resource for the community.

Best regards,

Shun Lu

Hi, thanks for your interest.

The max_train_steps parameter in ffs_train.yaml does not necessarily mean that the model needs to be trained up to 1000k steps; you can consider it as a maximum value.
The configuration in this repository is almost the same as the configuration I used. I use 8 A100 GPUs (80G) to conduct all the experiments shown in the paper (except for LatteT2V).

If you have any questions, please let me know.

from latte.

maxin-cn commented on August 16, 2024

thanks for your great work! i want to know how to generate the /path/to/datasets/UCF101/train_256_list.txt for the UCF101 training。 After downloading the UCF101 videos, according to the paper "We extract 16-frame video clips from these datasets", are there any process scripts we can follow?

Hi, thanks for your interest. train_256_list.txt contains the following information (video-class-name_video-name_frame):

As for the second question, please follow this link.

from latte.

valencebond commented on August 16, 2024

@maxin-cn how can i generate the train_256_list.txt, is there some process scripts? if I want to train with ucf101_img_train.yaml

from latte.

xszheng2020 commented on August 16, 2024

Hi, @maxin-cn

I processed the UCF101 dataset following your above suggestions and the data is organized as follows:

But when I try to start training using train_with_img.py, it gets frozen with a super high CPU occupation.
Any idea?

[2024-03-01 12:24:03] Experiment directory created at ./results_img/000-LatteIMG-S-2-F16S3-ucf101_img
Starting rank=1, local rank=1, seed=3408, world_size=2.
[2024-03-01 12:24:05] Model Parameters: 32,624,288
[2024-03-01 12:24:07] Dataset contains 2,486,613 videos (./data/UCF-101)

from latte.

maxin-cn commented on August 16, 2024

Hi, @maxin-cn

I processed the UCF101 dataset following your above suggestions and the data is organized as follows:

But when I try to start training using train_with_img.py, it gets frozen with a super high CPU occupation. Any idea?
[2024-03-01 12:24:03] Experiment directory created at ./results_img/000-LatteIMG-S-2-F16S3-ucf101_img
Starting rank=1, local rank=1, seed=3408, world_size=2.
[2024-03-01 12:24:05] Model Parameters: 32,624,288
[2024-03-01 12:24:07] Dataset contains 2,486,613 videos (./data/UCF-101)

@valencebond Hi, could you share any experience with @xszheng2020 ? Thank you very much~

from latte.

xszheng2020 commented on August 16, 2024

Hi, @maxin-cn

The images should be placed as follows, right?

['./UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000148.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000105.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000155.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000037.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000027.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000126.jpg',
...]

And the train_256_list.txt contains:

WritingOnBoardv_WritingOnBoard_g02_c04/000148.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000105.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000155.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000037.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000027.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000126.jpg
...

from latte.

maxin-cn commented on August 16, 2024

Hi, @maxin-cn

The images should be placed as follows, right?

['./UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000148.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000105.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000155.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000037.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000027.jpg',
 './UCF101/images_256/WritingOnBoardv_WritingOnBoard_g02_c04/000126.jpg',
...]

And the train_256_list.txt contains:

WritingOnBoardv_WritingOnBoard_g02_c04/000148.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000105.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000155.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000037.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000027.jpg
WritingOnBoardv_WritingOnBoard_g02_c04/000126.jpg
...

That's right.

from latte.

ShunLu91 commented on August 16, 2024

Dear authors,

I really appreciate your exceptional open-source work. As a newcomer to the video field, I have been exploring your repository and have a couple of questions that I hope you can help clarify.

Training Step vs. Epoch: the training procedure specifies max_train_steps rather than max_train_epochs. My understanding is that when using multiple GPUs, the steps per epoch decrease, but the total number of epochs increases, which doesn't seem to leverage the acceleration benefits of multi-GPU training. Could you please explain the rationale behind this setting? And why does the Fig.8 only use 150k training iterations at most but ffs_train.yaml and 'ucf101_train.yaml' adopt max_train_steps=1000k?
UCF-101 Dataset Training Details: I would also like to inquire about the training resources, time, and performance of the model on the UCF-101 dataset. According to the code, training the Latte-S/2 model on UCF-101 with two 32G NVIDIA V100 GPUs takes approximately 2.5 minutes for every 100 steps. This would translate to an 18-day training period for 1 million steps, which seems quite extensive. Could you provide details on the configuration used for training? Additionally, information on the training time and Inception Score (IS) precision for the Latte-S/2 model on the UCF-101 dataset would be invaluable. You mentioned that 8 cards were used for all experiments in Issue #39 and I wonder which type of GPUs are used.

Thank you for your time and for maintaining such a fantastic resource for the community.

Best regards,

Shun Lu

from latte.

ShunLu91 commented on August 16, 2024

Dear authors,
I really appreciate your exceptional open-source work. As a newcomer to the video field, I have been exploring your repository and have a couple of questions that I hope you can help clarify.

Training Step vs. Epoch: the training procedure specifies max_train_steps rather than max_train_epochs. My understanding is that when using multiple GPUs, the steps per epoch decrease, but the total number of epochs increases, which doesn't seem to leverage the acceleration benefits of multi-GPU training. Could you please explain the rationale behind this setting? And why does the Fig.8 only use 150k training iterations at most but ffs_train.yaml and 'ucf101_train.yaml' adopt max_train_steps=1000k?

UCF-101 Dataset Training Details: I would also like to inquire about the training resources, time, and performance of the model on the UCF-101 dataset. According to the code, training the Latte-S/2 model on UCF-101 with two 32G NVIDIA V100 GPUs takes approximately 2.5 minutes for every 100 steps. This would translate to an 18-day training period for 1 million steps, which seems quite extensive. Could you provide details on the configuration used for training? Additionally, information on the training time and Inception Score (IS) precision for the Latte-S/2 model on the UCF-101 dataset would be invaluable. You mentioned that 8 cards were used for all experiments in Issue #39 and I wonder which type of GPUs are used.

Thank you for your time and for maintaining such a fantastic resource for the community.
Best regards,
Shun Lu

Hi, thanks for your interest.

The max_train_steps parameter in ffs_train.yaml does not necessarily mean that the model needs to be trained up to 1000k steps; you can consider it as a maximum value.

The configuration in this repository is almost the same as the configuration I used. I use 8 A100 GPUs (80G) to conduct all the experiments shown in the paper (except for LatteT2V).

If you have any questions, please let me know.

Great thanks to your prompt and insightful response. I have two more questions when reproducing the results:

Training Steps and Duration for Latte-XL/2 on UCF-101: Could you please specify the exact number of training steps and the training time when training the Latte-XL/2 using eight A100 GPUs on the UCF-101 dataset?
Availability of Latte-S/2 Model on UCF-101: Have you trained the Latte-S/2 model on the UCF-101 dataset? If so, would it be possible for you to share such a model for reference? It would be useful for me to verify my experiments.

Thank you once again for your time and support.

Warm regards

from latte.

maxin-cn commented on August 16, 2024

Dear authors,
I really appreciate your exceptional open-source work. As a newcomer to the video field, I have been exploring your repository and have a couple of questions that I hope you can help clarify.

Training Step vs. Epoch: the training procedure specifies max_train_steps rather than max_train_epochs. My understanding is that when using multiple GPUs, the steps per epoch decrease, but the total number of epochs increases, which doesn't seem to leverage the acceleration benefits of multi-GPU training. Could you please explain the rationale behind this setting? And why does the Fig.8 only use 150k training iterations at most but ffs_train.yaml and 'ucf101_train.yaml' adopt max_train_steps=1000k?

UCF-101 Dataset Training Details: I would also like to inquire about the training resources, time, and performance of the model on the UCF-101 dataset. According to the code, training the Latte-S/2 model on UCF-101 with two 32G NVIDIA V100 GPUs takes approximately 2.5 minutes for every 100 steps. This would translate to an 18-day training period for 1 million steps, which seems quite extensive. Could you provide details on the configuration used for training? Additionally, information on the training time and Inception Score (IS) precision for the Latte-S/2 model on the UCF-101 dataset would be invaluable. You mentioned that 8 cards were used for all experiments in Issue #39 and I wonder which type of GPUs are used.

Thank you for your time and for maintaining such a fantastic resource for the community.
Best regards,
Shun Lu

Hi, thanks for your interest.

The max_train_steps parameter in ffs_train.yaml does not necessarily mean that the model needs to be trained up to 1000k steps; you can consider it as a maximum value.

The configuration in this repository is almost the same as the configuration I used. I use 8 A100 GPUs (80G) to conduct all the experiments shown in the paper (except for LatteT2V).

If you have any questions, please let me know.

Great thanks to your prompt and insightful response. I have two more questions when reproducing the results:

Training Steps and Duration for Latte-XL/2 on UCF-101: Could you please specify the exact number of training steps and the training time when training the Latte-XL/2 using eight A100 GPUs on the UCF-101 dataset?

Availability of Latte-S/2 Model on UCF-101: Have you trained the Latte-S/2 model on the UCF-101 dataset? If so, would it be possible for you to share such a model for reference? It would be useful for me to verify my experiments.

Thank you once again for your time and support.

Warm regards

You can refer to #58. I remember training on an 8 A100 for 2 days or so and the model could generate video.
I have only trained models of different sizes on the FFS dataset, and I can share these models with you if you need them.

from latte.

ShunLu91 commented on August 16, 2024

r

Sincere appreciation for the valuable information.

Could you kindly inform me about the exact/approximate training steps required to reach an accuracy of 68.53% on UCF101 for the model listed in Table 1?
Additionally, I really need these models on the FFS dataset and the exact/approximate training steps. I kindly request them at your convenience to my email: [email protected].

Really thanks for your generous support.

from latte.

maxin-cn commented on August 16, 2024

r

Sincere appreciation for the valuable information.

Could you kindly inform me about the exact/approximate training steps required to reach an accuracy of 68.53% on UCF101 for the model listed in Table 1?

Additionally, I really need these models on the FFS dataset and the exact/approximate training steps. I kindly request them at your convenience to my email: [email protected].

Really thanks for your generous support.

It is difficult for me to tell you the exact number of training steps for the model that achieved this Inception Score because I forgot which model was used. But it must have taken a long time (about several weeks).
I have uploaded these models to here (all models training about 250k iterations).

from latte.

ShunLu91 commented on August 16, 2024

r

Sincere appreciation for the valuable information.

Could you kindly inform me about the exact/approximate training steps required to reach an accuracy of 68.53% on UCF101 for the model listed in Table 1?

Additionally, I really need these models on the FFS dataset and the exact/approximate training steps. I kindly request them at your convenience to my email: [email protected].

Really thanks for your generous support.

It is difficult for me to tell you the exact number of training steps for the model that achieved this Inception Score because I forgot which model was used. But it must have taken a long time (about several weeks).

I have uploaded these models to here (all models training about 250k iterations).

Got it and thanks a lot.

from latte.

github-actions commented on August 16, 2024

Hi There! 👋

This issue has been marked as stale due to inactivity for 14 days.

We would like to inquire if you still have the same problem or if it has been resolved.

If you need further assistance, please feel free to respond to this comment within the next 7 days. Otherwise, the issue will be automatically closed.

We appreciate your understanding and would like to express our gratitude for your contribution to Latte. Thank you for your support. 🙏

from latte.

Preprocess of UCF101 about latte HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent