Hi, thanks for sharing the implementation. I wonder how many gpus (and what kind of gp

Hi <a class="user-mention notranslate" data-hovercard-type="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Training time and number of GPUS about enhancing-transformers HOT 6 CLOSED

thuanz123 commented on July 21, 2024

Training time and number of GPUS

from enhancing-transformers.

Comments (6)

thuanz123 commented on July 21, 2024 2

Hi @JialeTao, the training vit-vqgan small is faster than I expected and it is just released. The speed is 1.05s per iteration for a batch size of 8 and it can even go up to 16 but 8 is good enough, again this is for A100 40GB. Also if you dont have any further question, I will close this issue. Feel free to reopen

Hi @thuanz123 , thanks for sharing the checkpoint of vit-vqgan small. Then for this small-small model, how many GPUs and how many iterations you have trained?

For quick training, I use 32 gpus A100 40GB and train for 500000 iterations on ImageNet. But I think a decent GPU with 8GB VRAM is enough, just lower the batch size and train longer

from enhancing-transformers.

thuanz123 commented on July 21, 2024

Hi @JialeTao, depend on the config, training can be fast or slow. For the config in this repo which is ViT-VGAN base, it takes about 1.45s per iteration with a batch size of 4 on a A100, this is quite demanding. So if you dont have much gpus, I recommend training a much smaller config than the config I have in this repo. Also, there are plans to train smaller models so you can wait if you want but it will be a long time later since I'm too busy these days 😭

from enhancing-transformers.

JialeTao commented on July 21, 2024

Thanks for the reply. Then for the vit-vqgan base, what how many iterations you have trained? And the 1.45s means stage 1 training or stage 2 training? And the last, A100 with 40G menmory or 80G?

from enhancing-transformers.

thuanz123 commented on July 21, 2024

Hi @JialeTao, 1,45s per iteration is for stage 1 training and the GPU is A100 40GB. I have trained vit-vqgan base for 1000000 iterations with 32 A100s and each gpu has batch size of 4. For stage 2 training, it is currently buggy so I dont have any estimate or numbers for it 😅

from enhancing-transformers.

thuanz123 commented on July 21, 2024

Hi @JialeTao, the training vit-vqgan small is faster than I expected and it is just released. The speed is 1.05s per iteration for a batch size of 8 and it can even go up to 16 but 8 is good enough, again this is for A100 40GB. Also if you dont have any further question, I will close this issue. Feel free to reopen

from enhancing-transformers.

zyf0619sjtu commented on July 21, 2024

Hi @JialeTao, the training vit-vqgan small is faster than I expected and it is just released. The speed is 1.05s per iteration for a batch size of 8 and it can even go up to 16 but 8 is good enough, again this is for A100 40GB. Also if you dont have any further question, I will close this issue. Feel free to reopen

Hi @thuanz123 , thanks for sharing the checkpoint of vit-vqgan small. Then for this small-small model, how many GPUs and how many iterations you have trained?

from enhancing-transformers.

Training time and number of GPUS about enhancing-transformers HOT 6 CLOSED

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent