Reminder <input type=

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Memory Error during tokenization while fine tuning LLava1.5-7B-Chat more than 8000 images about llama-factory HOT 2 CLOSED

Hassaan68 commented on September 10, 2024

Memory Error during tokenization while fine tuning LLava1.5-7B-Chat more than 8000 images

from llama-factory.

Comments (2)

hiyouga commented on September 10, 2024

Try dataset streaming: streaming: true

from llama-factory.

Hassaan68 commented on September 10, 2024

@hiyouga I am still facing the issue with streaming:true and max_steps:10000. I am finetuning LLava on 93000 images and tokenizer just report No Space left on device error after tokenizing around 52000 images. I can see that my sagemaker cache is 75GB after this making the space memory full. how to counter this issue?

Full Command:

llamafactory-cli train \
    --stage sft \
    --do_train True \
    --model_name_or_path llava-hf/llava-1.5-7b-hf \
    --preprocessing_num_workers 16 \
    --finetuning_type lora \
    --template vicuna \
    --flash_attn fa2 \
    --visual_inputs True \
    --dataset_dir data \
    --dataset icentia11k \
    --cutoff_len 1024 \
    --learning_rate 5e-05 \
    --num_train_epochs 10.0 \
    --max_steps 10000 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --max_grad_norm 1.0 \
    --logging_steps 5 \
    --save_steps 100 \
    --warmup_steps 0 \
    --optim adamw_torch \
    --packing False \
    --report_to none \
    --output_dir saves/LLaVA1.5-7B-Chat/lora/train_2024-06-26-11-09-00 \
    --fp16 True \
    --plot_loss True \
    --ddp_timeout 180000000 \
    --include_num_input_tokens_seen True \
    --lora_rank 8 \
    --lora_alpha 32 \
    --lora_dropout 0 \
    --use_dora True \
    --lora_target all  \ 
    --streaming True

from llama-factory.

Memory Error during tokenization while fine tuning LLava1.5-7B-Chat more than 8000 images about llama-factory HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent