Code Monkey home page Code Monkey logo

Comments (7)

frozenarctic avatar frozenarctic commented on August 21, 2024

在cli.py的第80行,改成
if (not disable_torchrun) and (get_device_count() > 1):
然后到根目录下重新运行
pip install -e '.[torch,metrics]'
默认就是用本地单卡了

from llama-factory.

hiyouga avatar hiyouga commented on August 21, 2024

修复了

from llama-factory.

yxl23 avatar yxl23 commented on August 21, 2024

我用llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml 8bit量化微调qwen2出现
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 0.51}
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 1.02}
{'loss': 0.0, 'grad_norm': nan, 'learning_rate': 0.0, 'epoch': 1.53}

from llama-factory.

hiyouga avatar hiyouga commented on August 21, 2024

用 bf16

from llama-factory.

yxl23 avatar yxl23 commented on August 21, 2024

在哪添加啊

model

model_name_or_path: E:\LLaMA-Factory\qwen\Qwen2-7B-Instruct
quantization_bit: 8

method

stage: sft
do_train: true
finetuning_type: lora
lora_target: all

dataset

dataset: xunlian
template: qwen
cutoff_len: 1024
max_samples: 1000
overwrite_cache: true
preprocessing_num_workers: 16

output

output_dir: saves/qwen2-7b/lora/sft
logging_steps: 10
save_steps: 500
plot_loss: true
overwrite_output_dir: true

train

per_device_train_batch_size: 1
gradient_accumulation_steps: 8
learning_rate: 1.0e-4
num_train_epochs: 100.0
lr_scheduler_type: cosine
warmup_ratio: 0.1
fp16: true

eval

val_size: 0.1
per_device_eval_batch_size: 1
eval_strategy: steps
eval_steps: 500

from llama-factory.

hiyouga avatar hiyouga commented on August 21, 2024

fp16 换成 bf16

from llama-factory.

yxl23 avatar yxl23 commented on August 21, 2024

好的谢谢

from llama-factory.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.