Code Monkey home page Code Monkey logo

Comments (5)

zhoushaoxiang avatar zhoushaoxiang commented on July 20, 2024 1

请问大家有使用昇腾进行glm4微调训练成功的吗,我这一直报
The param dtype not implemented for DT_BFLOAT16, should be in dtype support list [DT_FLOAT16,DT_FLOAT,DT_DOUBLE,DT_INT8,DT_UINT8,DT_INT16,DT_INT32,DT_INT64,DT_BOOL,DT_COMPLEX64,DT_COMPLEX128,]
有没有大佬能解决这个问题

from llama-factory.

hiyouga avatar hiyouga commented on July 20, 2024

把这个变量改成 False 试试?

self.is_chatglm_model = getattr(unwrapped_model.config, "model_type", None) == "chatglm"

from llama-factory.

1737686924 avatar 1737686924 commented on July 20, 2024

Reminder

  • I have read the README and searched the existing issues.

System Info

[2024-06-07 10:17:14,980] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)

  • llamafactory version: 0.7.2.dev0
  • Platform: Linux-5.10.0-198.0.0.111.oe2203sp3.aarch64-aarch64-with-glibc2.34
  • Python version: 3.10.14
  • PyTorch version: 2.2.0 (NPU)
  • Transformers version: 4.41.2
  • Datasets version: 2.19.2
  • Accelerate version: 0.30.1
  • PEFT version: 0.11.1
  • TRL version: 0.9.3
  • NPU type: Ascend910B2
  • CANN version: 8.0.RC2.alpha001
  • DeepSpeed version: 0.13.2

Reproduction

llamafactory-cli train \
    --stage ppo \
    --do_train True \
    --model_name_or_path ZhipuAI/glm-4-9b-chat \
    --preprocessing_num_workers 16 \
    --finetuning_type lora \
    --template glm4 \
    --flash_attn auto \
    --dataset_dir data \
    --dataset disc-law-sft-triplet \
    --cutoff_len 8192 \
    --learning_rate 5e-05 \
    --num_train_epochs 3.0 \
    --max_samples 100000 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --max_grad_norm 1.0 \
    --logging_steps 5 \
    --save_steps 100 \
    --warmup_steps 0 \
    --optim adamw_torch \
    --packing False \
    --report_to none \
    --output_dir saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-44-37 \
    --bf16 True \
    --plot_loss True \
    --ddp_timeout 180000000 \
    --include_num_input_tokens_seen True \
    --adapter_name_or_path saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 \
    --lora_rank 8 \
    --lora_alpha 16 \
    --lora_dropout 0 \
    --lora_target all \
    --reward_model saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06 \
    --reward_model_type lora \
    --ppo_score_norm True \
    --top_k 0 \
    --top_p 0.9

### Expected behavior

_No response_

### Others

[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] 
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] *****************************************
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. 
[2024-06-07 10:10:55,970] torch.distributed.run: [WARNING] *****************************************
[2024-06-07 10:11:03,623] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,661] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,705] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,818] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,836] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,905] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,955] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
[2024-06-07 10:11:03,991] [INFO] [real_accelerator.py:191:get_accelerator] Setting ds_accelerator to npu (auto detect)
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 0, device: npu:0, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
2024-06-07 10:11:17,434 - modelscope - INFO - PyTorch version 2.2.0 Found.
2024-06-07 10:11:17,436 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-06-07 10:11:17,490 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 ceb78a2ac746b5506819a47dbbf0e37c and a total number of 976 components indexed
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 7, device: npu:7, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 4, device: npu:4, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 6, device: npu:6, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 2, device: npu:2, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:17 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:17 - INFO - llamafactory.hparams.parser - Process rank: 1, device: npu:1, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
06/07/2024 10:11:18 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:18 - INFO - llamafactory.hparams.parser - Process rank: 5, device: npu:5, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,235 >> loading file tokenizer.model
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,235 >> loading file added_tokens.json
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file special_tokens_map.json
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file tokenizer_config.json
[INFO|tokenization_utils_base.py:2106] 2024-06-07 10:11:18,236 >> loading file tokenizer.json
06/07/2024 10:11:18 - WARNING - llamafactory.hparams.parser - `ddp_find_unused_parameters` needs to be set as False for LoRA in DDP training.
06/07/2024 10:11:18 - INFO - llamafactory.hparams.parser - Process rank: 3, device: npu:3, n_gpu: 1, distributed training: True, compute dtype: torch.bfloat16
[WARNING|logging.py:314] 2024-06-07 10:11:19,288 >> Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
06/07/2024 10:11:19 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:19 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:20 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
06/07/2024 10:11:22 - INFO - llamafactory.data.template - Add <|user|>,<|observation|> to stop words.
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
06/07/2024 10:11:26 - INFO - llamafactory.data.loader - Loading dataset disc-law-sft-triplet.json...
Running tokenizer on dataset (num_proc=16): 100%|█████████████████████████████████████████████████████████████████| 16000/16000 [00:38<00:00, 416.91 examples/s]
input_ids:
[151331, 151333, 151336, 198, 100698, 103309, 101138, 3837, 113094, 110590, 105177, 99312, 8994, 98379, 106170, 117921, 3837, 98546, 20, 98334, 21, 98424, 99146, 98385, 99082, 117225, 3837, 108592, 98696, 105181, 103757, 117537, 98380, 99043, 100451, 102337, 103273, 106156, 118828, 98798, 105181, 101376, 98314, 117055, 98550, 109534, 3837, 98459, 101247, 105079, 98634, 123900, 98324, 117537, 98595, 101676, 111602, 99916, 98760, 101642, 98335, 3837, 108592, 98696, 105181, 98453, 105529, 109290, 98396, 98381, 103941, 98798, 105181, 99195, 118894, 3837, 103078, 98711, 109534, 105079, 98322, 107801, 98993, 114731, 100129, 101242, 3837, 98547, 110664, 99999, 105181, 109487, 98365, 3837, 108592, 98696, 105181, 98701, 107801, 98993, 114731, 103941, 98798, 105181, 98314, 99527, 113995, 3837, 99704, 124187, 116767, 101806, 98583, 109695, 98829, 110960, 99416, 121952, 109055, 112246, 117442, 101242, 3837, 117442, 101242, 100048, 98875, 121424, 99054, 99893, 98649, 105862, 98433, 112998, 99108, 120250, 106318, 100035, 1773, 98365, 98379, 118828, 98798, 105181, 105420, 3837, 101113, 99131, 100588, 98634, 100059, 98493, 108592, 98696, 105181, 98607, 103278, 98344, 98817, 1773, 98379, 103171, 3837, 109534, 108634, 99532, 102492, 20, 11, 124206, 13, 24, 98575, 3837, 109055, 108634, 99532, 102492, 16, 11, 19, 101474, 13, 102486, 98575, 3837, 117442, 101242, 108634, 99532, 102492, 17, 11, 24, 99951, 13, 99082, 98575, 3837, 99054, 99893, 98649, 106508, 99108, 120250, 108634, 99532, 102492, 24, 11, 102114, 21, 98575, 3837, 111086, 101832, 99532, 106234, 102492, 98729, 11, 101135, 17, 13, 21, 98575, 1773, 101409, 100867, 3837, 108592, 98696, 105181, 98319, 119626, 98322, 100297, 98479, 110416, 3837, 118828, 98798, 105181, 5373, 100547, 105181, 5373, 104464, 105181, 110065, 3837, 110664, 99999, 105181, 98314, 98697, 98856, 3837, 100059, 111413, 99565, 98990, 3837, 116550, 99304, 3837, 103171, 102622, 98560, 3837, 108592, 98696, 105181, 98314, 127251, 98381, 102070, 98539, 98404, 102243, 105483, 3837, 106144, 102919, 1773, 151337]
inputs:
[gMASK] <sop> <|user|> 
基于下列案件,推测可能的判决结果。
经审理查明,2015年6月21日15时许,被告人白某某在大东区小河沿公交车站乘坐被害人张某某驾驶的133路公交车,当车辆行驶至沈阳市大东区东陵西路26号附近时,被告人白某某因未能下车而与司机张某某发生争执,并在该公交车行驶中用手拉拽档杆,被证人韩某某拉开后,被告人白某某又用手拉拽司机张某某的右胳膊,导致该车失控撞向右侧马路边停放的轿车和一个路灯杆,路灯杆折断后将福锅记炖品店的牌匾砸坏。后经被害人张某某报警,公安人员赶至现场将被告人白某某传唤到案。经鉴定,公交车受损价值人民币5,189.9元,轿车受损价值人民币1,449.57元,路灯杆受损价值人民币2,927.15元,福锅记饭店牌匾受损价值人民币9,776元,本案损失价值共计人民币19,342.6元。上述事实,被告人白某某在庭审中亦无异议,被害人张某某、朱某某、詹某某陈述,证人韩某某的证言,现场勘察笔录,视听资料,鉴定结论书,被告人白某某的供述与辩解等证据证实,足以认定。 <|assistant|>
[INFO|configuration_utils.py:731] 2024-06-07 10:12:08,107 >> loading configuration file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/config.json
[INFO|configuration_utils.py:731] 2024-06-07 10:12:08,110 >> loading configuration file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/config.json
[INFO|configuration_utils.py:796] 2024-06-07 10:12:08,111 >> Model config ChatGLMConfig {
  "_name_or_path": "/root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat",
  "add_bias_linear": false,
  "add_qkv_bias": true,
  "apply_query_key_layer_scaling": true,
  "apply_residual_connection_post_layernorm": false,
  "architectures": [
    "ChatGLMModel"
  ],
  "attention_dropout": 0.0,
  "attention_softmax_in_fp32": true,
  "auto_map": {
    "AutoConfig": "configuration_chatglm.ChatGLMConfig",
    "AutoModel": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForCausalLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSeq2SeqLM": "modeling_chatglm.ChatGLMForConditionalGeneration",
    "AutoModelForSequenceClassification": "modeling_chatglm.ChatGLMForSequenceClassification"
  },
  "bias_dropout_fusion": true,
  "classifier_dropout": null,
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "ffn_hidden_size": 13696,
  "fp32_residual_connection": false,
  "hidden_dropout": 0.0,
  "hidden_size": 4096,
  "kv_channels": 128,
  "layernorm_epsilon": 1.5625e-07,
  "model_type": "chatglm",
  "multi_query_attention": true,
  "multi_query_group_num": 2,
  "num_attention_heads": 32,
  "num_hidden_layers": 40,
  "num_layers": 40,
  "original_rope": true,
  "pad_token_id": 151329,
  "padded_vocab_size": 151552,
  "post_layer_norm": true,
  "rmsnorm": true,
  "rope_ratio": 500,
  "seq_length": 131072,
  "tie_word_embeddings": false,
  "torch_dtype": "bfloat16",
  "transformers_version": "4.41.2",
  "use_cache": true,
  "vocab_size": 151552
}

[INFO|modeling_utils.py:3471] 2024-06-07 10:12:08,159 >> loading weights file /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat/model.safetensors.index.json
[INFO|modeling_utils.py:1519] 2024-06-07 10:12:08,160 >> Instantiating ChatGLMForConditionalGeneration model under default dtype torch.bfloat16.
[INFO|configuration_utils.py:962] 2024-06-07 10:12:08,162 >> Generate config GenerationConfig {
  "eos_token_id": [
    151329,
    151336,
    151338
  ],
  "pad_token_id": 151329
}

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:06<00:00,  1.45it/s]
[INFO|modeling_utils.py:4280] 2024-06-07 10:12:15,224 >> All model checkpoint weights were used when initializing ChatGLMForConditionalGeneration.

[INFO|modeling_utils.py:4288] 2024-06-07 10:12:15,224 >> All the weights of ChatGLMForConditionalGeneration were initialized from the model checkpoint at /root/.cache/modelscope/hub/ZhipuAI/glm-4-9b-chat.
If your task is similar to the task the model of the checkpoint was trained on, you can already use ChatGLMForConditionalGeneration for predictions without further training.
[INFO|modeling_utils.py:3797] 2024-06-07 10:12:15,231 >> Generation config file not found, using a generation config created from the model config.
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards:  60%|██████████████████████████████████████████████████████████▏                                      | 6/10 [00:04<00:02,  1.35it/s]06/07/2024 10:12:15 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:15 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:15 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
Loading checkpoint shards:  70%|███████████████████████████████████████████████████████████████████▉                             | 7/10 [00:05<00:02,  1.39it/s]06/07/2024 10:12:16 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:06<00:00,  1.51it/s]
06/07/2024 10:12:17 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:17 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:17 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:17 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00,  1.42it/s]
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00,  1.36it/s]
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00,  1.35it/s]
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:07<00:00,  1.34it/s]
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
Loading checkpoint shards:  90%|███████████████████████████████████████████████████████████████████████████████████████▎         | 9/10 [00:07<00:00,  1.19it/s]06/07/2024 10:12:18 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:18 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:18 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:08<00:00,  1.19it/s]
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:19 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:19 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:19 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
Loading checkpoint shards:  90%|███████████████████████████████████████████████████████████████████████████████████████▎         | 9/10 [00:09<00:01,  1.02s/it]06/07/2024 10:12:20 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:20 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:20 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:20 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:20 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:21 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:10<00:00,  1.05s/it]
06/07/2024 10:12:21 - INFO - llamafactory.model.model_utils.checkpointing - Gradient checkpointing enabled.
06/07/2024 10:12:21 - INFO - llamafactory.model.model_utils.attention - Using vanilla attention implementation.
06/07/2024 10:12:21 - INFO - llamafactory.model.adapter - Upcasting trainable params to float32.
06/07/2024 10:12:21 - INFO - llamafactory.model.adapter - Fine-tuning method: LoRA
06/07/2024 10:12:22 - INFO - llamafactory.model.adapter - Loaded adapter(s): saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03
06/07/2024 10:12:22 - INFO - llamafactory.model.model_utils.valuehead - Provided path (saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03) does not contain value head weights: saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03 does not appear to have a file named value_head.bin. Checkout 'https://huggingface.co/saves/GLM-4-9B-Chat/lora/train_2024-06-06-15-42-03/tree/None' for available files..
06/07/2024 10:12:22 - INFO - llamafactory.model.model_utils.valuehead - Ignore the above message if you are not resuming the training of a value head model.
06/07/2024 10:12:22 - INFO - llamafactory.model.loader - trainable params: 21180417 || all params: 9421131777 || trainable%: 0.2248
06/07/2024 10:12:23 - INFO - llamafactory.train.trainer_utils - Loaded adapter weights of reward model from saves/GLM-4-9B-Chat/lora/train_2024-06-07-09-37-06
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer - ***** Running training *****
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Num examples = 16000
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Num Epochs = 3.0
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Instantaneous batch size per device = 1
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Total train batch size (w. parallel, buffer, distributed & accumulation) = 64
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Gradient Accumulation steps = 8
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Num optimization epochs per batch = 4
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Total training steps = 750
06/07/2024 10:12:23 - INFO - llamafactory.train.ppo.trainer -   Number of trainable parameters = 21180417
  0%|                                                                                                                                   | 0/750 [00:00<?, ?it/s]/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/logits_process.py:1591: UserWarning: AutoNonVariableTypeMode is deprecated and will be removed in 1.10 release. For kernel implementations please use AutoDispatchBelowADInplaceOrView instead, If you are looking for a user facing API to enable running your inference-only workload, please use c10::InferenceMode. Using AutoDispatchBelowADInplaceOrView in user code is under risk of producing silent wrong result in some edge cases. See Note [AutoDispatchBelowAutograd] for more details. (Triggered internally at torch_npu/csrc/aten/common/TensorFactories.cpp:74.)
  scores_processed = torch.where(scores != scores, 0.0, scores)
[rank1]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank0]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank7]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank2]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank6]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank3]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank4]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[rank5]:[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
  0%|                                                                                                                                   | 0/750 [00:14<?, ?it/s]
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 213 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 67 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 379 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 390 is out of bounds for dimension 1 with size 1
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 408 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 499 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 501 is out of bounds for dimension 1 with size 1
Traceback (most recent call last):
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 9, in <module>
    launch()
  File "/data/LLaMA-Factory/src/llamafactory/launcher.py", line 5, in launch
    run_exp()
  File "/data/LLaMA-Factory/src/llamafactory/train/tuner.py", line 37, in run_exp
    run_ppo(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/workflow.py", line 59, in run_ppo
    ppo_trainer.ppo_train(resume_from_checkpoint=training_args.resume_from_checkpoint)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 220, in ppo_train
    mini_batch_rewards = self.get_rewards(mini_batch_queries, mini_batch_responses)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/LLaMA-Factory/src/llamafactory/train/ppo/trainer.py", line 387, in get_rewards
    rewards.append(values[i, end_index].float().detach().cpu())  # use fp32 type
IndexError: index 488 is out of bounds for dimension 1 with size 1
[2024-06-07 10:12:46,085] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227860 closing signal SIGTERM
[2024-06-07 10:12:46,085] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227861 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227862 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227863 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227864 closing signal SIGTERM
[2024-06-07 10:12:46,086] torch.distributed.elastic.multiprocessing.api: [WARNING] Sending process 2227865 closing signal SIGTERM
[2024-06-07 10:12:46,451] torch.distributed.elastic.multiprocessing.api: [ERROR] failed (exitcode: 1) local_rank: 0 (pid: 2227858) of binary: /data/anaconda3/envs/llama_factory/bin/python
Traceback (most recent call last):
  File "/data/anaconda3/envs/llama_factory/bin/torchrun", line 8, in <module>
    sys.exit(main())
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 347, in wrapper
    return f(*args, **kwargs)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 812, in main
    run(args)
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/run.py", line 803, in run
    elastic_launch(
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 135, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/data/anaconda3/envs/llama_factory/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 268, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
/data/LLaMA-Factory/src/llamafactory/launcher.py FAILED
------------------------------------------------------------
Failures:
[1]:
  time      : 2024-06-07_10:12:46
  host      : localhost.localdomain
  rank      : 1 (local_rank: 1)
  exitcode  : 1 (pid: 2227859)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-06-07_10:12:46
  host      : localhost.localdomain
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 2227858)
  error_file: <N/A>
  traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
===========================================================

老哥,你好,昇腾的glm4微调训练成功了么,可不可以提供你的脚本

from llama-factory.

jesuswa avatar jesuswa commented on July 20, 2024

大佬,有没有推理脚本

from llama-factory.

DaozeZhang avatar DaozeZhang commented on July 20, 2024

同样想要glm4的finetune脚本,推荐的配置参数命令也行,感谢

from llama-factory.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.