<a target="_blank" rel="noopener noreferrer" href="https://private-user-images.githubu

量化后的模型推理报错怎么解决 about swift HOT 1 OPEN

greatheart1000 commented on June 7, 2024

量化后的模型推理报错怎么解决

from swift.

Comments (1)

greatheart1000 commented on June 7, 2024

部署量化后模型服务端也报错
CUDA_VISIBLE_DEVICES=0 swift deploy --ckpt_dir 'output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged'

CUDA_VISIBLE_DEVICES=0 swift deploy --ckpt_dir 'output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged'
run sh: python /root/swift/swift/cli/deploy.py --ckpt_dir output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged
2024-05-15 18:36:47,690 - modelscope - INFO - PyTorch version 2.1.2 Found.
2024-05-15 18:36:47,691 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-05-15 18:36:47,715 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 20ca4569cd83063597978af789773db5 and a total number of 976 components indexed
[INFO:swift] Successfully registered /root/swift/swift/llm/data/dataset_info.json
[INFO:swift] Start time of running main: 2024-05-15 18:36:48.482545
[INFO:swift] ckpt_dir: /root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged
[INFO:swift] Setting args.model_type: baichuan2-7b
[INFO:swift] Setting model_info['revision']: master
[INFO:swift] Setting self.eval_human: True
[INFO:swift] Setting overwrite_generation_config: True
[INFO:swift] args: DeployArguments(model_type='baichuan2-7b', model_id_or_path='baichuan-inc/Baichuan2-7B-Base', model_revision='master', sft_type='full', template_type='default-generation', infer_backend='pt', ckpt_dir='/root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged', load_args_from_ckpt_dir=True, load_dataset_config=False, eval_human=True, device_map_config_path=None, seed=42, dtype='bf16', dataset=[], dataset_seed=42, dataset_test_ratio=1, show_dataset_sample=10, save_result=True, system=None, max_length=None, truncation_strategy='delete', check_dataset_strategy='none', model_name=[None, None], model_author=[None, None], quantization_bit=4, bnb_4bit_comp_dtype='bf16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=2048, do_sample=True, temperature=0.3, top_k=20, top_p=0.7, repetition_penalty=1.0, num_beams=1, stop_words=None, use_flash_attn=None, ignore_args_error=False, stream=True, merge_lora=False, merge_device_map='cpu', save_safetensors=True, overwrite_generation_config=True, verbose=None, custom_register_path=None, custom_dataset_info=None, gpu_memory_utilization=0.9, tensor_parallel_size=1, max_model_len=None, vllm_enable_lora=False, vllm_max_lora_rank=16, lora_modules=[], self_cognition_sample=0, train_dataset_sample=-1, val_dataset_sample=None, safe_serialization=None, model_cache_dir=None, merge_lora_and_save=None, custom_train_dataset_path=[], custom_val_dataset_path=[], vllm_lora_modules=None, host='127.0.0.1', port=8000, ssl_keyfile=None, ssl_certfile=None)
[INFO:swift] Global seed set to 42
INFO: 2024-05-15 18:36:48,526 infer.py:131] device_count: 1
INFO: 2024-05-15 18:36:48,526 infer.py:148] quantization_config: {'quant_method': <QuantizationMethod.BITS_AND_BYTES: 'bitsandbytes'>, '_load_in_8bit': False, '_load_in_4bit': True, 'llm_int8_threshold': 6.0, 'llm_int8_skip_modules': None, 'llm_int8_enable_fp32_cpu_offload': False, 'llm_int8_has_fp16_weight': False, 'bnb_4bit_quant_type': 'nf4', 'bnb_4bit_use_double_quant': True, 'bnb_4bit_compute_dtype': torch.bfloat16, 'bnb_4bit_quant_storage': torch.uint8}
INFO: 2024-05-15 18:36:48,526 model.py:3941] Loading the model using model_dir: /root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged
Traceback (most recent call last):
File "/root/swift/swift/cli/deploy.py", line 5, in
deploy_main()
File "/root/swift/swift/utils/run_utils.py", line 27, in x_main
result = llm_x(args, **kwargs)
File "/root/swift/swift/llm/deploy.py", line 442, in llm_deploy
model, template = prepare_model_template(args)
File "/root/swift/swift/llm/infer.py", line 161, in prepare_model_template
model, tokenizer = get_model_tokenizer(
File "/root/swift/swift/llm/utils/model.py", line 4004, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/root/swift/swift/llm/utils/model.py", line 1106, in get_model_tokenizer_baichuan2
model, tokenizer = get_model_tokenizer_from_repo(
File "/root/swift/swift/llm/utils/model.py", line 815, in get_model_tokenizer_from_repo
model = automodel_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-1200-merged/modeling_baichuan.py", line 609, in from_pretrained
state_dict = torch.load(model_file, map_location="cpu")
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 986, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 435, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 416, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged/pytorch_model.bin' 怎么解决呢参数是 safetensors格式

from swift.

量化后的模型推理报错怎么解决 about swift HOT 1 OPEN

Comments (1)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent