Code Monkey home page Code Monkey logo

Comments (1)

greatheart1000 avatar greatheart1000 commented on June 7, 2024

部署量化后模型服务端 也报错
CUDA_VISIBLE_DEVICES=0 swift deploy --ckpt_dir 'output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged'

CUDA_VISIBLE_DEVICES=0 swift deploy --ckpt_dir 'output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged'
run sh: python /root/swift/swift/cli/deploy.py --ckpt_dir output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged
2024-05-15 18:36:47,690 - modelscope - INFO - PyTorch version 2.1.2 Found.
2024-05-15 18:36:47,691 - modelscope - INFO - Loading ast index from /root/.cache/modelscope/ast_indexer
2024-05-15 18:36:47,715 - modelscope - INFO - Loading done! Current index file version is 1.14.0, with md5 20ca4569cd83063597978af789773db5 and a total number of 976 components indexed
[INFO:swift] Successfully registered /root/swift/swift/llm/data/dataset_info.json
[INFO:swift] Start time of running main: 2024-05-15 18:36:48.482545
[INFO:swift] ckpt_dir: /root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged
[INFO:swift] Setting args.model_type: baichuan2-7b
[INFO:swift] Setting model_info['revision']: master
[INFO:swift] Setting self.eval_human: True
[INFO:swift] Setting overwrite_generation_config: True
[INFO:swift] args: DeployArguments(model_type='baichuan2-7b', model_id_or_path='baichuan-inc/Baichuan2-7B-Base', model_revision='master', sft_type='full', template_type='default-generation', infer_backend='pt', ckpt_dir='/root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged', load_args_from_ckpt_dir=True, load_dataset_config=False, eval_human=True, device_map_config_path=None, seed=42, dtype='bf16', dataset=[], dataset_seed=42, dataset_test_ratio=1, show_dataset_sample=10, save_result=True, system=None, max_length=None, truncation_strategy='delete', check_dataset_strategy='none', model_name=[None, None], model_author=[None, None], quantization_bit=4, bnb_4bit_comp_dtype='bf16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, max_new_tokens=2048, do_sample=True, temperature=0.3, top_k=20, top_p=0.7, repetition_penalty=1.0, num_beams=1, stop_words=None, use_flash_attn=None, ignore_args_error=False, stream=True, merge_lora=False, merge_device_map='cpu', save_safetensors=True, overwrite_generation_config=True, verbose=None, custom_register_path=None, custom_dataset_info=None, gpu_memory_utilization=0.9, tensor_parallel_size=1, max_model_len=None, vllm_enable_lora=False, vllm_max_lora_rank=16, lora_modules=[], self_cognition_sample=0, train_dataset_sample=-1, val_dataset_sample=None, safe_serialization=None, model_cache_dir=None, merge_lora_and_save=None, custom_train_dataset_path=[], custom_val_dataset_path=[], vllm_lora_modules=None, host='127.0.0.1', port=8000, ssl_keyfile=None, ssl_certfile=None)
[INFO:swift] Global seed set to 42
INFO: 2024-05-15 18:36:48,526 infer.py:131] device_count: 1
INFO: 2024-05-15 18:36:48,526 infer.py:148] quantization_config: {'quant_method': <QuantizationMethod.BITS_AND_BYTES: 'bitsandbytes'>, '_load_in_8bit': False, '_load_in_4bit': True, 'llm_int8_threshold': 6.0, 'llm_int8_skip_modules': None, 'llm_int8_enable_fp32_cpu_offload': False, 'llm_int8_has_fp16_weight': False, 'bnb_4bit_quant_type': 'nf4', 'bnb_4bit_use_double_quant': True, 'bnb_4bit_compute_dtype': torch.bfloat16, 'bnb_4bit_quant_storage': torch.uint8}
INFO: 2024-05-15 18:36:48,526 model.py:3941] Loading the model using model_dir: /root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged
Traceback (most recent call last):
File "/root/swift/swift/cli/deploy.py", line 5, in
deploy_main()
File "/root/swift/swift/utils/run_utils.py", line 27, in x_main
result = llm_x(args, **kwargs)
File "/root/swift/swift/llm/deploy.py", line 442, in llm_deploy
model, template = prepare_model_template(args)
File "/root/swift/swift/llm/infer.py", line 161, in prepare_model_template
model, tokenizer = get_model_tokenizer(
File "/root/swift/swift/llm/utils/model.py", line 4004, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/root/swift/swift/llm/utils/model.py", line 1106, in get_model_tokenizer_baichuan2
model, tokenizer = get_model_tokenizer_from_repo(
File "/root/swift/swift/llm/utils/model.py", line 815, in get_model_tokenizer_from_repo
model = automodel_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/modelscope/utils/hf_util.py", line 113, in from_pretrained
module_obj = module_class.from_pretrained(model_dir, *model_args,
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 558, in from_pretrained
return model_class.from_pretrained(
File "/root/.cache/huggingface/modules/transformers_modules/checkpoint-1200-merged/modeling_baichuan.py", line 609, in from_pretrained
state_dict = torch.load(model_file, map_location="cpu")
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 986, in load
with _open_file_like(f, 'rb') as opened_file:
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 435, in _open_file_like
return _open_file(name_or_buffer, mode)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 416, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: '/root/swift/examples/pytorch/llm/output/baichuan2-7b/v11-20240511-210615/checkpoint-1200-merged/pytorch_model.bin' 怎么解决呢 参数是 safetensors格式

from swift.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.