lich99 / chatglm-finetune-lora Goto Github PK
View Code? Open in Web Editor NEWCode for fintune ChatGLM-6b using low-rank adaptation (LoRA)
License: Apache License 2.0
Code for fintune ChatGLM-6b using low-rank adaptation (LoRA)
License: Apache License 2.0
ValueError: 150004 is not in list
---> 75 return {'input_ids': torch.tensor(input_ids).long(),
76 'attention_mask': torch.stack(attention_mask),
77 'labels': torch.stack(labels),
ValueError: expected sequence of length 115 at dim 1 (got 62)
return {'input_ids': torch.tensor(input_ids).long(),
'attention_mask': attention_mask,
'labels': torch.stack(labels),
'position_ids':torch.stack(position_ids)}
这里为什么有torch.tensor()和torch.stack()呢?
不写成torch.stack()有时会报ValueError: expected sequence of length 50 at dim 1 (got 49)这个错,可以解答一下吗?
训练时,经常报疑似梯度溢出警告?请问大佬能解答下原因和解决方法么?
Rank 0 Skipping step. Attempted loss scale: 65536, reducing to 32768
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.
可不可以使用22gb甚至20gb的设备训练?
[1., 0., 0.],
[1., 1., 0.],
[1., 1., 1.]]
类似这种。
所以attention_mask = (attention_mask < 0.5).bool()这个是不是多余的呢?
peft 的0.3.0在LoraModel里增加了一个adapter_name是,这个参数怎么填写?
项目里用的二维位置编码吗?
但是chatGLM不是用的旋转位置编码吗?
@lich99
position_ids.append(torch.stack([torch.arange(0, _max_length, device=device),
torch.concat([torch.zeros(context_length - 1, device=device),
torch.arange(0, _max_length - context_length + 1, device=device)])]).long())
学生党在colab上想尝试您的代码
但是显卡只有15G ,没法跑这个大模型,于是将模型路径换成chatGLM-6b-int4 这个量化后的小模型
但是运行时在运行train.py时215行outputs = model(**batch) 这里报错:self and mat2 must have same dtype
这是因为量化后的模型的参数和输入不匹配导致的吗?那么可以在int4量化后的模型上进行finetune吗?
您方便的话能否写一个量化后模型微调的demo,非常感谢!
LoRA训练时间大概是多久呢?
用一个样例调试的时候,发现LoRA_A矩阵的梯度一直为0,没有被更新。
文中写道:Try ZeRO 2 and no offload first, unless you encounter OOM. ZeRO 2 (no offload) > ZeRO 2 (offload) > ZeRO 3 (no offload) > ZeRO 3 (offload)。
请问为什么推荐使用ZeRO2-offload,是因为ZeRO3以及ZeRO3-offload等虽然高效利用了显存,但是牺牲了参数的计算精度等,这会导致模型的训练效果变差吗?
感谢回复!
模型正常加载 发送prompt后报错
Loading checkpoint shards: 14%|████████▏ | 1/7 [00:00<00:05, 1Loading checkpoint shards: 29%|████████████████▎ | 2/7 [00:01<0Loading checkpoint shards: 43%|████████████████████████▍ | 3/7 Loading checkpoint shards: 57%|████████████████████████████████▌ Loading checkpoint shards: 71%|████████████████████████████████████████▋ Loading checkpoint shards: 86%|████████████████████████████████████████████Loading checkpoint shards: 100%|████████████████████████████████████████████Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 7/7 [00:05<00:00, 1.18it/s]
Setting eos_token is not supported, use the default one.
Setting pad_token is not supported, use the default one.
Setting unk_token is not supported, use the default one.
binary_path: F:\Anaconda\envs\chatglm\lib\site-packages\bitsandbytes\cuda_setup\libbitsandbytes_cuda116.dll
CUDA SETUP: Loading binary F:\Anaconda\envs\chatglm\lib\site-packages\bitsandbytes\cuda_setup\libbitsandbytes_cuda116.dll...
Setting eos_token is not supported, use the default one.
Setting pad_token is not supported, use the default one.
Setting unk_token is not supported, use the default one.
报错信息如下:
┌─────────────────────────────── Traceback (most recent call last) ────────────── ──────────────────┐
│ F:\ChatGLM\chatglm3_6b_finetune\inference_hf.py:51 in main │
│ │
│ 48 │ │ prompt: Annotated[str, typer.Option(help='')], │
│ 49 ): │
│ 50 │ model, tokenizer = load_model_and_tokenizer(model_dir) │
│ > 51 │ response, _ = model.chat(tokenizer, prompt) │
│ 52 │ print(response) │
│ 53 │
│ 54 │
│ │
│ F:\Anaconda\envs\chatglm\lib\site-packages\torch\autograd\grad_mode.py:27 in decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ > 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ C:\Users\Administrator.cache\huggingface\modules\transformers_modules\chatglm3-6b\modeling_chat │
│ glm.py:1042 in chat │
│ │
│ 1039 │ │ inputs = inputs.to(self.device) │
│ 1040 │ │ eos_token_id = [tokenizer.eos_token_id, tokenizer.get_command("<|user|>"), │
│ 1041 │ │ │ │ │ │ tokenizer.get_command("<|observation|>")] │
│ > 1042 │ │ outputs = self.generate(**inputs, **gen_kwargs, eos_token_id=eos_token_id) │
│ 1043 │ │ outputs = outputs.tolist()[0][len(inputs["input_ids"][0]):-1] │
│ 1044 │ │ response = tokenizer.decode(outputs) │
│ 1045 │ │ history.append({"role": role, "content": query}) │
│ │
│ F:\Anaconda\envs\chatglm\lib\site-packages\torch\autograd\grad_mode.py:27 in decorate_context │
│ │
│ 24 │ │ @functools.wraps(func) │
│ 25 │ │ def decorate_context(*args, **kwargs): │
│ 26 │ │ │ with self.clone(): │
│ > 27 │ │ │ │ return func(*args, **kwargs) │
│ 28 │ │ return cast(F, decorate_context) │
│ 29 │ │
│ 30 │ def _wrap_generator(self, func): │
│ │
│ F:\Anaconda\envs\chatglm\lib\site-packages\transformers\generation\utils.py:1575 in generate │
│ │
│ 1572 │ │ │ ) │
│ 1573 │ │ │ │
│ 1574 │ │ │ # 13. run sample │
│ > 1575 │ │ │ result = self._sample( │
│ 1576 │ │ │ │ input_ids, │
│ 1577 │ │ │ │ logits_processor=prepared_logits_processor, │
│ 1578 │ │ │ │ logits_warper=logits_warper, │
│ │
│ F:\Anaconda\envs\chatglm\lib\site-packages\transformers\generation\utils.py:2697 in _sample │
│ │
│ 2694 │ │ │ model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs) │
│ 2695 │ │ │ │
│ 2696 │ │ │ # forward pass to get next token │
│ > 2697 │ │ │ outputs = self( │
│ 2698 │ │ │ │ **model_inputs, │
│ 2699 │ │ │ │ return_dict=True, │
│ 2700 │ │ │ │ output_attentions=output_attentions, │
│ │
│ F:\Anaconda\envs\chatglm\lib\site-packages\torch\nn\modules\module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ > 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ C:\Users\Administrator.cache\huggingface\modules\transformers_modules\chatglm3-6b\modeling_chat │
│ glm.py:941 in forward │
│ │
│ 938 │ │ use_cache = use_cache if use_cache is not None else self.config.use_cache │
│ 939 │ │ return_dict = return_dict if return_dict is not None else self.config.use_return │
│ 940 │ │ │
│ > 941 │ │ transformer_outputs = self.transformer( │
│ 942 │ │ │ input_ids=input_ids, │
│ 943 │ │ │ position_ids=position_ids, │
│ 944 │ │ │ attention_mask=attention_mask, │
│ │
│ F:\Anaconda\envs\chatglm\lib\site-packages\torch\nn\modules\module.py:1130 in _call_impl │
│ │
│ 1127 │ │ # this function, and just call forward. │
│ 1128 │ │ if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks o │
│ 1129 │ │ │ │ or _global_forward_hooks or _global_forward_pre_hooks): │
│ > 1130 │ │ │ return forward_call(*input, **kwargs) │
│ 1131 │ │ # Do not call functions when jit is used │
│ 1132 │ │ full_backward_hooks, non_full_backward_hooks = [], [] │
│ 1133 │ │ if self._backward_hooks or _global_backward_hooks: │
│ │
│ C:\Users\Administrator.cache\huggingface\modules\transformers_modules\chatglm3-6b\modeling_chat │
│ glm.py:822 in forward │
│ │
│ 819 │ │ │ │ │ │ │ │ │ │ │ attention_mask], dim=-1) │
│ 820 │ │ │
│ 821 │ │ if full_attention_mask is None: │
│ > 822 │ │ │ if (attention_mask is not None and not attention_mask.all()) or (past_key_va │
│ 823 │ │ │ │ full_attention_mask = self.get_masks(input_ids, past_key_values, padding │
│ 824 │ │ │
│ 825 │ │ # Rotary positional embeddings │
└──────────────────────────────────────────────────────────────────────────────────────────────────┘
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
环境如下:
absl-py==2.1.0
accelerate==0.27.2
aiofiles==23.2.1
aiohttp==3.9.3
aiosignal==1.3.1
altair==5.2.0
annotated-types==0.6.0
antlr4-python3-runtime==4.9.3
anyio==4.3.0
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arxiv==2.1.0
async-timeout==4.0.3
attrs==23.2.0
azure-core==1.30.1
azure-storage-blob==12.19.1
backoff==2.2.1
beautifulsoup4==4.12.3
bitsandbytes==0.37.1
bitsandbytes-windows==0.37.5
blinker==1.7.0
blis==0.7.11
Brotli==1.1.0
cachetools==5.3.3
catalogue==2.0.10
certifi==2024.2.2
cffi==1.16.0
chardet==5.2.0
charset-normalizer==3.3.2
click==8.1.7
cloudpathlib==0.16.0
colorama==0.4.6
coloredlogs==15.0.1
confection==0.1.4
contourpy==1.2.0
cpm-kernels==1.0.11
cryptography==42.0.5
curl_cffi==0.6.2
cycler==0.12.1
cymem==2.0.8
dashscope==1.13.6
dataclasses-json==0.6.4
datasets==2.18.0
deepdiff==6.7.1
Deprecated==1.2.14
deprecation==2.1.0
dill==0.3.8
distro==1.9.0
duckduckgo_search==5.1.0
effdet==0.4.1
einops==0.7.0
emoji==2.10.1
environs==9.5.0
et-xmlfile==1.1.0
exceptiongroup==1.2.0
faiss-cpu==1.7.4
fake-useragent==1.5.1
fastapi==0.109.0
feedparser==6.0.10
ffmpy==0.3.2
filelock==3.13.1
filetype==1.2.0
flatbuffers==24.3.7
fonttools==4.49.0
frozenlist==1.4.1
fschat==0.2.35
fsspec==2024.2.0
gitdb==4.0.11
GitPython==3.1.42
google-auth==2.29.0
google-auth-oauthlib==0.4.6
gradio==3.50.0
gradio_client==0.6.1
greenlet==3.0.3
grpcio==1.60.0
h11==0.14.0
h2==4.1.0
hpack==4.0.0
httpcore==1.0.4
httpx==0.27.0
httpx-sse==0.4.0
huggingface-hub==0.21.4
humanfriendly==10.0
hyperframe==6.0.1
idna==3.4
imageio==2.34.0
importlib_metadata==7.0.2
importlib_resources==6.3.1
iniconfig==2.0.0
iopath==0.1.10
isodate==0.6.1
jieba==0.42.1
Jinja2==3.1.3
joblib==1.3.2
jsonpatch==1.33
jsonpath-python==1.0.6
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
kiwisolver==1.4.5
langchain==0.0.354
langchain-community==0.0.20
langchain-core==0.1.23
langchain-experimental==0.0.47
langcodes==3.3.0
langdetect==1.0.9
langsmith==0.0.87
latex2mathml==3.77.0
layoutparser==0.3.4
lazy_loader==0.3
llama-index==0.9.35
loguru==0.7.2
lxml==5.1.0
Markdown==3.5.2
markdown-it-py==3.0.0
markdown2==2.4.13
markdownify==0.11.6
MarkupSafe==2.1.5
marshmallow==3.21.1
matplotlib==3.8.3
mdtex2html==1.3.0
mdurl==0.1.2
metaphor-python==0.1.23
minio==7.2.5
mkl-fft==1.3.8
mkl-random==1.2.4
mkl-service==2.4.0
mpmath==1.3.0
msg-parser==1.2.0
multidict==6.0.5
multiprocess==0.70.16
murmurhash==1.0.10
mypy-extensions==1.0.0
nest-asyncio==1.6.0
networkx==3.2.1
nh3==0.2.15
nltk==3.8.1
numexpr==2.8.6
numpy==1.24.4
oauthlib==3.2.2
olefile==0.47
omegaconf==2.3.0
onnx==1.15.0
onnxruntime==1.15.1
openai==1.9.0
opencv-python==4.9.0.80
openpyxl==3.1.2
ordered-set==4.1.0
orjson==3.9.15
packaging==23.2
pandas==2.0.3
pathlib==1.0.1
pdf2image==1.17.0
pdfminer.six==20231228
pdfplumber==0.11.0
peft==0.9.0
pikepdf==8.4.1
Pillow==9.5.0
pillow_heif==0.15.0
pip==23.3.1
pluggy==1.4.0
portalocker==2.8.2
preshed==3.0.9
prompt-toolkit==3.0.43
protobuf==3.20.3
psutil==5.9.8
pyarrow==15.0.1
pyarrow-hotfix==0.6
pyasn1==0.6.0
pyasn1_modules==0.4.0
pyclipper==1.3.0.post5
pycocotools==2.0.7
pycparser==2.21
pycryptodome==3.20.0
pydantic==1.10.13
pydantic_core==2.16.3
pydash==7.0.7
pydeck==0.8.1b0
pydub==0.25.1
PyExecJS==1.5.1
Pygments==2.17.2
PyJWT==2.8.0
pymilvus==2.4.0
PyMuPDF==1.23.16
PyMuPDFb==1.23.9
pypandoc==1.13
pyparsing==3.1.2
pypdf==4.1.0
pypdfium2==4.28.0
pyreadline3==3.4.1
pytesseract==0.3.10
pytest==7.4.3
python-dateutil==2.9.0.post0
python-decouple==3.8
python-docx==1.1.0
python-dotenv==1.0.1
python-iso639==2024.2.7
python-magic==0.4.27
python-magic-bin==0.4.14
python-multipart==0.0.9
python-pptx==0.6.23
pytz==2024.1
pywencai==0.12.2
pywin32==306
PyYAML==6.0.1
rapidfuzz==3.6.2
rapidocr-onnxruntime==1.3.8
referencing==0.33.0
regex==2023.12.25
requests==2.31.0
requests-oauthlib==2.0.0
rich==13.7.1
rouge-chinese==1.0.3
rpds-py==0.18.0
rsa==4.9
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.8
ruff==0.3.3
safetensors==0.4.2
scikit-image==0.22.0
scikit-learn==1.4.1.post1
scipy==1.12.0
semantic-version==2.10.0
sentence-transformers==2.2.2
sentencepiece==0.2.0
setuptools==68.2.2
sgmllib3k==1.0.0
shapely==2.0.3
shellingham==1.5.4
shortuuid==1.0.13
simplejson==3.19.2
six==1.16.0
smart-open==6.4.0
smmap==5.0.1
sniffio==1.3.1
socksio==1.0.0
soupsieve==2.5
spacy==3.7.2
spacy-legacy==3.0.12
spacy-loggers==1.0.5
SQLAlchemy==2.0.19
srsly==2.4.8
sse-starlette==1.8.2
starlette==0.35.0
streamlit==1.30.0
streamlit-aggrid==0.3.4.post3
streamlit-antd-components==0.3.1
streamlit-chatbox==1.1.11
streamlit-feedback==0.1.3
streamlit-modal==0.1.0
streamlit-option-menu==0.3.12
strsimpy==0.2.1
svgwrite==1.4.3
sympy==1.12
tabulate==0.9.0
tenacity==8.2.3
tensorboard==2.10.0
tensorboard-data-server==0.6.1
tensorboard-plugin-wit==1.8.1
text2vec==1.2.9
thinc==8.2.3
threadpoolctl==3.3.0
tifffile==2024.2.12
tiktoken==0.5.2
timm==0.9.16
tokenizers==0.15.2
toml==0.10.2
tomli==2.0.1
tomlkit==0.12.0
toolz==0.12.1
torch==1.12.0+cu113
torchaudio==0.12.0+cu113
torchvision==0.13.0+cu113
tornado==6.4
tqdm==4.66.1
transformers==4.39.3
transformers-stream-generator==0.0.4
typer==0.9.0
typing_extensions==4.10.0
typing-inspect==0.9.0
tzdata==2024.1
tzlocal==5.2
ujson==5.9.0
unstructured==0.11.0
unstructured-client==0.22.0
unstructured-inference==0.7.15
unstructured.pytesseract==0.3.12
urllib3==2.1.0
uvicorn==0.28.0
validators==0.22.0
visdom==0.2.4
wasabi==1.1.2
watchdog==3.0.0
wavedrom==2.0.3.post3
wcwidth==0.2.13
weasel==0.3.4
websocket-client==1.7.0
websockets==12.0
Werkzeug==3.0.2
wheel==0.41.2
win32-setctime==1.1.0
wrapt==1.16.0
xformers==0.0.23.post1
xlrd==2.0.1
XlsxWriter==3.2.0
xxhash==3.4.1
yarl==1.9.4
youtube-search==2.1.2
zipp==3.18.0
Is there any command and ymal example about using accelerate to launch train.py
?
前面代码不变
model(**batch).loss
改为
for i in range(10):
output = model(**batch)
loss = output.loss
loss.backward()
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
print(loss.detach().float())
输出
tensor(3.2207, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
tensor(nan, device='cuda:0')
只有第1个step时loss正常计算,请问这是为啥?
lr设置1e-8
lr_scheduler = get_linear_schedule_with_warmup(
optimizer=optimizer,
num_warmup_steps=int(len(train_dataloader) / accumulate_step),
num_training_steps=(int(len(train_dataloader) / accumulate_step) * NUM_EPOCHS),
)
运行accelerate launch --config_file config/default_config.yaml train_new.py
,然后报错:
Traceback (most recent call last):
File "/home/searchgpt/yq/ChatGLM-finetune-LoRA/train_new.py", line 38, in <module>
accelerator = Accelerator(mixed_precision=mixed_precision, gradient_accumulation_steps=accumulate_step, deepspeed_plugin=deepspeed_plugin)
File "/home/searchgpt/anaconda3/envs/stanford_alpaca/lib/python3.10/site-packages/accelerate/accelerator.py", line 340, in __init__
self.state = AcceleratorState(
File "/home/searchgpt/anaconda3/envs/stanford_alpaca/lib/python3.10/site-packages/accelerate/state.py", line 539, in __init__
PartialState(cpu, **kwargs)
File "/home/searchgpt/anaconda3/envs/stanford_alpaca/lib/python3.10/site-packages/accelerate/state.py", line 123, in __init__
torch.cuda.set_device(self.device)
File "/home/searchgpt/anaconda3/envs/stanford_alpaca/lib/python3.10/site-packages/torch/cuda/__init__.py", line 350, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: CUDA error: invalid device ordinal
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
运行环境为:单机2卡A100
进行反向传播以后,loss就会变为NaN。
设置的accumlate_step = 16。
目前系统的CUDA是10.2,pytorch中的cuda是11.6。
代码如下,单GPU:
import os
import tqdm
import json
import torch
import loralib as lora
import lora_utils.insert_lora
import dataset.GLM as GLM_Data
from torch.utils.data import DataLoader
from transformers import AutoTokenizer, AutoModel
# from accelerate import Accelerator, DeepSpeedPlugin
from transformers import get_linear_schedule_with_warmup
device = "cuda"
checkpoint = "THUDM/chatglm-6b"
# mixed_precision = 'bf16'
lora_config = {
'r': 32,
'lora_alpha':32,
'lora_dropout':0.1,
'enable_lora':[True, True, True],
}
max_length = 256
LR = 2e-5
NUM_EPOCHS = 2
batch_size = 4
accumulate_step = 8
warm_up_ratio = 0.1
tokenizer = AutoTokenizer.from_pretrained(checkpoint, trust_remote_code=True, revision = 'main')
model = AutoModel.from_pretrained(checkpoint, trust_remote_code=True, revision = 'main')
model = lora_utils.insert_lora.get_lora_model(model, lora_config)
import dataset.Alpaca as Alpaca_Data
pairs = Alpaca_Data.load('./data/alpaca_data.json')
GLM_Data.device = device
pairs_encoded = GLM_Data.encode_pairs(pairs, tokenizer)
pairs_encoded = list(filter(lambda pair: len(pair['prompt'])+len(pair['completion']) <= max_length, pairs_encoded))
train_dataset = GLM_Data.GLMDataset(pairs_encoded)
train_dataloader = DataLoader(dataset=train_dataset, collate_fn = GLM_Data.collate_fn, shuffle=True, batch_size=batch_size)
optimizer = torch.optim.AdamW(model.parameters(), lr=LR)
lr_scheduler = get_linear_schedule_with_warmup(
optimizer=optimizer,
num_warmup_steps=int(len(train_dataloader) / accumulate_step * warm_up_ratio),
num_training_steps=(int(len(train_dataloader) / accumulate_step) * NUM_EPOCHS),
)
# model, optimizer, train_dataloader = accelerator.prepare(model, optimizer, train_dataloader)
# model.to(device).train()
from torch.cuda.amp import autocast
LR = 2e-5
NUM_EPOCHS = 2
accumulate_step = 16
version = 'test'
optimizer = torch.optim.AdamW(model.parameters(), lr=LR)
lr_scheduler = get_linear_schedule_with_warmup(
optimizer=optimizer,
num_warmup_steps=int(len(train_dataloader) / accumulate_step),
num_training_steps=(int(len(train_dataloader) / accumulate_step) * NUM_EPOCHS),
)
model.half().to(device).train()
for epoch in range(NUM_EPOCHS):
epoch_loss_local = 0
for step, batch in enumerate(t:=tqdm.tqdm(train_dataloader)):
batch = {k: v.to('cuda') for k, v in batch.items()}
outputs = model(**batch)
loss_d = outputs.loss.detach()
epoch_loss_local += loss_d
t.set_description(f"loss: {epoch_loss_local.cpu().float() / step}")
loss = outputs.loss / accumulate_step
loss.backward()
if (step+1) % accumulate_step == 0:
optimizer.step()
lr_scheduler.step()
optimizer.zero_grad()
rt
我使用train.py完成了训练,并且获得了./saved/finetune_0.pt
文件,但是由于没有inference部分,所以使用LoRA_finetune_with_stanford_alpaca.ipynb
中最后inference模块的时候报错:
Traceback (most recent call last):
File "inference.py", line 49, in <module>
module.query_key_value = peft.tuners.lora.LoraModel(config, module.query_key_value)
File "C:\Users\xxxx\AppData\Local\Programs\Python\Python38\lib\site-packages\peft\tuners\lora.py", line 118, in __init__
self._find_and_replace()
File "C:\Users\xxxx\AppData\Local\Programs\Python\Python38\lib\site-packages\peft\tuners\lora.py", line 181, in _find_and_replace
raise ValueError(
ValueError: Target modules ['q', 'k', 'v'] not found in the base model. Please check the target modules and try again.
使用的LORAConfig是:(这个target在ipynb和train.py里是一致的)
config = LoraConfig(
peft_type="LORA",
r=32,
lora_alpha=32,
target_modules=["q", "k", "v"],
lora_dropout=0.1,
)
为什么会报这个错?之前遇到过但是也没搜到…不知道大家能不能跑通inference。
No such file or directory: '/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/tokenization_chatglm.py'
请问这个模型训练显存至少要多少?我这里是V100 32GB
有什么卡训练的,V100完全搞不定,动不动cuda out of memory
WARNING:torch.distributed.run:
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.revision
is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.130004是从哪来的 ?为什么设置这个参数?
大佬这是在做什么while cnt < retry_cnt:??
直接python3 train.py吗?要设置什么参数?
这个训练是如何支持分布式GPU训练的,有什么参数是能设置让模型进行并行训练的,我只知道修改那个num_processes的参数只能在每个GPU下分别训练一个模型,但是能不能让一个模型在多个GPU下训练呢,因为我的显卡只有16G
如题
针对你是谁finetune,输出还是原模型的回答,新finetune的pt模型加载没有问题吧,如下:
tokenizer = AutoTokenizer.from_pretrained("ChatGLM-6B/chatglm-6b", trust_remote_code=True)
model = AutoModel.from_pretrained("ChatGLM-6B/chatglm-6b", trust_remote_code=True).half().cuda()
peft_path = "ChatGLM-finetune-LoRA/saved/finetune_test/finetune_test_epoch_2.pt"
model.load_state_dict(torch.load(peft_path), strict=False)
model.eval()
我想多卡微调模型,用超长文本(超过1024)来微调。因为1024用4张卡已经很勉强了,所以我想用8张卡微调,但是我用超过4张卡运行就会报错,有大佬们试过用8卡微调的吗?最长能支持多长的文本长度?
lora_finetune_with_stanford_alpaca.ipynb如何直接改成Multi-GPU训练?因为DeepSpeed不支持Windows系统。以及DeepSpeed对finetune的提升效果有多大?
SelfAttention' object has no attribute 'query_key_valu 有人知道这怎么解决吗
尝试将NUM_EPOCHS调大,总的loss始终是不变的。
保存的chatglm-6b_alpaca_5.pt这个文件是怎么加载到ChatGLM里的呢
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.