Code Monkey home page Code Monkey logo

Comments (5)

charent avatar charent commented on July 29, 2024

你的训练参数(比如学习率、优化器啥的)贴一下。

另外请检查你的训练数据是否处理正确。输出 [(,,)]像是丢失了三元组信息。

调用bot.chat不应该输出[EOS]的,是不是还有其他的修改?

这里是完整的微调代码:finetune_IE_task.ipynb

from chatlm-mini-chinese.

qiutzh avatar qiutzh commented on July 29, 2024

您好,运行code与main分支是一致的,只更改了t5模型(google-t5/t5-base)的base路径:

参数是这样子的:

SFTconfig(max_seq_len=512, tokenizer_dir='/home/xxx/mycode/demo2/model_save/', sft_train_file='./data/my_train.json', batch_size=16, num_train_epochs=6, save_steps=3000, gradient_accumulation_steps=4, learning_rate=5e-05, logging_first_step=True, logging_steps=20, output_dir='./model_save/ie_task', warmup_steps=1000, fp16=True, seed=23333)

training_args = Seq2SeqTrainingArguments(
        output_dir=config.output_dir,
        per_device_train_batch_size=config.batch_size,
        auto_find_batch_size=True,  # 防止OOM
        gradient_accumulation_steps=config.gradient_accumulation_steps,
        learning_rate=config.learning_rate,
        logging_steps=config.logging_steps,
        num_train_epochs=config.num_train_epochs,
        optim="adafactor",
        report_to='tensorboard',
        log_level='info',
        save_steps=config.save_steps,
        save_total_limit=3,
        fp16=config.fp16,
        logging_first_step=config.logging_first_step,
        warmup_steps=config.warmup_steps,
        seed=config.seed,
        generation_config=generation_config,
    )

我找到了,自己更改了一处代码:

def sft_train(config: SFTconfig) -> None:
    # step 1. 加载tokenizer
    tokenizer = PreTrainedTokenizerFast.from_pretrained(config.tokenizer_dir)
    tokenizer.add_special_tokens({'pad_token': '[PAD]'})  # add code in here
    ...

因为先前运行时报了这个错误:

ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})`

看来像是包版本不同导致的。

from chatlm-mini-chinese.

qiutzh avatar qiutzh commented on July 29, 2024

使用requirements.txt重新创建一份虚拟环境,注释 tokenizer.add_special_tokens({'pad_token': '[PAD]'}) 这一行,执行时还是报了上面的[PAD]错误提示,好奇怪。。

执行 trainer = sft_train(config) 出,报错信息:
log_error_pad.txt

训练数据格式应该正常:
image

from chatlm-mini-chinese.

charent avatar charent commented on July 29, 2024

如果你使用项目的tokenizer的话,pad_token是存在的,不用自己添加。依赖一样的话,那大概率是模型文件不完整,重新下载试试,国内可以通过modelscope下载,tokenizer_dir记得改为./model_save,即:tokenizer和模型是在一个文件夹下的。

from modelscope import snapshot_download
model_id = 'charent/ChatLM-mini-Chinese'
model_id = snapshot_download(model_id, cache_dir='./model_save')

from chatlm-mini-chinese.

qiutzh avatar qiutzh commented on July 29, 2024

如果你使用项目的tokenizer的话,pad_token是存在的,不用自己添加。依赖一样的话,那大概率是模型文件不完整,重新下载试试,国内可以通过modelscope下载,tokenizer_dir记得改为./model_save,即:tokenizer和模型是在一个文件夹下的。

from modelscope import snapshot_download
model_id = 'charent/ChatLM-mini-Chinese'
model_id = snapshot_download(model_id, cache_dir='./model_save')

嗯嗯好的,谢谢喽。

from chatlm-mini-chinese.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.