Code Monkey home page Code Monkey logo

llm-quickstart's Issues

import from awq reported cannot import name 'MoeModelOutputWithPast' from 'transformers.modeling_outputs'

ImportError                               Traceback (most recent call last)
[<ipython-input-9-10f3d88ac51c>](https://localhost:8080/#) in <cell line: 1>()
----> 1 from awq import AutoAWQForCausalLM
      2 from transformers import AutoTokenizer
      3 
      4 model_path = 'facebook/opt-6.7b'
      5 quant_path = "/Content/drive/models/opt-6.7b-awq"

3 frames
[/content/AutoAWQ/awq/modules/fused/model.py](https://localhost:8080/#) in <module>
      3 from typing import List
      4 from awq.utils import fused_utils
----> 5 from transformers.modeling_outputs import BaseModelOutputWithPast, MoeModelOutputWithPast
      6 from awq.modules.fused.block import MPTBlock, FalconDecoderLayer, LlamaLikeBlock, MixtralBlock
      7 

ImportError: cannot import name 'MoeModelOutputWithPast' from 'transformers.modeling_outputs' (/usr/local/lib/python3.10/dist-packages/transformers/modeling_outputs.py)

【fine-tune-quickstart.ipynb】训练后ACC值随着评估集变化的问题

作业: fine-tune-quickstart.ipynb

由于时间原因,只使用了5万的样本进行训练。
训练后,使用原测试集的100条进行trainer.evaluate()得到如下结果。

{'eval_loss': 1.2431399822235107,
'eval_accuracy': 0.57,
'eval_runtime': 1.7855,
'eval_samples_per_second': 56.007,
'eval_steps_per_second': 7.281}

再使用 1000条进行evaluate()得到另一个结果。

{'eval_loss': 1.016939640045166,
'eval_accuracy': 0.64,
'eval_runtime': 15.7258,
'eval_samples_per_second': 63.59,
'eval_steps_per_second': 7.949}

GPT的解释如下 :

  • 数据集规模:从之前的较小数据集增加到1000个样本,模型在更大规模的数据集上进行评估,可能更全面地反映了模型的性能。
  • 模型改进:在两组评估之间,模型可能经历了改进或调优的过程,使其在新的评估数据集上表现更好。
  • 数据偏差:两组评估使用的数据集可能存在一定的偏差,包括样本分布、数据质量等方面的差异,导致评估结果有所不同。
  • 随机性:评估过程中存在一定的随机性,包括数据样本的随机选择、模型参数的初始化等。因此,两组评估结果之间的差异可能部分归因于随机因素。

代码:

https://github.com/simson2010/LLM-quickstart/blob/feature/homework/homework/fine-tune-quickstart.ipynb

问题

同一个数据集fine-tune完后,用不同数量的样本进行评估,结果有较大偏差,这个可以怎么理解,用100条样本,ACC是0.57,用1000条样本时则升到0.64.但这样来做评估,怎么确定训练数据量用多少为好?

【AWQ_transformers.ipynb】下载facebook-opt-6.7模型到本地再加载报错

作业: fine-tune-quickstart.ipynb

目标:

因网络不稳定,打算把"facebook/opt-6.7b"模型下载到本地后再加载使用,进行量化。

代码如下:

#下载模型
b6e665c761cca3c138e6a088306c841

#加载模型
48da132622d4042f1fb94ef1b8740bc

报错如下:

707458bbf54dedbf62497fa2061f006

经查验本地存储内容为:
e3d6cc50f8e697d51643348763c27d0

完全和官网文件格式不一致,改为直接读取已经存在本地的缓存文件反而没有问题。

问题

这是否说明:用transformers的from_pretrained方法从hf下载的原始模型到本地,用save_pretrained方法保存后,再用from_pretrained从本地指定路径读取这一方法是有使用限制的?

为什么会出现这样的问题?

Adapter文件缺失问题

问题:
我在jupyter lab中打开的peft_lora_whisper-large-v2.ipynb文件运行时出现adapter文件缺失问题。
描述:
在models/whisper-large-v2-asr-int8/中缺少adapter_config.json、adapter_model.json这2个文件;但是多了pytorch_model.bin以及一个Run的文件夹。如图所示:
问题截图

请老师帮分析下原因,提供解决建议,谢谢~~

foundation/common_voice_11_0' on the Hub (ConnectionError)

在load_dataset(dataset_name, language_abbr, data_dir="./dataset", split="train", trust_remote_code=True)时报异常:
raise ConnectionError(f"Couldn't reach '{path}' on the Hub ({type(e).name})")
ConnectionError: Couldn't reach 'mozilla-foundation/common_voice_11_0' on the Hub (ConnectionError)

BuilderConfig 'allenai--c4' not found. Available: ['default']

AutoGPTQ_transformers 家庭作业

执行 这个的时候:
quant_model67 = AutoModelForCausalLM.from_pretrained(model_id67, quantization_config=quantization_config67, device_map='auto')

报这个错误:

BuilderConfig 'allenai--c4' not found. Available: ['default']

/usr/local/lib/python3.10/dist-packages/huggingface_hub/repocard.py:105: UserWarning: Repo card metadata block was not found. Setting CardData to empty.
warnings.warn("Repo card metadata block was not found. Setting CardData to empty.")

ValueError Traceback (most recent call last)
in <cell line: 2>()
1 tokenizer67 = AutoTokenizer.from_pretrained(model_id67)
----> 2 quant_model67 = AutoModelForCausalLM.from_pretrained(model_id67, quantization_config=quantization_config67, device_map='auto')

9 frames
/usr/local/lib/python3.10/dist-packages/datasets/builder.py in _create_builder_config(self, config_name, custom_features, **config_kwargs)
588 builder_config = self.builder_configs.get(config_name)
589 if builder_config is None and self.BUILDER_CONFIGS:
--> 590 raise ValueError(
591 f"BuilderConfig '{config_name}' not found. Available: {list(self.builder_configs.keys())}"
592 )

ValueError: BuilderConfig 'allenai--c4' not found. Available: ['default']

readme操作文档缺少针对云服务环境, 如何启动 jupyter 的细节方面的说明

作为云服务器的新手, 按照 readme 的文档操作之后发现使用 nohup jupyter lab --port=8000 --NotebookApp.token='替换为你的密码' --notebook-dir=./ & 启动jupyter之后, 终端没有任何提示, 不知道下一步怎么办.
实际上我发现使用云服务有俩个坑:

  1. 需要在云服务器安全服务中打开 8000端口
  2. 不知道怎么访问 jupyter 网页终端

综上所述: 希望至少在 readme 文档加以说明.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.