Code Monkey home page Code Monkey logo

internlm's Introduction

InternLM

👋 join us on Discord and WeChat

Introduction

InternLM2 series are released with the following features:

  • 200K Context window: Nearly perfect at finding needles in the haystack with 200K-long context, with leading performance on long-context tasks like LongBench and L-Eval. Try it with LMDeploy for 200K-context inference.

  • Outstanding comprehensive performance: Significantly better than the last generation in all dimensions, especially in reasoning, math, code, chat experience, instruction following, and creative writing, with leading performance among open-source models in similar sizes. In some evaluations, InternLM2-Chat-20B may match or even surpass ChatGPT (GPT-3.5).

  • Code interpreter & Data analysis: With code interpreter, InternLM2-Chat-20B obtains compatible performance with GPT-4 on GSM8K and MATH. InternLM2-Chat also provides data analysis capability.

  • Stronger tool use: Based on better tool utilization-related capabilities in instruction following, tool selection and reflection, InternLM2 can support more kinds of agents and multi-step tool calling for complex tasks. See examples.

News

[2024.03.26] We release InternLM2 technical report. See arXiv for details.

[2024.01.31] We release InternLM2-1.8B, along with the associated chat model. They provide a cheaper deployment option while maintaining leading performance.

[2024.01.23] We release InternLM2-Math-7B and InternLM2-Math-20B with pretraining and SFT checkpoints. They surpass ChatGPT with small sizes. See InternLM-Math for details and download.

[2024.01.17] We release InternLM2-7B and InternLM2-20B and their corresponding chat models with stronger capabilities in all dimensions. See model zoo below for download or model cards for more details.

[2023.12.13] InternLM-7B-Chat and InternLM-20B-Chat checkpoints are updated. With an improved finetuning strategy, the new chat models can generate higher quality responses with greater stylistic diversity.

[2023.09.20] InternLM-20B is released with base and chat versions.

Model Zoo

Model Transformers(HF) ModelScope(HF) OpenXLab(HF) OpenXLab(Origin) Release Date
InternLM2-1.8B 🤗internlm2-1.8b internlm2-1.8b Open in OpenXLab Open in OpenXLab 2024-01-31
InternLM2-Chat-1.8B-SFT 🤗internlm2-chat-1.8b-sft internlm2-chat-1.8b-sft Open in OpenXLab Open in OpenXLab 2024-01-31
InternLM2-Chat-1.8B 🤗internlm2-chat-1.8b internlm2-chat-1.8b Open in OpenXLab Open in OpenXLab 2024-02-19
InternLM2-Base-7B 🤗internlm2-base-7b internlm2-base-7b Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-7B 🤗internlm2-7b internlm2-7b Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-Chat-7B-SFT 🤗internlm2-chat-7b-sft internlm2-chat-7b-sft Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-Chat-7B 🤗internlm2-chat-7b internlm2-chat-7b Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-Base-20B 🤗internlm2-base-20b internlm2-base-20b Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-20B 🤗internlm2-20b internlm2-20b Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-Chat-20B-SFT 🤗internlm2-chat-20b-sft internlm2-chat-20b-sft Open in OpenXLab Open in OpenXLab 2024-01-17
InternLM2-Chat-20B 🤗internlm2-chat-20b internlm2-chat-20b Open in OpenXLab Open in OpenXLab 2024-01-17

Notes:

The release of InternLM2 series contains two model sizes: 7B and 20B. 7B models are efficient for research and application and 20B models are more powerful and can support more complex scenarios. The relation of these models are shown as follows.

  1. InternLM2-Base: Foundation models with high quality and high adaptation flexibility, which serve as a good starting point for downstream deep adaptations.
  2. InternLM2: Further pretrain with general domain data and domain-enhanced corpus, obtaining state-of-the-art performance in evaluation with good language capability. InternLM2 models are recommended for consideration in most applications.
  3. InternLM2-Chat-SFT: Intermediate version of InternLM2-Chat that only undergoes supervised fine-tuning (SFT), based on the InternLM2-Base model. We release them to benefit research on alignment.
  4. InternLM2-Chat: Further aligned on top of InternLM2-Chat-SFT through online RLHF. InternLM2-Chat exhibits better instruction following, chat experience, and function call, which is recommended for downstream applications.

Limitations: Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

Supplements: HF refers to the format used by HuggingFace in transformers, whereas Origin denotes the format adopted by the InternLM team in InternEvo.

Performance

Objective Evaluation

Dataset Baichuan2-7B-Chat Mistral-7B-Instruct-v0.2 Qwen-7B-Chat InternLM2-Chat-7B ChatGLM3-6B Baichuan2-13B-Chat Mixtral-8x7B-Instruct-v0.1 Qwen-14B-Chat InternLM2-Chat-20B
MMLU 50.1 59.2 57.1 63.7 58.0 56.6 70.3 66.7 66.5
CMMLU 53.4 42.0 57.9 63.0 57.8 54.8 50.6 68.1 65.1
AGIEval 35.3 34.5 39.7 47.2 44.2 40.0 41.7 46.5 50.3
C-Eval 53.9 42.4 59.8 60.8 59.1 56.3 54.0 71.5 63.0
TrivialQA 37.6 35.0 46.1 50.8 38.1 40.3 57.7 54.5 53.9
NaturalQuestions 12.8 8.1 18.6 24.1 14.0 12.7 22.5 22.9 25.9
C3 78.5 66.9 84.4 91.5 79.3 84.4 82.1 91.5 93.5
CMRC 8.1 5.6 14.6 63.8 43.2 27.8 5.3 13.0 50.4
WinoGrande 49.9 50.8 54.2 65.8 61.7 50.9 60.9 55.7 74.8
BBH 35.9 46.5 45.5 61.2 56.0 42.5 57.3 55.8 68.3
GSM-8K 32.4 48.3 44.1 70.7 53.8 56.0 71.7 57.7 79.6
Math 5.7 8.6 12.0 23.0 20.4 4.3 22.5 27.6 31.9
HumanEval 17.7 35.4 36.0 59.8 52.4 19.5 37.8 40.9 67.1
MBPP 37.7 25.7 33.9 51.4 55.6 40.9 40.9 30.0 65.8
  • Performance of MBPP is reported with MBPP(Sanitized)

Alignment Evaluation

  • We have evaluated our model on AlpacaEval 2.0 and InternLM2-Chat-20B surpass Claude 2, GPT-4(0613) and Gemini Pro.
Model Name Win Rate Length
GPT-4 Turbo 50.00% 2049
GPT-4 23.58% 1365
GPT-4 0314 22.07% 1371
Mistral Medium 21.86% 1500
XwinLM 70b V0.1 21.81% 1775
InternLM2 Chat 20B 21.75% 2373
Mixtral 8x7B v0.1 18.26% 1465
Claude 2 17.19% 1069
Gemini Pro 16.85% 1315
GPT-4 0613 15.76% 1140
Claude 2.1 15.73% 1096
  • According to the released performance of 2024-01-17.

Requirements

  • Python >= 3.8
  • PyTorch >= 1.12.0 (2.0.0 and above are recommended)
  • Transformers >= 4.34

Usages

We briefly show the usages with Transformers, ModelScope, and Web demos. The chat models adopt chatml format to support both chat and agent applications. To ensure a better usage effect, please make sure that the installed transformers library version meets the following requirements before performing inference with Transformers or ModelScope:

transformers >= 4.34

Import from Transformers

To load the InternLM2-7B-Chat model using Transformers, use the following code:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
# (Optional) If on low resource devices, you can load model in 4-bit or 8-bit to further save GPU memory via bitsandbytes.
  # InternLM 7B in 4bit will cost nearly 8GB GPU memory.
  # pip install -U bitsandbytes
  # 8-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_8bit=True)
  # 4-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_4bit=True)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
# Output: Hello? How can I help you today?
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)

Import from ModelScope

To load the InternLM2-7B-Chat model using ModelScope, use the following code:

import torch
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
# (Optional) If on low resource devices, you can load model in 4-bit or 8-bit to further save GPU memory via bitsandbytes.
  # InternLM 7B in 4bit will cost nearly 8GB GPU memory.
  # pip install -U bitsandbytes
  # 8-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_8bit=True)
  # 4-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_4bit=True)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)

Dialogue

You can interact with the InternLM Chat 7B model through a frontend interface by running the following code:

pip install streamlit
pip install transformers>=4.34
streamlit run ./chat/web_demo.py

Deployment

We use LMDeploy for fast deployment of InternLM.

With only 4 lines of codes, you can perform internlm2-chat-7b inference after pip install lmdeploy>=0.2.1.

from lmdeploy import pipeline
pipe = pipeline("internlm/internlm2-chat-7b")
response = pipe(["Hi, pls intro yourself", "Shanghai is"])
print(response)

Please refer to the guidance for more usages about model deployment. For additional deployment tutorials, feel free to explore here.

200K-long-context Inference

By enabling the Dynamic NTK feature of LMDeploy, you can acquire the long-context inference power.

from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig

backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=200000)
pipe = pipeline('internlm/internlm2-chat-7b', backend_config=backend_config)
prompt = 'Use a long prompt to replace this sentence'
response = pipe(prompt)
print(response)

Agent

InternLM2-Chat models have excellent tool utilization capabilities and can work with function calls in a zero-shot manner. See more examples in agent session.

Fine-tuning

Please refer to finetune docs for fine-tuning with InternLM.

Note: We have migrated the whole training functionality in this project to InternEvo for easier user experience, which provides efficient pre-training and fine-tuning infra for training InternLM.

Evaluation

We utilize OpenCompass for model evaluation. In InternLM-2, we primarily focus on standard objective evaluation, long-context evaluation (needle in a haystack), data contamination assessment, agent evaluation, and subjective evaluation.

Objective Evaluation

To evaluate the InternLM model, please follow the guidelines in the OpenCompass tutorial. Typically, we use ppl for multiple-choice questions on the Base model and gen for all questions on the Chat model.

Long-Context Evaluation (Needle in a Haystack)

For the Needle in a Haystack evaluation, refer to the tutorial provided in the documentation. Feel free to try it out.

Data Contamination Assessment

To learn more about data contamination assessment, please check the contamination eval.

Agent Evaluation

  • To evaluate tool utilization, please refer to T-Eval.
  • For code interpreter evaluation, use the Math Agent Evaluation provided in the repository.

Subjective Evaluation

  • Please follow the tutorial for subjective evaluation.

Contribution

We appreciate all the contributors for their efforts to improve and enhance InternLM. Community users are highly encouraged to participate in the project. Please refer to the contribution guidelines for instructions on how to contribute to the project.

License

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表(中文). For other questions or collaborations, please contact [email protected].

Citation

@misc{cai2024internlm2,
      title={InternLM2 Technical Report},
      author={Zheng Cai and Maosong Cao and Haojiong Chen and Kai Chen and Keyu Chen and Xin Chen and Xun Chen and Zehui Chen and Zhi Chen and Pei Chu and Xiaoyi Dong and Haodong Duan and Qi Fan and Zhaoye Fei and Yang Gao and Jiaye Ge and Chenya Gu and Yuzhe Gu and Tao Gui and Aijia Guo and Qipeng Guo and Conghui He and Yingfan Hu and Ting Huang and Tao Jiang and Penglong Jiao and Zhenjiang Jin and Zhikai Lei and Jiaxing Li and Jingwen Li and Linyang Li and Shuaibin Li and Wei Li and Yining Li and Hongwei Liu and Jiangning Liu and Jiawei Hong and Kaiwen Liu and Kuikun Liu and Xiaoran Liu and Chengqi Lv and Haijun Lv and Kai Lv and Li Ma and Runyuan Ma and Zerun Ma and Wenchang Ning and Linke Ouyang and Jiantao Qiu and Yuan Qu and Fukai Shang and Yunfan Shao and Demin Song and Zifan Song and Zhihao Sui and Peng Sun and Yu Sun and Huanze Tang and Bin Wang and Guoteng Wang and Jiaqi Wang and Jiayu Wang and Rui Wang and Yudong Wang and Ziyi Wang and Xingjian Wei and Qizhen Weng and Fan Wu and Yingtong Xiong and Chao Xu and Ruiliang Xu and Hang Yan and Yirong Yan and Xiaogui Yang and Haochen Ye and Huaiyuan Ying and Jia Yu and Jing Yu and Yuhang Zang and Chuyu Zhang and Li Zhang and Pan Zhang and Peng Zhang and Ruijie Zhang and Shuo Zhang and Songyang Zhang and Wenjian Zhang and Wenwei Zhang and Xingcheng Zhang and Xinyue Zhang and Hui Zhao and Qian Zhao and Xiaomeng Zhao and Fengzhe Zhou and Zaida Zhou and Jingming Zhuo and Yicheng Zou and Xipeng Qiu and Yu Qiao and Dahua Lin},
      year={2024},
      eprint={2403.17297},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

internlm's People

Contributors

00index avatar braisedpork1964 avatar del-zhenwu avatar fly2tomato avatar gaoyang07 avatar harold-lkk avatar hellock avatar huangting4201 avatar jiaopl avatar kkscilife avatar li126com avatar lvhan028 avatar ly015 avatar lymdlut avatar mwiacx avatar pryest avatar rangeking avatar rangilyu avatar solenoidwgt avatar sunpengsdu avatar tonysy avatar vansin avatar x54-729 avatar yhcc avatar yingtongxiong avatar ywmditto avatar zhangxc11 avatar zhulinjulia24 avatar zigzagcai avatar zwwwayne avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

internlm's Issues

[Bug] CUDA out of memory on 8 GTX4090

Describe the bug

I found that a single RTX4090 (24G) can not load InternLM-Chat-7B model. But when I use device_map="auto" , an exception occurs:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB (GPU 0; 23.65 GiB total capacity; 22.99 GiB already allocated; 154.06 MiB free; 22.99 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I don't know why it didn't evenly utilize multiple GPUs. Therefore, the first GPU is out of Memory.

intern

Here is the code I use:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", device_map="auto", trust_remote_code=True).cuda()
model = model.eval()

response, history = model.chat(tokenizer, "hello", history=[])
print(response)

The error happens when executing
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", device_map="auto", trust_remote_code=True).cuda()

What's the reason for this error?

Environment

python 3.8
cuda 11.7

Other information

No response

Some mistakes in readme about weight conversion

Describe the bug

  1. in tools/transformers/README.md, "python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer ../v7_sft.model"
    v7_sft.model should be V7_sft.model
  2. in README.md about Convert to Transformers Format, "With the help of tools/convert2hf.py" ,the tools/convert2hf.py should be tools/transformers/convert2hf.py
  3. in README.md about Convert to Transformers Format, "python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer tokenizes/tokenizer.model",this demo should be updated, according to tools/transformers/README.md

The same mistakes to other language doc

Environment

conda env

Other information

No response

模型自行进行多轮对话问题

描述该错误

您好,我们在调用internlm-chat-7b这个模型的时候,发现模型针对一些prompt,会自行进行多轮对话。我看到官方提供的对话实例是model.chat,这个在内部实现是用response.split("")[0]来保证了只返回第一个回复。但是如果要使用这个模型进行后续的实验如RLHF等,在调用model.generate时会输出后面的多轮内容,相关token也会加入到计算中,这个是不可接受的。请问有什么方式来避免输出多轮对话的内容吗?

环境信息

其他信息

No response

[Bug] Need to import torch

Describe the bug

It is a small problem. When I follow REAMDE Import from Transformers to initialize model, a error (NameError: name 'torch' is not defined) occurs. So it is need to import torch .

Also, I have a problem if internlm-chat-7b can be load on multi gpus since my machine is 11G x 8.

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()

Environment

torch              1.13.1+cu117
torch-scatter      2.1.1+pt113cu117
torchaudio         0.13.1+cu117
torchvision        0.14.1+cu117

Other information

No response

Installation per requirements/torch.txt error

Hi, I tried to install per requirements/torch.txt and got following errors. Anything did I miss?

 ➜  InternLM git:(main) pip install -r requirements/torch.txt 
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
ERROR: Could not find a version that satisfies the requirement torch==1.13.1+cu117 (from versions: 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==1.13.1+cu117

My environment is:

  • macOS 13.4.1
  • python 3.11
  • venv

How can I get a training throughput of over 180 TFLOPS ?

I run the code, but only got 90+ tflops.

INFO train.py:317 in record_current_batch_training_metrics -- tflops=93.48098385143103,step=9,loss=7.502509117126465,tgs (tokens/gpu/second)=2104.89,lr=2.2e-06,loss_scale=65536.0,grad_norm=20.60409540743281,micro_num=4,num_consumed_tokens=2621440,inf_nan_skip_batches=0,num_samples_in_batch=13,largest_length=2048,largest_batch=4,smallest_batch=3,adam_beta2=0.95,fwd_bwd_time=6.15

what is the prompt used in chat model?

I want to finetune the chat model using an instruction dataset.

can you give me the prompt used in your chat model?

USER: <instruction>\nASSISTANT: <output>
is this right?

你们公司是不是对 Apache-2.0 开源协议 有点误解?

描述该功能

你们的开源许可证

本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放,也可申请免费的商业使用授权(申请表)。其他问题与合作请联系 [email protected]

但 Apache-2.0 这个协议本身就支持商业化授权,是不需要再找你们申请的!

image

申请限制

而且你的申请表只支持公司,不支持个人,我没太懂:彻底开源,免费商用,上海AI实验室把大模型门槛打下来 是什么意思。

能解释一下,你们是怎么做出这个决定?

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 InternLM 贡献代码!

可否提供Retrieval QA问答的prompt

感谢开源!

我想做一组Retrieval QA的测试,请问用什么prompt比较合适,我试了几个prompt都不太好,比如下面这个internLM-7b-chat会用英文来回答,而直接使用chat接口回答都很简短。

    prompt_template = f"""User: 已知信息:
{context}

根据上述已知信息,专业的回答用户的问题。如果无法从中得到答案,请说 “抱歉,根据已知信息无法回答该问题”,不允许在答案中添加编造成分,答案请使用中文。

问题:{question}

Assistant:"""

[Bug]

Describe the bug

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x90 in position 25401: invalid start byte
The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /data/llm/anaconda3/envs/llm/lib/python3.9/site-packages/transformers/modeling_utils.py:457 in │
│ load_state_dict │
│ │
│ 454 │ │ │ │ │ │ "you cloned." │
│ 455 │ │ │ │ │ ) │
│ 456 │ │ │ │ else: │
│ ❱ 457 │ │ │ │ │ raise ValueError( │
│ 458 │ │ │ │ │ │ f"Unable to locate the file {checkpoint_file} which is necessary │
│ 459 │ │ │ │ │ │ "model. Make sure you have saved the model properly." │
│ 460 │ │ │ │ │ ) from e │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Unable to locate the file /data/llm/internlm-chat-7b/pytorch_model-00001-of-00002.bin which is
necessary to load this pretrained model. Make sure you have saved the model properly.

Environment

python3.9

Other information

No response

[Feature] 规范pre-commit hook

描述该功能

目前pre-commit hook的规则跟 lmdeployopencompass不完全一致,需要进行对齐和统一验证

  1. 对齐lint 规范
  2. 使得当前仓库通过pre-commit

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 InternLM 贡献代码!

[Feature] 支持无apex、flash-attention的执行模式

描述该功能

目前的仓库有几个复杂的依赖库apex、flash-attention,需要可以在没有这些依赖的条件下,可以fallback执行。

  1. 提供无apex、flash-attention条件下的fallback执行模式
  2. 支持opencompass评测直接运行
  3. 支持pip安装

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 InternLM 贡献代码!

按照说明推理的时候报错

model = AutoModelForCausalLM.from_pretrained("/home/hope/work/models/internlm-chat-7b-8k", trust_remote_code=True).to(torch.bfloat16).cuda()
Loading checkpoint shards: 100%|███████████████████████████████| 2/2 [00:11<00:00, 5.90s/it]
Traceback (most recent call last):
File "", line 1, in
NameError: name 'torch' is not defined
import torch
tokenizer = AutoTokenizer.from_pretrained("/home/hope/work/models/internlm-chat-7b-8k", trust_remote_code=True)
model = model.eval()Traceback (most recent call last):
File "", line 1, in
NameError: name 'model' is not defined

[Bug] tokenizer alpaca_data.json 的时候遇到错误aborted

Describe the bug

你好呀,我想微调咱们的大模型,但是在集群上直接出现了这个错误。这有可能是什么原因导致的呢?
python tools/alpaca_tokenizer.py /mnt/petrelfs/wangxiaochen/workspace/alpaca_data.json /mnt/petrelfs/wangxiaochen/workspace/dataset /mnt/petrelfs/wangxiaochen/workspace/pretrained_models/internlm-chat-7b --split_ratio 0.1

Aborted

Environment

所有的环境按照说明配好,但是mpfr、mpc还没配置。在集群share区有此文件,却没法编译引用。

Other information

No response

The cmd print the promt log when I exit the Dialogue using 'ctrl+c'

Describe the bug

The promt log when exit the dialogue:

Loading checkpoint shards: 100%|██████████| 2/2 [00:18<00:00, 9.30s/it]
^Csrun: interrupt (one more within 1 sec to abort)
srun: StepId=7154040.0 task 0: running
^Csrun: sending Ctrl-C to StepId=7154040.0
load model end.
load model begin.
load model end.
cur real input:
<|User|>:介绍下你自己~
<|Bot|>:

cur total response:
你好!我是一个人工智能助手,我的名字是书生·浦语,我致力于帮助用户解决问题和提供信息服务。我能够回答问题、提供定义和解释、将文本从一种语言翻译成另一种语言、总结文本、生成文本、编写故事、分析情感、提供推荐、开发算法、编写代码以及其他任何基于语言的任务。我的设计理念是有用、诚实并且无害,我使用深度学习技术进行构建并且不断从与用户的对话中学习。

load model begin.
load model end.
cur real input:
<|User|>:介绍下你自己~
<|Bot|>:你好!我是一个人工智能助手,我的名字是书生·浦语,我致力于帮助用户解决问题和提供信息服务。我能够回答问题、提供定义和解释、将文本从一种语言翻译成另一种语言、总结文本、生成文本、编写故��、分析情感、提供推荐、开发算法、编写代码以及其他任何基于语言的任务。我的设计理念是有用、诚实并且无害,我使用深度学习技术进行构建并且不断从与用户的对话中学习。
<|User|>:我们来对联吧,上联:生意如春意
<|Bot|>:

cur total response:
下联:婚姻若水纹

load model begin.
load model end.
cur real input:
<|User|>:介绍下你自己~
<|Bot|>:你好!我是一个人工智能助手,我的名字是书生·浦语,我致力于帮助用户解决问题和提供信息服务。我能够回答问题、提供定义和解释、将文本从一种语言翻译成另一种语言、总结文本、生成文本、编写故事、分析情感、提供推荐、开发算法、编写代码以及其他任何基于语言的任务。我的设计理念是有用、诚实并且无害,我使用深度学习技术进行构建并且不断从与用户的对话中学习。
<|User|>:我们来对联吧,上联:生意如春意
<|Bot|>:下联:婚姻若水纹
<|User|>:GOOD
<|Bot|>:

cur total response:
非常感谢!

load model begin.
load model end.
cur real input:
<|User|>:介绍下你自己~
<|Bot|>:你好!我是一个人工智能助手,我的名字是书生·浦语,我致力于帮助用户解决问题和提供信息服务。我能够回答问题、提供定义和解释、将文本从一种语言翻译成另一种语言、总结文本、生成文本、编写故事、分析情感、提供推荐、开发算法、编写代码以及其他任何基于语言的任务。我的设计理念是有用、诚实并且无害,我使用深度学习技术进行构建并且不断从与用户的对话中学习。
<|User|>:我们来对联吧,上联:生意如春意
<|Bot|>:下联:婚姻若水纹
<|User|>:GOOD
<|Bot|>:非常感谢!
<|User|>:展开说明为什么这样对下联
<|Bot|>:

Environment

slurm platform

Other information

No response

按readme加载7B 只输出编码

Describe the bug

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-7b", trust_remote_code=True).cuda()
model = model.eval()
inputs = tokenizer(["A beautiful flower"], return_tensors="pt")
for k,v in inputs.items():
gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
output = model.generate(**inputs, **gen_kwargs)
print(output)
按照这个代码运行只输出编码
截屏2023-07-14 09 32 16

Environment

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-7b", trust_remote_code=True).cuda()
model = model.eval()
inputs = tokenizer(["A beautiful flower"], return_tensors="pt")
for k,v in inputs.items():
gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
output = model.generate(**inputs, **gen_kwargs)
print(output)

Other information

No response

Docker

Can you provide a docker so we can deploy it out-of-box?

支持Windwos吗

描述该功能

感谢您分享的成果。
想确认一下,目前是否计划支持windows,有的话大概什么时候,谢谢!

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 InternLM 贡献代码!

找不到rotary_emb类[Bug]

描述该错误

在尝试torchrun微调时,运行遇到找不到rotary_emb类
11

环境信息

CUDA Version: 12.1
torch 1.13.1

其他信息

No response

[Bug] 安装apex不成功

描述该错误

  • Python == 3.10
  • GCC == 10.2.0
  • MPFR == 4.1.0
  • CUDA == 11.7
  • Pytorch == 1.13.1+cu117
  • Transformers >= 4.25.1
  • 均安装,仍无法正常编译apex

环境信息

packages/torch/include/torch/csrc/api/include/torch/serialize/archive.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers/serialize.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets/chunk.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/attributes.h:35:27: note: ‘torch::jit::toString’
35 | static inline const char* toString(AttributeKind kind) {
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In function ‘std::vectorat::Tensor linear_gelu_linear_backward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor)’:
/data2/InternLM/apex/csrc/fused_dense.cpp:149:73: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
149 | auto d_weight1 = at::empty({hidden_features, in_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:150:74: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
150 | auto d_weight2 = at::empty({out_features, hidden_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:151:58: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
151 | auto d_bias1 = at::empty({hidden_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:152:55: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
152 | auto d_bias2 = at::empty({out_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:153:66: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
153 | auto d_input = at::empty({batch_size, in_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:154:72: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
154 | auto d_output1 = at::empty({batch_size, hidden_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:157:55: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
157 | auto lt_workspace = at::empty({1 << 22}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:13,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:159:94: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:312:28: note: in definition of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
312 | const auto& the_type = TYPE;
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:13,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:314:56: warning: ‘c10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [-Wdeprecated-declarations]
314 | at::ScalarType st = ::detail::scalar_type(the_type);
| ^
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:178:23: note: declared here
178 | inline at::ScalarType scalar_type(const at::DeprecatedTypeProperties& t) {
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:314:56: warning: ‘c10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [-Wdeprecated-declarations]
314 | at::ScalarType st = ::detail::scalar_type(the_type);
| ^
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:178:23: note: declared here
178 | inline at::ScalarType scalar_type(const at::DeprecatedTypeProperties& t) {
| ^~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:317:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
317 | AT_PRIVATE_CASE_TYPE(NAME, at::ScalarType::Double, double, VA_ARGS)
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:318:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
318 | AT_PRIVATE_CASE_TYPE(NAME, at::ScalarType::Float, float, VA_ARGS)
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:319:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
319 | AT_PRIVATE_CASE_TYPE(
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:324:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
324 | AT_PRIVATE_CASE_TYPE(
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Device.h:5,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Allocator.h:6,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:159:94: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:260:39: note: in definition of macro ‘C10_EXPAND_MSVC_WORKAROUND’
260 | #define C10_EXPAND_MSVC_WORKAROUND(x) x
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:465:9: note: in expansion of macro ‘TORCH_CHECK_MSG’
465 | TORCH_CHECK_MSG(cond, "", ##VA_ARGS));
| ^~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:637:32: note: in expansion of macro ‘TORCH_CHECK’
637 | C10_EXPAND_MSVC_WORKAROUND(TORCH_CHECK(false, ::c10::str(VA_ARGS)));
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:9: note: in expansion of macro ‘AT_ERROR’
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Device.h:5,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Allocator.h:6,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:51: error: ‘toString’ was not declared in this scope
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:260:39: note: in definition of macro ‘C10_EXPAND_MSVC_WORKAROUND’
260 | #define C10_EXPAND_MSVC_WORKAROUND(x) x
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:465:9: note: in expansion of macro ‘TORCH_CHECK_MSG’
465 | TORCH_CHECK_MSG(cond, "", ##VA_ARGS));
| ^~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:637:32: note: in expansion of macro ‘TORCH_CHECK’
637 | C10_EXPAND_MSVC_WORKAROUND(TORCH_CHECK(false, ::c10::str(VA_ARGS)));
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:9: note: in expansion of macro ‘AT_ERROR’
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:51: note: suggested alternatives:
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:260:39: note: in definition of macro ‘C10_EXPAND_MSVC_WORKAROUND’
260 | #define C10_EXPAND_MSVC_WORKAROUND(x) x
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:465:9: note: in expansion of macro ‘TORCH_CHECK_MSG’
465 | TORCH_CHECK_MSG(cond, "", ##VA_ARGS));
| ^~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:637:32: note: in expansion of macro ‘TORCH_CHECK’
637 | C10_EXPAND_MSVC_WORKAROUND(TORCH_CHECK(false, ::c10::str(VA_ARGS)));
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:9: note: in expansion of macro ‘AT_ERROR’
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/op_registration/infer_schema.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/library.h:61,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:12,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/function_schema.h:522:20: note: ‘c10::toString’
522 | inline std::string toString(const FunctionSchema& schema) {
| ^~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/ir.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/function_impl.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/method.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/object.h:6,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/module.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/serialize/input-archive.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/serialize/archive.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers/serialize.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets/chunk.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/attributes.h:35:27: note: ‘torch::jit::toString’
35 | static inline const char* toString(AttributeKind kind) {
| ^~~~~~~~
[2/2] /usr/local/cuda-11.3/bin/nvcc -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/TH -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-11.3/include -I/data1/lym/anaconda3/envs/lab2/include/python3.10 -c -c /data2/InternLM/apex/csrc/fused_dense_cuda.cu -o /data2/InternLM/apex/build/temp.linux-x86_64-cpython-310/csrc/fused_dense_cuda.o -D__CUDA_NO_HALF_OPERATORS
-D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_dense_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1631): warning: variable "beta_zero" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1755): warning: variable "alpha" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1756): warning: variable "beta_zero" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1757): warning: variable "status" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1812): warning: variable "alpha" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1813): warning: variable "beta_zero" was declared but never referenced

/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
subprocess.run(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/data2/InternLM/apex/setup.py", line 795, in
setup(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/init.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/command/install.py", line 74, in run
return orig.install.run(self)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/install.py", line 697, in run
self.run_command('build')
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
build_ext.build_extensions(self)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
self._build_extensions_serial()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
self.build_extension(ext)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
objects = self.compiler.compile(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
error: subprocess-exited-with-error

× Running setup.py install for apex did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /data1/lym/anaconda3/envs/lab2/bin/python -u -c '
exec(compile('"'"''"'"''"'"'

This is -- a caller that pip uses to run setup.py

- It imports setuptools before invoking setup.py, to enable projects that directly

import from distutils.core to work with newer packaging standards.

- It provides a clear error message when setuptools is not installed.

- It sets sys.argv[0] to the underlying setup.py, when invoking setup.py so

setuptools doesn'"'"'t think the script is -c. This avoids the following warning:

manifest_maker: standard file '"'"'-c'"'"' not found".

- It generates a shim setup.py, for handling setup.cfg-only projects.

import os, sys, tokenize

try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute setup.py since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)

file = %r
sys.argv[0] = file

if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"

exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/data2/InternLM/apex/setup.py'"'"',), "", "exec"))' --cpp_ext --cuda_ext install --record /tmp/pip-record-_mp6xe2r/install-record.txt --single-version-externally-managed --compile --install-headers /data1/lym/anaconda3/envs/lab2/include/python3.10/apex
cwd: /data2/InternLM/apex/
Running setup.py install for apex ... error
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> apex

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

其他信息

No response

which languages ​​are available in the training dataset?

It would be extremely helpful if you could share the list of languages that are included in the dataset used to train the model. If the dataset is available on Hugging Face, it would be even better if you could provide a direct link to it.

[Bug] from_pretrained raise ValueError

Describe the bug

在已安装好环境(包括transformers库)时报错内容为找不到对应的类,请问如何解决:
File "/HOME/scz3924/.conda/envs/InternLM/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 719, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.internlm.internlm-chat-7b.dd2fa16d14c8b21fea4b4c168b9fef839154e305.configuration_internlm.InternLMConfig'> to build an AutoTokenizer.
Model type should be one of AlbertConfig, AlignConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, CamembertConfig, CanineConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, CodeGenConfig, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, DPRConfig, ElectraConfig, ErnieConfig, ErnieMConfig, EsmConfig, FlaubertConfig, FNetConfig, FSMTConfig, FunnelConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GPTSanJapaneseConfig, GroupViTConfig, HubertConfig, IBertConfig, JukeboxConfig, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MgpstrConfig, MobileBertConfig, MPNetConfig, MT5Config, MvpConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, OwlViTConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, Pix2StructConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, RagConfig, RealmConfig, ReformerConfig, RemBertConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2TextConfig, Speech2Text2Config, SpeechT5Config, SplinterConfig, SqueezeBertConfig, SwitchTransformersConfig, T5Config, TapasConfig, TransfoXLConfig, ViltConfig, VisualBertConfig, Wav2Vec2Config, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, YosoConfig.

Environment

满足requirments

Other information

No response

The demo code about tokenizer.py in tools dir exists some mistakes

Describe the bug

In demo code: python tools/tokenizer.py --raw_data_name your_raw_data_file_name(without suffix) --input_file_type 'text' or 'json' or 'jsonl' --bin your_output_bin_path
'text' should be 'txt'
same to the demo code: python tools/tokenizer.py --raw_data_name raw_data --input_file_type 'text' --bin cn/output.bin
Both in chinese and engish,and other language if existed

Environment

any browser

Other information

No response

[Bug] Python3.10环境运行加载模型错误

描述该错误

按照官方文档进行环境安装并使用Python3.10环境进行加载模型,模型加载失败如下图所示:
image
模型加载到内存中,内存快速增长到21GB之后,进程被killed了,导致模型加载失败
image

环境信息

image

其他信息

No response

[Bug] CUDA out of memory on A10 24G with device_map auto

Describe the bug

I can load the model into VRAM. However, when I call the chat methods, an exception occurs:

CUDA out of memory. Tried to allocate 1.57 GiB (GPU 0; 22.05 GiB total capacity; 19.76 GiB already allocated; 1.32 GiB free; 19.77 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Is it an expected behavior? How much VRAM do I need to run the model?

Environment

Python 3.8
Cuda 12.1
CPU: 8 cores, RAM: 32GB
GPU: A10, VRAM: 24GB

Other information

I set the device_map to auto since I need to load the model into GPU. Otherwise, it says

RuntimeError: [enforce fail at alloc_cpu.cpp:75] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 845152256 bytes. Error code 12 (Cannot allocate memory)

关于 InternLM 通用工具调用能力 的疑问

描述该功能

在ReadMe中提到了 模型的特点
“3. 提供了一个多功能工具集,让用户灵活构建自己的工作流程。”
请问 这里的工作流程是什么意思,是指的LLM调用外部工具吗

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 InternLM 贡献代码!

Training details?

Hi InternLM team, thank you for this open source contribution! InternLM looks like a really strong 7B model.

I think the research community would greatly benefit from learning about the training details of InternLM. Are you open to sharing the token budget and global batch size used for this model?

In the README I see this comment which suggests a token budget over 1T tokens:

It leverages trillions of high-quality tokens for training to establish a powerful knowledge base.

And in the training performance README I see that the max performance was achieved at 16k tokens per GPU. If this was used across 1024 GPUs for pretraining it would imply a global batch size of 16M tokens which is larger than I've seen before (especially for 7B models).

Thank you again!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.