internlm / internlm Goto Github PK

View Code? Open in Web Editor NEW

5.2K 48.0 373.0 4.63 MB

Official release of InternLM2 7B and 20B base and chat models. 200K context support

Home Page: https://internlm.intern-ai.org.cn/

License: Apache License 2.0

Python 100.00%

chatbot gpt large-language-model long-context rlhf fine-tuning-llm llm chinese flash-attention pretrained-models

internlm's Introduction

InternLM

InternLM ^HOT

📘Commercial Application | 🤗HuggingFace | 🆕Update News | 🤔Reporting Issues | 📜Technical Report

English | 简体中文

👋 join us on Discord and WeChat

Introduction

InternLM2 series are released with the following features:

200K Context window: Nearly perfect at finding needles in the haystack with 200K-long context, with leading performance on long-context tasks like LongBench and L-Eval. Try it with LMDeploy for 200K-context inference.
Outstanding comprehensive performance: Significantly better than the last generation in all dimensions, especially in reasoning, math, code, chat experience, instruction following, and creative writing, with leading performance among open-source models in similar sizes. In some evaluations, InternLM2-Chat-20B may match or even surpass ChatGPT (GPT-3.5).
Code interpreter & Data analysis: With code interpreter, InternLM2-Chat-20B obtains compatible performance with GPT-4 on GSM8K and MATH. InternLM2-Chat also provides data analysis capability.
Stronger tool use: Based on better tool utilization-related capabilities in instruction following, tool selection and reflection, InternLM2 can support more kinds of agents and multi-step tool calling for complex tasks. See examples.

News

[2024.03.26] We release InternLM2 technical report. See arXiv for details.

[2024.01.31] We release InternLM2-1.8B, along with the associated chat model. They provide a cheaper deployment option while maintaining leading performance.

[2024.01.23] We release InternLM2-Math-7B and InternLM2-Math-20B with pretraining and SFT checkpoints. They surpass ChatGPT with small sizes. See InternLM-Math for details and download.

[2024.01.17] We release InternLM2-7B and InternLM2-20B and their corresponding chat models with stronger capabilities in all dimensions. See model zoo below for download or model cards for more details.

[2023.12.13] InternLM-7B-Chat and InternLM-20B-Chat checkpoints are updated. With an improved finetuning strategy, the new chat models can generate higher quality responses with greater stylistic diversity.

[2023.09.20] InternLM-20B is released with base and chat versions.

Model Zoo

Model	Transformers(HF)	ModelScope(HF)	Release Date
InternLM2-1.8B	🤗internlm2-1.8b	internlm2-1.8b	2024-01-31
InternLM2-Chat-1.8B-SFT	🤗internlm2-chat-1.8b-sft	internlm2-chat-1.8b-sft	2024-01-31
InternLM2-Chat-1.8B	🤗internlm2-chat-1.8b	internlm2-chat-1.8b	2024-02-19
InternLM2-Base-7B	🤗internlm2-base-7b	internlm2-base-7b	2024-01-17
InternLM2-7B	🤗internlm2-7b	internlm2-7b	2024-01-17
InternLM2-Chat-7B-SFT	🤗internlm2-chat-7b-sft	internlm2-chat-7b-sft	2024-01-17
InternLM2-Chat-7B	🤗internlm2-chat-7b	internlm2-chat-7b	2024-01-17
InternLM2-Base-20B	🤗internlm2-base-20b	internlm2-base-20b	2024-01-17
InternLM2-20B	🤗internlm2-20b	internlm2-20b	2024-01-17
InternLM2-Chat-20B-SFT	🤗internlm2-chat-20b-sft	internlm2-chat-20b-sft	2024-01-17
InternLM2-Chat-20B	🤗internlm2-chat-20b	internlm2-chat-20b	2024-01-17

Notes:

The release of InternLM2 series contains two model sizes: 7B and 20B. 7B models are efficient for research and application and 20B models are more powerful and can support more complex scenarios. The relation of these models are shown as follows.

InternLM2-Base: Foundation models with high quality and high adaptation flexibility, which serve as a good starting point for downstream deep adaptations.
InternLM2: Further pretrain with general domain data and domain-enhanced corpus, obtaining state-of-the-art performance in evaluation with good language capability. InternLM2 models are recommended for consideration in most applications.
InternLM2-Chat-SFT: Intermediate version of InternLM2-Chat that only undergoes supervised fine-tuning (SFT), based on the InternLM2-Base model. We release them to benefit research on alignment.
InternLM2-Chat: Further aligned on top of InternLM2-Chat-SFT through online RLHF. InternLM2-Chat exhibits better instruction following, chat experience, and function call, which is recommended for downstream applications.

Limitations: Although we have made efforts to ensure the safety of the model during the training process and to encourage the model to generate text that complies with ethical and legal requirements, the model may still produce unexpected outputs due to its size and probabilistic generation paradigm. For example, the generated responses may contain biases, discrimination, or other harmful content. Please do not propagate such content. We are not responsible for any consequences resulting from the dissemination of harmful information.

Supplements: HF refers to the format used by HuggingFace in transformers, whereas Origin denotes the format adopted by the InternLM team in InternEvo.

Performance

Objective Evaluation

Dataset	Baichuan2-7B-Chat	Mistral-7B-Instruct-v0.2	Qwen-7B-Chat	InternLM2-Chat-7B	ChatGLM3-6B	Baichuan2-13B-Chat	Mixtral-8x7B-Instruct-v0.1	Qwen-14B-Chat	InternLM2-Chat-20B
MMLU	50.1	59.2	57.1	63.7	58.0	56.6	70.3	66.7	66.5
CMMLU	53.4	42.0	57.9	63.0	57.8	54.8	50.6	68.1	65.1
AGIEval	35.3	34.5	39.7	47.2	44.2	40.0	41.7	46.5	50.3
C-Eval	53.9	42.4	59.8	60.8	59.1	56.3	54.0	71.5	63.0
TrivialQA	37.6	35.0	46.1	50.8	38.1	40.3	57.7	54.5	53.9
NaturalQuestions	12.8	8.1	18.6	24.1	14.0	12.7	22.5	22.9	25.9
C3	78.5	66.9	84.4	91.5	79.3	84.4	82.1	91.5	93.5
CMRC	8.1	5.6	14.6	63.8	43.2	27.8	5.3	13.0	50.4
WinoGrande	49.9	50.8	54.2	65.8	61.7	50.9	60.9	55.7	74.8
BBH	35.9	46.5	45.5	61.2	56.0	42.5	57.3	55.8	68.3
GSM-8K	32.4	48.3	44.1	70.7	53.8	56.0	71.7	57.7	79.6
Math	5.7	8.6	12.0	23.0	20.4	4.3	22.5	27.6	31.9
HumanEval	17.7	35.4	36.0	59.8	52.4	19.5	37.8	40.9	67.1
MBPP	37.7	25.7	33.9	51.4	55.6	40.9	40.9	30.0	65.8

Performance of MBPP is reported with MBPP(Sanitized)

Alignment Evaluation

We have evaluated our model on AlpacaEval 2.0 and InternLM2-Chat-20B surpass Claude 2, GPT-4(0613) and Gemini Pro.

Model Name	Win Rate	Length
GPT-4 Turbo	50.00%	2049
GPT-4	23.58%	1365
GPT-4 0314	22.07%	1371
Mistral Medium	21.86%	1500
XwinLM 70b V0.1	21.81%	1775
InternLM2 Chat 20B	21.75%	2373
Mixtral 8x7B v0.1	18.26%	1465
Claude 2	17.19%	1069
Gemini Pro	16.85%	1315
GPT-4 0613	15.76%	1140
Claude 2.1	15.73%	1096

According to the released performance of 2024-01-17.

Requirements

Python >= 3.8
PyTorch >= 1.12.0 (2.0.0 and above are recommended)
Transformers >= 4.34

Usages

We briefly show the usages with Transformers, ModelScope, and Web demos. The chat models adopt chatml format to support both chat and agent applications. To ensure a better usage effect, please make sure that the installed transformers library version meets the following requirements before performing inference with Transformers or ModelScope:

transformers >= 4.34

Import from Transformers

To load the InternLM2-7B-Chat model using Transformers, use the following code:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
# (Optional) If on low resource devices, you can load model in 4-bit or 8-bit to further save GPU memory via bitsandbytes.
  # InternLM 7B in 4bit will cost nearly 8GB GPU memory.
  # pip install -U bitsandbytes
  # 8-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_8bit=True)
  # 4-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_4bit=True)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
# Output: Hello? How can I help you today?
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)

Import from ModelScope

To load the InternLM2-7B-Chat model using ModelScope, use the following code:

import torch
from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
# Set `torch_dtype=torch.float16` to load model in float16, otherwise it will be loaded as float32 and might cause OOM Error.
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, torch_dtype=torch.float16)
# (Optional) If on low resource devices, you can load model in 4-bit or 8-bit to further save GPU memory via bitsandbytes.
  # InternLM 7B in 4bit will cost nearly 8GB GPU memory.
  # pip install -U bitsandbytes
  # 8-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_8bit=True)
  # 4-bit: model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True, load_in_4bit=True)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)

Dialogue

You can interact with the InternLM Chat 7B model through a frontend interface by running the following code:

pip install streamlit
pip install transformers>=4.34
streamlit run ./chat/web_demo.py

Deployment

We use LMDeploy for fast deployment of InternLM.

With only 4 lines of codes, you can perform internlm2-chat-7b inference after pip install lmdeploy>=0.2.1.

from lmdeploy import pipeline
pipe = pipeline("internlm/internlm2-chat-7b")
response = pipe(["Hi, pls intro yourself", "Shanghai is"])
print(response)

Please refer to the guidance for more usages about model deployment. For additional deployment tutorials, feel free to explore here.

200K-long-context Inference

By enabling the Dynamic NTK feature of LMDeploy, you can acquire the long-context inference power.

from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig

backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=200000)
pipe = pipeline('internlm/internlm2-chat-7b', backend_config=backend_config)
prompt = 'Use a long prompt to replace this sentence'
response = pipe(prompt)
print(response)

Agent

InternLM2-Chat models have excellent tool utilization capabilities and can work with function calls in a zero-shot manner. See more examples in agent session.

Fine-tuning

Please refer to finetune docs for fine-tuning with InternLM.

Note: We have migrated the whole training functionality in this project to InternEvo for easier user experience, which provides efficient pre-training and fine-tuning infra for training InternLM.

Evaluation

We utilize OpenCompass for model evaluation. In InternLM-2, we primarily focus on standard objective evaluation, long-context evaluation (needle in a haystack), data contamination assessment, agent evaluation, and subjective evaluation.

Objective Evaluation

To evaluate the InternLM model, please follow the guidelines in the OpenCompass tutorial. Typically, we use ppl for multiple-choice questions on the Base model and gen for all questions on the Chat model.

Long-Context Evaluation (Needle in a Haystack)

For the Needle in a Haystack evaluation, refer to the tutorial provided in the documentation. Feel free to try it out.

Data Contamination Assessment

To learn more about data contamination assessment, please check the contamination eval.

Agent Evaluation

To evaluate tool utilization, please refer to T-Eval.
For code interpreter evaluation, use the Math Agent Evaluation provided in the repository.

Subjective Evaluation

Please follow the tutorial for subjective evaluation.

Contribution

We appreciate all the contributors for their efforts to improve and enhance InternLM. Community users are highly encouraged to participate in the project. Please refer to the contribution guidelines for instructions on how to contribute to the project.

License

The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表（中文）. For other questions or collaborations, please contact [email protected].

Citation

@misc{cai2024internlm2,
      title={InternLM2 Technical Report},
      author={Zheng Cai and Maosong Cao and Haojiong Chen and Kai Chen and Keyu Chen and Xin Chen and Xun Chen and Zehui Chen and Zhi Chen and Pei Chu and Xiaoyi Dong and Haodong Duan and Qi Fan and Zhaoye Fei and Yang Gao and Jiaye Ge and Chenya Gu and Yuzhe Gu and Tao Gui and Aijia Guo and Qipeng Guo and Conghui He and Yingfan Hu and Ting Huang and Tao Jiang and Penglong Jiao and Zhenjiang Jin and Zhikai Lei and Jiaxing Li and Jingwen Li and Linyang Li and Shuaibin Li and Wei Li and Yining Li and Hongwei Liu and Jiangning Liu and Jiawei Hong and Kaiwen Liu and Kuikun Liu and Xiaoran Liu and Chengqi Lv and Haijun Lv and Kai Lv and Li Ma and Runyuan Ma and Zerun Ma and Wenchang Ning and Linke Ouyang and Jiantao Qiu and Yuan Qu and Fukai Shang and Yunfan Shao and Demin Song and Zifan Song and Zhihao Sui and Peng Sun and Yu Sun and Huanze Tang and Bin Wang and Guoteng Wang and Jiaqi Wang and Jiayu Wang and Rui Wang and Yudong Wang and Ziyi Wang and Xingjian Wei and Qizhen Weng and Fan Wu and Yingtong Xiong and Chao Xu and Ruiliang Xu and Hang Yan and Yirong Yan and Xiaogui Yang and Haochen Ye and Huaiyuan Ying and Jia Yu and Jing Yu and Yuhang Zang and Chuyu Zhang and Li Zhang and Pan Zhang and Peng Zhang and Ruijie Zhang and Shuo Zhang and Songyang Zhang and Wenjian Zhang and Wenwei Zhang and Xingcheng Zhang and Xinyue Zhang and Hui Zhao and Qian Zhao and Xiaomeng Zhao and Fengzhe Zhou and Zaida Zhou and Jingming Zhuo and Yicheng Zou and Xipeng Qiu and Yu Qiao and Dahua Lin},
      year={2024},
      eprint={2403.17297},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

internlm's People

Contributors

Stargazers

Watchers

Forkers

jmaigc dominikusbrian fuliang jamesthesnake qzl164 t-bagwell isaackwan tonysy hellock rangeking zytx121 stephenxxxx itsharex hbcbh1999 gaoyang07 mlogix mars-wei techthiyanes theykk-bunker mbrukman expresschen tonywork yuanhuachao matrixgame2018 varuy322 shiningmaker yssai criticalpulsar goswamig kevinnunu kaesahao xixiyahaha zwwwayne yhcc scutcyr f901107 rayjue gjyin jasonchen505 josegron vnmnb nmbvr nmkoi jkmnuu vbd5 fgvkk kimberlyjoness rbess0741 arunabacies ssslaw cem9492 mrnnmaclean malcoolboy7 ercanerde arjnamanas morelsp valentinss1 mettetexda marimaron bnojb ssboy43 mz0in positioner mihais7 genaroket eltociear laframmohamed st-rnd coopot luxifaanna thuankelly2 khaleelhabeeb shurima3 cicada0007 g9lea yzkee wuhaohao1234 jaedukseo apollohuang1 hhy5277 victorshawfan taurusduan solenoidwgt seceum leejodie liangofthechen mayi140611 ssahgal bigcash markgtking lymdlut nonomal hljpeter zyling10 li126com sysujayce pkafma-aon xwyangjshb canqiang nnuujj

internlm's Issues

[Bug] CUDA out of memory on 8 GTX4090

Describe the bug

I found that a single RTX4090 (24G) can not load InternLM-Chat-7B model. But when I use device_map="auto" , an exception occurs:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 172.00 MiB (GPU 0; 23.65 GiB total capacity; 22.99 GiB already allocated; 154.06 MiB free; 22.99 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I don't know why it didn't evenly utilize multiple GPUs. Therefore, the first GPU is out of Memory.

Here is the code I use:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", device_map="auto", trust_remote_code=True).cuda()
model = model.eval()

response, history = model.chat(tokenizer, "hello", history=[])
print(response)

The error happens when executing
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", device_map="auto", trust_remote_code=True).cuda()

What's the reason for this error?

Environment

python 3.8
cuda 11.7

Other information

No response

复读严重

是参数不对，还是我调用程序（https://github.com/wenda-LLM/wenda/blob/main/llms/llm_internlm.py ）有问题？

The issue template of documentation doesn't exists

Describe the bug

click the link: https://internlm.readthedocs.io/en/latest/
the result is 404

Environment

any browser

Other information

No response

Some mistakes in readme about weight conversion

Describe the bug

in tools/transformers/README.md, "python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer ../v7_sft.model"
v7_sft.model should be V7_sft.model
in README.md about Convert to Transformers Format, "With the help of tools/convert2hf.py" ,the tools/convert2hf.py should be tools/transformers/convert2hf.py
in README.md about Convert to Transformers Format, "python convert2hf.py --src_folder origin_ckpt/ --tgt_folder hf_ckpt/ --tokenizer tokenizes/tokenizer.model",this demo should be updated, according to tools/transformers/README.md

The same mistakes to other language doc

Environment

conda env

Other information

No response

模型自行进行多轮对话问题

描述该错误

您好，我们在调用internlm-chat-7b这个模型的时候，发现模型针对一些prompt，会自行进行多轮对话。我看到官方提供的对话实例是model.chat，这个在内部实现是用response.split("")[0]来保证了只返回第一个回复。但是如果要使用这个模型进行后续的实验如RLHF等，在调用model.generate时会输出后面的多轮内容，相关token也会加入到计算中，这个是不可接受的。请问有什么方式来避免输出多轮对话的内容吗？

环境信息

略

其他信息

No response

大致多久能收到商业授权反馈？[Feature]

描述该功能

已按制式用公司信息填写申请表

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

有官方微信群或者QQ群吗？[Feature]

描述该功能

请问有官方微信群或者QQ群吗？

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

[Bug] Need to import torch

Describe the bug

It is a small problem. When I follow REAMDE Import from Transformers to initialize model， a error (NameError: name 'torch' is not defined) occurs. So it is need to import torch .

Also, I have a problem if internlm-chat-7b can be load on multi gpus since my machine is 11G x 8.

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-chat-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm-chat-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda()

Environment

torch              1.13.1+cu117
torch-scatter      2.1.1+pt113cu117
torchaudio         0.13.1+cu117
torchvision        0.14.1+cu117

Other information

No response

'InternLMModel' object has no attribute 'chat'

The example does not work

response, history = model.chat(tokenizer, "hello", history=[])

AttributeError: 'InternLMModel' object has no attribute 'chat'

Installation per requirements/torch.txt error

Hi, I tried to install per requirements/torch.txt and got following errors. Anything did I miss?

 ➜  InternLM git:(main) pip install -r requirements/torch.txt 
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu117
ERROR: Could not find a version that satisfies the requirement torch==1.13.1+cu117 (from versions: 2.0.0, 2.0.1)
ERROR: No matching distribution found for torch==1.13.1+cu117

My environment is:

macOS 13.4.1
python 3.11
venv

why it is different: modeling_internlm.py on hugging face and on github

The hugging face modeling file: https://huggingface.co/internlm/internlm-7b/blob/main/modeling_internlm.py

if I train SFT from base model， which modeling file should I use?

thank you!

How can I get a training throughput of over 180 TFLOPS ?

I run the code, but only got 90+ tflops.

INFO train.py:317 in record_current_batch_training_metrics -- tflops=93.48098385143103,step=9,loss=7.502509117126465,tgs (tokens/gpu/second)=2104.89,lr=2.2e-06,loss_scale=65536.0,grad_norm=20.60409540743281,micro_num=4,num_consumed_tokens=2621440,inf_nan_skip_batches=0,num_samples_in_batch=13,largest_length=2048,largest_batch=4,smallest_batch=3,adam_beta2=0.95,fwd_bwd_time=6.15

[Bug] 如何评测internLM的基础模型，使用评测百川模型的5-few-shot时，internLM无法正确输出答案

Describe the bug

none

Environment

none

Other information

No response

what is the prompt used in chat model?

I want to finetune the chat model using an instruction dataset.

can you give me the prompt used in your chat model?

USER: <instruction>\nASSISTANT: <output>
is this right?

Get rid of "trust_remote_code" requirement

Why is this required? It is not secure. Can you change it so this requirement goes away? Thanks.

你们公司是不是对 Apache-2.0 开源协议有点误解？

描述该功能

你们的开源许可证

本仓库的代码依照 Apache-2.0 协议开源。模型权重对学术研究完全开放，也可申请免费的商业使用授权（申请表）。其他问题与合作请联系 [email protected]。

但 Apache-2.0 这个协议本身就支持商业化授权，是不需要再找你们申请的！

申请限制

而且你的申请表只支持公司，不支持个人，我没太懂：彻底开源，免费商用，上海AI实验室把大模型门槛打下来是什么意思。

能解释一下，你们是怎么做出这个决定？

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

可否提供Retrieval QA问答的prompt

感谢开源！

我想做一组Retrieval QA的测试，请问用什么prompt比较合适，我试了几个prompt都不太好，比如下面这个internLM-7b-chat会用英文来回答，而直接使用chat接口回答都很简短。

    prompt_template = f"""User: 已知信息：
{context}

根据上述已知信息，专业的回答用户的问题。如果无法从中得到答案，请说 “抱歉，根据已知信息无法回答该问题”，不允许在答案中添加编造成分，答案请使用中文。

问题：{question}

Assistant:"""

[Bug]

Describe the bug

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x90 in position 25401: invalid start byte
The above exception was the direct cause of the following exception:

╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /data/llm/anaconda3/envs/llm/lib/python3.9/site-packages/transformers/modeling_utils.py:457 in │
│ load_state_dict │
│ │
│ 454 │ │ │ │ │ │ "you cloned." │
│ 455 │ │ │ │ │ ) │
│ 456 │ │ │ │ else: │
│ ❱ 457 │ │ │ │ │ raise ValueError( │
│ 458 │ │ │ │ │ │ f"Unable to locate the file {checkpoint_file} which is necessary │
│ 459 │ │ │ │ │ │ "model. Make sure you have saved the model properly." │
│ 460 │ │ │ │ │ ) from e │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: Unable to locate the file /data/llm/internlm-chat-7b/pytorch_model-00001-of-00002.bin which is
necessary to load this pretrained model. Make sure you have saved the model properly.

Environment

python3.9

Other information

No response

The chat app in huggingface reponse nothing after the first question

Describe the bug

enter huggingface link: "https://huggingface.co/spaces/internlm/InternLM-Chat-7B"
submit the first question then get the answer, but if you continue submiting the question, then response nothing

Environment

any browser

Other information

No response

[Feature] 规范pre-commit hook

描述该功能

目前pre-commit hook的规则跟 lmdeploy 和 opencompass不完全一致，需要进行对齐和统一验证

对齐lint 规范
使得当前仓库通过pre-commit

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

[Feature] 支持无apex、flash-attention的执行模式

描述该功能

目前的仓库有几个复杂的依赖库apex、flash-attention，需要可以在没有这些依赖的条件下，可以fallback执行。

提供无apex、flash-attention条件下的fallback执行模式
支持opencompass评测直接运行
支持pip安装

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

[Feature] license issue for Colossal-AI

Describe the feature

Dear InternLM team,

Thanks for your Acknowledgements to Colossal-AI.

It would be appreciated if you could adhere to the open source agreement and cite Colossal-AI license in your license file.

Thank you very much.

Colossal-AI team

Will you implement it?

I would like to implement this feature and create a PR!

按照说明推理的时候报错

model = AutoModelForCausalLM.from_pretrained("/home/hope/work/models/internlm-chat-7b-8k", trust_remote_code=True).to(torch.bfloat16).cuda()
Loading checkpoint shards: 100%|███████████████████████████████| 2/2 [00:11<00:00, 5.90s/it]
Traceback (most recent call last):
File "", line 1, in
NameError: name 'torch' is not defined
import torch
tokenizer = AutoTokenizer.from_pretrained("/home/hope/work/models/internlm-chat-7b-8k", trust_remote_code=True)
model = model.eval()Traceback (most recent call last):
File "", line 1, in
NameError: name 'model' is not defined

ModuleNotFoundError: No module named 'packaging' when install apex

Describe the bug

when I install apex, some error occurred:
ModuleNotFoundError: No module named 'packaging'
error: subprocess-exited-with-error

Environment

CUDA: 11.7
python: 3.10

Other information

No response

[Bug] 输出停不下来，会输出很长的无关字符

描述该错误

例如会出现这种情况

环境信息

环境为官方推荐环境
模型是internlm-chat-7B

其他信息

No response

What's the performance’s difference between chat-7b and chat-7b-8k?

Two lines of `CommonSenseQA` evaluation results in the README

Hi there, thanks for the excellent work! There are two lines of CommonSenseQA evaluation results. Are there any differences between them?

[Feature] 上下文缓存一直占用显存不释放，使用torch.cuda.empty_cache()无法释放显存缓存

Describe the feature

torch.cuda.empty_cache()对其他模型生效，都能释放显存，这个模型无效

Will you implement it?

希望增加释放显存缓存的功能，在用户手动清除缓存的情况下，希望显存能恢复到刚加载模型的状态

[Bug] tokenizer alpaca_data.json 的时候遇到错误aborted

Describe the bug

你好呀，我想微调咱们的大模型，但是在集群上直接出现了这个错误。这有可能是什么原因导致的呢？
python tools/alpaca_tokenizer.py /mnt/petrelfs/wangxiaochen/workspace/alpaca_data.json /mnt/petrelfs/wangxiaochen/workspace/dataset /mnt/petrelfs/wangxiaochen/workspace/pretrained_models/internlm-chat-7b --split_ratio 0.1

Aborted

Environment

所有的环境按照说明配好，但是mpfr、mpc还没配置。在集群share区有此文件，却没法编译引用。

Other information

No response

The inference process is too slow when using 'transformers'. Is this normal?

Example provided in https://github.com/InternLM/InternLM#import-from-transformers toke more than 40s to finish at V100. Why?

what's the difference between InternLM-Chat-7B and InternLM-7B?

请问，可以做到colab上吗？google云端硬盘部署可以吗？

Describe the feature

之前的stable diffusion部署在了google云端硬盘上，学习和使用都比较方便，请问这个项目能这样做吗？谢谢

Will you implement it?

I would like to implement this feature and create a PR!

The cmd print the promt log when I exit the Dialogue using 'ctrl+c'

Describe the bug

The promt log when exit the dialogue:

Loading checkpoint shards: 100%|██████████| 2/2 [00:18<00:00, 9.30s/it]
^Csrun: interrupt (one more within 1 sec to abort)
srun: StepId=7154040.0 task 0: running
^Csrun: sending Ctrl-C to StepId=7154040.0
load model end.
load model begin.
load model end.
cur real input:
<|User|>:介绍下你自己~
<|Bot|>:

cur total response:
你好！我是一个人工智能助手，我的名字是书生·浦语，我致力于帮助用户解决问题和提供信息服务。我能够回答问题、提供定义和解释、将文本从一种语言翻译成另一种语言、总结文本、生成文本、编写故事、分析情感、提供推荐、开发算法、编写代码以及其他任何基于语言的任务。我的设计理念是有用、诚实并且无害，我使用深度学习技术进行构建并且不断从与用户的对话中学习。

cur total response:
下联：婚姻若水纹

cur total response:
非常感谢！

Environment

slurm platform

Other information

No response

按readme加载7B 只输出编码

Describe the bug

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("internlm/internlm-7b", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("internlm/internlm-7b", trust_remote_code=True).cuda()
model = model.eval()
inputs = tokenizer(["A beautiful flower"], return_tensors="pt")
for k,v in inputs.items():
gen_kwargs = {"max_length": 128, "top_p": 0.8, "temperature": 0.8, "do_sample": True, "repetition_penalty": 1.1}
output = model.generate(**inputs, **gen_kwargs)
print(output)
按照这个代码运行只输出编码

Environment

Other information

No response

Docker

Can you provide a docker so we can deploy it out-of-box?

支持Windwos吗

描述该功能

感谢您分享的成果。
想确认一下，目前是否计划支持windows，有的话大概什么时候，谢谢！

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

[Bug] 使用上万亿高质量【预料】，建立模型超强知识体系；

描述该错误

预料 -> 语料

环境信息

不涉及

其他信息

No response

找不到rotary_emb类[Bug]

描述该错误

在尝试torchrun微调时，运行遇到找不到rotary_emb类

环境信息

CUDA Version: 12.1
torch 1.13.1

其他信息

No response

[Bug] 安装apex不成功

描述该错误

Python == 3.10
GCC == 10.2.0
MPFR == 4.1.0
CUDA == 11.7
Pytorch == 1.13.1+cu117
Transformers >= 4.25.1
均安装，仍无法正常编译apex

环境信息

packages/torch/include/torch/csrc/api/include/torch/serialize/archive.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers/serialize.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets/chunk.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/attributes.h:35:27: note: ‘torch::jit::toString’
35 | static inline const char* toString(AttributeKind kind) {
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In function ‘std::vectorat::Tensor linear_gelu_linear_backward(at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor, at::Tensor)’:
/data2/InternLM/apex/csrc/fused_dense.cpp:149:73: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
149 | auto d_weight1 = at::empty({hidden_features, in_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:150:74: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
150 | auto d_weight2 = at::empty({out_features, hidden_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:151:58: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
151 | auto d_bias1 = at::empty({hidden_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:152:55: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
152 | auto d_bias2 = at::empty({out_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:153:66: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
153 | auto d_input = at::empty({batch_size, in_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:154:72: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
154 | auto d_output1 = at::empty({batch_size, hidden_features}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:157:55: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
157 | auto lt_workspace = at::empty({1 << 22}, input.type());
| ^
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:13,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:159:94: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:312:28: note: in definition of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
312 | const auto& the_type = TYPE;
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:13,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:314:56: warning: ‘c10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [-Wdeprecated-declarations]
314 | at::ScalarType st = ::detail::scalar_type(the_type);
| ^
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:178:23: note: declared here
178 | inline at::ScalarType scalar_type(const at::DeprecatedTypeProperties& t) {
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:314:56: warning: ‘c10::ScalarType detail::scalar_type(const at::DeprecatedTypeProperties&)’ is deprecated: passing at::DeprecatedTypeProperties to an AT_DISPATCH macro is deprecated, pass an at::ScalarType instead [-Wdeprecated-declarations]
314 | at::ScalarType st = ::detail::scalar_type(the_type);
| ^
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:178:23: note: declared here
178 | inline at::ScalarType scalar_type(const at::DeprecatedTypeProperties& t) {
| ^~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:317:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
317 | AT_PRIVATE_CASE_TYPE(NAME, at::ScalarType::Double, double, VA_ARGS)
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:318:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
318 | AT_PRIVATE_CASE_TYPE(NAME, at::ScalarType::Float, float, VA_ARGS)
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:319:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
319 | AT_PRIVATE_CASE_TYPE(
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:163:10: warning: unused variable ‘result’ [-Wunused-variable]
163 | auto result = linear_gelu_linear_backward_cuda<scalar_t>(
| ^~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:68:12: note: in definition of macro ‘AT_PRIVATE_CASE_TYPE_USING_HINT’
68 | return VA_ARGS();
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:324:7: note: in expansion of macro ‘AT_PRIVATE_CASE_TYPE’
324 | AT_PRIVATE_CASE_TYPE(
| ^~~~~~~~~~~~~~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Device.h:5,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Allocator.h:6,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data2/InternLM/apex/csrc/fused_dense.cpp: In lambda function:
/data2/InternLM/apex/csrc/fused_dense.cpp:159:94: warning: ‘at::DeprecatedTypeProperties& at::Tensor::type() const’ is deprecated: Tensor.type() is deprecated. Instead use Tensor.options(), which in many cases (e.g. in a constructor) is a drop-in replacement. If you were using data from type(), that is now available from Tensor itself, so instead of tensor.type().scalar_type(), use tensor.scalar_type() instead and instead of tensor.type().backend() use tensor.device(). [-Wdeprecated-declarations]
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:260:39: note: in definition of macro ‘C10_EXPAND_MSVC_WORKAROUND’
260 | #define C10_EXPAND_MSVC_WORKAROUND(x) x
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:465:9: note: in expansion of macro ‘TORCH_CHECK_MSG’
465 | TORCH_CHECK_MSG(cond, "", ##VA_ARGS));
| ^~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:637:32: note: in expansion of macro ‘TORCH_CHECK’
637 | C10_EXPAND_MSVC_WORKAROUND(TORCH_CHECK(false, ::c10::str(VA_ARGS)));
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:9: note: in expansion of macro ‘AT_ERROR’
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/Tensor.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/DeviceGuard.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:11,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/TensorBody.h:213:30: note: declared here
213 | DeprecatedTypeProperties & type() const {
| ^~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Device.h:5,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/Allocator.h:6,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/ATen.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:51: error: ‘toString’ was not declared in this scope
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:260:39: note: in definition of macro ‘C10_EXPAND_MSVC_WORKAROUND’
260 | #define C10_EXPAND_MSVC_WORKAROUND(x) x
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:465:9: note: in expansion of macro ‘TORCH_CHECK_MSG’
465 | TORCH_CHECK_MSG(cond, "", ##VA_ARGS));
| ^~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:637:32: note: in expansion of macro ‘TORCH_CHECK’
637 | C10_EXPAND_MSVC_WORKAROUND(TORCH_CHECK(false, ::c10::str(VA_ARGS)));
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:9: note: in expansion of macro ‘AT_ERROR’
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:51: note: suggested alternatives:
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:260:39: note: in definition of macro ‘C10_EXPAND_MSVC_WORKAROUND’
260 | #define C10_EXPAND_MSVC_WORKAROUND(x) x
| ^
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:465:9: note: in expansion of macro ‘TORCH_CHECK_MSG’
465 | TORCH_CHECK_MSG(cond, "", ##VA_ARGS));
| ^~~~~~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/util/Exception.h:637:32: note: in expansion of macro ‘TORCH_CHECK’
637 | C10_EXPAND_MSVC_WORKAROUND(TORCH_CHECK(false, ::c10::str(VA_ARGS)));
| ^~~~~~~~~~~
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/Dispatch.h:330:9: note: in expansion of macro ‘AT_ERROR’
330 | AT_ERROR(#NAME, " not implemented for '", toString(TYPE), "'");
| ^~~~~~~~
/data2/InternLM/apex/csrc/fused_dense.cpp:159:3: note: in expansion of macro ‘AT_DISPATCH_FLOATING_TYPES_AND2’
159 | AT_DISPATCH_FLOATING_TYPES_AND2(at::ScalarType::Half, at::ScalarType::BFloat16, input.type(), "linear_bias_backward", [&] {
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/op_registration/infer_schema.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/library.h:61,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/types.h:12,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader_options.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/base.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader/stateful.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/dataloader.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/ATen/core/function_schema.h:522:20: note: ‘c10::toString’
522 | inline std::string toString(const FunctionSchema& schema) {
| ^~~~~~~~
In file included from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/ir.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/function_impl.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/method.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/object.h:6,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/api/module.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/serialize/input-archive.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/serialize/archive.h:3,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers/serialize.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/samplers.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets/chunk.h:7,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data/datasets.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/data.h:4,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:8,
from /data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/extension.h:4,
from /data2/InternLM/apex/csrc/fused_dense.cpp:1:
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/jit/ir/attributes.h:35:27: note: ‘torch::jit::toString’
35 | static inline const char* toString(AttributeKind kind) {
| ^~~~~~~~
[2/2] /usr/local/cuda-11.3/bin/nvcc -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/TH -I/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda-11.3/include -I/data1/lym/anaconda3/envs/lab2/include/python3.10 -c -c /data2/InternLM/apex/csrc/fused_dense_cuda.cu -o /data2/InternLM/apex/build/temp.linux-x86_64-cpython-310/csrc/fused_dense_cuda.o -D__CUDA_NO_HALF_OPERATORS -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DVERSION_GE_1_1 -DVERSION_GE_1_3 -DVERSION_GE_1_5 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=fused_dense_cuda -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++14
/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1631): warning: variable "beta_zero" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1755): warning: variable "alpha" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1756): warning: variable "beta_zero" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1757): warning: variable "status" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1812): warning: variable "alpha" was declared but never referenced

/data2/InternLM/apex/csrc/fused_dense_cuda.cu(1813): warning: variable "beta_zero" was declared but never referenced

/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/include/c10/core/SymInt.h(84): warning: integer conversion resulted in a change of sign

ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1808, in _run_ninja_build
subprocess.run(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/subprocess.py", line 524, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "/data2/InternLM/apex/setup.py", line 795, in
setup(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/init.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/command/install.py", line 74, in run
return orig.install.run(self)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/install.py", line 697, in run
self.run_command('build')
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 131, in run
self.run_command(cmd_name)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 318, in run_command
self.distribution.run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/dist.py", line 1234, in run_command
super().run_command(command)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 84, in run
_build_ext.run(self)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 345, in run
self.build_extensions()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 765, in build_extensions
build_ext.build_extensions(self)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 467, in build_extensions
self._build_extensions_serial()
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 493, in _build_extensions_serial
self.build_extension(ext)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 246, in build_extension
_build_ext.build_extension(self, ext)
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 548, in build_extension
objects = self.compiler.compile(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 586, in unix_wrap_ninja_compile
_write_ninja_file_and_compile_objects(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1487, in _write_ninja_file_and_compile_objects
_run_ninja_build(
File "/data1/lym/anaconda3/envs/lab2/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1824, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension
error: subprocess-exited-with-error

× Running setup.py install for apex did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /data1/lym/anaconda3/envs/lab2/bin/python -u -c '
exec(compile('"'"''"'"''"'"'

This is -- a caller that pip uses to run setup.py

- It imports setuptools before invoking setup.py, to enable projects that directly

import from `distutils.core` to work with newer packaging standards.

- It provides a clear error message when setuptools is not installed.

- It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so

setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:

manifest_maker: standard file '"'"'-c'"'"' not found".

- It generates a shim setup.py, for handling setup.cfg-only projects.

import os, sys, tokenize

try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute setup.py since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)

file = %r
sys.argv[0] = file

if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"

exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/data2/InternLM/apex/setup.py'"'"',), "", "exec"))' --cpp_ext --cuda_ext install --record /tmp/pip-record-_mp6xe2r/install-record.txt --single-version-externally-managed --compile --install-headers /data1/lym/anaconda3/envs/lab2/include/python3.10/apex
cwd: /data2/InternLM/apex/
Running setup.py install for apex ... error
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> apex

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

其他信息

No response

8G 以下的GPU RAM能运行么？

有没有运行的方法？

which languages are available in the training dataset?

It would be extremely helpful if you could share the list of languages that are included in the dataset used to train the model. If the dataset is available on Hugging Face, it would be even better if you could provide a direct link to it.

[Bug] web_demo中torch.cuda.empty_cache()不生效，显存一直上涨，几轮后重复回答问题

Describe the bug

torch.cuda.empty_cache()在每次聊天后都会调用，但是不生效，显存一直上涨，几轮后重复回答问题

刚开始可以正常回复，几轮过后就不正常了
类似截图这样，一直重复，不能理解对话

Environment

Python 3.8.16
streamlit==1.24.0
transformers==4.30.2

Other information

No response

[Bug] from_pretrained raise ValueError

Describe the bug

在已安装好环境(包括transformers库)时报错内容为找不到对应的类，请问如何解决：
File "/HOME/scz3924/.conda/envs/InternLM/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 719, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers_modules.internlm.internlm-chat-7b.dd2fa16d14c8b21fea4b4c168b9fef839154e305.configuration_internlm.InternLMConfig'> to build an AutoTokenizer.
Model type should be one of AlbertConfig, AlignConfig, BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig, BlenderbotConfig, BlenderbotSmallConfig, BlipConfig, Blip2Config, BloomConfig, BridgeTowerConfig, CamembertConfig, CanineConfig, ChineseCLIPConfig, ClapConfig, CLIPConfig, CLIPSegConfig, CodeGenConfig, ConvBertConfig, CpmAntConfig, CTRLConfig, Data2VecTextConfig, DebertaConfig, DebertaV2Config, DistilBertConfig, DPRConfig, ElectraConfig, ErnieConfig, ErnieMConfig, EsmConfig, FlaubertConfig, FNetConfig, FSMTConfig, FunnelConfig, GitConfig, GPT2Config, GPT2Config, GPTBigCodeConfig, GPTNeoConfig, GPTNeoXConfig, GPTNeoXJapaneseConfig, GPTJConfig, GPTSanJapaneseConfig, GroupViTConfig, HubertConfig, IBertConfig, JukeboxConfig, LayoutLMConfig, LayoutLMv2Config, LayoutLMv3Config, LEDConfig, LiltConfig, LlamaConfig, LongformerConfig, LongT5Config, LukeConfig, LxmertConfig, M2M100Config, MarianConfig, MBartConfig, MegaConfig, MegatronBertConfig, MgpstrConfig, MobileBertConfig, MPNetConfig, MT5Config, MvpConfig, NezhaConfig, NllbMoeConfig, NystromformerConfig, OneFormerConfig, OpenAIGPTConfig, OPTConfig, OwlViTConfig, PegasusConfig, PegasusXConfig, PerceiverConfig, Pix2StructConfig, PLBartConfig, ProphetNetConfig, QDQBertConfig, RagConfig, RealmConfig, ReformerConfig, RemBertConfig, RetriBertConfig, RobertaConfig, RobertaPreLayerNormConfig, RoCBertConfig, RoFormerConfig, RwkvConfig, Speech2TextConfig, Speech2Text2Config, SpeechT5Config, SplinterConfig, SqueezeBertConfig, SwitchTransformersConfig, T5Config, TapasConfig, TransfoXLConfig, ViltConfig, VisualBertConfig, Wav2Vec2Config, Wav2Vec2ConformerConfig, WhisperConfig, XCLIPConfig, XGLMConfig, XLMConfig, XLMProphetNetConfig, XLMRobertaConfig, XLMRobertaXLConfig, XLNetConfig, XmodConfig, YosoConfig.

Environment

满足requirments

Other information

No response

Some documentation is missing

Do we need documentation in the readthedocs of this repository?
The contributing guideline such as https://github.com/InternLM/opencompass/blob/main/docs/en/notes/contribution_guide.md is missing, community may have issue in contributing to this project.
A better documentation structure following https://github.com/internlm/opencompass/tree/main/docs is more helpful to users.

The demo code about tokenizer.py in tools dir exists some mistakes

Describe the bug

In demo code: python tools/tokenizer.py --raw_data_name your_raw_data_file_name(without suffix) --input_file_type 'text' or 'json' or 'jsonl' --bin your_output_bin_path
'text' should be 'txt'
same to the demo code: python tools/tokenizer.py --raw_data_name raw_data --input_file_type 'text' --bin cn/output.bin
Both in chinese and engish，and other language if existed

Environment

any browser

Other information

No response

A quick question everyone concerns.

Is the IntermLM 7B the only one open to the community, larger models are commercially licensed?

[Bug] Python3.10环境运行加载模型错误

描述该错误

按照官方文档进行环境安装并使用Python3.10环境进行加载模型，模型加载失败如下图所示：

模型加载到内存中，内存快速增长到21GB之后，进程被killed了，导致模型加载失败

环境信息

其他信息

No response

[Bug] CUDA out of memory on A10 24G with device_map auto

Describe the bug

I can load the model into VRAM. However, when I call the chat methods, an exception occurs:

CUDA out of memory. Tried to allocate 1.57 GiB (GPU 0; 22.05 GiB total capacity; 19.76 GiB already allocated; 1.32 GiB free; 19.77 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Is it an expected behavior? How much VRAM do I need to run the model?

Environment

Python 3.8
Cuda 12.1
CPU: 8 cores, RAM: 32GB
GPU: A10, VRAM: 24GB

Other information

I set the device_map to auto since I need to load the model into GPU. Otherwise, it says

RuntimeError: [enforce fail at alloc_cpu.cpp:75] err == 0. DefaultCPUAllocator: can't allocate memory: you tried to allocate 845152256 bytes. Error code 12 (Cannot allocate memory)

关于 InternLM 通用工具调用能力的疑问

描述该功能

在ReadMe中提到了模型的特点
“3. 提供了一个多功能工具集，让用户灵活构建自己的工作流程。”
请问这里的工作流程是什么意思，是指的LLM调用外部工具吗

是否希望自己实现该功能？

我希望自己来实现这一功能，并向 InternLM 贡献代码！

Training details?

Hi InternLM team, thank you for this open source contribution! InternLM looks like a really strong 7B model.

I think the research community would greatly benefit from learning about the training details of InternLM. Are you open to sharing the token budget and global batch size used for this model?

In the README I see this comment which suggests a token budget over 1T tokens:

It leverages trillions of high-quality tokens for training to establish a powerful knowledge base.

And in the training performance README I see that the max performance was achieved at 16k tokens per GPU. If this was used across 1024 GPUs for pretraining it would imply a global batch size of 16M tokens which is larger than I've seen before (especially for 7B models).

Thank you again!

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.