描述该错误按照文档中的快速上手，数据及使用的是ceval，模型为OPT-125M；运行python run.py con

<a target="_blank" rel="noopener noreferrer" href="https://private-user-images.githubu

执行python run.py configs/eval_demo.py -w outputs/demo 速度异常 about opencompass HOT 2 CLOSED

open-compass commented on May 20, 2024

执行python run.py configs/eval_demo.py -w outputs/demo 速度异常

from opencompass.

Comments (2)

Ezra-Yu commented on May 20, 2024 1

The infer tasks finished at 22:45, the nvidia-smi had no GPU info because Eval Tasks do not need GPU。

For Infer Task, 'gen' mode LLM itself is pretty slow especially pretrain models will generate text until it hits the generation length limit example the demo takes around 90 mins on my test machine with one 1660ti GPU.

from opencompass.

niexufei commented on May 20, 2024 1

The infer tasks finished at 22:45, the nvidia-smi had no GPU info because Eval Tasks do not need GPU。

For Infer Task, 'gen' mode LLM itself is pretty slow especially pretrain models will generate text until it hits the generation length limit example the demo takes around 90 mins on my test machine with one 1660ti GPU.

上面截图的状态是跑eval_demo.py配置，但是将数据集修改为ceval中的两门课（目的是简化数据，打通流程）。所以，infer任务很快就完成了；等看GPU状态时，已经跑完；

你这里解释完之后，就清晰啦。Eval任务就应该在CPU上跑；

另外，后来我又将Ceval的所有科目都打开，开始运行，发现又非常慢；
最终原因：ceval数据已经在本地了，但是好像每次都去huggingface上下载，超时之后，才从本地下载，导致每次执行到：
Found cached dataset csv (/root/.cache/huggingface/datasets/csv/default-e2ed8a58cfad59df/0.0.0/eea64c71ca8b46dd3f537ed218fc9bf495d5707789152eb2764f5c78fa66d59d)
时都卡好几分钟；

这个问题的原因是，应该设置下面环境变量：
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
export HF_EVALUATE_OFFLINE=1

这里也抛出来，供后面同学参考。
再次感谢。

from opencompass.

Recommend Projects

执行python run.py configs/eval_demo.py -w outputs/demo 速度异常 about opencompass HOT 2 CLOSED

Comments (2)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent