Comments (2)
The infer tasks finished at 22:45, the nvidia-smi
had no GPU info because Eval Tasks do not need GPU。
For Infer Task, 'gen' mode LLM itself is pretty slow especially pretrain models will generate text until it hits the generation length limit example the demo takes around 90 mins on my test machine with one 1660ti GPU.
from opencompass.
The infer tasks finished at 22:45, the
nvidia-smi
had no GPU info because Eval Tasks do not need GPU。For Infer Task, 'gen' mode LLM itself is pretty slow especially pretrain models will generate text until it hits the generation length limit example the demo takes around 90 mins on my test machine with one 1660ti GPU.
上面截图的状态是跑eval_demo.py配置,但是将数据集修改为ceval中的两门课(目的是简化数据,打通流程)。所以,infer任务很快就完成了;等看GPU状态时,已经跑完;
你这里解释完之后,就清晰啦。Eval任务就应该在CPU上跑;
另外,后来我又将Ceval的所有科目都打开,开始运行,发现又非常慢;
最终原因:ceval数据已经在本地了,但是好像每次都去huggingface上下载,超时之后,才从本地下载,导致每次执行到:
Found cached dataset csv (/root/.cache/huggingface/datasets/csv/default-e2ed8a58cfad59df/0.0.0/eea64c71ca8b46dd3f537ed218fc9bf495d5707789152eb2764f5c78fa66d59d)
时都卡好几分钟;
这个问题的原因是,应该设置下面环境变量:
export HF_DATASETS_OFFLINE=1
export TRANSFORMERS_OFFLINE=1
export HF_EVALUATE_OFFLINE=1
这里也抛出来,供后面同学参考。
再次感谢。
from opencompass.
Related Issues (20)
- [Bug] AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
- [Bug] Error when evaluate using LightLLM api
- [Feature] Utilizing an Alternate Model for Model Evaluation HOT 1
- [Bug] No Pattern match or more than one config Error when running NeedleBench HOT 1
- [Bug] 为什么评估的humaneval分数比使用bigcode-evaluation-harness的分数低? HOT 5
- [Feature] Support PromptCBLUE HOT 1
- 在测评的时候显存总是有空闲,如何全部利用显存呢,单机 8*80G的 A800 HOT 1
- [Bug] 使用VLLM时遇到被切分的task会卡住,而HuggingFaceCausalLM则不会 HOT 6
- [Bug] alignbench 用Qwen14B infer的时候,有的prediction是空的 HOT 26
- [Bug] AlignBench Auto-j extracted judgements failed HOT 1
- [Bug] Result output is 0. HOT 7
- [Bug] open compass hangs when evaluating chat musician trained model - waiting for semaphore? HOT 11
- [Bug] Long text evaluation parameters are not clear HOT 3
- 为什么基于openai api部署的llama2-7b-chat-hf,在MMLU数据集上测试精度远低于官方数据 HOT 7
- 使用baichuan-7b评估humaneval数据与榜单差距过大 HOT 1
- [Bug] 使用NumWorkerPartitioner切分推理层和验证层任务后,最总结果指标无法将数据集结果汇总 HOT 3
- [Bug] OpenICLInfer fail HOT 8
- [Feature] 自定义数据集命令如何修改评估指标?
- [Feature] 兼容torch_npu HOT 1
- [Bug] 找不到LongContext的数据集
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opencompass.