先决条件 <input type="che

in opencompass/models/vllm.py <div class="snippet-clipboard-content notranslate po

in opencompass/models/vllm.py <div class="snippet-clipboard-content n

[Bug] 使用VLLM时遇到被切分的task会卡住，而HuggingFaceCausalLM则不会 about opencompass HOT 6 OPEN

IcyFeather233 commented on June 12, 2024

[Bug] 使用VLLM时遇到被切分的task会卡住，而HuggingFaceCausalLM则不会

from opencompass.

Comments (6)

ww0o0 commented on June 12, 2024 1

我也遇到了这个问题，使用vllm对第一个task测完之后就会卡住了

from opencompass.

ww0o0 commented on June 12, 2024 1

目前可以通过运行时把指定的很大避免对数据集进行切分从而规避这个问题，但是只是一个暂时的解决方案，还是希望开发人员可以看看怎么解决～--max-partition-size

--max-partition-size 对单个数据集可以解决，但是多个数据集进行测评的话还是会分为多个task也会出现这个问题

from opencompass.

Zbaoli commented on June 12, 2024 1

in opencompass/models/vllm.py

import ray
if ray.is_initialized():
    self.logger.info('shutdown ray instance to avoid "Calling ray.init() again" error.')
     ray.shutdown()

add above command before calling vllm LLM class;
in about 52 lines;

from opencompass.

IcyFeather233 commented on June 12, 2024

目前可以通过运行时把 --max-partition-size 指定的很大避免对数据集进行切分从而规避这个问题，但是只是一个暂时的解决方案，还是希望开发人员可以看看怎么解决～

from opencompass.

Zbaoli commented on June 12, 2024

same question, get "Calling ray.init() again after it has already been called." error

from opencompass.

IcyFeather233 commented on June 12, 2024

in opencompass/models/vllm.py
import ray
if ray.is_initialized():
    self.logger.info('shutdown ray instance to avoid "Calling ray.init() again" error.')
     ray.shutdown()
add above command before calling vllm LLM class; in about 52 lines;

发现使用了这个方法之后，对于单模型多数据集的情况，每次有新数据集，似乎模型也要跟着重新启动一遍ray，即每处理一个数据集都会输出:

2024-04-12 01:59:04,123 INFO worker.py:1743 -- Started a local Ray instance. View the dashboard at 127.0.0.1:8266 
INFO 04-12 01:59:44 llm_engine.py:75] Initializing an LLM engine (v0.4.0) with config: model='xxx', tokenizer='xxx)

(RayWorkerVllm pid=108846) INFO 04-12 02:01:37 selector.py:16] Using FlashAttention backend.

然而我发现这个过程十分耗时，有没有办法能改成启动一遍ray，一口气把数据集都跑完？

from opencompass.

Recommend Projects