Comments (6)
@xmshi-trio Hi, please check this issue
from opencompass.
"0": {
"origin_prompt": "<|im_start|>user\n给定病历或者医学影像报告,要求从中抽取临床发现事件的四个属性:主体词、解剖部位、描述词、发生状态。\n主体词:指患者的电子病历中的疾病名称或者由疾病引发的症状,也包括患者的一般情况如饮食,二便,睡眠等。\n描述词:对主体词的发生时序特征、轻重程度、形态颜色等多个维度的刻画,也包括疾病的起病缓急、突发。\n解剖部位:指主体词发生在患者的身体部位,也包括组织,细胞,系统等,也包括部位的方向和数量。\n发生状态:“不确定”或“否定”,肯定的情况不标注发生状态。\n\n\n\n要求输出所有的临床发生事件,每个临床发现事件占一行,临床发现事件中包含主体词、解剖部位、描述词、发生状态四个字段,每个字段用“;”分隔,然后每个字段是“字段名:字段值”的格式,字段值有多个,则用“,”符号分隔。。主体词是字符串。发生状态是字符串,如果没有发生状态,默认为\"不确定\"。如果没有描述词,默认为\"空\"。如果没有解剖部位,默认为\"空\"。\n示例:\n输入病历或者医学影像所见报告:\n精神好,食欲好。\n输出这个句子里的临床发现事件为:\n\n主体词:精神;发生状态:不确定;描述词:好;解剖部位:空。\n主体词:食欲;发生状态:不确定;描述词:好;解剖部位:空。\n\n请从以下病历或者医学影像所见报告中,直接抽取临床发现事件的四个属性:主体词、解剖部位、描述词、发生状态,不要进行额外的解释:患者本次发病以来,食欲正常,神志清醒,精神尚可,睡眠欠佳,大便正常,小便正常,体重无明显变化。\n答:<|im_end|>\n<|im_start|>assistant\n",
"prediction": "以下是根据您提供的病历或医学影像报告中抽取的临床发现事件及其四个属性:\n\n1. **主体词**:食欲\n - **解剖部位**:空\n - **描述词**:正常\n - **发生状态**:不确定\n\n2. **主体词**:神志\n - **解剖部位**:空\n - **描述词**:清醒\n - **发生状态**:不确定\n\n3. **主体词**"
},
这是中间结果
from opencompass.
是否 gt 和 reference 参数放反了?怀疑是 Medbench 的 eval 方法写错了,或者数据有问题?我下载的数据是name_test.json,code base 是 name.json,我的没有 answer,不过我看官网确实也没有 answer。还有不是很明白,我的数据中没有problem_input列
{"question": "请回答以下单选题。要求只输出选项,不输出解释:\n儿童可以使用八子补肾胶囊吗?\n(A)是\n(B)否。\n答:", "options": ["(A)是", "(B)否"], "answer": null, "other": {"source": "DrugCA", "id": 1}}
{"question": "请回答以下单选题。要求只输出选项,不输出解释:\n儿童可以使用八子补肾胶囊吗?\n(A)否\n(B)是。\n答:", "options": ["(A)否", "(B)是"], "answer": null, "other": {"source": "DrugCA", "id": 1}}
medbench_reader_cfg = dict(
input_columns=['problem_input'], output_column='label')
但是代码中说读取这个列
from opencompass.
放出来的测试数据的答案为空吧。
from opencompass.
是否 gt 和 reference 参数放反了?怀疑是 Medbench 的 eval 方法写错了,或者数据有问题?我下载的数据是name_test.json,code base 是 name.json,我的没有 answer,不过我看官网确实也没有 answer。还有不是很明白,我的数据中没有problem_input列
{"question": "请回答以下单选题。要求只输出选项,不输出解释:\n儿童可以使用八子补肾胶囊吗?\n(A)是\n(B)否。\n答:", "options": ["(A)是", "(B)否"], "answer": null, "other": {"source": "DrugCA", "id": 1}} {"question": "请回答以下单选题。要求只输出选项,不输出解释:\n儿童可以使用八子补肾胶囊吗?\n(A)否\n(B)是。\n答:", "options": ["(A)否", "(B)是"], "answer": null, "other": {"source": "DrugCA", "id": 1}}medbench_reader_cfg = dict( input_columns=['problem_input'], output_column='label')但是代码中说读取这个列
您好,测试数据的答案是不对外放的。后续如果我们释放带答案的开发集才能在本地进行评测。如果想进行评测,可以本地进行推理,然后通过https://medbench.opencompass.org.cn/home提交结果进行评测。
from opencompass.
是否 gt 和 reference 参数放反了?怀疑是 Medbench 的 eval 方法写错了,或者数据有问题?我下载的数据是name_test.json,code base 是 name.json,我的没有 answer,不过我看官网确实也没有 answer。还有不是很明白,我的数据中没有problem_input列
{"question": "请回答以下单选题。要求只输出选项,不输出解释:\n儿童可以使用八子补肾胶囊吗?\n(A)是\n(B)否。\n答:", "options": ["(A)是", "(B)否"], "answer": null, "other": {"source": "DrugCA", "id": 1}} {"question": "请回答以下单选题。要求只输出选项,不输出解释:\n儿童可以使用八子补肾胶囊吗?\n(A)否\n(B)是。\n答:", "options": ["(A)否", "(B)是"], "answer": null, "other": {"source": "DrugCA", "id": 1}}medbench_reader_cfg = dict( input_columns=['problem_input'], output_column='label')但是代码中说读取这个列
您好,测试数据的答案是不对外放的。后续如果我们释放带答案的开发集才能在本地进行评测。如果想进行评测,可以本地进行推理,然后通过https://medbench.opencompass.org.cn/home提交结果进行评测。
感谢。Opencompass 真挺好用的。Salute!
from opencompass.
Related Issues (20)
- [Bug] Official website ranking page, unable to view configuration items when hovering over scores with mouse HOT 3
- [Bug] 运行成功后在summary文件夹中的结果都是为空的
- 多卡推理,内存溢出[Bug] HOT 3
- [Feature] Add some examples in the documentation of how to sandbox the humaneval code execution
- [Bug] CMB Dataset HOT 2
- [Feature] Why is the leaderboard called "Multi-modal Modal Leaderboard"? HOT 2
- Is qwen1.5 supported? HOT 2
- [Bug] 无法测评openai接口格式部署的模型 HOT 5
- [Bug] TypeError: Fields of type "<class 'typing.IO'>" are not supported.
- [Bug] FileNotFoundError: Couldn't find a module script at xxx/accuracy/accuracy.py. Module 'accuracy' doesn't exist on the Hugging Face Hub either. HOT 1
- [Bug] 新增API模型报错 KeyError: 'opencompass.models.xxx is not in the opencompass::model registry. HOT 2
- [Bug] mmlu_gen评测日志college_chemistry中只有100道题,而mmlu本地/data目录下(zip解压)数据集college_chemistry则有116道题,少了16道题 HOT 9
- 关于 医疗方面 MedBench, 在连接模型测试时的问题 HOT 4
- [Bug] 评测mbpp数据集时,infer过程报错TypeError: can only concatenate tuple (not "str") to tuple
- [Bug] 'NoneType' object cannot be interpreted as an integer HOT 1
- [Bug] Evaluations on Mistral-7B-v0.1 couldn't be reproduced HOT 1
- [Bug] KeyError: 'opencompass.openicl.icl_evaluator.TEvalEvaluator is not in the opencompass::icl_evaluators registry. HOT 2
- [Bug] KeyError: 'path' when executing python run.py configs/multimodal/tasks.py --mm-eval HOT 1
- 关于大海捞针测试的问题
- [Bug] 🐛 Type Err in Humaneval
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from opencompass.