Comments (4)
Results reported in paper was only on the TACO-test split.
from taco.
不好意思,不知道是否是我的理解有误,论文中使用的是pass@k,我理解的是,pass@1就是生成1次代码,通过的个数再除以总的问题个数,如果是只在test上测的,test split一共1000个问题,以 codellama-7b-python 的 pass@1 为例,是9.32%,通过的总个数是93.2个?为什么会有小数呢?不知道是我哪里理解错了
from taco.
We compute Pass@k according to formula mentioned in Codex Paper.
I think it will help you :)
from taco.
原来如此!我对这个式子的理解有些问题,感谢您~
from taco.
Related Issues (11)
- compute_metric.py 好像有问题 HOT 3
- 更新后的评测框架似乎存在重大bug? HOT 2
- How to construct the appropriate output? HOT 1
- AttributeError: module 'inspect' has no attribute 'getargspec'. Did you mean: 'getargs'? HOT 1
- specific performance of gpt-4 HOT 1
- code-llama-7b-python精度对不上 HOT 3
- 数据污染问题 HOT 1
- 题目的难度是如何确定的?/ How is the difficulty level obtained? HOT 2
- 请问TACO数据集中是否包含了APPS和code_contest中的所有题目? HOT 1
- Finetuned Models HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from taco.