Spaces:
Running
Running
BigCodeBench: No instructions for evaluating for pass@k where k is greator than 1
#12
by JohnLins - opened
I generated a file called model--bigcodebench-complete--vllm-0.8-2.jsonl where n samples is 2, so for each task there are two solutions.
Something like this:
{"solution": "...", "task_id": "BigCodeBench/0"}
{"solution": "...", "task_id": "BigCodeBench/0"}
{"solution": "...", "task_id": "BigCodeBench/1"}
{"solution": "...", "task_id": "BigCodeBench/1"}
...
But when I run bigcodebench.evaluate --split complete --subset full --samples model--bigcodebench-complete--vllm-0.8-2.jsonl
it gives me a pass@1 score, not pass@2
JohnLins changed discussion status to closed