BigCodeBench: No instructions for evaluating for pass@k where k is greator than 1

#12
by JohnLins - opened

I generated a file called model--bigcodebench-complete--vllm-0.8-2.jsonl where n samples is 2, so for each task there are two solutions.
Something like this:
{"solution": "...", "task_id": "BigCodeBench/0"}
{"solution": "...", "task_id": "BigCodeBench/0"}
{"solution": "...", "task_id": "BigCodeBench/1"}
{"solution": "...", "task_id": "BigCodeBench/1"}
...

But when I run bigcodebench.evaluate --split complete --subset full --samples model--bigcodebench-complete--vllm-0.8-2.jsonl

it gives me a pass@1 score, not pass@2

JohnLins changed discussion status to closed

Sign up or log in to comment