Spaces:

bigcode
/

README

Running

BigCodeBench: No instructions for evaluating for pass@k where k is greator than 1

#12

by JohnLins - opened Sep 14, 2024

Sep 14, 2024

I generated a file called model--bigcodebench-complete--vllm-0.8-2.jsonl where n samples is 2, so for each task there are two solutions.
Something like this:
{"solution": "...", "task_id": "BigCodeBench/0"}
{"solution": "...", "task_id": "BigCodeBench/0"}
{"solution": "...", "task_id": "BigCodeBench/1"}
{"solution": "...", "task_id": "BigCodeBench/1"}
...

But when I run bigcodebench.evaluate --split complete --subset full --samples model--bigcodebench-complete--vllm-0.8-2.jsonl

it gives me a pass@1 score, not pass@2

JohnLins changed discussion status to closed Sep 14, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment