1 10

Pierre Erbacher

erbacher

AI & ML interests

None yet

Recent Activity

published an article about 1 month ago

Distribution Matching Prevents Mode Collapse in Training Reasoning Models

updated a model 6 months ago

erbacher/Qwen2.5-Kimina-1.7B-SFT

updated a dataset 6 months ago

erbacher/trl-NuminaMath-LEAN

View all activity

Organizations

published an article about 1 month ago

Article

Distribution Matching Prevents Mode Collapse in Training Reasoning Models

Mar 17

•

updated a model 6 months ago

erbacher/Qwen2.5-Kimina-1.7B-SFT

Text Generation • 2B • Updated Nov 10, 2025 • 6

updated a dataset 6 months ago

erbacher/trl-NuminaMath-LEAN

Viewer • Updated Nov 7, 2025 • 9.48k • 8

published a model 6 months ago

erbacher/Qwen2.5-Kimina-1.7B-SFT

Text Generation • 2B • Updated Nov 10, 2025 • 6

liked a Space 6 months ago

The Smol Training Playbook

📚

3.13k

The secrets to building world-class LLMs

published a dataset 7 months ago

erbacher/LeanRank-test

Viewer • Updated Sep 29, 2025 • 6.57k • 10

updated 3 datasets 7 months ago

published 2 datasets 8 months ago

erbacher/LeanRank-corpus

Viewer • Updated Sep 29, 2025 • 249k • 4

erbacher/LeanRank-data

Viewer • Updated Sep 29, 2025 • 2.09M • 23

published a dataset 9 months ago

erbacher/trl-NuminaMath-LEAN

Viewer • Updated Nov 7, 2025 • 9.48k • 8

upvoted a paper 10 months ago

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30, 2025 • 51

updated a dataset 11 months ago

erbacher/MATH_TTT

Viewer • Updated Jun 14, 2025 • 12k • 9

updated a model about 1 year ago

erbacher/wiki_categories

Updated Mar 7, 2025

published a model about 1 year ago

erbacher/wiki_categories

Updated Mar 7, 2025

updated a dataset about 1 year ago

erbacher/open-math-instruct-steps

Updated Mar 4, 2025 • 8

published a dataset about 1 year ago

erbacher/open-math-instruct-steps

Updated Mar 4, 2025 • 8

liked a Space about 1 year ago

The Ultra-Scale Playbook

🌌

3.82k

The ultimate guide to training LLM on large GPU Clusters

updated a model over 1 year ago

erbacher/Llama-3.2-Tulu-3-1B-SFT

1B • Updated Jan 2, 2025 • 1

Pierre Erbacher

AI & ML interests

Recent Activity

Organizations

erbacher's activity

Distribution Matching Prevents Mode Collapse in Training Reasoning Models

The Smol Training Playbook

The Ultra-Scale Playbook