Olmo 3 Pre-training Collection All artifacts related to Olmo 3 pre-training • 10 items • Updated Dec 23, 2025 • 36
Nemotron v3 Pre-Training Collection Large scale pre-training datasets used in the Nemotron family of models. • 11 items • Updated 9 days ago • 17
SmolLM3 pretraining datasets Collection datasets used in SmolLM3 pretraining • 15 items • Updated Aug 12, 2025 • 51
Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset Paper • 2508.15096 • Published Aug 20, 2025 • 9
view article Article SmolLM3: smol, multilingual, long-context reasoner +21 eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf • Jul 8, 2025 • 777
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 12 items • Updated 9 days ago • 149
🧠 SmolLM3 Collection Smol, multilingual, long-context reasoner • 14 items • Updated Oct 9, 2025 • 104
Running Agents 13 Training Time Calculator 🚀 13 Calculates the amount of time it will take to train a model