KletterMix: Climbing Toward High-Quality German Pretraining Data Paper • 2606.03773 • Published 15 days ago • 21
KletterMix: Climbing Toward High-Quality German Pretraining Data Paper • 2606.03773 • Published 15 days ago • 21
Nemotron-Pre-Training-Datasets Collection Large scale pre-training datasets used in the Nemotron family of models. • 15 items • Updated 6 days ago • 164
Evaluation-Suite Collection Multilingual Evaluation Suite supporting 21 European Languages • 15 items • Updated Jan 8