Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 29.3k • 178 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 13.6k • 181 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 202 • 37 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16, 2025 • 410k • 2.15k • 10
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 15.1k • 162 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 119 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 15.1k • 162
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 119 • 5
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4, 2025 • 1.15k • 18 • 5 Conard/fortune-telling Viewer • Updated Feb 17, 2025 • 207 • 505 • 171
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • 33B • Updated Apr 8, 2025 • 161k • • 116
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 30 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6, 2025 • 70 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 73.5k • • 216
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 73.5k • • 216
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 4.74k • 668 open-r1/codeforces-cots Viewer • Updated Mar 28, 2025 • 254k • 7.81k • 222 nvidia/OpenCodeReasoning Viewer • Updated May 4, 2025 • 753k • 10.7k • 540 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17, 2025 • 2.16M • 2.46k • 57
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18, 2025 • 1.79M • 10.8k • 175 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 60.6k • 243 nvidia/OpenMathReasoning Viewer • Updated May 27, 2025 • 5.68M • 13.8k • 463 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22, 2025 • 719k • 1.56k • 20
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 19.9k • 393 bytedance-research/Web-Bench Viewer • Updated May 19, 2025 • 1k • 1.06k • 11 luzimu/WebGen-Bench Viewer • Updated Sep 29, 2025 • 6.77k • 374 • 3
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • 15B • Updated May 11, 2025 • 476 • • 682 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12, 2025 • 1.08M • • 2.02k
Pretrain Data Utils mlfoundations/fasttext-oh-eli5 Updated Aug 1, 2024 • 30 hkust-nlp/preselect-fasttext-classifier Text Classification • Updated Mar 6, 2025 • 70 • 8 HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 73.5k • • 216
HuggingFaceFW/fineweb-edu-classifier Text Classification • 0.1B • Updated Nov 17, 2024 • 73.5k • • 216
Coder SFT Data ise-uiuc/Magicoder-Evol-Instruct-110K Viewer • Updated Dec 28, 2023 • 111k • 29.3k • 178 theblackcat102/evol-codealpaca-v1 Viewer • Updated Mar 10, 2024 • 111k • 13.6k • 181 Multilingual-Multimodal-NLP/McEval-Instruct Viewer • Updated Jun 12, 2024 • 35.9k • 202 • 37 KodCode/KodCode-V1-SFT-4o Viewer • Updated Mar 16, 2025 • 410k • 2.15k • 10
Coder SFT Data (Long-COT ) nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 4.74k • 668 open-r1/codeforces-cots Viewer • Updated Mar 28, 2025 • 254k • 7.81k • 222 nvidia/OpenCodeReasoning Viewer • Updated May 4, 2025 • 753k • 10.7k • 540 nvidia/OpenCodeReasoning-2 Viewer • Updated May 17, 2025 • 2.16M • 2.46k • 57
Coder DPO argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 15.1k • 162 argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 119 • 5
argilla/ultrafeedback-binarized-preferences-cleaned Viewer • Updated Dec 11, 2023 • 60.9k • 15.1k • 162
argilla/ultrafeedback-multi-binarized-quality-preferences-cleaned Viewer • Updated Dec 11, 2023 • 155k • 119 • 5
Math SFT Data BytedTsinghua-SIA/DAPO-Math-17k Viewer • Updated Apr 18, 2025 • 1.79M • 10.8k • 175 nvidia/OpenMathInstruct-2 Viewer • Updated Nov 25, 2024 • 22M • 60.6k • 243 nvidia/OpenMathReasoning Viewer • Updated May 27, 2025 • 5.68M • 13.8k • 463 miromind-ai/MiroMind-M1-SFT-719K Viewer • Updated Jul 22, 2025 • 719k • 1.56k • 20
WebPage Related HuggingFaceM4/WebSight Viewer • Updated Mar 26, 2024 • 2.75M • 19.9k • 393 bytedance-research/Web-Bench Viewer • Updated May 19, 2025 • 1k • 1.06k • 11 luzimu/WebGen-Bench Viewer • Updated Sep 29, 2025 • 6.77k • 374 • 3
Funny Questions (Long-COT) JackGao/brain-teaser-chinese Viewer • Updated Mar 4, 2025 • 1.15k • 18 • 5 Conard/fortune-telling Viewer • Updated Feb 17, 2025 • 207 • 505 • 171
Coder Models agentica-org/DeepCoder-14B-Preview Text Generation • 15B • Updated May 11, 2025 • 476 • • 682 Qwen/Qwen2.5-Coder-32B-Instruct Text Generation • 33B • Updated Jan 12, 2025 • 1.08M • • 2.02k
Reasoning Model deepcogito/cogito-v1-preview-qwen-32B Text Generation • 33B • Updated Apr 8, 2025 • 161k • • 116