Running 110 Unlocking On-Policy Distillation for Any Model Family π 110 Visualize onβpolicy distillation token alignment
Running Agents 7 Dataset Length Profiler π 7 Estimate optimal max_length for SFT training with token analysis
Running 3.88k The Ultra-Scale Playbook π 3.88k The ultimate guide to training LLM on large GPU Clusters
Running Agents 88 Large Reasoning Models Leaderboard π³ 88 A leaderboard to rank large reasoning models
Running 600 Scaling test-time compute π 600 Boost LLM answers with flexible testβtime search strategies
Running Agents 431 Reward Bench Leaderboard π 431 Explore and compare model scores on RewardBench benchmarks
HuggingFaceH4/zephyr-7b-alpha Text Generation β’ 7B β’ Updated Oct 16, 2024 β’ 4.47k β’ β’ 1.12k
Running on CPU Upgrade 14k Open LLM Leaderboard π 14k Track, rank and evaluate open LLMs and chatbots