Running 3.86k The Ultra-Scale Playbook 🌌 3.86k The ultimate guide to training LLM on large GPU Clusters
principled-intelligence/gemma-4-E2B-it-text-only Feature Extraction • 5B • Updated Apr 3 • 2.59k • 6
Qwen2.5 Collection Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 720
meta-llama/Meta-Llama-3-8B-Instruct Text Generation • 8B • Updated Jun 18, 2025 • 1.67M • • 4.53k
mistralai/Mistral-7B-Instruct-v0.2 Text Generation • 7B • Updated Jul 24, 2025 • 3.25M • • 3.15k
TheBloke/Mixtral-8x7B-Instruct-v0.1-GPTQ Text Generation • 47B • Updated Dec 14, 2023 • 5.61k • 141