view article Article How we OCR'ed 30,000 papers using Codex, open OCR models and Jobs 21 days ago • 59
TutorBench: A Benchmark To Assess Tutoring Capabilities Of Large Language Models Paper • 2510.02663 • Published Oct 3, 2025 • 2
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published 20 days ago • 187
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI Paper • 2311.16502 • Published Nov 27, 2023 • 40
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Image-Text-to-Text • 28B • Updated 22 days ago • 470k • 2.8k
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors Paper • 2502.18940 • Published Feb 26, 2025 • 3
From Problem-Solving to Teaching Problem-Solving: Aligning LLMs with Pedagogy using Reinforcement Learning Paper • 2505.15607 • Published May 21, 2025 • 4