view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 merve, pcuenq, sergiopaniego, burtenshaw, Steveeeeeeen, alvarobartt, SaylorTwift • Apr 2 • 897
view article Article Strand-Rust-Coder-v1: Rust Coding Model Fine-Tuned on Peer-Ranked Synthetic Data Fortytwo-Network • Dec 11, 2025 • 8
view article Article Custom Kernels for All from Codex and Claude +2 burtenshaw, sayakpaul, ariG23498, evalstate • Feb 13 • 78
ResearchRubrics: A Benchmark of Prompts and Rubrics For Evaluating Deep Research Agents Paper • 2511.07685 • Published Nov 10, 2025 • 10
Rubric-Based Benchmarking and Reinforcement Learning for Advancing LLM Instruction Following Paper • 2511.10507 • Published Nov 13, 2025 • 10
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7, 2025 • 57
Granite 4.0 Language Models Collection Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 11 items • Updated 22 days ago • 221
ParallelMuse: Agentic Parallel Thinking for Deep Information Seeking Paper • 2510.24698 • Published Oct 28, 2025 • 21
C2S-Scale-Gemma-Models Collection C2S-Scale Gemma models trained using the Cell2Sentence framework, described in the C2S-Scale paper. • 2 items • Updated Oct 13, 2025 • 13
view article Article Gaia2 and ARE: Empowering the community to study agents +9 clefourrier, gregmialz, mlcu, mortimerp9, XciD, tfrere, evijit, RomainFroger, dheeraj7596, CarolinePascal, upiter • Sep 22, 2025 • 134