Multi-view Consistent 3D Gaussian Head Avatars 'without' Multi-view Generation Paper • 2605.25220 • Published 8 days ago • 4
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 12 days ago • 204
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published 26 days ago • 101
UniDoc-RL: Coarse-to-Fine Visual RAG with Hierarchical Actions and Dense Rewards Paper • 2604.14967 • Published Apr 16 • 15
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 102
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 291
Faithful GRPO: Improving Visual Spatial Reasoning in Multimodal Language Models via Constrained Policy Optimization Paper • 2604.08476 • Published Apr 9 • 8
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 630