DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 5 days ago • 197
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Paper • 2605.17757 • Published 7 days ago • 62
StableVLA: Towards Robust Vision-Language-Action Models without Extra Data Paper • 2605.18287 • Published 7 days ago • 15
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 13 days ago • 190
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 12 days ago • 264
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation Paper • 2605.10912 • Published 14 days ago • 45
Skill1: Unified Evolution of Skill-Augmented Agents via Reinforcement Learning Paper • 2605.06130 • Published 18 days ago • 110
From Context to Skills: Can Language Models Learn from Context Skillfully? Paper • 2604.27660 • Published 22 days ago • 162
WorldMark: A Unified Benchmark Suite for Interactive Video World Models Paper • 2604.21686 • Published Apr 23 • 36
LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model Paper • 2604.20796 • Published Apr 22 • 242