Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 13 days ago • 90
Sandboxed Coding Agents are Competitive Omni-modal Task Solvers Paper • 2606.00579 • Published 10 days ago • 1
AdaPlanBench: Evaluating Adaptive Planning in Large Language Model Agents under World and User Constraints Paper • 2606.05622 • Published 5 days ago • 40
You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories Paper • 2605.21468 • Published 20 days ago • 50
Rethinking Cross-Layer Information Routing in Diffusion Transformers Paper • 2605.20708 • Published 20 days ago • 109
AutoRubric-T2I: Robust Rule-Based Reward Model for Text-to-Image Alignment Paper • 2605.17602 • Published 20 days ago • 19