Nudging Beyond the Comfort Zone: Efficient Strategy-Guided Exploration for RLVR Paper • 2605.15726 • Published 10 days ago • 33
Video2GUI: Synthesizing Large-Scale Interaction Trajectories for Generalized GUI Agent Pretraining Paper • 2605.14747 • Published 11 days ago • 142
jackf857/qwen3-8b-base-new-dpo-hh-helpful-4xh200-batch-64-q_t-0.45-s_star-0.4-eta-1 Text Generation • 8B • Updated 24 days ago • 17 • 1
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 503
VideoZeroBench: Probing the Limits of Video MLLMs with Spatio-Temporal Evidence Verification Paper • 2604.01569 • Published Apr 2 • 13
GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning Paper • 2604.02721 • Published Apr 3 • 629
All Roads Lead to Rome: Incentivizing Divergent Thinking in Vision-Language Models Paper • 2604.00479 • Published Apr 1 • 69
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 342