GRASP: Learning to Ground Social Reasoning in Multi-Person Non-Verbal Interactions Paper • 2605.15764 • Published 11 days ago • 3
Learning to Foresee: Unveiling the Unlocking Efficiency of On-Policy Distillation Paper • 2605.11739 • Published 13 days ago • 58
End-to-End Autoregressive Image Generation with 1D Semantic Tokenizer Paper • 2605.00503 • Published 25 days ago • 11
DiPO: Disentangled Perplexity Policy Optimization for Fine-grained Exploration-Exploitation Trade-Off Paper • 2604.13902 • Published Apr 15 • 62
DARE: Diffusion Large Language Models Alignment and Reinforcement Executor Paper • 2604.04215 • Published Apr 5 • 21