MACE-Dance: Motion-Appearance Cascaded Experts for Music-Driven Dance Video Generation Paper • 2512.18181 • Published 8 days ago • 84
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published Apr 10 • 2
SpatialGenEval Collection [ICLR 2026] Everything in Its Place: Benchmarking Spatial Intelligence of Text-to-Image Models • 1 item • Updated 7 days ago
VGPO-RL Collection [ACL 2026] Visually-Guided Policy Optimization for Multimodal Reasoning • 3 items • Updated 7 days ago
HY-Embodied-0.5: Embodied Foundation Models for Real-World Agents Paper • 2604.07430 • Published Apr 8 • 187
LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics Paper • 2604.17295 • Published 26 days ago • 84
Elucidating the SNR-t Bias of Diffusion Probabilistic Models Paper • 2604.16044 • Published 28 days ago • 74
Visually-Guided Policy Optimization for Multimodal Reasoning Paper • 2604.09349 • Published Apr 10 • 2
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published Apr 9 • 289
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Paper • 2603.22212 • Published Mar 23 • 126
Video-CoE: Reinforcing Video Event Prediction via Chain of Events Paper • 2603.14935 • Published Mar 16 • 91
Geometry-Guided Reinforcement Learning for Multi-view Consistent 3D Scene Editing Paper • 2603.03143 • Published Mar 3 • 145