WorldAct: Activating Monolithic 3D Worlds into Interactive-Ready Object-Centric Scenes Paper • 2605.15843 • Published 7 days ago • 5
ReactiveGWM: Steering NPC in Reactive Game World Models Paper • 2605.15256 • Published 8 days ago • 26
DexJoCo: A Benchmark and Toolkit for Task-Oriented Dexterous Manipulation on MuJoCo Paper • 2605.16257 • Published 7 days ago • 48
Sparse Autoencoders enable Robust and Interpretable Fine-tuning of CLIP models Paper • 2605.15961 • Published 7 days ago • 7
World Model for Robot Learning: A Comprehensive Survey Paper • 2605.00080 • Published 22 days ago • 16
Map2World: Segment Map Conditioned Text to 3D World Generation Paper • 2605.00781 • Published 21 days ago • 25
When Do Diffusion Models learn to Generate Multiple Objects? Paper • 2605.00273 • Published 22 days ago • 9
Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models Paper • 2602.24264 • Published Feb 27 • 14
Enhancing Multi-Image Understanding through Delimiter Token Scaling Paper • 2602.01984 • Published Feb 2 • 5
DISCO: Diversifying Sample Condensation for Efficient Model Evaluation Paper • 2510.07959 • Published Oct 9, 2025 • 15
Does Data Scaling Lead to Visual Compositional Generalization? Paper • 2507.07102 • Published Jul 9, 2025 • 2
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion Paper • 2507.06165 • Published Jul 8, 2025 • 60
High-Resolution Visual Reasoning via Multi-Turn Grounding-Based Reinforcement Learning Paper • 2507.05920 • Published Jul 8, 2025 • 12
Is Diversity All You Need for Scalable Robotic Manipulation? Paper • 2507.06219 • Published Jul 8, 2025 • 21