SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published 8 days ago • 181
HyperEyes: Dual-Grained Efficiency-Aware Reinforcement Learning for Parallel Multimodal Search Agents Paper • 2605.07177 • Published 12 days ago • 62
From Web to Pixels: Bringing Agentic Search into Visual Perception Paper • 2605.12497 • Published 8 days ago • 14
AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery Paper • 2604.25256 • Published 22 days ago • 29
DR^{3}-Eval: Towards Realistic and Reproducible Deep Research Evaluation Paper • 2604.14683 • Published Apr 16 • 36
OpenSpatial: A Principled Data Engine for Empowering Spatial Intelligence Paper • 2604.07296 • Published Apr 8 • 40
MolmoWeb: Open Visual Web Agent and Open Data for the Open Web Paper • 2604.08516 • Published Apr 9 • 43
Act Wisely: Cultivating Meta-Cognitive Tool Use in Agentic Multimodal Models Paper • 2604.08545 • Published Apr 9 • 41
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering Paper • 2604.08224 • Published Apr 9 • 51
KnowU-Bench: Towards Interactive, Proactive, and Personalized Mobile Agent Evaluation Paper • 2604.08455 • Published Apr 9 • 47
OpenVLThinkerV2: A Generalist Multimodal Reasoning Model for Multi-domain Visual Tasks Paper • 2604.08539 • Published Apr 9 • 49
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning Paper • 2604.04746 • Published Apr 8 • 72
FIPO: Eliciting Deep Reasoning with Future-KL Influenced Policy Optimization Paper • 2603.19835 • Published Mar 20 • 351