WANG HAIBO's picture

WANG HAIBO

WHB139426

·

WHB139426

AI & ML interests

None yet

Recent Activity

authored a paper about 10 hours ago

Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding

authored a paper about 10 hours ago

Learning Geometric Representations from Videos for Spatial Intelligent Multimodal Large Language Models

updated a model 2 days ago

WHB139426/GeoVR

View all activity

Organizations

None yet

Papers 6

arxiv:2606.05833

arxiv:2604.00528

arxiv:2505.05467

arxiv:2410.03290

models 4

WHB139426/GeoVR

Video-Text-to-Text • Updated 2 days ago • 11

WHB139426/QAprompts

Updated Nov 20, 2024

WHB139426/Grounded-Video-LLM

Updated Nov 11, 2024 • 5

WHB139426/GCG

Updated Jul 16, 2024

datasets 3

WHB139426/Scannet

Viewer • Updated Apr 5 • 1k • 4.14k

WHB139426/Grounded-VideoLLM

Updated Apr 10, 2025 • 2.28k • 12

WHB139426/cc3m

Preview • Updated Jan 16, 2025 • 11