arxiv:2604.04771
Bin Wang
wanderkid
AI & ML interests
Computer Vision, Multimodal Large Language Model
Recent Activity
authored a paper about 13 hours ago
MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale liked a model 6 days ago
opendatalab/MinerU2.5-Pro-2604-1.2B authored a paper 9 days ago
TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition