ModalityDance/PhysTool-Bench
Viewer • Updated • 2.51k • 1.46k
We focus on Natural Language Processing and Multimodal Learning, exploring generative AI across different modalities.
Beyond APIs: Probing the Limits of MLLMs in Physical Tool Use
Optical Reasoning: Rethinking Images as an Expressive Reasoning Medium Beyond Text