RL checkpoints of Octopus-8B and baselines of paper: Learning Self-Correction in Vision–Language Models via Rollout Augmentation
Yi Ding
Tuwhy
AI & ML interests
None yet
Organizations
Sherlock
Series model of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models"
-
Tuwhy/Llama-3.2V-11B-Sherlock-SFT
Image-Text-to-Text • 11B • Updated • 5 -
Tuwhy/Llama-3.2V-11B-Sherlock-Offline
Image-Text-to-Text • 11B • Updated • 3 -
Tuwhy/Llama-3.2V-11B-Sherlock-iter1
Image-Text-to-Text • 11B • Updated • 3 -
Tuwhy/Llama-3.2V-11B-Sherlock-iter2
Image-Text-to-Text • 11B • Updated • 11 • 2
MIRage
Official model collection of paper: Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models
Octopus
RL checkpoints of Octopus-8B and baselines of paper: Learning Self-Correction in Vision–Language Models via Rollout Augmentation
Sherlock
Series model of paper "Sherlock: Self-Correcting Reasoning in Vision-Language Models"
-
Tuwhy/Llama-3.2V-11B-Sherlock-SFT
Image-Text-to-Text • 11B • Updated • 5 -
Tuwhy/Llama-3.2V-11B-Sherlock-Offline
Image-Text-to-Text • 11B • Updated • 3 -
Tuwhy/Llama-3.2V-11B-Sherlock-iter1
Image-Text-to-Text • 11B • Updated • 3 -
Tuwhy/Llama-3.2V-11B-Sherlock-iter2
Image-Text-to-Text • 11B • Updated • 11 • 2
MIS
Official dataset collection of paper: Rethinking Bottleneck in Safety Fine-Tuning of Vision Language Models
MIRage
Official model collection of paper: Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models