Falconss1
/

VideoThinker-R1-3B

Video-Text-to-Text

image-text-to-text

video-understanding

reinforcement-learning

question-answering

text-generation-inference

Model card Files Files and versions

Improve model card and link to paper

#1

by nielsr HF Staff - opened 21 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

This PR improves the model card for VideoThinker-R1-3B. Key changes include:

Linked the model to the corresponding research paper on Hugging Face: Beyond Perceptual Shortcuts: Causal-Inspired Debiasing Optimization for Generalizable Video Reasoning in Lightweight MLLMs.
Replaced the full paper abstract with a concise summary of the framework and results to improve readability.
Maintained and organized metadata tags for better discoverability.
Provided clear links to the official code repository.

Improve model card and link to paper14e95849

Falconss1 changed pull request status to merged 21 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment