HuggingFaceTB/SmolVLM2-500M-Video-Instruct Image-Text-to-Text • 0.5B • Updated Apr 8, 2025 • 666k • 153
Runtime error Agents Featured 104 Phased Consistency Model PCM 🐠 104 Generate images from text prompts
Running on CPU Upgrade Featured 962 TTS Arena V2 🗣 962 Compare TTS voices and vote for the more human‑sounding one
Build error Agents Featured 259 YOLO-World + EfficientSAM 🔥 259 Detect and segment objects in images or videos