Extract video frames and upload them to Hugging Face
Start processing movies into frame zip files
Sync datasets efficiently
Generate captions for images using Florence-2