facebook/ijepa_vith14_1k
Image Feature Extraction
•
0.6B
•
Updated
•
3.08k
•
15
None defined yet.
Inference-time Physics Alignment of Video Generative Models with Latent World Models
VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice