Distilling Vision-Language Models on Millions of Videos Paper • 2401.06129 • Published Jan 11, 2024 • 16