Visual Representation Learning with Stochastic Frame Prediction Paper • 2406.07398 • Published Jun 11, 2024 • 1
Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction Paper • 2411.14762 • Published Nov 22, 2024 • 11
Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction Paper • 2411.14762 • Published Nov 22, 2024 • 11
Meta-Transformer: A Unified Framework for Multimodal Learning Paper • 2307.10802 • Published Jul 20, 2023 • 43
Collaborative Score Distillation for Consistent Visual Synthesis Paper • 2307.04787 • Published Jul 4, 2023 • 28