VideoGrain: Modulating Space-Time Attention for Multi-grained Video Editing Paper • 2502.17258 • Published 4 days ago • 58
view article Article PaliGemma 2 Mix - New Instruction Vision Language Models by Google 10 days ago • 60
SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features Paper • 2502.14786 • Published 8 days ago • 118
Temporal Preference Optimization Collection Temporal Preference Optimization for Long-form Video Understanding • 3 items • Updated Jan 19 • 4
Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models Paper • 2402.14207 • Published Feb 22, 2024 • 8
view article Article Introducing the Synthetic Data Generator - Build Datasets with Natural Language Dec 16, 2024 • 109
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • Jan 23 • 64
view article Article Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo Dec 23, 2024 • 39