-
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Paper • 2309.03895 • Published • 14 -
ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning
Paper • 2309.16650 • Published • 10 -
CCEdit: Creative and Controllable Video Editing via Diffusion Models
Paper • 2309.16496 • Published • 9 -
FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling
Paper • 2310.15169 • Published • 10
Collections
Discover the best community collections!
Collections trending this week
-
Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts
Paper • 2309.04354 • Published • 14 -
Vision Transformers Need Registers
Paper • 2309.16588 • Published • 78 -
AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models
Paper • 2309.16414 • Published • 19 -
MotionLM: Multi-Agent Motion Forecasting as Language Modeling
Paper • 2309.16534 • Published • 15
-
LEAP Hand: Low-Cost, Efficient, and Anthropomorphic Hand for Robot Learning
Paper • 2309.06440 • Published • 9 -
Robotic Table Tennis: A Case Study into a High Speed Learning System
Paper • 2309.03315 • Published • 7 -
Video Language Planning
Paper • 2310.10625 • Published • 10 -
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
Paper • 2311.01455 • Published • 29
-
Uncovering mesa-optimization algorithms in Transformers
Paper • 2309.05858 • Published • 12 -
ProPainter: Improving Propagation and Transformer for Video Inpainting
Paper • 2309.03897 • Published • 27 -
Approximating Two-Layer Feedforward Networks for Efficient Transformers
Paper • 2310.10837 • Published • 11 -
CLEX: Continuous Length Extrapolation for Large Language Models
Paper • 2310.16450 • Published • 10
-
Natural Language Supervision for General-Purpose Audio Representations
Paper • 2309.05767 • Published • 9 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 26 -
FoleyGen: Visually-Guided Audio Generation
Paper • 2309.10537 • Published • 8 -
Toward Joint Language Modeling for Speech Units and Text
Paper • 2310.08715 • Published • 8
-
PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models
Paper • 2309.05793 • Published • 50 -
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation
Paper • 2309.06380 • Published • 32 -
ImageBind-LLM: Multi-modality Instruction Tuning
Paper • 2309.03905 • Published • 17 -
DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models
Paper • 2309.06933 • Published • 12