Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images Paper • 2308.16582 • Published Aug 31, 2023 • 10
DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation Paper • 2310.13119 • Published Oct 19, 2023 • 11
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior Paper • 2310.16818 • Published Oct 25, 2023 • 30
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction Paper • 2310.20700 • Published Oct 31, 2023 • 9
Controlling Text-to-Image Diffusion by Orthogonal Finetuning Paper • 2306.07280 • Published Jun 12, 2023 • 20
DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model Paper • 2311.09217 • Published Nov 15, 2023 • 21
One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion Paper • 2311.07885 • Published Nov 14, 2023 • 39
Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning Paper • 2311.10709 • Published Nov 17, 2023 • 24
LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching Paper • 2311.11284 • Published Nov 19, 2023 • 16
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models Paper • 2311.12092 • Published Nov 20, 2023 • 20
Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models Paper • 2311.13141 • Published Nov 22, 2023 • 12
ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs Paper • 2311.13600 • Published Nov 22, 2023 • 41
LEDITS++: Limitless Image Editing using Text-to-Image Models Paper • 2311.16711 • Published Nov 28, 2023 • 20
GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs Paper • 2312.00093 • Published Nov 30, 2023 • 14
CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation Paper • 2311.18775 • Published Nov 30, 2023 • 6
VideoBooth: Diffusion-based Video Generation with Image Prompts Paper • 2312.00777 • Published Dec 1, 2023 • 20
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models Paper • 2312.00079 • Published Nov 30, 2023 • 14
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training Paper • 2312.01663 • Published Dec 4, 2023 • 3
HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting Paper • 2312.03461 • Published Dec 6, 2023 • 15
Cache Me if You Can: Accelerating Diffusion Models through Block Caching Paper • 2312.03209 • Published Dec 6, 2023 • 17
TokenCompose: Grounding Diffusion with Token-level Supervision Paper • 2312.03626 • Published Dec 6, 2023 • 5
HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces Paper • 2312.03160 • Published Dec 5, 2023 • 5
MotionCtrl: A Unified and Flexible Motion Controller for Video Generation Paper • 2312.03641 • Published Dec 6, 2023 • 20
LooseControl: Lifting ControlNet for Generalized Depth Conditioning Paper • 2312.03079 • Published Dec 5, 2023 • 12
Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors Paper • 2312.04963 • Published Dec 7, 2023 • 16
3D-LLM: Injecting the 3D World into Large Language Models Paper • 2307.12981 • Published Jul 24, 2023 • 35
DreaMoving: A Human Dance Video Generation Framework based on Diffusion Models Paper • 2312.05107 • Published Dec 8, 2023 • 38
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution Paper • 2312.06640 • Published Dec 11, 2023 • 45
NeRFiller: Completing Scenes via Generative 3D Inpainting Paper • 2312.04560 • Published Dec 7, 2023 • 11
Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation Paper • 2312.07231 • Published Dec 12, 2023 • 6
Clockwork Diffusion: Efficient Generation With Model-Step Distillation Paper • 2312.08128 • Published Dec 13, 2023 • 12
CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor Paper • 2312.07661 • Published Dec 12, 2023 • 16
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation Paper • 2312.08754 • Published Dec 14, 2023 • 6
Holodeck: Language Guided Generation of 3D Embodied AI Environments Paper • 2312.09067 • Published Dec 14, 2023 • 13
FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection Paper • 2312.09252 • Published Dec 14, 2023 • 9
SHAP-EDITOR: Instruction-guided Latent 3D Editing in Seconds Paper • 2312.09246 • Published Dec 14, 2023 • 5
SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance Paper • 2312.08889 • Published Dec 13, 2023 • 11
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models Paper • 2312.09608 • Published Dec 15, 2023 • 13
Stable Score Distillation for High-Quality 3D Generation Paper • 2312.09305 • Published Dec 14, 2023 • 7
Dream2Real: Zero-Shot 3D Object Rearrangement with Vision-Language Models Paper • 2312.04533 • Published Dec 7, 2023 • 1
Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method Paper • 2312.12030 • Published Dec 19, 2023 • 4
TIP: Text-Driven Image Processing with Semantic and Restoration Instructions Paper • 2312.11595 • Published Dec 18, 2023 • 5
MixRT: Mixed Neural Representations For Real-Time NeRF Rendering Paper • 2312.11841 • Published Dec 19, 2023 • 10
Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior Paper • 2312.11535 • Published Dec 15, 2023 • 5
GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning Paper • 2312.11461 • Published Dec 18, 2023 • 18
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting Paper • 2312.13271 • Published Dec 20, 2023 • 4
SpecNeRF: Gaussian Directional Encoding for Specular Reflections Paper • 2312.13102 • Published Dec 20, 2023 • 5
InstructVideo: Instructing Video Diffusion Models with Human Feedback Paper • 2312.12490 • Published Dec 19, 2023 • 17
StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation Paper • 2312.12491 • Published Dec 19, 2023 • 69
Splatter Image: Ultra-Fast Single-View 3D Reconstruction Paper • 2312.13150 • Published Dec 20, 2023 • 14
UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections Paper • 2312.13285 • Published Dec 20, 2023 • 5
LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset Paper • 2312.12418 • Published Dec 19, 2023 • 2
Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation Paper • 2312.13469 • Published Dec 20, 2023 • 10
Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models Paper • 2312.13913 • Published Dec 21, 2023 • 22
HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models Paper • 2312.14091 • Published Dec 21, 2023 • 15
HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs Paper • 2312.14140 • Published Dec 21, 2023 • 6
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis Paper • 2312.13834 • Published Dec 20, 2023 • 26
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning Paper • 2312.13980 • Published Dec 21, 2023 • 13
ControlRoom3D: Room Generation using Semantic Proxy Rooms Paper • 2312.05208 • Published Dec 8, 2023 • 8
DyBluRF: Dynamic Deblurring Neural Radiance Fields for Blurry Monocular Video Paper • 2312.13528 • Published Dec 21, 2023 • 6
DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis Paper • 2312.13016 • Published Dec 20, 2023 • 6
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors Paper • 2312.13324 • Published Dec 20, 2023 • 9
MACS: Mass Conditioned 3D Hand and Object Motion Synthesis Paper • 2312.14929 • Published Dec 22, 2023 • 4
City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web Paper • 2312.16457 • Published Dec 27, 2023 • 13
Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis Paper • 2312.16812 • Published Dec 28, 2023 • 9
DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors Paper • 2312.16837 • Published Dec 28, 2023 • 5
PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion Paper • 2312.16486 • Published Dec 27, 2023 • 6
I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models Paper • 2312.16693 • Published Dec 27, 2023 • 13
Prompt Expansion for Adaptive Text-to-Image Generation Paper • 2312.16720 • Published Dec 27, 2023 • 5
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data Paper • 2401.01173 • Published Jan 2 • 11
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields Paper • 2401.01647 • Published Jan 3 • 12
Instruct-Imagen: Image Generation with Multi-modal Instruction Paper • 2401.01952 • Published Jan 3 • 30
CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection Paper • 2310.02960 • Published Oct 4, 2023 • 1
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation Paper • 2401.04092 • Published Jan 8 • 20
PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models Paper • 2401.05252 • Published Jan 10 • 45
InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes Paper • 2401.05335 • Published Jan 10 • 26
PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding Paper • 2312.04461 • Published Dec 7, 2023 • 56
HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation Paper • 2401.07727 • Published Jan 15 • 8
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation Paper • 2401.08559 • Published Jan 16 • 8
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects Paper • 2401.09962 • Published Jan 18 • 7
SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers Paper • 2401.08740 • Published Jan 16 • 11
SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities Paper • 2401.12168 • Published Jan 22 • 24
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs Paper • 2401.11708 • Published Jan 22 • 28
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data Paper • 2401.10891 • Published Jan 19 • 58
Fast Registration of Photorealistic Avatars for VR Facial Animation Paper • 2401.11002 • Published Jan 19 • 1
Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation Paper • 2401.14257 • Published Jan 25 • 9
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation Paper • 2402.08682 • Published Feb 13 • 12
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Paper • 2402.10210 • Published Feb 15 • 29
CityDreamer: Compositional Generative Model of Unbounded 3D Cities Paper • 2309.00610 • Published Sep 1, 2023 • 18
Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation Paper • 2403.19319 • Published Mar 28 • 10
FruitNeRF: A Unified Neural Radiance Field based Fruit Counting Framework Paper • 2408.06190 • Published Aug 12 • 17
xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations Paper • 2408.12590 • Published 28 days ago • 33
Towards Realistic Example-based Modeling via 3D Gaussian Stitching Paper • 2408.15708 • Published 23 days ago • 7
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Paper • 2409.02095 • Published 16 days ago • 32
Open-MAGVIT2: An Open-Source Project Toward Democratizing Auto-regressive Visual Generation Paper • 2409.04410 • Published 13 days ago • 23
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models Paper • 2409.07452 • Published 8 days ago • 18
Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think Paper • 2409.11355 • Published 2 days ago • 24