Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 5 days ago • 20
FRNet: Frustum-Range Networks for Scalable LiDAR Segmentation Paper • 2312.04484 • Published Dec 7, 2023
LiMoE: Mixture of LiDAR Representation Learners from Automotive Scenes Paper • 2501.04004 • Published 5 days ago • 1
Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives Paper • 2501.04003 • Published 5 days ago • 20
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving Paper • 2501.04005 • Published 5 days ago
OVGaussian: Generalizable 3D Gaussian Segmentation with Open Vocabularies Paper • 2501.00326 • Published 12 days ago • 1
Diffusion as Shader: 3D-aware Video Diffusion for Versatile Video Generation Control Paper • 2501.03847 • Published 5 days ago • 17
Efficient Diffusion Model for Image Restoration by Residual Shifting Paper • 2403.07319 • Published Mar 12, 2024
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration Paper • 2501.01320 • Published 10 days ago • 10
view post Post 4654 Google drops Gemini 2.0 Flash Thinkinga new experimental model that unlocks stronger reasoning capabilities and shows its thoughts. The model plans (with thoughts visible), can solve complex problems with Flash speeds, and morenow available in anychat, try it out: akhaliq/anychat See translation 🚀 6 6 🔥 4 4 👀 1 1 + Reply
Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution Paper • 2312.06640 • Published Dec 11, 2023 • 46
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper • 2412.09645 • Published Dec 10, 2024 • 35
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models Paper • 2412.09645 • Published Dec 10, 2024 • 35
Arbitrary-steps Image Super-resolution via Diffusion Inversion Paper • 2412.09013 • Published Dec 12, 2024 • 11
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published about 1 month ago • 20
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion Paper • 2412.09626 • Published about 1 month ago • 20
ObjCtrl-2.5D: Training-free Object Control with Camera Poses Paper • 2412.07721 • Published Dec 10, 2024 • 8
FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models Paper • 2412.07674 • Published Dec 10, 2024 • 20
FlexEvent: Event Camera Object Detection at Arbitrary Frequencies Paper • 2412.06708 • Published Dec 9, 2024
Unsupervised Video Domain Adaptation for Action Recognition: A Disentanglement Perspective Paper • 2208.07365 • Published Aug 15, 2022