RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques Paper • 2501.14492 • Published 13 days ago • 29
Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding Paper • 2401.15708 • Published Jan 28, 2024 • 12
StableIdentity: Inserting Anybody into Anywhere at First Sight Paper • 2401.15975 • Published Jan 29, 2024 • 18
SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning Paper • 2401.16013 • Published Jan 29, 2024 • 24
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models Paper • 2401.15947 • Published Jan 29, 2024 • 51
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling Paper • 2401.16380 • Published Jan 29, 2024 • 49
Generative Expressive Robot Behaviors using Large Language Models Paper • 2401.14673 • Published Jan 26, 2024 • 7
TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts Paper • 2401.14828 • Published Jan 26, 2024 • 9
SliceGPT: Compress Large Language Models by Deleting Rows and Columns Paper • 2401.15024 • Published Jan 26, 2024 • 72
From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities Paper • 2401.15071 • Published Jan 26, 2024 • 37
Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI Paper • 2401.14019 • Published Jan 25, 2024 • 23
Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All Paper • 2401.13795 • Published Jan 24, 2024 • 68
CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion Paper • 2401.14066 • Published Jan 25, 2024 • 10
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models Paper • 2401.13919 • Published Jan 25, 2024 • 30
BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models Paper • 2401.13974 • Published Jan 25, 2024 • 13
GALA: Generating Animatable Layered Assets from a Single Scan Paper • 2401.12979 • Published Jan 23, 2024 • 9
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study Paper • 2401.12789 • Published Jan 23, 2024 • 9
Multiview Equivariance Improves 3D Correspondence Understanding with Minimal Feature Finetuning Paper • 2411.19458 • Published Nov 29, 2024 • 5