microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • Updated about 4 hours ago • 7.35k • 513
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data Paper • 2502.14397 • Published 8 days ago • 35
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Paper • 2502.13128 • Published 10 days ago • 35
Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening Paper • 2502.12146 • Published 11 days ago • 15 • 3