Malav Warke

malavwarke

https://www.creatosaurus.io/

AI & ML interests

https://www.creatosaurus.io/

Recent Activity

liked a model 3 days ago

deepseek-ai/DeepSeek-R1

liked a model 3 days ago

Wan-AI/Wan2.1-T2V-14B

liked a model 3 days ago

microsoft/Phi-4-multimodal-instruct

View all activity

Organizations

malavwarke's activity

upvoted a paper 17 days ago

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Paper • 2502.13128 • Published 20 days ago • 37

upvoted a paper 25 days ago

TextAtlas5M: A Large-scale Dataset for Dense Text Image Generation

Paper • 2502.07870 • Published 27 days ago • 43

upvoted a paper 27 days ago

On-device Sora: Enabling Diffusion-Based Text-to-Video Generation for Mobile Devices

Paper • 2502.04363 • Published Feb 5 • 12

upvoted an article about 1 month ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.15k

upvoted a paper about 1 month ago

FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Paper • 2501.12909 • Published Jan 22 • 68

upvoted 2 articles about 2 months ago

Article

SmolVLM Grows Smaller – Introducing the 250M & 500M Models!

Jan 23

• 147

Article

The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about...

•

Jan 20

• 63

upvoted a paper 2 months ago

Edicho: Consistent Image Editing in the Wild

Paper • 2412.21079 • Published Dec 30, 2024 • 23

upvoted 12 papers 3 months ago

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Paper • 2412.09626 • Published Dec 12, 2024 • 20

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Paper • 2412.09619 • Published Dec 12, 2024 • 25

UniReal: Universal Image Generation and Editing via Learning Real-world Dynamics

Paper • 2412.07774 • Published Dec 10, 2024 • 28

DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation

Paper • 2412.07589 • Published Dec 10, 2024 • 47

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Paper • 2412.04424 • Published Dec 5, 2024 • 61

TryOffDiff: Virtual-Try-Off via High-Fidelity Garment Reconstruction using Diffusion Models

Paper • 2411.18350 • Published Nov 27, 2024 • 27

CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models

Paper • 2411.18613 • Published Nov 27, 2024 • 52

Pathways on the Image Manifold: Image Editing via Video Generation

Paper • 2411.16819 • Published Nov 25, 2024 • 33

VideoRepair: Improving Text-to-Video Generation via Misalignment Evaluation and Localized Refinement

Paper • 2411.15115 • Published Nov 22, 2024 • 9