iFormer: Integrating ConvNet and Transformer for Mobile Application Paper • 2501.15369 • Published 10 days ago • 10
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 29 days ago • 293
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 10 items • Updated 6 days ago • 85
🧠 Abliteration Collection Uncensored models using abliteration. See this article for more information: huggingface.co/blog/mlabonne/abliteration • 7 items • Updated Nov 18, 2024 • 27
Transcription Collection Transcribe interviews for free with Whisper in Spaces. • 10 items • Updated Oct 1, 2024 • 8
Mantis Collection Mantis model family optimized for multi-image reasoning with interleaved text/image format • 11 items • Updated Jul 2, 2024 • 9
PHAnToM: Personality Has An Effect on Theory-of-Mind Reasoning in Large Language Models Paper • 2403.02246 • Published Mar 4, 2024 • 1
Refining Text-to-Image Generation: Towards Accurate Training-Free Glyph-Enhanced Image Generation Paper • 2403.16422 • Published Mar 25, 2024 • 1
Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians Paper • 2403.17898 • Published Mar 26, 2024 • 15
GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers Paper • 2409.04196 • Published Sep 6, 2024 • 14
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency Paper • 2409.02634 • Published Sep 4, 2024 • 92
Sapiens Collection Foundation models for human tasks. Code: https://github.com/facebookresearch/sapiens • 72 items • Updated Sep 18, 2024 • 53
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24, 2024 • 183