Phi-4 Collection Phi-4 family of small language and multi-modal models. • 7 items • Updated about 8 hours ago • 83
olmOCR Collection olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 3 items • Updated 2 days ago • 46
Ovis2 Collection Our latest advancement in multi-modal large language models (MLLMs) • 8 items • Updated 12 days ago • 52
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 11 days ago • 89
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 11 items • Updated 16 days ago • 91
TransPixar: Advancing Text-to-Video Generation with Transparency Paper • 2501.03006 • Published Jan 6 • 23
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published Dec 6, 2024 • 50
Llama 3.3 Collection This collection hosts the transformers and original repos of the Llama 3.3 • 1 item • Updated Dec 6, 2024 • 136
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 143
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 74
Insight-V Collection Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models • 5 items • Updated Nov 22, 2024 • 9
WhisperNER Collection Collection of WhisperNER models for joint open type NER and ASR • 3 items • Updated Nov 30, 2024 • 6
OpenScholar_V1 Collection The set of models, index, data associated with the paper "OpenScholar: Synthesizing Scientific Literature with Retrieval-Augmented LMs". • 8 items • Updated Nov 22, 2024 • 33
LLaVA-o1: Let Vision Language Models Reason Step-by-Step Paper • 2411.10440 • Published Nov 15, 2024 • 114