GHOST 2.0: generative high-fidelity one shot transfer of heads Paper • 2502.18417 • Published 3 days ago • 56
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers Paper • 2502.15007 • Published 8 days ago • 151
Magma: A Foundation Model for Multimodal AI Agents Paper • 2502.13130 • Published 10 days ago • 47
You Do Not Fully Utilize Transformer's Representation Capacity Paper • 2502.09245 • Published 15 days ago • 33
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Paper • 2502.03032 • Published 23 days ago • 55
The Differences Between Direct Alignment Algorithms are a Blur Paper • 2502.01237 • Published 25 days ago • 111
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper • 2501.17161 • Published about 1 month ago • 108
SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding Paper • 2412.09604 • Published Dec 12, 2024 • 35
Mechanistic Permutability: Match Features Across Layers Paper • 2410.07656 • Published Oct 10, 2024 • 18
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated 18 days ago • 297
XGen-MM-1 models and datasets Collection A collection of all XGen-MM (Foundation LMM) models! • 18 items • Updated 10 days ago • 38
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30, 2024 • 47
Visual Scorers! Collection Variants of Visual Evaluation Models proposed by [Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-defined Levels]. Use by `model.score()`! • 10 items • Updated Dec 2, 2024 • 3
Gemma 2 2B Release Collection The 2.6B parameter version of Gemma 2. • 6 items • Updated Dec 13, 2024 • 78
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents Paper • 2407.18901 • Published Jul 26, 2024 • 33
Theia: Distilling Diverse Vision Foundation Models for Robot Learning Paper • 2407.20179 • Published Jul 29, 2024 • 47
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published Jul 26, 2024 • 32
WebUI (CHI 2023) Collection Learning Mobile User Interface Representation with Web Semantics • 23 items • Updated Nov 1, 2024 • 5