Reducing Transformer Key-Value Cache Size with Cross-Layer Attention Paper • 2405.12981 • Published May 21, 2024 • 32
Core ML Text Generation Collection [WIP] On-device LLMs https://huggingface.co./blog/swift-coreml-llm • 3 items • Updated Sep 7, 2023 • 3
view article Article Halo: Open Source Health Tracking with Wearables By cyrilzakka • Nov 19, 2024 • 107