deepseek-mla / README.md

Commit History

Update README.md: clarify this is an attention implementation, not a trained model
f628f42

bird-of-paradise commited on

Initial commit: DeepSeek Multi-Latent Attention implementation
550eb56

Yan Wei commited on