Commit
·
098730b
1
Parent(s):
1919884
Fix: Rename to Multi-Head Latent Attention
Browse files- README.md +3 -3
- insights/architecture.md +1 -1
README.md
CHANGED
@@ -13,7 +13,7 @@ license: mit
|
|
13 |
|
14 |
# DeepSeek Multi-Latent Attention
|
15 |
|
16 |
-
This repository provides a PyTorch implementation of the Multi-Latent Attention (MLA) mechanism introduced in the DeepSeek-V2 paper. **This is not a trained model, but rather a modular attention implementation** that significantly reduces KV cache for efficient inference while maintaining model performance through its innovative architecture. It can be used as a drop-in attention module in transformer architectures.
|
17 |
|
18 |
## Key Features
|
19 |
|
@@ -33,10 +33,10 @@ Or download directly from the HuggingFace repository page.
|
|
33 |
|
34 |
```python
|
35 |
import torch
|
36 |
-
from src.mla import
|
37 |
|
38 |
# Initialize MLA
|
39 |
-
mla =
|
40 |
d_model=512, # Model dimension
|
41 |
num_head=8, # Number of attention heads
|
42 |
d_embed=512, # Embedding dimension
|
|
|
13 |
|
14 |
# DeepSeek Multi-Latent Attention
|
15 |
|
16 |
+
This repository provides a PyTorch implementation of the Multi-Head Latent Attention (MLA) mechanism introduced in the DeepSeek-V2 paper. **This is not a trained model, but rather a modular attention implementation** that significantly reduces KV cache for efficient inference while maintaining model performance through its innovative architecture. It can be used as a drop-in attention module in transformer architectures.
|
17 |
|
18 |
## Key Features
|
19 |
|
|
|
33 |
|
34 |
```python
|
35 |
import torch
|
36 |
+
from src.mla import MultiHeadLatentAttention
|
37 |
|
38 |
# Initialize MLA
|
39 |
+
mla = MultiHeadLatentAttention(
|
40 |
d_model=512, # Model dimension
|
41 |
num_head=8, # Number of attention heads
|
42 |
d_embed=512, # Embedding dimension
|
insights/architecture.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1 |
-
# Advanced Insights: Multi-Latent Attention Architecture
|
2 |
|
3 |
## Key Architectural Innovations
|
4 |
|
|
|
1 |
+
# Advanced Insights: Multi-Head Latent Attention Architecture
|
2 |
|
3 |
## Key Architectural Innovations
|
4 |
|