huawei-noah
/

MOASpec-Llama-3-8B-Instruct

Model card Files Files and versions Community

MatthieuZ commited on 5 days ago

Commit

2398649

·

verified ·

1 Parent(s): 1354fc3

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Mixture of Attentions for Speculative Decoding
-This is checkpoints obtained from "[Mixture of Attentions For Speculative Decoding](https://arxiv.org/abs/2410.03804)" by Matthieu Zimmer*, Milan Gritta*, Gerasimos Lampouras, Haitham Bou Ammar, and Jun Wang.
 The paper introduces a novel architecture for speculative decoding that enhances the speed of large language model (LLM) inference.
 It is supported in vLLM see our [Github repository](https://github.com/huawei-noah/HEBO/tree/mixture-of-attentions/).

 # Mixture of Attentions for Speculative Decoding
+This checkpoint was obtained from "[Mixture of Attentions For Speculative Decoding](https://arxiv.org/abs/2410.03804)" by Matthieu Zimmer*, Milan Gritta*, Gerasimos Lampouras, Haitham Bou Ammar, and Jun Wang.
 The paper introduces a novel architecture for speculative decoding that enhances the speed of large language model (LLM) inference.
 It is supported in vLLM see our [Github repository](https://github.com/huawei-noah/HEBO/tree/mixture-of-attentions/).