Accelerating Large Language Model Decoding with Speculative Sampling Paper • 2302.01318 • Published Feb 2, 2023 • 2
Fast Inference from Transformers via Speculative Decoding Paper • 2211.17192 • Published Nov 30, 2022 • 4
AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling Paper • 2011.09011 • Published Nov 18, 2020 • 2