Distil-Whisper Models Collection The first version of the Distil-Whisper models released with the Distil-Whisper paper. • 4 items • Updated Mar 21 • 35
Retentive Network: A Successor to Transformer for Large Language Models Paper • 2307.08621 • Published Jul 17, 2023 • 170
YaRN: Efficient Context Window Extension of Large Language Models Paper • 2309.00071 • Published Aug 31, 2023 • 65