MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models Paper • 2409.17481 • Published Sep 26 • 46
Minitron Collection A family of compressed models obtained via pruning and knowledge distillation • 9 items • Updated Oct 3 • 59
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21 • 53