@vladbogo on Hugging Face: "A recent paper titled "ShortGPT: Layers in Large Language Models are More…"

Post

A recent paper titled "ShortGPT: Layers in Large Language Models are More Redundant Than You Expect" proposes a simple and effective approach to pruning Large Language Models (LLMs) by removing redundant layers.

Key points:
* Discovers significant redundancy across layers in LLMs, with some layers playing a negligible role for the final performance.
* Defines a new metric called Block Influence (BI) to quantify the importance of each layer in an LLM.
* Removes layers with low BI scores, achieving up to 25% reduction in parameters and computation while maintaining 92% of the LLM's performance.

Congrats to the authors for their work!

Paper: ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (2403.03853)

Join the conversation