--- license: llama2 --- **Paper**: [https://arxiv.org/pdf/2310.06694.pdf](https://arxiv.org/pdf/2310.06694.pdf) **Code**: https://github.com/princeton-nlp/LLM-Shearing **Models**: [Sheared-LLaMA-1.3B](https://huggingface.co./princeton-nlp/Sheared-LLaMA-1.3B), [Sheared-LLaMA-2.7B](https://huggingface.co./princeton-nlp/Sheared-LLaMA-2.7B) **Pruned Models without Continued Pre-training**: [Sheared-LLaMA-1.3B-Pruned](https://huggingface.co./princeton-nlp/Sheared-LLaMA-1.3B-Pruned), [Sheared-LLaMA-2.7B-Pruned](https://huggingface.co./princeton-nlp/Sheared-LLaMA-2.7B-Pruned) **Instruction-tuned Models**: [Sheared-LLaMA-1.3B-ShareGPT](https://huggingface.co./princeton-nlp/Sheared-LLaMA-1.3B-ShareGPT), [Sheared-LLaMA-2.7B-ShareGPT](https://huggingface.co./princeton-nlp/Sheared-LLaMA-2.7B-ShareGPT) **License**: Must comply with license of Llama2 since it's a model derived from Llama2. Sheared-LLaMA-2.7B-Pruned is the model pruned from [meta-llama/Llama-2-7b-hf](https://huggingface.co./meta-llama/Llama-2-7b-hf) **without continued pre-training**. We used roughly 0.4B tokens to perform the pruning experiment.