princeton-nlp
/

Sheared-LLaMA-1.3B-Pruned

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Sheared-LLaMA-1.3B-Pruned / README.md

princeton-nlp's picture

Update README.md

300c0d8 verified 9 months ago

|

No virus

1.36 kB

	---
	license: llama2
	---


	Paper: [https://arxiv.org/pdf/2310.06694.pdf](https://arxiv.org/pdf/2310.06694.pdf)
	Code: https://github.com/princeton-nlp/LLM-Shearing
	Models: [Sheared-LLaMA-1.3B](https://huggingface.co./princeton-nlp/Sheared-LLaMA-1.3B), [Sheared-LLaMA-2.7B](https://huggingface.co./princeton-nlp/Sheared-LLaMA-2.7B)
	Pruned Models without Continued Pre-training: [Sheared-LLaMA-1.3B-Pruned](https://huggingface.co./princeton-nlp/Sheared-LLaMA-1.3B-Pruned), [Sheared-LLaMA-2.7B-Pruned](https://huggingface.co./princeton-nlp/Sheared-LLaMA-2.7B-Pruned)
	Instruction-tuned Models: [Sheared-LLaMA-1.3B-ShareGPT](https://huggingface.co./princeton-nlp/Sheared-LLaMA-1.3B-ShareGPT), [Sheared-LLaMA-2.7B-ShareGPT](https://huggingface.co./princeton-nlp/Sheared-LLaMA-2.7B-ShareGPT)

	License: Must comply with license of Llama2 since it's a model derived from Llama2.

	Sheared-LLaMA-1.3B-Pruned is the model pruned from [meta-llama/Llama-2-7b-hf](https://huggingface.co./meta-llama/Llama-2-7b-hf) without continued pre-training.
	We used roughly 0.4B tokens to perform the pruning experiment. This model could be a good use to study
	- effective data mixtures for continued pre-training
	- comparisons to other pruning techniques
	- extensive evaluations to understand how pruning affects knowledge and reasoning capabilities of LLMs