grimulkan
/

llama2_70b_longlora_fp16_32k_ROPE8

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

grimulkan commited on Jan 13

Commit

1074882

•

1 Parent(s): d8d3c0e

Create README.md

Files changed (1) hide show

README.md +5 -0

README.md ADDED Viewed

	@@ -0,0 +1,5 @@

+This is the same as Yukang's [Llama-2-70b-longlora-32k](https://huggingface.co/Yukang/Llama-2-70b-longlora-32k), except that the extra pad token has been stripped from the tokenizer to make it similar to the base Llama model. Please refer to that page for more details.
+It was created by merging [LongAlpaca-70B-lora](https://huggingface.co/Yukang/LongAlpaca-70B-lora) into [Llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b), replacing the embed and norm layers as described in the [LongLoRA repo](https://github.com/dvlab-research/LongLoRA), and removing the extra row and pad token.
+This is not an instruct-tuned model, but a base model for further fine-tuning. It supports 32K of context with linear rope scaling of 8.