grimulkan commited on
Commit
1074882
1 Parent(s): d8d3c0e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ This is the same as Yukang's [Llama-2-70b-longlora-32k](https://huggingface.co/Yukang/Llama-2-70b-longlora-32k), except that the extra pad token has been stripped from the tokenizer to make it similar to the base Llama model. Please refer to that page for more details.
2
+
3
+ It was created by merging [LongAlpaca-70B-lora](https://huggingface.co/Yukang/LongAlpaca-70B-lora) into [Llama-2-70b](https://huggingface.co/meta-llama/Llama-2-70b), replacing the embed and norm layers as described in the [LongLoRA repo](https://github.com/dvlab-research/LongLoRA), and removing the extra row and pad token.
4
+
5
+ This is not an instruct-tuned model, but a base model for further fine-tuning. It supports 32K of context with linear rope scaling of 8.