Text Generation
Transformers
PyTorch
longllama
code
custom_code
Eval Results
syzymon commited on
Commit
b5d73c2
1 Parent(s): c7f5c7c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -41,7 +41,7 @@ tags:
41
  </div>
42
 
43
  ## TLDR
44
- This repo contains [LongLLaMA-Instruct-3Bv1.1](https://huggingface.co/syzymon/long_llama_3b_v1_1).
45
 
46
  LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on [Hugging Face](https://huggingface.co/syzymon/long_llama_3b). Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models. Stay tuned for further updates.
47
 
 
41
  </div>
42
 
43
  ## TLDR
44
+ This repo contains [LongLLaMA-3Bv1.1](https://huggingface.co/syzymon/long_llama_3b_v1_1).
45
 
46
  LongLLaMA is built upon the foundation of [OpenLLaMA](https://github.com/openlm-research/open_llama) and fine-tuned using the [Focused Transformer (FoT)](https://arxiv.org/abs/2307.03170) method. We release a smaller 3B base variant (not instruction tuned) of the LongLLaMA model on a permissive license (Apache 2.0) and inference code supporting longer contexts on [Hugging Face](https://huggingface.co/syzymon/long_llama_3b). Our model weights can serve as the drop-in replacement of LLaMA in existing implementations (for short context up to 2048 tokens). Additionally, we provide evaluation results and comparisons against the original OpenLLaMA models. Stay tuned for further updates.
47