Extended Context (via YaRN) Finetune of Llama-2-13b with airoboros-2.1 (LoRA)
Overview
This is a finetune of NousResearch/Yarn-Llama-2-13b-64k. This starting point is Llama-2-13b with additional pretraining done with YaRN scaling applied to RoPE to extend the useful context length to 64k tokens. Starting with this model, I performed instruction tuning with Jon Durbin's Airoboros 2.1 dataset, with same scaling approach applied.
This is a (merged) QLoRA fine-tune (rank 64).
The finetune was performed with 1x RTX 6000 Ada (~18 hours).
For full model card, including benchmarks, see the model card of the fp16 merged model
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.