--- license: apache-2.0 --- THIS MODEL IS EXPERIMENTAL AND MIGHT BE BUGGY, I DIDN'T PERFECT THE STRENGTH OF DPO AND SFT YET. \ Submitting to Open LLM leaderboard with base model yi-34b-200k-llamafied to see whether there's a point in merging a lora over a lora if both have the same lora_r or if it doesn't matter. Another AEZAKMI v2 finetune over Yi-34B-200K-rawrr-r3. Sequence length 2200 I was able to squeeze that in using Unsloth, script I used is in this repo. Training took around 18 hours on local RTX 3090 Ti. Will be uploading fp16 and exl2 soon. So far it seems like de-contaminating Yi worked nicely. This lora goes over Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3 lora. So first get Yi-34B-200K llamafied, merge in Yi-34B-200K-rawrr1-LORA-DPO-experimental-r3, then merge in this lora. Credits for mlabonne (I was using his Mistral fine-tuning script pieces for dataset preparation), Daniel Han and Michael Han (Unsloth AI team) [made with Unsloth](https://github.com/unslothai/unsloth)