kazcfz
/

Llama-2-7b-chat-Calgary-Cambridge-multi-turn

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

kazcfz commited on Jan 11

Commit

5a2ef4a

•

1 Parent(s): 21fa523

Create README.md

Files changed (1) hide show

README.md +100 -0

README.md ADDED Viewed

	@@ -0,0 +1,100 @@

+# Llama-2-7b-chat-Calgary-Cambridge
+[NousResearch/Llama-2-7b-chat-hf](https://huggingface.co/NousResearch/Llama-2-7b-chat-hf) fine-tuned on [kazcfz/Calgary-Cambridge-llama2-37](https://huggingface.co/datasets/kazcfz/Calgary-Cambridge-llama2-37)
+<hr style="margin-top: 1.0em; margin-bottom: 1.0em;">
+**QLoRA parameters**
+Parameter | Value
+--- | ---
+lora_r | 64
+lora_alpha | 16
+lora_dropout | 0.1
+**bitsandbytes parameters**
+Parameter | Value
+--- | ---
+use_4bit | True
+bnb_4bit_compute_dtype | "float16"
+bnb_4bit_quant_type | "nf4"
+use_nested_quant | False
+**TrainingArguments parameters**
+Parameter | Value
+--- | ---
+num_train_epochs | 10
+fp16 | False
+bf16 | False
+per_device_train_batch_size | 4
+gradient_accumulation_steps | 1
+gradient_checkpointing | True
+max_grad_norm | 0.3
+learning_rate | 2e-4
+weight_decay | 0.001
+optim | "paged_adamw_32bit"
+lr_scheduler_type | "cosine"
+max_steps | -1
+warmup_ratio | 0.03
+group_by_length | True
+save_steps | 0
+logging_steps | 25
+**SFT parameters**
+Parameter | Value
+--- | ---
+max_seq_length | None
+packing | False
+<br>
+## Prompt template: Llama-2-Chat
+```
+<s>[INST] {prompt} [/INST]
+```
+<br>
+## Sample (actual) output (Max length: 200)
+**Base Model**
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+**Fine-tuned**
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+**Reload model + Merge LoRA weights?**
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+```
+Query: <s>[INST] What? [/INST]
+Response:
+```
+## Findings:
+[Current]
+- Increasing epochs from 5 (previous) to 20 (current) sure made the fine-tuned appear to respond a lot better on Golaif Virus and Dr. Golaif Pootis.
+- However, this looks like a solid example of overfitting. 2 of 3 responses (under 2nd query) are 100% identical to the dataset trained on, excluding the hallucinations from [/INST] onwards that somehow still exists.