File size: 4,187 Bytes
cf3a7fe aeaa4fa 20ba485 a6846c2 aeaa4fa 20ba485 aeaa4fa 20ba485 aeaa4fa 20ba485 3085e3e 20ba485 cf3a7fe aeaa4fa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
---
tags:
- generated_from_trainer
- conversational
model-index:
- name: Qra-1b-dolly-instruction-0.1
results: []
datasets:
- s3nh/alpaca-dolly-instruction-only-polish
language:
- pl
inference: true
widget:
- messages:
- role: user
content: Napisz kod w pythonie.
license: apache-2.0
---
# Qra-1b-dolly-instruction-0.1
This model if a fine-tuned version of [OPI-PG/Qra-1b](https://huggingface.co./OPI-PG/Qra-1b) on the [s3nh/alpaca-dolly-instruction-only-polish](https://huggingface.co./datasets/s3nh/alpaca-dolly-instruction-only-polish) dataset.
## Model Description
Trained from [OPI-PG/Qra-1b](https://huggingface.co./OPI-PG/Qra-1b)
## Intended uses & limitations
This model has been fine-tuned for question-answering task. It is possible to use it as a chat, but it doesn't work well because the dataset did not contain conversations.
```py
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
model_id = "nie3e/Qra-1b-dolly-instruction-0.1"
device = "cuda" if torch.cuda.is_available() else "cpu"
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
pipe = pipeline(
"text-generation", model=model, tokenizer=tokenizer, device=device
)
def get_answer(system_prompt: str, user_prompt: str) -> str:
input_msg = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
prompt = pipe.tokenizer.apply_chat_template(
input_msg, tokenize=False,
add_generation_prompt=True
)
outputs = pipe(
prompt, max_new_tokens=512, do_sample=False, temperature=0.1, top_k=50,
top_p=0.1, eos_token_id=pipe.tokenizer.eos_token_id,
pad_token_id=pipe.tokenizer.pad_token_id
)
return outputs[0]['generated_text'][len(prompt):].strip()
print(
get_answer(
system_prompt="Jesteś przyjaznym chatbotem",
user_prompt="Napisz czym jest dokument architectural decision record."
)
)
```
## Training and evaluation data
Dataset: [s3nh/alpaca-dolly-instruction-only-polish](https://huggingface.co./datasets/s3nh/alpaca-dolly-instruction-only-polish)
Each row has been converted into conversation using this function:
```py
system_message = """Jesteś przyjaznym chatbotem"""
def create_conversation(sample) -> dict:
strip_characters = "\"'"
return {
"messages": [
{"role": "system", "content": system_message},
{"role": "user",
"content": f"{sample['instruction'].strip(strip_characters)} "
f"{sample['input'].strip(strip_characters)}"},
{"role": "assistant",
"content": f"{sample['output'].strip(strip_characters)}"}
]
}
```
Train/test split: 90%/10%
## Training procedure
GPU: 2x RTX 4060Ti 16GB
Training time: ~1 hour
Using accelerate + deepspeed with config:
```yml
compute_environment: LOCAL_MACHINE
debug: false
deepspeed_config:
gradient_accumulation_steps: 2
zero3_init_flag: false
zero_stage: 1
distributed_type: DEEPSPEED
downcast_bf16: 'no'
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 2
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false
```
### Training hyperparameters
Lora config:
```py
peft_config = LoraConfig(
lora_alpha=128,
lora_dropout=0.05,
r=256,
bias="none",
target_modules="all-linear",
task_type="CAUSAL_LM"
)
```
Training arguments:
```py
args = TrainingArguments(
output_dir="Qra-1b-dolly-instruction-0.1",
num_train_epochs=3,
per_device_train_batch_size=3,
gradient_accumulation_steps=2,
gradient_checkpointing=True,
optim="adamw_torch_fused",
logging_steps=10,
save_strategy="epoch",
learning_rate=2e-4,
bf16=True,
tf32=True,
max_grad_norm=0.3,
warmup_ratio=0.03,
lr_scheduler_type="constant",
push_to_hub=False,
report_to=["tensorboard"],
)
```
### Framework versions
- PEFT 0.10.0
- Transformers 4.39.2
- Pytorch 2.2.2+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2 |