Qra-1b-dolly-instruction-0.1 / README.md

Update README.md

a6846c2 verified 7 months ago

4.19 kB

	---
	tags:
	- generated_from_trainer
	- conversational
	model-index:
	- name: Qra-1b-dolly-instruction-0.1
	results: []
	datasets:
	- s3nh/alpaca-dolly-instruction-only-polish
	language:
	- pl
	inference: true
	widget:
	- messages:
	- role: user
	content: Napisz kod w pythonie.
	license: apache-2.0
	---

	# Qra-1b-dolly-instruction-0.1

	This model if a fine-tuned version of [OPI-PG/Qra-1b](https://huggingface.co./OPI-PG/Qra-1b) on the [s3nh/alpaca-dolly-instruction-only-polish](https://huggingface.co./datasets/s3nh/alpaca-dolly-instruction-only-polish) dataset.

	## Model Description

	Trained from [OPI-PG/Qra-1b](https://huggingface.co./OPI-PG/Qra-1b)

	## Intended uses & limitations

	This model has been fine-tuned for question-answering task. It is possible to use it as a chat, but it doesn't work well because the dataset did not contain conversations.

	```py
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

	model_id = "nie3e/Qra-1b-dolly-instruction-0.1"
	device = "cuda" if torch.cuda.is_available() else "cpu"

	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	)
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	pipe = pipeline(
	"text-generation", model=model, tokenizer=tokenizer, device=device
	)

	def get_answer(system_prompt: str, user_prompt: str) -> str:
	input_msg = [
	{"role": "system", "content": system_prompt},
	{"role": "user", "content": user_prompt}
	]
	prompt = pipe.tokenizer.apply_chat_template(
	input_msg, tokenize=False,
	add_generation_prompt=True
	)
	outputs = pipe(
	prompt, max_new_tokens=512, do_sample=False, temperature=0.1, top_k=50,
	top_p=0.1, eos_token_id=pipe.tokenizer.eos_token_id,
	pad_token_id=pipe.tokenizer.pad_token_id
	)
	return outputs[0]['generated_text'][len(prompt):].strip()

	print(
	get_answer(
	system_prompt="Jesteś przyjaznym chatbotem",
	user_prompt="Napisz czym jest dokument architectural decision record."
	)
	)
	```

	## Training and evaluation data

	Dataset: [s3nh/alpaca-dolly-instruction-only-polish](https://huggingface.co./datasets/s3nh/alpaca-dolly-instruction-only-polish)

	Each row has been converted into conversation using this function:
	```py
	system_message = """Jesteś przyjaznym chatbotem"""

	def create_conversation(sample) -> dict:
	strip_characters = "\"'"
	return {
	"messages": [
	{"role": "system", "content": system_message},
	{"role": "user",
	"content": f"{sample['instruction'].strip(strip_characters)} "
	f"{sample['input'].strip(strip_characters)}"},
	{"role": "assistant",
	"content": f"{sample['output'].strip(strip_characters)}"}
	]
	}
	```

	Train/test split: 90%/10%

	## Training procedure

	GPU: 2x RTX 4060Ti 16GB
	Training time: ~1 hour

	Using accelerate + deepspeed with config:
	```yml
	compute_environment: LOCAL_MACHINE
	debug: false
	deepspeed_config:
	gradient_accumulation_steps: 2
	zero3_init_flag: false
	zero_stage: 1
	distributed_type: DEEPSPEED
	downcast_bf16: 'no'
	machine_rank: 0
	main_training_function: main
	mixed_precision: bf16
	num_machines: 1
	num_processes: 2
	rdzv_backend: static
	same_network: true
	tpu_env: []
	tpu_use_cluster: false
	tpu_use_sudo: false
	use_cpu: false
	```

	### Training hyperparameters

	Lora config:
	```py
	peft_config = LoraConfig(
	lora_alpha=128,
	lora_dropout=0.05,
	r=256,
	bias="none",
	target_modules="all-linear",
	task_type="CAUSAL_LM"
	)
	```

	Training arguments:
	```py
	args = TrainingArguments(
	output_dir="Qra-1b-dolly-instruction-0.1",
	num_train_epochs=3,
	per_device_train_batch_size=3,
	gradient_accumulation_steps=2,
	gradient_checkpointing=True,
	optim="adamw_torch_fused",
	logging_steps=10,
	save_strategy="epoch",
	learning_rate=2e-4,
	bf16=True,
	tf32=True,
	max_grad_norm=0.3,
	warmup_ratio=0.03,
	lr_scheduler_type="constant",
	push_to_hub=False,
	report_to=["tensorboard"],
	)
	```


	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.39.2
	- Pytorch 2.2.2+cu121
	- Datasets 2.18.0
	- Tokenizers 0.15.2