nbeerbower
/

mistral-nemo-kartoffel-12B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mistral-nemo-kartoffel-12B / README.md

nbeerbower's picture

Update README.md

cbd2ffb verified about 1 month ago

|

history blame contribute delete

1.67 kB

	---
	license: apache-2.0
	library_name: transformers
	base_model:
	- nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated
	datasets:
	- nbeerbower/Schule-DPO
	- nbeerbower/Arkhaios-DPO
	- nbeerbower/Purpura-DPO
	---

	![image/png](https://huggingface.co./nbeerbower/mistral-nemo-kartoffel-12B/resolve/main/kartoffel.png?download=true)

	# mistral-nemo-kartoffel-12B

	[Mahou-1.5-mistral-nemo-12B-lorablated](https://huggingface.co./nbeerbower/Mahou-1.5-mistral-nemo-12B-lorablated) finetuned on various datasets.

	### Method

	[ORPO tuned](https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html) with 8x A100 for 2 epochs.

	QLoRA config:
	```
	# QLoRA config
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype=torch_dtype,
	bnb_4bit_use_double_quant=True,
	)

	# LoRA config
	peft_config = LoraConfig(
	r=16,
	lora_alpha=32,
	lora_dropout=0.05,
	bias="none",
	task_type="CAUSAL_LM",
	target_modules=['up_proj', 'down_proj', 'gate_proj', 'k_proj', 'q_proj', 'v_proj', 'o_proj']
	)
	```

	Training config:
	```
	orpo_args = ORPOConfig(
	run_name=new_model,
	learning_rate=8e-6,
	lr_scheduler_type="linear",
	max_length=2048,
	max_prompt_length=1024,
	max_completion_length=1024,
	beta=0.1,
	per_device_train_batch_size=4,
	per_device_eval_batch_size=4,
	gradient_accumulation_steps=1,
	optim="paged_adamw_8bit",
	num_train_epochs=2,
	evaluation_strategy="steps",
	eval_steps=0.2,
	logging_steps=1,
	warmup_steps=10,
	max_grad_norm=10,
	report_to="wandb",
	output_dir="./results/",
	bf16=True,
	gradient_checkpointing=True,
	)
	```