Model save

aff76c5 verified 6 months ago

4.81 kB

	---
	library_name: peft
	tags:
	- trl
	- dpo
	- alignment-handbook
	- generated_from_trainer
	base_model: NbAiLab/nb-gpt-j-6B-v2
	model-index:
	- name: aftonposten-6b-align-scan
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# aftonposten-6b-align-scan

	This model is a fine-tuned version of [NbAiLab/nb-gpt-j-6B-v2](https://huggingface.co./NbAiLab/nb-gpt-j-6B-v2) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.6201
	- Rewards/chosen: 0.0327
	- Rewards/rejected: 0.0149
	- Rewards/accuracies: 0.5249
	- Rewards/margins: 0.0178
	- Logps/rejected: -37.4793
	- Logps/chosen: -33.9527
	- Logits/rejected: -2.2332
	- Logits/chosen: -2.2381

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-06
	- train_batch_size: 4
	- eval_batch_size: 8
	- seed: 42
	- distributed_type: multi-GPU
	- gradient_accumulation_steps: 2
	- total_train_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_ratio: 0.1
	- num_epochs: 4

	### Training results

	\| Training Loss \| Epoch \| Step \| Logits/chosen \| Logits/rejected \| Logps/chosen \| Logps/rejected \| Validation Loss \| Rewards/accuracies \| Rewards/chosen \| Rewards/margins \| Rewards/rejected \|
	\|:-------------:\|:-----:\|:----:\|:-------------:\|:---------------:\|:------------:\|:--------------:\|:---------------:\|:------------------:\|:--------------:\|:---------------:\|:----------------:\|
	\| 1.4583 \| 0.26 \| 100 \| -2.2357 \| -2.2308 \| -34.0303 \| -37.5236 \| 1.6162 \| 0.5245 \| 0.0017 \| 0.0045 \| -0.0028 \|
	\| 1.279 \| 0.52 \| 200 \| -2.2359 \| -2.2311 \| -34.0825 \| -37.5923 \| 1.6100 \| 0.5257 \| -0.0192 \| 0.0111 \| -0.0303 \|
	\| 1.0043 \| 0.78 \| 300 \| -2.2312 \| -2.2263 \| -34.0845 \| -37.6004 \| 1.5962 \| 0.5166 \| -0.0200 \| 0.0135 \| -0.0335 \|
	\| 0.7239 \| 1.04 \| 400 \| 1.6461 \| -0.0219 \| -0.0311 \| 0.5341 \| 0.0092 \| -37.5945 \| -34.0893 \| -2.2276 \| -2.2324 \|
	\| 0.6061 \| 1.3 \| 500 \| 1.6487 \| -0.0274 \| -0.0429 \| 0.5395 \| 0.0155 \| -37.6239 \| -34.1030 \| -2.2282 \| -2.2330 \|
	\| 0.9255 \| 1.56 \| 600 \| 1.5912 \| 0.0108 \| -0.0119 \| 0.5544 \| 0.0228 \| -37.5464 \| -34.0074 \| -2.2273 \| -2.2321 \|
	\| 0.8252 \| 1.82 \| 700 \| 1.6334 \| 0.0226 \| 0.0045 \| 0.5216 \| 0.0180 \| -37.5053 \| -33.9781 \| -2.2298 \| -2.2346 \|
	\| 0.2848 \| 2.08 \| 800 \| 1.6033 \| 0.0153 \| -0.0031 \| 0.5249 \| 0.0184 \| -37.5244 \| -33.9964 \| -2.2313 \| -2.2361 \|
	\| 0.3671 \| 2.34 \| 900 \| 1.6569 \| 0.0283 \| 0.0177 \| 0.5162 \| 0.0106 \| -37.4723 \| -33.9637 \| -2.2309 \| -2.2358 \|
	\| 0.3936 \| 2.6 \| 1000 \| 1.6203 \| 0.0348 \| 0.0187 \| 0.5428 \| 0.0161 \| -37.4698 \| -33.9475 \| -2.2325 \| -2.2374 \|
	\| 0.3156 \| 2.86 \| 1100 \| 1.6012 \| 0.0302 \| 0.0108 \| 0.5606 \| 0.0194 \| -37.4896 \| -33.9592 \| -2.2326 \| -2.2375 \|
	\| 0.2893 \| 3.12 \| 1200 \| 1.5705 \| 0.0346 \| 0.0103 \| 0.5365 \| 0.0243 \| -37.4909 \| -33.9480 \| -2.2335 \| -2.2383 \|
	\| 0.277 \| 3.38 \| 1300 \| 1.6102 \| 0.0314 \| 0.0121 \| 0.5403 \| 0.0194 \| -37.4865 \| -33.9559 \| -2.2333 \| -2.2382 \|
	\| 0.139 \| 3.64 \| 1400 \| 1.6181 \| 0.0273 \| 0.0092 \| 0.5307 \| 0.0181 \| -37.4937 \| -33.9663 \| -2.2333 \| -2.2381 \|
	\| 0.24 \| 3.9 \| 1500 \| 1.6201 \| 0.0327 \| 0.0149 \| 0.5249 \| 0.0178 \| -37.4793 \| -33.9527 \| -2.2332 \| -2.2381 \|


	### Framework versions

	- PEFT 0.10.0
	- Transformers 4.39.0.dev0
	- Pytorch 2.1.2+cu121
	- Datasets 2.14.6
	- Tokenizers 0.15.1