ymcki commited on
Commit
60bd8d2
·
1 Parent(s): 49edce9

8 epoches README

Browse files
Files changed (1) hide show
  1. README.md +20 -16
README.md CHANGED
@@ -38,20 +38,23 @@ This is abliterated model of [google/gemma-2-2b-jpn-it](https://huggingface.co/g
38
  described by mlabonne.
39
 
40
  Layer 17 of the original model was chosen for abliteration.
41
- I also created another layer 18 abliterated model for comparison.
42
-
43
- ORPO fine tuning was performed for eight epoches.
44
-
45
- | Epoch | loss | eval_loss |
46
- | ----- | ---- | --------- |
47
- | 1 | 1.20152769684791564 | 1.0501047372817993 |
48
- | 2 | 1.25755584239959716 | 1.0144596099853516 |
49
- | 3 | 0.93099724054336543 | 0.9957754611968994 |
50
- | 4 | 0.88664623498916623 | 0.9857067465782166 |
51
- | 5 | 0.86961059570312504 | 1.0203918218612670 |
52
- | 6 | 0.98065975904464630 | 0.9958684444427490 |
53
- | 7 | 0.38512575328350068 | 0.9686505198478699 |
54
- | 8 | 1.41178888082504270 | 0.9652527570724487 |
 
 
 
55
 
56
  The fine tuned model is uploaded here to be evaluated by the Open LLM Leaderboard to see if the slightly brain damaged non-ORPO model can be healed. Again, the fine tuning method is also based on one described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090.
57
 
@@ -63,12 +66,13 @@ Click on the model name go to the raw score json generated by Open LLM Leaderboa
63
  | ----- | ------- | ------ | ----|--------- | ---- | ---- | -------- |
64
  | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
65
  | [gemma-2-2b-jpn-it-abliterated-17-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-20T02-46-59.069357.json) | 29.99 | 50.94 | 38.59 | 2.87 | 27.43 | 38.23 | 21.86 |
66
- | gemma-2-2b-jpn-it-abliterated-17-ORPO (8 epoches) | TBD | TBD | TBD | TBD | TBD | TBD | TBD |
67
  | [gemma-2-2b-jpn-it-abliterated-18-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18-ORPO/results_2024-10-22T04-04-56.385050.json) | 29.94 | 48.97 | 40.18 | 3.02 | 26.17 | 39.42 | 21.85 |
68
  | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
69
  | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
 
70
 
71
- Looks like fine tuning is probably not enough. May need to run more epoches.
72
 
73
  ## How to run this model
74
 
 
38
  described by mlabonne.
39
 
40
  Layer 17 of the original model was chosen for abliteration.
41
+ I also created another layer 18 and 24 abliterated model for comparison.
42
+
43
+ ORPO fine tuning was performed for eight epoches. Lowest eval_loss at 7.48 epoch.
44
+ Checkpoint at 7.48 epoch was chosen to generate this model.
45
+
46
+ | | | | eval | eval |
47
+ | Epoch | loss | eval_loss | logps/rejected | logps/chosen |
48
+ | ----- | ---- | --------- | -------------- | ------------ |
49
+ | 1.00 | 1.2015 | 1.0501 | -1.0451 | -0.7449 |
50
+ | 2.00 | 1.2576 | 1.0145 | -1.1346 | -0.7248 |
51
+ | 3.00 | 0.9310 | 0.9958 | -1.2629 | -0.7332 |
52
+ | 4.00 | 0.8866 | 0.9857 | -1.2231 | -0.7019 |
53
+ | 5.00 | 0.8696 | 1.0204 | -1.2242 | -0.7523 |
54
+ | 6.00 | 0.9807 | 0.9959 | -1.3093 | -0.7257 |
55
+ | 7.00 | 0.3851 | 0.9687 | -1.3826 | -0.7103 |
56
+ | 7.48 | 1.2072 | 0.9638 | -1.4512 | -0.6959 |
57
+ | 8.00 | 1.4118 | 0.9653 | -1.5047 | -0.6990 |
58
 
59
  The fine tuned model is uploaded here to be evaluated by the Open LLM Leaderboard to see if the slightly brain damaged non-ORPO model can be healed. Again, the fine tuning method is also based on one described by [mlabonne](https://towardsdatascience.com/fine-tune-llama-3-with-orpo-56cfab2f9ada) but the input model was read into VRAM by [unsloth](https://github.com/unslothai/unsloth) to allow using the full 40k dataset to run on a single 3090.
60
 
 
66
  | ----- | ------- | ------ | ----|--------- | ---- | ---- | -------- |
67
  | [gemma-2-2b-jpn-it](https://huggingface.co/datasets/open-llm-leaderboard/results/blob/main/google/gemma-2-2b-jpn-it/results_2024-10-15T15-21-39.173019.json) | 30.82 | 54.11 | 41.43 | 0.0 | 27.52 | 37.17 | 24.67 |
68
  | [gemma-2-2b-jpn-it-abliterated-17-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-20T02-46-59.069357.json) | 29.99 | 50.94 | 38.59 | 2.87 | 27.43 | 38.23 | 21.86 |
69
+ | [gemma-2-2b-jpn-it-abliterated-17-ORPO (8 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17-ORPO/results_2024-10-24T00-00-00.000000.json) | 29.42 | 48.95 | 38.27 | 3.17 | 26.93 | 37.43 | 21.77 |
70
  | [gemma-2-2b-jpn-it-abliterated-18-ORPO (4 epoches)](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18-ORPO/results_2024-10-22T04-04-56.385050.json) | 29.94 | 48.97 | 40.18 | 3.02 | 26.17 | 39.42 | 21.85 |
71
  | [gemma-2-2b-jpn-it-abliterated-17](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-17/results_2024-10-18T15-18-46.821674.json) | 30.29 | 52.65 | 40.46 | 0.0 | 27.18 | 36.90 | 24.55 |
72
  | [gemma-2-2b-jpn-it-abliterated-18](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-18/results_2024-10-18T15-41-42.399571.json) | 30.61 | 53.02 | 40.96 | 0.0 | 27.35 | 37.30 | 25.05 |
73
+ | [gemma-2-2b-jpn-it-abliterated-24](https://huggingface.co/datasets/open-llm-leaderboard/results/raw/main/ymcki/gemma-2-2b-jpn-it-abliterated-24/results_2024-10-25T16-29-46.542899.json) | 30.61 | 51.37 | 40.77 | 0.0 | 27.77 | 39.02 | 24.73 |
74
 
75
+ Looks like fine tuning for 8 epoches is still not enough. May need to run more epoches.
76
 
77
  ## How to run this model
78