adamo1139 commited on
Commit
a7be7fd
·
verified ·
1 Parent(s): 3b01473

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -0
README.md CHANGED
@@ -3,3 +3,9 @@ license: other
3
  license_name: yi-license
4
  license_link: LICENSE
5
  ---
 
 
 
 
 
 
 
3
  license_name: yi-license
4
  license_link: LICENSE
5
  ---
6
+ About 961 steps total, Yi-34B-200K llamafied DPO trained for 1 epoch on rawrr_v2 dataset via unsloth at prompt length of 400 and max length of 700, lr 0.000045 \
7
+ Model initialized with max_positional_embeddings of 4096 to not OOM. \
8
+ Training done on RTX 3090 Ti in about 14 hours. \
9
+ Average mem usage was like 23.89 / 23.99 GiB, so very close to OOM at all times. \
10
+ I trained it with XFCE on one 1080p monitor loaded up, on more fancy DM it would probably OOM with the same setup. \
11
+ I am not sure what's the purpose of max_prompt_length being separate from max_length, so I may have used it wrong, I should read up on it.