euclaise
/

crow-1b-attempt1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

euclaise commited on Jan 9, 2024

Commit

9fac2b5

·

1 Parent(s): 2af004d

Update README.md

Files changed (1) hide show

README.md +10 -3

README.md CHANGED Viewed

@@ -5,10 +5,17 @@ datasets:
 - euclaise/prm800k_preferences
 ---
-Expirements in preference learning.
-Trained with PRO on SuperMC and PRM800K for 3 epochs, using my supertrainer2000 framework.
 This is an expiremental model.
-Benchmarks coming soon.

 - euclaise/prm800k_preferences
 ---
+Expirements in large-scale preference learning.
+Trained with PRO (preference ranking optimization, see https://arxiv.org/abs/2306.17492) on SuperMC and PRM800K for 3 epochs, using my supertrainer2000 framework.
 This is an expiremental model.
+Benchmarks coming soon.
+Hyperparameters:
+- AdamW, weight decay of 0.01, otherwise default hyperparams
+- Maximum LR of 1e-5
+- Cosine schedule with a warmup of 5400 steps
+- Batch size of 4 (2 real x 2 accumulated)
+- Maximum of 5 epochs, early stopping (visual observation), stopped after 3