--- license: apache-2.0 datasets: - euclaise/SuperMC - euclaise/prm800k_preferences --- Expirements in preference learning. Trained with PRO on SuperMC and PRM800K for 3 epochs, using my supertrainer2000 framework. This is an expiremental model. Benchmarks coming soon.