agi-css
/

socially-good-lm

Text Generation

computational social science

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

agi-css commited on May 27, 2023

Commit

49a6b3c

·

1 Parent(s): c677efc

Update README.md

Files changed (1) hide show

README.md +4 -2

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ tags:
 ![model image](https://agwarbliu.s3.amazonaws.com/model_select_ours.png)
-**Fast, Effective, and Stable alternative of RLHF!**
 **Instead of training an additional reward model that is likely to be gamed, we directly train the model on the social games!** 🕹️ 🎲 🎮
@@ -29,7 +29,9 @@ Full details on simulation and training can be found [here](https://github.com/a
 # Training Procedure
-Trained on 8xA100s for 3H. The start checkpoint is the [SFT model](https://huggingface.co/agi-css/hh-rlhf-sft)). We have also released the [better-base model](https://huggingface.co/agi-css/better-base) which is the start checkpoint of SFT.
 Here is the training script:

 ![model image](https://agwarbliu.s3.amazonaws.com/model_select_ours.png)
+**Efficient, Effective, and Stable alternative of RLHF!**
 **Instead of training an additional reward model that is likely to be gamed, we directly train the model on the social games!** 🕹️ 🎲 🎮
 # Training Procedure
+Trained on 8xA100s for 3H. The start checkpoint is the [SFT model](https://huggingface.co/agi-css/hh-rlhf-sft)).
+We have also released the [better-base model](https://huggingface.co/agi-css/better-base) which is the start checkpoint of SFT.
 Here is the training script: