Holarissun
commited on
Commit
•
e839b99
1
Parent(s):
5bca5a7
Update README.md
Browse files
README.md
CHANGED
@@ -59,4 +59,15 @@ The following hyperparameters were used during training:
|
|
59 |
- Transformers 4.36.2
|
60 |
- Pytorch 2.1.2
|
61 |
- Datasets 2.15.0
|
62 |
-
- Tokenizers 0.15.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
- Transformers 4.36.2
|
60 |
- Pytorch 2.1.2
|
61 |
- Datasets 2.15.0
|
62 |
+
- Tokenizers 0.15.0
|
63 |
+
|
64 |
+
### BibTex Citation
|
65 |
+
If you would like to cite our paper when using the model, please use
|
66 |
+
```
|
67 |
+
@article{sun2024supervised,
|
68 |
+
title={Supervised Fine-Tuning as Inverse Reinforcement Learning},
|
69 |
+
author={Sun, Hao},
|
70 |
+
journal={arXiv preprint arXiv:2403.12017},
|
71 |
+
year={2024}
|
72 |
+
}
|
73 |
+
```
|