Update README.md
Browse files
README.md
CHANGED
@@ -303,7 +303,7 @@ https://github.com/sileod/tasknet/ \
|
|
303 |
Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|
304 |
Training took 7 days on RTX6000 24GB gpu.
|
305 |
|
306 |
-
This is the shared model with the MNLI classifier on top.
|
307 |
Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
308 |
The number of examples per task was capped to 64k. The model was trained for 45k steps with a batch size of 384, and a peak learning rate of 2e-5.
|
309 |
|
|
|
303 |
Training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|
304 |
Training took 7 days on RTX6000 24GB gpu.
|
305 |
|
306 |
+
This is the shared model with the MNLI classifier on top. Datasets including bigbench, Anthropic rlhf, anli... alongside many NLI and classification.
|
307 |
Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
308 |
The number of examples per task was capped to 64k. The model was trained for 45k steps with a batch size of 384, and a peak learning rate of 2e-5.
|
309 |
|