Update README.md
Browse files
README.md
CHANGED
@@ -4,13 +4,19 @@ _NOTE: this model card is a WIP_
|
|
4 |
|
5 |
GPT2-L (774M parameters) fine-tuned on the Wizard of Wikipedia dataset for 40k steps with 34/36 layers frozen using `aitextgen`. This model was then subsequently further fine-tuned on the [Daily Dialogues](http://yanran.li/dailydialog) dataset for an additional 40k steps, this time with **35** of 36 layers frozen.
|
6 |
|
7 |
-
Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to create an open-ended chatbot (of course, if other use cases arise have at it).
|
8 |
|
9 |
|
10 |
## conversation data
|
11 |
|
12 |
-
The dataset was tokenized and fed to the model as a conversation between two speakers, whose names are below.
|
13 |
|
14 |
`script_speaker_name` = `person alpha`
|
15 |
|
16 |
-
`script_responder_name` = `person beta`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
|
5 |
GPT2-L (774M parameters) fine-tuned on the Wizard of Wikipedia dataset for 40k steps with 34/36 layers frozen using `aitextgen`. This model was then subsequently further fine-tuned on the [Daily Dialogues](http://yanran.li/dailydialog) dataset for an additional 40k steps, this time with **35** of 36 layers frozen.
|
6 |
|
7 |
+
Designed for use with [ai-msgbot](https://github.com/pszemraj/ai-msgbot) to create an open-ended chatbot (of course, if other use cases arise, have at it).
|
8 |
|
9 |
|
10 |
## conversation data
|
11 |
|
12 |
+
The dataset was tokenized and fed to the model as a conversation between two speakers, whose names are below. This is relevant for writing prompts and filtering/extracting text from responses.
|
13 |
|
14 |
`script_speaker_name` = `person alpha`
|
15 |
|
16 |
+
`script_responder_name` = `person beta`
|
17 |
+
|
18 |
+
|
19 |
+
## examples
|
20 |
+
|
21 |
+
- the default inference API examples should work _okay_
|
22 |
+
- an ideal test would be explicitly adding `person beta` to the **end** of the prompt text. That way, the model is forced to respond to instead of adding to the entered prompt.
|