mylesgoose
commited on
Commit
•
0b4ca29
1
Parent(s):
831cec6
Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,41 @@
|
|
1 |
-
---
|
2 |
-
license: other
|
3 |
-
license_name: other
|
4 |
-
license_link: https://ai.meta.com/llama/license
|
5 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
license_name: other
|
4 |
+
license_link: https://ai.meta.com/llama/license
|
5 |
+
---
|
6 |
+
Repairing the chat template for the model.
|
7 |
+
There is a slight problem with your chat template. If you train a model with that current chat template the model starts to output as the first token <|eot_id|> and naturally the script will then halt generation. the model learns to see this:
|
8 |
+
|
9 |
+
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
|
10 |
+
|
11 |
+
<|image|>If I had to write a haiku for this one, it would be: <|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
12 |
+
|
13 |
+
Here is a haiku for the image:
|
14 |
+
|
15 |
+
Rabbit in a coat
|
16 |
+
Dapper and dignified
|
17 |
+
Country cottage charm<|eot_id|>
|
18 |
+
|
19 |
+
and so the model learns to do this in its first output:
|
20 |
+
<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
21 |
+
which naturally messes up the training. can you please put a new line character after the eot_id or prior to the start header id in the chat template: so that the format is like so :
|
22 |
+
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
|
23 |
+
|
24 |
+
<|image|>If I had to write a haiku for this one, it would be: <|eot_id|>
|
25 |
+
|
26 |
+
<|start_header_id|>assistant<|end_header_id|>
|
27 |
+
|
28 |
+
this results in a clearer distinction between the end of the user message and the start of the models.
|
29 |
+
<|begin_of_text|>
|
30 |
+
<|start_header_id|>system<|end_header_id|>
|
31 |
+
|
32 |
+
Today Date: 26 Sep 2024
|
33 |
+
|
34 |
+
You are a helpful language and vision assistant. You are able to understand the visual content that the user provides, and assist the user with a variety of tasks using natural language.<|eot_id|>
|
35 |
+
<|start_header_id|>user<|end_header_id|>
|
36 |
+
|
37 |
+
If I had to write a haiku for this one, it would be:<|eot_id|>#notice that this is sending the message to the next line now. which forms a clear distinction for the model. if you train a model with your current prompt it just outputs[ ]
|
38 |
+
<|start_header_id|>assistant<|end_header_id|>
|
39 |
+
|
40 |
+
['A rabbit on a sunny day']
|
41 |
+
this is an example of the 3.1 models chat template. i have not examined your one yet however i have examined the output of it above. to prevent the model learning that eot comes first there need to be a clearer distinction made with a \n
|