Text Generation
Transformers
Safetensors
English
stablelm
conversational
Inference Endpoints
euclaise commited on
Commit
02856a3
·
verified ·
1 Parent(s): bfcd1ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -31,7 +31,7 @@ In particular, it might be easy to predict a *reasonable* next token, but much m
31
  > The astronomer pointed his telescope at the distant star, hoping to see
32
 
33
  The correct prediction here might be "signs of life.". However, the model might predict "and" rather than "signs", since "and" is *reasonable* in the immediate context - it's gramatically correct, but implies a strange ending to the sentence.
34
- As a result, the model might end up with something like "The astronomer pointed his telescope at the distant star, hoping to see and hear." - which makes little sense.
35
 
36
  SPIN's advantage over SFT likely comes from its partial mitigation of exposure bias.
37
  SPIN doesn't only train the model to predict the next token accurately, it repeatedly trains the model to identify and fix discrepancies between its generations and the ground-truth.
 
31
  > The astronomer pointed his telescope at the distant star, hoping to see
32
 
33
  The correct prediction here might be "signs of life.". However, the model might predict "and" rather than "signs", since "and" is *reasonable* in the immediate context - it's gramatically correct, but implies a strange ending to the sentence.
34
+ As a result, the model might end up with something like *"The astronomer pointed his telescope at the distant star, hoping to see and hear."* - which makes little sense.
35
 
36
  SPIN's advantage over SFT likely comes from its partial mitigation of exposure bias.
37
  SPIN doesn't only train the model to predict the next token accurately, it repeatedly trains the model to identify and fix discrepancies between its generations and the ground-truth.