Text Generation
Transformers
Safetensors
English
stablelm
conversational
Inference Endpoints
euclaise commited on
Commit
bfcd1ce
·
verified ·
1 Parent(s): 7b1caad

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -2
README.md CHANGED
@@ -33,8 +33,6 @@ In particular, it might be easy to predict a *reasonable* next token, but much m
33
  The correct prediction here might be "signs of life.". However, the model might predict "and" rather than "signs", since "and" is *reasonable* in the immediate context - it's gramatically correct, but implies a strange ending to the sentence.
34
  As a result, the model might end up with something like "The astronomer pointed his telescope at the distant star, hoping to see and hear." - which makes little sense.
35
 
36
- ---
37
-
38
  SPIN's advantage over SFT likely comes from its partial mitigation of exposure bias.
39
  SPIN doesn't only train the model to predict the next token accurately, it repeatedly trains the model to identify and fix discrepancies between its generations and the ground-truth.
40
  In order to do this, the model must implicitly learn to think ahead, as exposure bias is likely what causes many of the discrepancies.
 
33
  The correct prediction here might be "signs of life.". However, the model might predict "and" rather than "signs", since "and" is *reasonable* in the immediate context - it's gramatically correct, but implies a strange ending to the sentence.
34
  As a result, the model might end up with something like "The astronomer pointed his telescope at the distant star, hoping to see and hear." - which makes little sense.
35
 
 
 
36
  SPIN's advantage over SFT likely comes from its partial mitigation of exposure bias.
37
  SPIN doesn't only train the model to predict the next token accurately, it repeatedly trains the model to identify and fix discrepancies between its generations and the ground-truth.
38
  In order to do this, the model must implicitly learn to think ahead, as exposure bias is likely what causes many of the discrepancies.