euclaise
/

ReMask-3B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

euclaise commited on Mar 28, 2024

Commit

02856a3

·

verified ·

1 Parent(s): bfcd1ce

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -31,7 +31,7 @@ In particular, it might be easy to predict a *reasonable* next token, but much m
 > The astronomer pointed his telescope at the distant star, hoping to see
 The correct prediction here might be "signs of life.". However, the model might predict "and" rather than "signs", since "and" is *reasonable* in the immediate context - it's gramatically correct, but implies a strange ending to the sentence.
-As a result, the model might end up with something like "The astronomer pointed his telescope at the distant star, hoping to see and hear." - which makes little sense.
 SPIN's advantage over SFT likely comes from its partial mitigation of exposure bias.
 SPIN doesn't only train the model to predict the next token accurately, it repeatedly trains the model to identify and fix discrepancies between its generations and the ground-truth.

 > The astronomer pointed his telescope at the distant star, hoping to see
 The correct prediction here might be "signs of life.". However, the model might predict "and" rather than "signs", since "and" is *reasonable* in the immediate context - it's gramatically correct, but implies a strange ending to the sentence.
+As a result, the model might end up with something like *"The astronomer pointed his telescope at the distant star, hoping to see and hear."* - which makes little sense.
 SPIN's advantage over SFT likely comes from its partial mitigation of exposure bias.
 SPIN doesn't only train the model to predict the next token accurately, it repeatedly trains the model to identify and fix discrepancies between its generations and the ground-truth.