how file-level span corruption works?
#2
by
zeromquan
- opened
It's no clear for the span corruption part. could you helps on it ?
in the paper, there is a simple description:
"We choose span corruption as the base infill objective following InCoder (Fried et al., 2022). However, we take a different approach in selecting the spans for corruption: (1) we first sample a dynamic ratio of sequence to mask out, (2)we then sample the span length and mask out locations such that the total number of tokens match the ratio of the original sequence determined earlier "