Why doesn't the S1/S2 text encoder use attn_mask or key_padding_mask to deal with padding tokens?
#5 opened 2 days ago
by
Kinfai
the scores distribution is all around 0.2, why is that
#4 opened 4 months ago
by
NemesisPrime