Only embedding layer is frozen. 10 epochs. 0.00001 learning rate. 8 batch size. 512 max tokens. AllQuAD dataset.
d2cdea1
verified
alienit
commited on