BERT based possible?

#12
by karrr0n - opened

Hi @juierror ,
could you also provide a model based on bert-large-uncased?
I tried by myself with your training code and changed model and tokenizer, but getting error: " The model did not return a loss from the inputs, only the following keys: last_hidden_state,pooler_output".

Or do you have further advices?

Best regards!

Hi @cioo ,

From my understanding, bert is an encoder model, which train on masking prediction, so I think it might not appropriate for this generation task. However, The T5 which I use is an encoder decoder model which is used for generation task.

Sign up or log in to comment