MT5 For Causal LM

#11
by jsegvic - opened

Hi guys,
I'd like to see the MT5ForCausalLM, which would essentially be just the decoder part of MT5. I would like to use it as a decoder in VisionEncoderDecoderModel. Such model is already available for MBart (https://github.com/huggingface/transformers/blob/c0f8d055ce7a218e041e20a06946bf0baa8a7d6a/src/transformers/models/mbart/modeling_mbart.py#L1935) so I implemented it by myself in PyTorch and now I can use it in VisionEncoderDecoderModel's constructor. Would you consider it adding this model to your code base? Do I need to make a PR here or directly on Github?

Sign up or log in to comment