Add the "max_length" parameter to the Generation configuration.

#2
by tomsherborne - opened

The 12B model does not match the performance of the 1.2B model as the generation defaults to the max_length of "20". This results in shorter sequences than the model should be generating. For example on WMT14-DE-EN: the 12B model scores 15.52 and the 1.2B model scores 31.786 (SacreBLEU). The default max_length is properly set in the smaller models (see https://huggingface.co./facebook/m2m100_1.2B/blob/main/generation_config.json) and the 12B models should match this. I am submitting similar PRs for the other 12B models.

you can directly add the specified parameter "max_length=2048" to the generate function.
generated_tokens = model.generate(**encoded_en, forced_bos_token_id=tokenizer.get_lang_id("zh"), max_length=2048)

That is a shorter term solution when the "out of the box" interface for 12B should match the smaller models.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment