Complete Sentence Transformers integration + patch inference on CPU & Windows

#4
by tomaarsen HF staff - opened

Hello!

Preface

I work on embedding models on a daily basis, and I worked hard to get ModernBERT integrated & supported nicely in transformers, and your work here is very much a culmination of those 2 coming together. It's wonderful to see such huge advancements for its model size; it really validates our work. The ModernBERT team is very excited about this model & the reranker.
Great work!

Pull Request overview

  • Complete Sentence Transformers integration: 1_Pooling/config.json already existed, but modules.json was missing to tell Sentence Transformers to look in 1_Pooling/config.json.
  • Remove reference_compile config option. When not specified in the config, it will be set dynamically based on the user's hardware and software: https://github.com/huggingface/transformers/blob/f439e28d32c9fa061c4fd90696ba0b158d273d09/src/transformers/models/modernbert/modeling_modernbert.py#L689-L718
  • Update the README:
    • Add tag for Sentence Transformers to boost visibility
    • Add model outputs so people get a better feel for what the model does
    • Remove 'trust_remote_code', not needed for ModernBERT!
    • Update minimum 'transformers' to v4.48.0, as that version introduced the modernbert architecture.
    • Mention that flash_attn is recommended (but not required) for faster inference.

Details

Regarding the reference_compile config change: if that isn't done, then parts of the model are always compiled, even if the user does not have triton (a core requirement for compilation) or if they are running on CPU (which isn't compatible with compilation). Removing the option will help.

P.s. will you upload your MTEB scores to the metadata? I'd love to see this in MTEB.

  • Tom Aarsen
tomaarsen changed pull request status to open
Alibaba-NLP org

Hi Tom,

Thank you so much for your kind words and contributions to the gte-modernbert series models. It's truly gratifying to hear that your hard work on ModernBERT. We are equally thrilled about the progress and the successful integration with transformers.

Best regards,

Dingkun Long

thenlper changed pull request status to merged

Sign up or log in to comment