question on max_seq_length
#1
by
botkop
- opened
Does this model have the same max_seq_length as LaBSE (256) or can you go beyond this?
Thank you.
Hi, this model does not have a max_seq_len limit. It has static embeddings, so you can process documents of arbitrary length with it. If you want do do this, please set max_length to None, e.g. embeddings = model.encode(["Example sentence"], max_length=None) and it will process whatever length your input is.
Thank you for the quick response.
Will this affect the quality of the embedding?
That's hard to say; we have not done extensive experiments yet on long documents, most of our benchmarks were for documents < 512 tokens (MTEB). We do plan on experimenting with this in the future