Can’t get to work in inference endpoints
Is there a trick to get this work?
File "/opt/conda/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1854, in from_pretrained\n return cls._from_pretrained(\n\n File "/opt/conda/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 1886, in _from_pretrained\n slow_tokenizer = (cls.slow_tokenizer_class)._from_pretrained(\n\n File "/opt/conda/lib/python3.9/site-packages/transformers/tokenization_utils_base.py", line 2073, in _from_pretrained\n raise ValueError(\n\nValueError: Non-consecutive added token '' found. Should have index 32000 but has index 0 in saved vocabulary.\n"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]} 2023/10/03 00:35:24 ~ Error: ShardCannotStart
use llamaccp and guff format ?
You'll need to get into whatever environment you have setup there (e.g. conda) and do:
pip install git+https://github.com/huggingface/transformers
This is because support for Mistral in Transformers is not merged to PyPI yet, so you need to install from the development snapshot.