Model doesn't run under HF's Transformers / Inference Endpoints
I tried to run the model as an HF inference endpoint. The first error I got was about the --trust-remote-code
option missing, which I got trhough by setting envvar TRUST_REMOTE_CODE=true.
Afterwards, however, I run into what seems to be a configuration error. The tail end of the traceback in the logs is
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
- 2024-10-01T08:35:38.136+00:00 handle._run()
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/asyncio/events.py", line 84, in _run
- 2024-10-01T08:35:38.136+00:00 self._context.run(self._callback, *self._args)
- 2024-10-01T08:35:38.136+00:00 > File "/opt/conda/lib/python3.11/site-packages/text_generation_server/server.py", line 229, in serve_inner
- 2024-10-01T08:35:38.136+00:00 model = get_model_with_lora_adapters(
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/site-packages/text_generation_server/models/__init__.py", line 1219, in get_model_with_lora_adapters
- 2024-10-01T08:35:38.136+00:00 model = get_model(
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/site-packages/text_generation_server/models/__init__.py", line 632, in get_model
- 2024-10-01T08:35:38.136+00:00 return CausalLM(
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/site-packages/text_generation_server/models/causal_lm.py", line 569, in __init__
- 2024-10-01T08:35:38.136+00:00 model = model_class(prefix, config, weights)
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/site-packages/text_generation_server/models/custom_modeling/mpt_modeling.py", line 1099, in __init__
- 2024-10-01T08:35:38.136+00:00 self.transformer = MPTModel(prefix, config, weights)
- 2024-10-01T08:35:38.136+00:00 File "/opt/conda/lib/python3.11/site-packages/text_generation_server/models/custom_modeling/mpt_modeling.py", line 791, in __init__
- 2024-10-01T08:35:38.136+00:00 self.attn_impl = config.attn_config.attn_impl
- 2024-10-01T08:35:38.136+00:00 AttributeError: 'dict' object has no attribute 'attn_impl'
Any idea?
Hi,
Thank you for your interest in SEA-LION.
May I check which transformers version are you using? Also, could you kindly share with us a minimal code to reproduce the issue?
We would also like to share that a better performing version 2 of the SEA-LION is out and is based on the Llama 3 architecture which should be more deployment friendly.
You can find the SEA-LIONv2 here,
https://huggingface.co./aisingapore/llama3-8b-cpt-sea-lionv2.1-instruct
Thanks for the reply!
May I check which transformers version are you using? Also, could you kindly share with us a minimal code to reproduce the issue?
I'm using whichever transformers version HuggingFace's "Inference Endpoints" come with. There is no code involved on my side for this setup. I'm using the direct "deploy" option from the model's page, and ten Selecting "Inference Endpoints"
We would also like to share that a better performing version 2 of the SEA-LION is out and is based on the Llama 3 architecture which should be more deployment friendly.
Thanks for that! Will try it out.
Thank you very much for the clarification.
The SEA-LION model is a modified version of the MPT model which unfortunately is not compatible with the standard MPT model template and is not supported by the Huggingface Inference Endpoints.
If you would like to test out SEA-LION via the HuggingFace Inference Endpoints, we would like recommend instead the SEA-LIONv2 which is based on the Llama 3 model and should not have any issues with the inference endpoints.