"Model mistralai/Mistral-Nemo-Instruct-2407 time out" in Inference APIs
I have the same timeout issue
HTTPError Traceback (most recent call last)
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py:304, in hf_raise_for_status(response, endpoint_name)
303 try:
--> 304 response.raise_for_status()
305 except HTTPError as e:
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/requests/models.py:1024, in Response.raise_for_status(self)
1023 if http_error_msg:
-> 1024 raise HTTPError(http_error_msg, response=self)
HTTPError: 503 Server Error: Service Unavailable for url: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions
The above exception was the direct cause of the following exception:
HfHubHTTPError Traceback (most recent call last)
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/huggingface_hub/inference/_client.py:273, in InferenceClient.post(self, json, data, model, task, stream)
272 try:
--> 273 hf_raise_for_status(response)
274 return response.iter_lines() if stream else response.content
File ~/anaconda3/envs/test-hf/lib/python3.12/site-packages/huggingface_hub/utils/_errors.py:371, in hf_raise_for_status(response, endpoint_name)
369 # Convert HTTPError
into a HfHubHTTPError
to display request information
370 # as well (request id and/or server error message)
--> 371 raise HfHubHTTPError(str(e), response=response) from e
...
288 ) from error
289 # ...or wait 1s and retry
290 logger.info(f"Waiting for model to be loaded on the server: {error}")
InferenceTimeoutError: Model not loaded on the server: https://api-inference.huggingface.co/models/mistralai/Mistral-Nemo-Instruct-2407/v1/chat/completions. Please retry with a higher timeout (current: 120).
All fixed.
Sorry we had to make tiny corrections for this configuration file (the model is the same but the configuration expresses things differently).
Cheers.
Thanks!