HuggingChat: Input validation error: `inputs` tokens + `max_new_tokens` must be..

#430
by Kostyak - opened

I use the meta-llama/Meta-Llama-3-70B-Instruct model. After a certain number of moves, the AI refuses to walk and gives an error : "Input validation error: inputs tokens + max_new_tokens must be <= 8192. Given: 6391 inputs tokens and 2047 max_new_tokens". Is this a bug or some new limitation? I still don't get it to be honest and I hope I get an answer here. I'm new to this site.

Kostyak changed discussion status to closed
Kostyak changed discussion title from Input validation error: `inputs` tokens + `max_new_tokens` must be.. to HuggingChat: Input validation error: `inputs` tokens + `max_new_tokens` must be..
Kostyak changed discussion status to open
Kostyak changed discussion status to closed
Kostyak changed discussion status to open

Same issue all of the sudden today

Hugging Chat org

Can you see if this still happens? Should be fixed now.

This comment has been hidden

Can you see if this still happens? Should be fixed now.

Still same error, except numbers have changed a little.
Screenshot_20.png

I keep getting this error as well. Using CohereForAI

Same error, "Meta-Llama-3-70B-Instruct" model.

I have also been running into this error. Is there a workaround or solution at all?

"Input validation error: inputs tokens + max_new_tokens must be <= 8192. Given: 6474 inputs tokens and 2047 max_new_tokens"

Using the meta-llama/Meta-Llama-3-70B-Instruct model.

Thanks a lot for the detailed how-to guide, JulienGuy. Appreciate it!

Hi everyone! I'm getting this message: Input validation error: inputs tokens + max_new_tokens must be <= 16384. Given: 16392 inputs tokens and 0 max_new_tokens

Do you know what's going on? I've been using both "Qwen/Qwen2.5-72B-Instruct" and "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B".

Can you please tell me what's going on?

Hi everyone! I'm getting this message too: Input validation error: inputs tokens + max_new_tokens must be <= 16000. Given: 14698 inputs tokens and 3072 max_new_tokens

I've been using Qwen/Qwen2.5-Coder-32B-Instruct,,,

How to fix it?

Hugging Chat org

@datoreviol @bocahpekael99 if one of you could share one of the conversations where this happen, that would help us a lot with debugging !

@datoreviol @bocahpekael99

Hi guys,

LLMs have a limited context window, that is, a limited amount of text they can process at once. If this limit is exceeded, you typically get the error you are seeing. The limit in your case is around 16k tokens.

What counts towards this limit is the input text PLUS the output text. The input text is your prompt, which may contain a lot of tokens if you are doing RAG (almost 15k in your case). The output text is what you are asking the LLM to generate as an answer, which is 3072 tokens in your case. So you are basically asking the model to process more text at once than it is able to.

To fix the error, you have to reduce the amount of text you are asking the LLM to process. You can use any of these approaches:

  • reduce the input size (write a shorter prompt, return fewer chunks from your database if you are doing RAG, have smaller chunks to begin with)
  • reduce the size of the answer you want the LLM to write. 3072 tokens is kind of a lot for a chatbot or a RAG pipeline, do you really need that much? Try 1024 or 512.
  • when calling an LLM through some kind of free API from HuggingFace, it seems that the max context window is set to a lower value than what the model can actually deal with. If this is how you are using DeepSeek, consider creating a (paid) dedicated endpoint instead, which would allow you to use a bigger context window (Qwen 32B should support 128k).

Hope this helps.

Hugging Chat org

TBH this shouldn't be happening, the backend should automatically truncate if you exceed the context window, that's why I wanted a conversation to see where the issue is

What I don't understand is the following:
When input limit is reached how much do we have to wait in order to continue asking questions to the model / agent ??

Still happening to me

Sign up or log in to comment