What is 256k?

by supercharge19 - opened Apr 28

Discussion

supercharge19

Apr 28

I was hoping that it was context size, but that is only 8K, so what is this number?

ehartford

Cognitive Computations org Apr 28

It's 256k

supercharge19

Apr 28

Sorry I forgot to say please.

Is it the number of examples from dolphin dataset on which llama 3 was trained?

CyberTimon

Apr 28

•

edited Apr 28

It's 256k context size. The model can process text with a lenght up to 256k context. Hope this answers your question.

EDIT: I think you have to adjust the rope tetha and similar stuff to get the 256k to work.

KipTonic

Apr 29

Damn, 256K! That's twice than what GPT 3.5 has. How much RAM does this require?

supercharge19

Apr 29

Damn, 256K! That's twice than what GPT 3.5 has. How much RAM does this require?

That is good question, however, I never got past 4K due to limited RAM.

rombodawg

Apr 29

@ehartford

It's 256k

I love the fucking non answer XD
So sarcastic

rombodawg

Apr 29

I get that you are replying to him asking about the context size but it just makes it seem like you werent really answering the question based on how he asked

ehartford

Cognitive Computations org Apr 29

Ask Google/chatgpt/whatever "how do I use RoPE to set 256k context in (ollama, ooba, vLLM, tgi, etc)"

KipTonic

Apr 29

Damn, 256K! That's twice than what GPT 3.5 has. How much RAM does this require?

That is good question, however, I never got past 4K due to limited RAM.

How much RAM do you have?

Also rombodawg is right, the owner didn't answer the OP's question. If it wasn't for CyberTimon, I wouldn't be sure that was really the context size either.
People need to be straightforward when talking about boring shit like tech. Trying to be funny in this kind of environment is just cringe.