On how much English token was the model trained onn

by aslawliet - opened Jan 15

Discussion

aslawliet

Jan 15

On how much English token was the model trained on?

jklj077

Qwen org Jan 17

🤔I would say less than 3T tokens; that's for sure.

aslawliet

Jan 17

@jklj077 is it more than 2.4 Trillion tokens?

nonetrix

Feb 18

•

edited Feb 18

It seems to randomly mix in Chinese words when I didn't ask for it annoyingly, maybe the model is better in Chinese I don't speak it. Might be due to the GGUF version though? It seems to make a GGUF model you need to give it some examples, and I think it sometimes makes it worse at some tasks if it's not good example. Might be worth testing if providing it with only English or mixed examples makes the quant better and release separate version

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment