Any plan for 70b ?

#1
by LPN64 - opened

Hello, do you plan to release 70b ?

I think, yes, because the model card says it and the 70b folder was renamed to:
Meta-Llama-3-70B-Instruct-GGUF-old

yup, just having trouble with the server that was running it, transferred several off but then it crashed and i need to get it back up

yup, just having trouble with the server that was running it, transferred several off but then it crashed and i need to get it back up

I think it would make sense to test perplexity of the models beforehand as allegedly there are issues with imatrix and I-Quants.

@Dampfinchen

There's perplexity issues but there's absolutely no generation issues

Even the exl2 gets weirdly high 7+ PPL, but it runs great. I almost feel the instruct tune is SO sensitive to its prompt template that it goes off the rails if it doesn't have it.

I've found in using it, unlike other models that will only slightly misbehave when they don't have their template, this one will go absolutely nuts generating infinitely. That's likely not good for perplexity..

Either way PPL on wiki raw is a weak test of a model's performance, use it if you like it

As for the 70B version, getting close ! internet is being waaay too slow, been going all day long :') just a one-off though, won't be doing it this way going forward

Thank you for the great work. Q5_K_M is the best i can use with my CPU/RAM, i think, imatrix could have benefits. For smaller models i use Q6 or even Q8.

These are with imatrix btw :)

Yes, this is, why i use them. :-)

Sign up or log in to comment