If we're gonna get fancy we should have Q8_0_L as well

by lemon07r - opened Jun 26

Discussion

lemon07r

Jun 26

•

edited Jun 26

The new larger quant that bartowski is doing. Would be nice to have here as well.

Experimental, uses f16 for embed and output weights. Please provide any feedback of differences. Extremely high quality, generally unneeded but max available quant.

lemon07r changed discussion status to closed Jun 26

lemon07r changed discussion status to open Jun 26

DavidAU

Owner Jun 27

•

edited Jun 27

Sometimes these additions help , sometimes they do not.
It is a per model / per quant issue too as well.
Did a lot of testing with these - different models and quants.
I do have a "Q8" project coming up ; it is just that this quant is a little more difficult to "raise up" ; with the exception of embed/output weights.

traveltube

Jun 27

It's so interesting that the model significantly improves with the Command R template! I can also attest to that.

DavidAU

Owner Jun 27

yes- it was a "lab" error ; funny how those work out sometimes.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment