16bit?

by BigDeeper - opened Jul 23

Discussion

BigDeeper

Jul 23

DIdn't Mistral/Nvidia release a 16bit version? Why bump up to 32?

bartowski

Owner Jul 23

I upcast to 32 to avoid clamping the bf16 values

Basically, if I could use CUDA on bf16, I'd convert directly, but as it stands I can only use CUDA with fp16 and fp32, but fp16 will clamp some of the values that it can't represent, so when calculating the imatrix it's quicker to upcast and calculate it from there

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment