16bit?
#2
by
BigDeeper
- opened
DIdn't Mistral/Nvidia release a 16bit version? Why bump up to 32?
I upcast to 32 to avoid clamping the bf16 values
Basically, if I could use CUDA on bf16, I'd convert directly, but as it stands I can only use CUDA with fp16 and fp32, but fp16 will clamp some of the values that it can't represent, so when calculating the imatrix it's quicker to upcast and calculate it from there