Does the weights is fp16?

by lucasjin - opened Nov 16, 2023

Nov 16, 2023

Why so big

Yhyu13

Nov 17, 2023

eh, since fp16 or bf16 are considered as sufficient training precision for LLMs. And thus, they are stored in fp16 precisions.

But most of the time, you'd rather quantize them to 4bit which 1/4 the size to make inferencing faster and use less ram

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment