Inference: bf16 or fp16?

by larekrow - opened Sep 6, 2023

Sep 6, 2023

During inference, should I set the torch_dtype to bf16 (like during finetuning) or to fp16 (which is found in config.json)?

Sep 6, 2023

Lianmin Zheng from FastChat says "fp16 is okay".

larekrow changed discussion status to closed Sep 6, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment