Can used with float16?

#11

by duhaifeng19900711 - opened Nov 8, 2024

Discussion

duhaifeng19900711

Nov 8, 2024

We have configured float16，but it throwed error，the code follow up:

and the error is as follows:

ZhengPeng7

Owner Nov 8, 2024

The current BiRefNet weights are all in FP32. I tried to use FP16 for training but met NaN errors. Efforts were made but failed...

If you directly use FP16 on this FP32 model, there will be an obvious decrease on the performance, but it can run in my experiments (I tested the inference speed in FP16).
The key fix for it is in this commit, which was forgotten to be updated on my HF codes but is now fixed when you read this msg.

I'm also going to transform the training part with the HF trainer. Codes will be updated if successful.

duhaifeng19900711

Nov 8, 2024

•

edited Nov 8, 2024

Thank you very much for your reply. The reason for using float16 is that we want to reduce VRAM. We have modified the code of the model so that it can execute correctly with float16, but the VRAM still hasn't been reduced. Is there any other way to reduce GPU VRAM usage。
The modified code is as follows:

ZhengPeng7

Owner Nov 8, 2024

Yeah, our ways of modification are the same.
Have you used autocast? If it's already employed, I neither know how to do it.

duhaifeng19900711

Nov 8, 2024

Thanks. I will try it based on this feature

duhaifeng19900711

Nov 11, 2024

•

edited Nov 11, 2024

I used an autocast like this

, but it still doesn't work.
But I used another method，modify the code and add _no_split_modules, using split loading, which reduced the VRAM by nearly 1GB

duhaifeng19900711 changed discussion status to closed Nov 12, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment