There is conversion problem/requirement which is not stated: it requires GPU newer than ampere architecture.
#1
by
AGenchev
- opened
Thanks for providing instructions. Conversion command fails if the GPU hardware isn't Nvidia newer than Ampere. Reason: converter loads to GPU and uses triton compiler to compile a conversion kernel which requires specific CUDA compute capability in the videocard.
Why this is a problem: Well, we poor folks with A100/80 can't convert it and the rest poor folks who want to run on EPYC CPU-only inference also can't convert it.
We are using WSL2 and RTX 4090, and it has been running for a while. It takes a long time.
It is good to mention in the text, that conversion requires Nvidia GPU of Ada/Hopper family or newer.
I found converted model here: https://huggingface.co./unsloth/DeepSeek-R1-BF16/tree/main.
AGenchev
changed discussion status to
closed