2025-01-09 does not run on the CPU with the example config

#50
by KeilahElla - opened

If I try to run the example code in the README on the CPU, I get the following errors:

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

It looks like it tries to run fp16 on the cpu, which pytorch still does not support. Overriding in the following way did not solve the issue:

model = AutoModelForCausalLM.from_pretrained(
model_id, trust_remote_code=True, revision=revision, torch_dtype=torch.float32
)

Any suggestions?

Update: there were a lot of float16 tensors in the inference code, which made it impossible to run the lateast version of moondream2 on the cpu.

I have replaced them with float32 types and it runs great now on the cpu. It's really fast even on my low-spec laptop.

If @vikhyatk is interested, I can send my code.

Final update on this. The problem was that the local version of PyTorch I'm using (2.1) was ancient and did not support fp16 math on the CPU. Modern versions (like 2.6) support fp16 arithmetic on the CPU. If you run into this problem, just update your PyTorch to 2.6+

KeilahElla changed discussion status to closed

Sign up or log in to comment