Image size versus inference speed/accuracy

#22

by logankeenan - opened about 1 month ago

about 1 month ago

I'm curious how I can increase the inference speed beside just using more VRAM. I've experimented with reducing the image size. Has anyone else tried anything to increase inference speed?

My test: How will molmo perform finding UI elements on a page as image resolution is reduced?
https://github.com/logankeenan/molmo-benchmarks/

yoarkyang

14 days ago

I am also trying to speed up inference. I think reducing image size is similar to reduce max_crops for processor module, which leads to fewer image tokens to be processed, therefore the speed up. I will update if my other speed up effort works.

amanrangapur

Ai2 org 9 days ago

•

edited 9 days ago

Hi @logankeenan , @yoarkyang
torch.nn.DataParallel works like a charm to speed up inference if you have multiple GPUs. Use this code snippet:

if num_gpus > 1:
    model = torch.nn.DataParallel(model, device_ids=list(range(num_gpus)))
model.to(device)

logankeenan

7 days ago

@amanrangapur - I've been using the vllm implementation for now, but will give that a try in the future, thanks so much!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment