Speed-up model loading
Hi and thanks very much for your great work!
I have one issue though and would appreciate your help.
On a Jetson Xavier NX, the loading time of hte model is very long.
I download and store the model once like this:
processor = AutoProcessor.from_pretrained("omlab/omdet-turbo-swin-tiny-hf")
model = OmDetTurboForObjectDetection.from_pretrained("omlab/omdet-turbo-swin-tiny-hf")
processor.save_pretrained("processor")
model.save_pretrained("model")
Later on, in a separate script, I load it like this:
processor = AutoProcessor.from_pretrained("processor")
model = OmDetTurboForObjectDetection.from_pretrained("model")
It takes about 5 minutes to load.
I tried upgrading the "accelerate" package and add the flag "low_cpu_mem_usage=True" but it doesn't seem to help.
Is it possible to reduce the loading time?
Many thanks.
Thanks Pavel.
Did you mean the long 5 minutes? It's for only loading as I wrote.
Rrgardning the forward pass, it takes ~0.5 sec. I would love to reduce it somehow e.g. TensorRT, but I had troubles even converting it to onnx. I tried the optimum library that uses onnxruntime but it reduced only ~10%, maybe I missed stuff.