Quantization into int8

#11

by SajjadIqbal - opened 6 days ago

6 days ago

How can we quantize this model to int8? Has anyone worked with this?
And in Fastapi it's taking 3 to 4 hundred mbs more than Gradio. Why is this so?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment