Text Generation
Safetensors
Transformers
vllm
mistral

Model is downloaded twice with transformers

#5
by dulacp - opened

My understanding is that you've included the model weights in two ways, once with the file consolidated.safetensors and a second time with weight chunks model-....safetensors.

It doubles the hgcache size in all our pods (94.3GB instead of 47.2GB).

Is it the expected behavior?

Sign up or log in to comment