phi-3-mini-4k-instruct-awq-4bit

phi-3-mini-4k-instruct-awq-4bit is a version of the Microsoft Phi 3 mini 4k Instruct model that was quantized using the AWQ method developed by Lin et al. (2023).

Please refer to the Original Phi 3 mini model card for details about the model preparation and training processes.

Dependencies

autoawq==0.2.5 – AutoAWQ was used to quantize the phi-3 model.
vllm==0.4.2 – vLLM was used to host models for benchmarking.

Downloads last month: 4

Safetensors

Model size

682M params

Tensor type

I32

FP16

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.