UsefulSensors/moonshine · Add merged and quantized ONNX model files

Useful Sensors Inc. org about 6 hours ago

•

Based on the great work by @Xenova in https://github.com/usefulsensors/moonshine/pull/73 I adjusted the quantized versions to make them more accurate.

There are three versions of each of the "tiny" and "base" models:

Float: Unquantized float32 models, from @Xenova 's original PR.
Quantized: 8-bit weights and activations, converted using dynamic_quantization().
Quantized 4-bit: Lower precision version using 4-bit weights for the MatMul ops only.

The 8-bit quantized versions were created using the ONNX shrink ray tool with these commands:

python3 src/onnx_shrink_ray/shrink.py --output_suffix ".onnx" --output_dir tiny/quantized_temp --method "integer_activations" --nodes_to_exclude "/conv1/Conv,/conv2/Conv,/conv3/Conv" tiny/float
python3 src/onnx_shrink_ray/shrink.py --method "integer_weights" --output_suffix ".onnx" --output_dir tiny/quantized tiny/quantized_temp

The second command is needed to shrink the file size by converting the float32 conv weights into int8 equivalents.

The 4-bit versions were created using:

python3 src/onnx_shrink_ray/shrink.py --output_suffix ".onnx" --output_dir  tiny/quantized_4bit --method "integer_activations" --nodes_to_exclude "/conv1/Conv,/conv2/Conv,/conv3/Conv" tiny/float

Here are the accuracy numbers from the Librispeech clean English dataset:

Model	Quantization	Accuracy	Total file size
Tiny	None	4.51%	133MB
Tiny	8-bit	4.75%	26MB
Tiny	4-bit	4.54%	44MB
Base	None	3.29%	235MB
Base	8-bit	3.30%	59MB
Base	4-bit	3.35%	70MB

The 8-bit quantized files are considerably smaller thank the float32 versions, for a small loss in accuracy. I don't yet have inference numbers.

Added merged and quantized ONNX model filescddb243c

petewarden changed pull request status to open about 6 hours ago

petewarden changed pull request status to merged about 5 hours ago