Difference with Block-wise Int8?

#1
by leo98xh - opened

Could you explain the difference with Block-wise Int8?

The main difference is that they have different quantization granularity. In block-wise int8, the elements in a block size 128x128 share the same quantization scale. In channel-wise int8, the elements in a column share the same quantization scale.

Sign up or log in to comment