Difference with Block-wise Int8?
#1
by
leo98xh
- opened
Could you explain the difference with Block-wise Int8?
The main difference is that they have different quantization granularity. In block-wise int8, the elements in a block size 128x128 share the same quantization scale. In channel-wise int8, the elements in a column share the same quantization scale.