Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/9b83f42050394e609be6a8c4a4b79011 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:9b83f42050394e609be6a8c4a4b79011 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:9b83f42050394e609be6a8c4a4b790112021-11-25T17:24:58ZInteger-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy10.3390/electronics102228232079-9292https://doaj.org/article/9b83f42050394e609be6a8c4a4b790112021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/22/2823https://doaj.org/toc/2079-9292Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in the literature. To avoid major degradation of accuracy, common quantization methods introduce additional scale factors to adapt the quantized values to the diverse data ranges, present in full-precision (floating-point) neural networks. These scales are usually kept in high precision, requiring the target compute engine to support a few high-precision multiplications, which is not desirable due to the larger hardware cost. Little effort has yet been invested in trying to avoid high-precision multipliers altogether, especially in combination with 4 bit weights. This work proposes a new quantization scheme, based on power-of-two quantization scales, that works on-par compared to uniform per-channel quantization with full-precision 32 bit quantization scales when using only 4 bit weights. This is done through the addition of a low-precision lookup-table that translates stored 4 bit weights into nonuniformly distributed 8 bit weights for internal computation. All our quantized ImageNet CNNs achieved or even exceeded the Top-1 accuracy of their full-precision counterparts, with ResNet18 exceeding its full-precision model by 0.35%. Our MobileNetV2 model achieved state-of-the-art performance with only a slight drop in accuracy of 0.51%.Maarten VandersteegenKristof Van BeeckToon GoedeméMDPI AGarticlequantizationneural networksnonuniformpower-of-two scaleslow-cost hardwareElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2823, p 2823 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
quantization neural networks nonuniform power-of-two scales low-cost hardware Electronics TK7800-8360 |
spellingShingle |
quantization neural networks nonuniform power-of-two scales low-cost hardware Electronics TK7800-8360 Maarten Vandersteegen Kristof Van Beeck Toon Goedemé Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy |
description |
Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in the literature. To avoid major degradation of accuracy, common quantization methods introduce additional scale factors to adapt the quantized values to the diverse data ranges, present in full-precision (floating-point) neural networks. These scales are usually kept in high precision, requiring the target compute engine to support a few high-precision multiplications, which is not desirable due to the larger hardware cost. Little effort has yet been invested in trying to avoid high-precision multipliers altogether, especially in combination with 4 bit weights. This work proposes a new quantization scheme, based on power-of-two quantization scales, that works on-par compared to uniform per-channel quantization with full-precision 32 bit quantization scales when using only 4 bit weights. This is done through the addition of a low-precision lookup-table that translates stored 4 bit weights into nonuniformly distributed 8 bit weights for internal computation. All our quantized ImageNet CNNs achieved or even exceeded the Top-1 accuracy of their full-precision counterparts, with ResNet18 exceeding its full-precision model by 0.35%. Our MobileNetV2 model achieved state-of-the-art performance with only a slight drop in accuracy of 0.51%. |
format |
article |
author |
Maarten Vandersteegen Kristof Van Beeck Toon Goedemé |
author_facet |
Maarten Vandersteegen Kristof Van Beeck Toon Goedemé |
author_sort |
Maarten Vandersteegen |
title |
Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy |
title_short |
Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy |
title_full |
Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy |
title_fullStr |
Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy |
title_full_unstemmed |
Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy |
title_sort |
integer-only cnns with 4 bit weights and bit-shift quantization scales at full-precision accuracy |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/9b83f42050394e609be6a8c4a4b79011 |
work_keys_str_mv |
AT maartenvandersteegen integeronlycnnswith4bitweightsandbitshiftquantizationscalesatfullprecisionaccuracy AT kristofvanbeeck integeronlycnnswith4bitweightsandbitshiftquantizationscalesatfullprecisionaccuracy AT toongoedeme integeronlycnnswith4bitweightsandbitshiftquantizationscalesatfullprecisionaccuracy |
_version_ |
1718412423446134784 |