Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy

Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Maarten Vandersteegen, Kristof Van Beeck, Toon Goedemé
Format:	article
Langue:	EN
Publié:	MDPI AG 2021
Sujets:	quantization neural networks nonuniform power-of-two scales low-cost hardware Electronics TK7800-8360
Accès en ligne:	https://doaj.org/article/9b83f42050394e609be6a8c4a4b79011
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Internet

https://doaj.org/article/9b83f42050394e609be6a8c4a4b79011

Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy

Internet

Documents similaires