Integer-Only CNNs with 4 Bit Weights and Bit-Shift Quantization Scales at Full-Precision Accuracy
Quantization of neural networks has been one of the most popular techniques to compress models for embedded (IoT) hardware platforms with highly constrained latency, storage, memory-bandwidth, and energy specifications. Limiting the number of bits per weight and activation has been the main focus in...
Enregistré dans:
Auteurs principaux: | , , |
---|---|
Format: | article |
Langue: | EN |
Publié: |
MDPI AG
2021
|
Sujets: | |
Accès en ligne: | https://doaj.org/article/9b83f42050394e609be6a8c4a4b79011 |
Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|