Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation

Computation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Wei-Kai Cheng, Xiang-Yi Liu, Hsin-Tzu Wu, Hsin-Yi Pai, Po-Yao Chung
Format:	article
Langue:	EN
Publié:	MDPI AG 2021
Sujets:	CNN DRAM PE array dataflow data migration data reuse Mechanical engineering and machinery TJ1-1570
Accès en ligne:	https://doaj.org/article/9035f02ded064c9f92bef9c7cfe74fd1
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

id	oai:doaj.org-article:9035f02ded064c9f92bef9c7cfe74fd1
record_format	dspace
spelling	oai:doaj.org-article:9035f02ded064c9f92bef9c7cfe74fd12021-11-25T18:23:25ZReconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation10.3390/mi121113652072-666Xhttps://doaj.org/article/9035f02ded064c9f92bef9c7cfe74fd12021-11-01T00:00:00Zhttps://www.mdpi.com/2072-666X/12/11/1365https://doaj.org/toc/2072-666XComputation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer and off-chip DRAM is even much more than the computation energy on processing element array (PE array). In order to reduce the energy consumption of memory access, a better dataflow to maximize data reuse and minimize data migration between on-chip buffer and external DRAM is important. Especially, the dimension of input feature map (ifmap) and filter weight are much different for each layer of the neural network. Hardware resources may not be effectively utilized if the array architecture and dataflow cannot be reconfigured layer by layer according to their ifmap dimension and filter dimension, and result in a large quantity of data migration on certain layers. However, a thorough exploration of all possible configurations is time consuming and meaningless. In this paper, we propose a quick and efficient methodology to adapt the configuration of PE array architecture, buffer assignment, dataflow and reuse methodology layer by layer with the given CNN architecture and hardware resource. In addition, we make an exploration on the different combinations of configuration issues to investigate their effectiveness and can be used as a guide to speed up the thorough exploration process.Wei-Kai ChengXiang-Yi LiuHsin-Tzu WuHsin-Yi PaiPo-Yao ChungMDPI AGarticleCNNDRAMPE arraydataflowdata migrationdata reuseMechanical engineering and machineryTJ1-1570ENMicromachines, Vol 12, Iss 1365, p 1365 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	CNN DRAM PE array dataflow data migration data reuse Mechanical engineering and machinery TJ1-1570
spellingShingle	CNN DRAM PE array dataflow data migration data reuse Mechanical engineering and machinery TJ1-1570 Wei-Kai Cheng Xiang-Yi Liu Hsin-Tzu Wu Hsin-Yi Pai Po-Yao Chung Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
description	Computation of convolutional neural network (CNN) requires a significant amount of memory access, which leads to lots of energy consumption. As the increase of neural network scale, this phenomenon is further obvious, the energy consumption of memory access and data migration between on-chip buffer and off-chip DRAM is even much more than the computation energy on processing element array (PE array). In order to reduce the energy consumption of memory access, a better dataflow to maximize data reuse and minimize data migration between on-chip buffer and external DRAM is important. Especially, the dimension of input feature map (ifmap) and filter weight are much different for each layer of the neural network. Hardware resources may not be effectively utilized if the array architecture and dataflow cannot be reconfigured layer by layer according to their ifmap dimension and filter dimension, and result in a large quantity of data migration on certain layers. However, a thorough exploration of all possible configurations is time consuming and meaningless. In this paper, we propose a quick and efficient methodology to adapt the configuration of PE array architecture, buffer assignment, dataflow and reuse methodology layer by layer with the given CNN architecture and hardware resource. In addition, we make an exploration on the different combinations of configuration issues to investigate their effectiveness and can be used as a guide to speed up the thorough exploration process.
format	article
author	Wei-Kai Cheng Xiang-Yi Liu Hsin-Tzu Wu Hsin-Yi Pai Po-Yao Chung
author_facet	Wei-Kai Cheng Xiang-Yi Liu Hsin-Tzu Wu Hsin-Yi Pai Po-Yao Chung
author_sort	Wei-Kai Cheng
title	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_short	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_full	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_fullStr	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_full_unstemmed	Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation
title_sort	reconfigurable architecture and dataflow for memory traffic minimization of cnns computation
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/9035f02ded064c9f92bef9c7cfe74fd1
work_keys_str_mv	AT weikaicheng reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT xiangyiliu reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT hsintzuwu reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT hsinyipai reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation AT poyaochung reconfigurablearchitectureanddataflowformemorytrafficminimizationofcnnscomputation
_version_	1718411211927715840

Reconfigurable Architecture and Dataflow for Memory Traffic Minimization of CNNs Computation

Documents similaires