FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit

Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Altho...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Mannhee Cho, Youngmin Kim
Formato:	article
Lenguaje:	EN
Publicado:	MDPI AG 2021
Materias:	convolutional neural network FPGA high-level synthesis accelerator Electronics TK7800-8360
Acceso en línea:	https://doaj.org/article/f24a1a1f14d7485baa6623c4d2ba1546
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:f24a1a1f14d7485baa6623c4d2ba1546
record_format	dspace
spelling	oai:doaj.org-article:f24a1a1f14d7485baa6623c4d2ba15462021-11-25T17:25:24ZFPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit10.3390/electronics102228592079-9292https://doaj.org/article/f24a1a1f14d7485baa6623c4d2ba15462021-11-01T00:00:00Zhttps://www.mdpi.com/2079-9292/10/22/2859https://doaj.org/toc/2079-9292Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating more efficient uses of FPGA resources. In this paper, we propose an FPGA-based CNN accelerator using multiple approximate accumulation units based on a fixed-point data type. We implemented the LeNet-5 CNN architecture, which performs classification of handwritten digits using the MNIST handwritten digit dataset. The proposed accelerator was implemented, using a high-level synthesis tool on a Xilinx FPGA. The proposed accelerator applies an optimized fixed-point data type and loop parallelization to improve performance. Approximate operation units are implemented using FPGA logic resources instead of high-precision digital signal processing (DSP) blocks, which are inefficient for low-precision data. Our accelerator model achieves 66% less memory usage and approximately 50% reduced network latency, compared to a floating point design and its resource utilization is optimized to use 78% fewer DSP blocks, compared to general fixed-point designs.Mannhee ChoYoungmin KimMDPI AGarticleconvolutional neural networkFPGAhigh-level synthesisacceleratorElectronicsTK7800-8360ENElectronics, Vol 10, Iss 2859, p 2859 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	convolutional neural network FPGA high-level synthesis accelerator Electronics TK7800-8360
spellingShingle	convolutional neural network FPGA high-level synthesis accelerator Electronics TK7800-8360 Mannhee Cho Youngmin Kim FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
description	Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating more efficient uses of FPGA resources. In this paper, we propose an FPGA-based CNN accelerator using multiple approximate accumulation units based on a fixed-point data type. We implemented the LeNet-5 CNN architecture, which performs classification of handwritten digits using the MNIST handwritten digit dataset. The proposed accelerator was implemented, using a high-level synthesis tool on a Xilinx FPGA. The proposed accelerator applies an optimized fixed-point data type and loop parallelization to improve performance. Approximate operation units are implemented using FPGA logic resources instead of high-precision digital signal processing (DSP) blocks, which are inefficient for low-precision data. Our accelerator model achieves 66% less memory usage and approximately 50% reduced network latency, compared to a floating point design and its resource utilization is optimized to use 78% fewer DSP blocks, compared to general fixed-point designs.
format	article
author	Mannhee Cho Youngmin Kim
author_facet	Mannhee Cho Youngmin Kim
author_sort	Mannhee Cho
title	FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
title_short	FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
title_full	FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
title_fullStr	FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
title_full_unstemmed	FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit
title_sort	fpga-based convolutional neural network accelerator with resource-optimized approximate multiply-accumulate unit
publisher	MDPI AG
publishDate	2021
url	https://doaj.org/article/f24a1a1f14d7485baa6623c4d2ba1546
work_keys_str_mv	AT mannheecho fpgabasedconvolutionalneuralnetworkacceleratorwithresourceoptimizedapproximatemultiplyaccumulateunit AT youngminkim fpgabasedconvolutionalneuralnetworkacceleratorwithresourceoptimizedapproximatemultiplyaccumulateunit
_version_	1718412383754387456

FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit

Ejemplares similares