A Fully Integrated Reprogrammable CMOS-RRAM Compute-in-Memory Coprocessor for Neuromorphic Applications

Analog compute-in-memory with resistive random access memory (RRAM) devices promises to overcome the data movement bottleneck in data-intensive artificial intelligence (AI) and machine learning. RRAM crossbar arrays improve the efficiency of vector-matrix multiplications (VMMs), which is a vital ope...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Justin M. Correll, Vishishtha Bothra, Fuxi Cai, Yong Lim, Seung Hwan Lee, Seungjong Lee, Wei D. Lu, Zhengya Zhang, Michael P. Flynn
Formato: article
Lenguaje:EN
Publicado: IEEE 2020
Materias:
ADC
DAC
Acceso en línea:https://doaj.org/article/3d1d7a52eebe4f4f88627fff7858e676
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:Analog compute-in-memory with resistive random access memory (RRAM) devices promises to overcome the data movement bottleneck in data-intensive artificial intelligence (AI) and machine learning. RRAM crossbar arrays improve the efficiency of vector-matrix multiplications (VMMs), which is a vital operation in these applications. The prototype IC is the first complete, fully integrated analog-RRAM CMOS coprocessor. This article focuses on the digital and analog circuitry that supports efficient and flexible RRAM-based computation. A passive <inline-formula> <tex-math notation="LaTeX">$54\times108$ </tex-math></inline-formula> RRAM crossbar array performs VMM in the analog domain. Specialized mixed-signal circuits stimulate and read the outputs of the RRAM crossbar. The single-chip CMOS prototype includes a reduced instruction set computer (RISC) processor interfaced to a memory-mapped mixed-signal core. In the mixed-signal core, ADCs and DACs interface with the passive RRAM crossbar. The RISC processor controls the mixed-signal circuits and the algorithm data path. The system is fully programmable and supports forward and backward propagation. As proof of concept, a fully integrated 0.18- <inline-formula> <tex-math notation="LaTeX">$\mu \text{m}$ </tex-math></inline-formula> CMOS prototype with a postprocessed RRAM array demonstrates several key functions of machine learning, including online learning. The mixed-signal core consumes 64 mW at an operating frequency of 148 MHz. The total system power consumption considering the mixed-signal circuitry, the digital processor, and the passive RRAM array is 307 mW. The maximum theoretical throughput is 2.6 GOPS at an efficiency of 8.5 GOPS/W.