1-D Systolic Arrays Design of LMS Adaptive (FIR) Digital Filtering

This paper extends the 1-D systolic array approach with a method of systematic linear design of systolic algorithms. Past methods for mapping the Least-Mean-Square (LMS) Adaptive Finite-Impulse-Response (FIR) filter onto parallel and pipelined architectures either introduce delays in the coefficient...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ali H. Mahdi, Bakir A. R. Al-Hashemy, Riyadh A.H. AL-Helali
Formato: article
Lenguaje:EN
Publicado: Al-Khwarizmi College of Engineering – University of Baghdad 2010
Materias:
Acceso en línea:https://doaj.org/article/f2aab48085e3412099eb31a4975a842b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:This paper extends the 1-D systolic array approach with a method of systematic linear design of systolic algorithms. Past methods for mapping the Least-Mean-Square (LMS) Adaptive Finite-Impulse-Response (FIR) filter onto parallel and pipelined architectures either introduce delays in the coefficients updates or have excessive hardware requirements. In this article, we describe an efficient 1-D systolic array for the LMS adaptive FIR filter that produces the same output and error signals as produced by the standard LMS adaptive filter architecture with single assignment form of processor functions.<br />The proposed systolic architectures that are designed operate on a block-by-block basis and makes use of the flexibility in the design, which takes the inner product step (convolution sum) of the tap weight vector and the tap input vector in the design consideration. It enables us to extract more than one algorithm for the same problem. The input and output data flow sequentially and continuously into and out of the systolic arrays at the system clock rates, during each clock period, processing element of the same type operates in parallel. The most computationally demanding among them performs only two consecutive multiplications and two additions/subtractions per clock period, thereby allowing a very high throughput and very fast block signal processing to be achieved at the expense of a delay of L samples between the input and output and 100% utilization, L being the block size.