1-D Systolic Arrays Design of LMS Adaptive (FIR) Digital Filtering
This paper extends the 1-D systolic array approach with a method of systematic linear design of systolic algorithms. Past methods for mapping the Least-Mean-Square (LMS) Adaptive Finite-Impulse-Response (FIR) filter onto parallel and pipelined architectures either introduce delays in the coefficient...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
Al-Khwarizmi College of Engineering – University of Baghdad
2010
|
Materias: | |
Acceso en línea: | https://doaj.org/article/f2aab48085e3412099eb31a4975a842b |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Sumario: | This paper extends the 1-D systolic array approach with a method of systematic linear design of systolic algorithms. Past methods for mapping the Least-Mean-Square (LMS) Adaptive Finite-Impulse-Response (FIR) filter onto parallel and pipelined architectures either introduce delays in the coefficients updates or have excessive hardware requirements. In this article, we describe an efficient 1-D systolic array for the LMS adaptive FIR filter that produces the same output and error signals as produced by the standard LMS adaptive filter architecture with single assignment form of processor functions.<br />The proposed systolic architectures that are designed operate on a block-by-block basis and makes use of the flexibility in the design, which takes the inner product step (convolution sum) of the tap weight vector and the tap input vector in the design consideration. It enables us to extract more than one algorithm for the same problem. The input and output data flow sequentially and continuously into and out of the systolic arrays at the system clock rates, during each clock period, processing element of the same type operates in parallel. The most computationally demanding among them performs only two consecutive multiplications and two additions/subtractions per clock period, thereby allowing a very high throughput and very fast block signal processing to be achieved at the expense of a delay of L samples between the input and output and 100% utilization, L being the block size. |
---|