Efficient binaural rendering of spherical microphone array data by linear filtering

Abstract High-quality rendering of spatial sound fields in real-time is becoming increasingly important with the steadily growing interest in virtual and augmented reality technologies. Typically, a spherical microphone array (SMA) is used to capture a spatial sound field. The captured sound field c...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Johannes M. Arend, Tim Lübeck, Christoph Pörschmann
Formato: article
Lenguaje:EN
Publicado: SpringerOpen 2021
Materias:
Acceso en línea:https://doaj.org/article/de6bebedbada47bf877407e188f0b49b
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:de6bebedbada47bf877407e188f0b49b
record_format dspace
spelling oai:doaj.org-article:de6bebedbada47bf877407e188f0b49b2021-11-08T11:02:27ZEfficient binaural rendering of spherical microphone array data by linear filtering10.1186/s13636-021-00224-51687-4722https://doaj.org/article/de6bebedbada47bf877407e188f0b49b2021-11-01T00:00:00Zhttps://doi.org/10.1186/s13636-021-00224-5https://doaj.org/toc/1687-4722Abstract High-quality rendering of spatial sound fields in real-time is becoming increasingly important with the steadily growing interest in virtual and augmented reality technologies. Typically, a spherical microphone array (SMA) is used to capture a spatial sound field. The captured sound field can be reproduced over headphones in real-time using binaural rendering, virtually placing a single listener in the sound field. Common methods for binaural rendering first spatially encode the sound field by transforming it to the spherical harmonics domain and then decode the sound field binaurally by combining it with head-related transfer functions (HRTFs). However, these rendering methods are computationally demanding, especially for high-order SMAs, and require implementing quite sophisticated real-time signal processing. This paper presents a computationally more efficient method for real-time binaural rendering of SMA signals by linear filtering. The proposed method allows representing any common rendering chain as a set of precomputed finite impulse response filters, which are then applied to the SMA signals in real-time using fast convolution to produce the binaural signals. Results of the technical evaluation show that the presented approach is equivalent to conventional rendering methods while being computationally less demanding and easier to implement using any real-time convolution system. However, the lower computational complexity goes along with lower flexibility. On the one hand, encoding and decoding are no longer decoupled, and on the other hand, sound field transformations in the SH domain can no longer be performed. Consequently, in the proposed method, a filter set must be precomputed and stored for each possible head orientation of the listener, leading to higher memory requirements than the conventional methods. As such, the approach is particularly well suited for efficient real-time binaural rendering of SMA signals in a fixed setup where usually a limited range of head orientations is sufficient, such as live concert streaming or VR teleconferencing.Johannes M. ArendTim LübeckChristoph PörschmannSpringerOpenarticleSpherical microphone arraysBinaural renderingSpatial audio reproductionVirtual acousticsAcoustics. SoundQC221-246Electronic computers. Computer scienceQA75.5-76.95ENEURASIP Journal on Audio, Speech, and Music Processing, Vol 2021, Iss 1, Pp 1-11 (2021)
institution DOAJ
collection DOAJ
language EN
topic Spherical microphone arrays
Binaural rendering
Spatial audio reproduction
Virtual acoustics
Acoustics. Sound
QC221-246
Electronic computers. Computer science
QA75.5-76.95
spellingShingle Spherical microphone arrays
Binaural rendering
Spatial audio reproduction
Virtual acoustics
Acoustics. Sound
QC221-246
Electronic computers. Computer science
QA75.5-76.95
Johannes M. Arend
Tim Lübeck
Christoph Pörschmann
Efficient binaural rendering of spherical microphone array data by linear filtering
description Abstract High-quality rendering of spatial sound fields in real-time is becoming increasingly important with the steadily growing interest in virtual and augmented reality technologies. Typically, a spherical microphone array (SMA) is used to capture a spatial sound field. The captured sound field can be reproduced over headphones in real-time using binaural rendering, virtually placing a single listener in the sound field. Common methods for binaural rendering first spatially encode the sound field by transforming it to the spherical harmonics domain and then decode the sound field binaurally by combining it with head-related transfer functions (HRTFs). However, these rendering methods are computationally demanding, especially for high-order SMAs, and require implementing quite sophisticated real-time signal processing. This paper presents a computationally more efficient method for real-time binaural rendering of SMA signals by linear filtering. The proposed method allows representing any common rendering chain as a set of precomputed finite impulse response filters, which are then applied to the SMA signals in real-time using fast convolution to produce the binaural signals. Results of the technical evaluation show that the presented approach is equivalent to conventional rendering methods while being computationally less demanding and easier to implement using any real-time convolution system. However, the lower computational complexity goes along with lower flexibility. On the one hand, encoding and decoding are no longer decoupled, and on the other hand, sound field transformations in the SH domain can no longer be performed. Consequently, in the proposed method, a filter set must be precomputed and stored for each possible head orientation of the listener, leading to higher memory requirements than the conventional methods. As such, the approach is particularly well suited for efficient real-time binaural rendering of SMA signals in a fixed setup where usually a limited range of head orientations is sufficient, such as live concert streaming or VR teleconferencing.
format article
author Johannes M. Arend
Tim Lübeck
Christoph Pörschmann
author_facet Johannes M. Arend
Tim Lübeck
Christoph Pörschmann
author_sort Johannes M. Arend
title Efficient binaural rendering of spherical microphone array data by linear filtering
title_short Efficient binaural rendering of spherical microphone array data by linear filtering
title_full Efficient binaural rendering of spherical microphone array data by linear filtering
title_fullStr Efficient binaural rendering of spherical microphone array data by linear filtering
title_full_unstemmed Efficient binaural rendering of spherical microphone array data by linear filtering
title_sort efficient binaural rendering of spherical microphone array data by linear filtering
publisher SpringerOpen
publishDate 2021
url https://doaj.org/article/de6bebedbada47bf877407e188f0b49b
work_keys_str_mv AT johannesmarend efficientbinauralrenderingofsphericalmicrophonearraydatabylinearfiltering
AT timlubeck efficientbinauralrenderingofsphericalmicrophonearraydatabylinearfiltering
AT christophporschmann efficientbinauralrenderingofsphericalmicrophonearraydatabylinearfiltering
_version_ 1718442478584987648