Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction

The task of real-time alignment between a music performance and the corresponding score (sheet music), also known as score following, poses a challenging multi-modal machine learning problem. Training a system that can solve this task robustly with live audio and real sheet music (i.e., scans or sco...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Florian Henkel , Gerhard Widmer 
Formato: article
Lenguaje:EN
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://doaj.org/article/9287ef02b0ab47c89f2d3e36815a0be3
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9287ef02b0ab47c89f2d3e36815a0be3
record_format dspace
spelling oai:doaj.org-article:9287ef02b0ab47c89f2d3e36815a0be32021-11-30T18:27:49ZReal-Time Music Following in Score Sheet Images via Multi-Resolution Prediction2624-989810.3389/fcomp.2021.718340https://doaj.org/article/9287ef02b0ab47c89f2d3e36815a0be32021-11-01T00:00:00Zhttps://www.frontiersin.org/articles/10.3389/fcomp.2021.718340/fullhttps://doaj.org/toc/2624-9898The task of real-time alignment between a music performance and the corresponding score (sheet music), also known as score following, poses a challenging multi-modal machine learning problem. Training a system that can solve this task robustly with live audio and real sheet music (i.e., scans or score images) requires precise ground truth alignments between audio and note-coordinate positions in the score sheet images. However, these kinds of annotations are difficult and costly to obtain, which is why research in this area mainly utilizes synthetic audio and sheet images to train and evaluate score following systems. In this work, we propose a method that does not solely rely on note alignments but is additionally capable of leveraging data with annotations of lower granularity, such as bar or score system alignments. This allows us to use a large collection of real-world piano performance recordings coarsely aligned to scanned score sheet images and, as a consequence, improve over current state-of-the-art approaches.Florian Henkel Gerhard Widmer Gerhard Widmer Frontiers Media S.A.articlemulti-modal deep learningconditional object detectionscore followingaudio-to-score alignmentmusic information retrievalElectronic computers. Computer scienceQA75.5-76.95ENFrontiers in Computer Science, Vol 3 (2021)
institution DOAJ
collection DOAJ
language EN
topic multi-modal deep learning
conditional object detection
score following
audio-to-score alignment
music information retrieval
Electronic computers. Computer science
QA75.5-76.95
spellingShingle multi-modal deep learning
conditional object detection
score following
audio-to-score alignment
music information retrieval
Electronic computers. Computer science
QA75.5-76.95
Florian Henkel 
Gerhard Widmer 
Gerhard Widmer 
Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction
description The task of real-time alignment between a music performance and the corresponding score (sheet music), also known as score following, poses a challenging multi-modal machine learning problem. Training a system that can solve this task robustly with live audio and real sheet music (i.e., scans or score images) requires precise ground truth alignments between audio and note-coordinate positions in the score sheet images. However, these kinds of annotations are difficult and costly to obtain, which is why research in this area mainly utilizes synthetic audio and sheet images to train and evaluate score following systems. In this work, we propose a method that does not solely rely on note alignments but is additionally capable of leveraging data with annotations of lower granularity, such as bar or score system alignments. This allows us to use a large collection of real-world piano performance recordings coarsely aligned to scanned score sheet images and, as a consequence, improve over current state-of-the-art approaches.
format article
author Florian Henkel 
Gerhard Widmer 
Gerhard Widmer 
author_facet Florian Henkel 
Gerhard Widmer 
Gerhard Widmer 
author_sort Florian Henkel 
title Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction
title_short Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction
title_full Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction
title_fullStr Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction
title_full_unstemmed Real-Time Music Following in Score Sheet Images via Multi-Resolution Prediction
title_sort real-time music following in score sheet images via multi-resolution prediction
publisher Frontiers Media S.A.
publishDate 2021
url https://doaj.org/article/9287ef02b0ab47c89f2d3e36815a0be3
work_keys_str_mv AT florianhenkel realtimemusicfollowinginscoresheetimagesviamultiresolutionprediction
AT gerhardwidmer realtimemusicfollowinginscoresheetimagesviamultiresolutionprediction
AT gerhardwidmer realtimemusicfollowinginscoresheetimagesviamultiresolutionprediction
_version_ 1718406394276741120