Frog calling activity detection using lightweight CNN with multi-view spectrogram: A case study on Kroombit tinker frog

Frogs play an important role in ecological systems, while frog species across the globe are threatened and declining. Therefore, it is valuable to estimate the frog population based on an intelligent computer system. Due to the success of deep learning (DL) in various pattern recognition tasks, prev...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Jie Xie, Mingying Zhu, Kai Hu, Jinglan Zhang, Harry Hines, Ya Guo
Formato:	article
Lenguaje:	EN
Publicado:	Elsevier 2022
Materias:	Bioacoustic signal activity detection Multi-view spectrogram Lightweight CNN Loss function Cybernetics Q300-390 Electronic computers. Computer science QA75.5-76.95
Acceso en línea:	https://doaj.org/article/67e7710c4b124390be626699700b55f3
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

Descripción
Sumario:	Frogs play an important role in ecological systems, while frog species across the globe are threatened and declining. Therefore, it is valuable to estimate the frog population based on an intelligent computer system. Due to the success of deep learning (DL) in various pattern recognition tasks, previous studies have used DL-based methods for frog call analysis. However, the performance of DL-based systems is highly affected by their input (feature representation). In this study, we develop a frog calling activity detection system for continuous field recordings using a light convolutional neural network (CNN) with multi-view spectrograms. To be specific, a sliding window is first applied to continuous recordings for obtaining audio segments with a fixed duration. Then, the background noise is filtered out. Next, a multi-view spectrogram is used for characterizing those segments, which has more distinctive information than a single-view spectrogram. Finally, a lightweight CNN model is used for the detection of frog calling activity with a twin loss, where different train and test sets are used to validate the model’s robustness. Our experimental results indicate that the highest macro F1-score was 99.6 ± 0.2 and 96.4 ± 2.0 using 2016 and 2017 as the train data respectively, where CNN-GAP is used as the model with multi-view spectrogram as the input.

Frog calling activity detection using lightweight CNN with multi-view spectrogram: A case study on Kroombit tinker frog

Ejemplares similares