Federated Learning for Privacy-Preserving Speaker Recognition

The state-of-the-art speaker recognition systems are usually trained on a single computer using speech data collected from multiple users. However, these speech samples may contain private information which users may not be willing to share. To overcome potential breaches of privacy, we investigate...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Abraham Woubie, Tom Backstrom
Formato:	article
Lenguaje:	EN
Publicado:	IEEE 2021
Materias:	Edge computation federated learning privacy secure aggregator speaker recognition Electrical engineering. Electronics. Nuclear engineering TK1-9971
Acceso en línea:	https://doaj.org/article/20568ae1fca8458dab50743444594d16
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:20568ae1fca8458dab50743444594d16
record_format	dspace
spelling	oai:doaj.org-article:20568ae1fca8458dab50743444594d162021-11-18T00:04:06ZFederated Learning for Privacy-Preserving Speaker Recognition2169-353610.1109/ACCESS.2021.3124029https://doaj.org/article/20568ae1fca8458dab50743444594d162021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9592761/https://doaj.org/toc/2169-3536The state-of-the-art speaker recognition systems are usually trained on a single computer using speech data collected from multiple users. However, these speech samples may contain private information which users may not be willing to share. To overcome potential breaches of privacy, we investigate the use of federated learning with and without secure aggregators both for supervised and unsupervised speaker recognition systems. Federated learning enables training of a shared model without sharing private data by training the models on edge devices where the data resides. In the proposed system, each edge device trains an individual model which is subsequently sent to a secure aggregator or directly to the main server. To provide contrasting data without the need for transmitting data, we use a generative adversarial network to generate imposter data at the edge. Afterwards, the secure aggregator or the main server merges the individual models, builds a global model and transmits the global model to the edge devices. Experimental results on Voxceleb-1 dataset show that the use of federated learning both for supervised and unsupervised speaker recognition systems provides two advantages. Firstly, it retains privacy since the raw data does not leave the edge devices. Secondly, experimental results show that the aggregated model provides a better average equal error rate than the individual models when the federated model does not use a secure aggregator. Thus, our results quantify the challenges in practical application of privacy-preserving training of speaker training, in particular in terms of the trade-off between privacy and accuracy.Abraham WoubieTom BackstromIEEEarticleEdge computationfederated learningprivacysecure aggregatorspeaker recognitionElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 149477-149485 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Edge computation federated learning privacy secure aggregator speaker recognition Electrical engineering. Electronics. Nuclear engineering TK1-9971
spellingShingle	Edge computation federated learning privacy secure aggregator speaker recognition Electrical engineering. Electronics. Nuclear engineering TK1-9971 Abraham Woubie Tom Backstrom Federated Learning for Privacy-Preserving Speaker Recognition
description	The state-of-the-art speaker recognition systems are usually trained on a single computer using speech data collected from multiple users. However, these speech samples may contain private information which users may not be willing to share. To overcome potential breaches of privacy, we investigate the use of federated learning with and without secure aggregators both for supervised and unsupervised speaker recognition systems. Federated learning enables training of a shared model without sharing private data by training the models on edge devices where the data resides. In the proposed system, each edge device trains an individual model which is subsequently sent to a secure aggregator or directly to the main server. To provide contrasting data without the need for transmitting data, we use a generative adversarial network to generate imposter data at the edge. Afterwards, the secure aggregator or the main server merges the individual models, builds a global model and transmits the global model to the edge devices. Experimental results on Voxceleb-1 dataset show that the use of federated learning both for supervised and unsupervised speaker recognition systems provides two advantages. Firstly, it retains privacy since the raw data does not leave the edge devices. Secondly, experimental results show that the aggregated model provides a better average equal error rate than the individual models when the federated model does not use a secure aggregator. Thus, our results quantify the challenges in practical application of privacy-preserving training of speaker training, in particular in terms of the trade-off between privacy and accuracy.
format	article
author	Abraham Woubie Tom Backstrom
author_facet	Abraham Woubie Tom Backstrom
author_sort	Abraham Woubie
title	Federated Learning for Privacy-Preserving Speaker Recognition
title_short	Federated Learning for Privacy-Preserving Speaker Recognition
title_full	Federated Learning for Privacy-Preserving Speaker Recognition
title_fullStr	Federated Learning for Privacy-Preserving Speaker Recognition
title_full_unstemmed	Federated Learning for Privacy-Preserving Speaker Recognition
title_sort	federated learning for privacy-preserving speaker recognition
publisher	IEEE
publishDate	2021
url	https://doaj.org/article/20568ae1fca8458dab50743444594d16
work_keys_str_mv	AT abrahamwoubie federatedlearningforprivacypreservingspeakerrecognition AT tombackstrom federatedlearningforprivacypreservingspeakerrecognition
_version_	1718425211894759424

Federated Learning for Privacy-Preserving Speaker Recognition

Ejemplares similares