Adaptive Data Compression for Classification Problems

Data subset selection is a crucial task in deploying machine learning algorithms under strict constraints regarding memory and computation resources. Despite extensive research in this area, a practical difficulty is the lack of rigorous strategies for identifying the optimal size of the reduced dat...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Farhad Pourkamali-Anaraki, Walter D. Bennette
Formato:	article
Lenguaje:	EN
Publicado:	IEEE 2021
Materias:	Adaptive algorithms supervised learning classification algorithms compression algorithms Electrical engineering. Electronics. Nuclear engineering TK1-9971
Acceso en línea:	https://doaj.org/article/81109cc807f3452398476c90783b5d60
Etiquetas:	Agregar Etiqueta Sin Etiquetas, Sea el primero en etiquetar este registro!

id	oai:doaj.org-article:81109cc807f3452398476c90783b5d60
record_format	dspace
spelling	oai:doaj.org-article:81109cc807f3452398476c90783b5d602021-12-03T00:01:08ZAdaptive Data Compression for Classification Problems2169-353610.1109/ACCESS.2021.3130551https://doaj.org/article/81109cc807f3452398476c90783b5d602021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9627182/https://doaj.org/toc/2169-3536Data subset selection is a crucial task in deploying machine learning algorithms under strict constraints regarding memory and computation resources. Despite extensive research in this area, a practical difficulty is the lack of rigorous strategies for identifying the optimal size of the reduced data to regulate trade-offs between accuracy and efficiency. Furthermore, existing methods are often built around specific machine learning models, and translating existing theoretical results into practice is challenging for practitioners. To address these problems, we propose two adaptive compression algorithms for classification problems by formulating data subset selection in the form of interactive teaching. The user interacts with the learning task at hand to adapt to the unique structure of the problem at hand, developing an iterative importance sampling scheme. We also propose to couple importance sampling and a diversity criterion to further control the evolution of the data summary over the rounds of interaction. We conduct extensive experiments on several data sets, including imbalanced and multiclass data, and various classification algorithms, such as ensemble learning and neural networks. Our results demonstrate the performance, efficiency, and ease of implementation of the underlying framework.Farhad Pourkamali-AnarakiWalter D. BennetteIEEEarticleAdaptive algorithmssupervised learningclassification algorithmscompression algorithmsElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 157654-157669 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Adaptive algorithms supervised learning classification algorithms compression algorithms Electrical engineering. Electronics. Nuclear engineering TK1-9971
spellingShingle	Adaptive algorithms supervised learning classification algorithms compression algorithms Electrical engineering. Electronics. Nuclear engineering TK1-9971 Farhad Pourkamali-Anaraki Walter D. Bennette Adaptive Data Compression for Classification Problems
description	Data subset selection is a crucial task in deploying machine learning algorithms under strict constraints regarding memory and computation resources. Despite extensive research in this area, a practical difficulty is the lack of rigorous strategies for identifying the optimal size of the reduced data to regulate trade-offs between accuracy and efficiency. Furthermore, existing methods are often built around specific machine learning models, and translating existing theoretical results into practice is challenging for practitioners. To address these problems, we propose two adaptive compression algorithms for classification problems by formulating data subset selection in the form of interactive teaching. The user interacts with the learning task at hand to adapt to the unique structure of the problem at hand, developing an iterative importance sampling scheme. We also propose to couple importance sampling and a diversity criterion to further control the evolution of the data summary over the rounds of interaction. We conduct extensive experiments on several data sets, including imbalanced and multiclass data, and various classification algorithms, such as ensemble learning and neural networks. Our results demonstrate the performance, efficiency, and ease of implementation of the underlying framework.
format	article
author	Farhad Pourkamali-Anaraki Walter D. Bennette
author_facet	Farhad Pourkamali-Anaraki Walter D. Bennette
author_sort	Farhad Pourkamali-Anaraki
title	Adaptive Data Compression for Classification Problems
title_short	Adaptive Data Compression for Classification Problems
title_full	Adaptive Data Compression for Classification Problems
title_fullStr	Adaptive Data Compression for Classification Problems
title_full_unstemmed	Adaptive Data Compression for Classification Problems
title_sort	adaptive data compression for classification problems
publisher	IEEE
publishDate	2021
url	https://doaj.org/article/81109cc807f3452398476c90783b5d60
work_keys_str_mv	AT farhadpourkamalianaraki adaptivedatacompressionforclassificationproblems AT walterdbennette adaptivedatacompressionforclassificationproblems
_version_	1718373998162608128

Adaptive Data Compression for Classification Problems

Ejemplares similares