Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Le Li, Benjamin Guedj
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Q
Acceso en línea:https://doaj.org/article/16617bdbcbda4d09a3e9fb9cc38aeb99
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:16617bdbcbda4d09a3e9fb9cc38aeb99
record_format dspace
spelling oai:doaj.org-article:16617bdbcbda4d09a3e9fb9cc38aeb992021-11-25T17:30:42ZSequential Learning of Principal Curves: Summarizing Data Streams on the Fly10.3390/e231115341099-4300https://doaj.org/article/16617bdbcbda4d09a3e9fb9cc38aeb992021-11-01T00:00:00Zhttps://www.mdpi.com/1099-4300/23/11/1534https://doaj.org/toc/1099-4300When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called slpc, for sequential learning principal curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data.Le LiBenjamin GuedjMDPI AGarticlesequential learningprincipal curvesdata streamsregret boundsgreedy algorithmsleeping expertsScienceQAstrophysicsQB460-466PhysicsQC1-999ENEntropy, Vol 23, Iss 1534, p 1534 (2021)
institution DOAJ
collection DOAJ
language EN
topic sequential learning
principal curves
data streams
regret bounds
greedy algorithm
sleeping experts
Science
Q
Astrophysics
QB460-466
Physics
QC1-999
spellingShingle sequential learning
principal curves
data streams
regret bounds
greedy algorithm
sleeping experts
Science
Q
Astrophysics
QB460-466
Physics
QC1-999
Le Li
Benjamin Guedj
Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
description When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called slpc, for sequential learning principal curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data.
format article
author Le Li
Benjamin Guedj
author_facet Le Li
Benjamin Guedj
author_sort Le Li
title Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
title_short Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
title_full Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
title_fullStr Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
title_full_unstemmed Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly
title_sort sequential learning of principal curves: summarizing data streams on the fly
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/16617bdbcbda4d09a3e9fb9cc38aeb99
work_keys_str_mv AT leli sequentiallearningofprincipalcurvessummarizingdatastreamsonthefly
AT benjaminguedj sequentiallearningofprincipalcurvessummarizingdatastreamsonthefly
_version_ 1718412270301609984