Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Le Li, Benjamin Guedj
Format:	article
Langue:	EN
Publié:	MDPI AG 2021
Sujets:	sequential learning principal curves data streams regret bounds greedy algorithm sleeping experts Science Q Astrophysics QB460-466 Physics QC1-999
Accès en ligne:	https://doaj.org/article/16617bdbcbda4d09a3e9fb9cc38aeb99
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

Description
Résumé:	When confronted with massive data streams, summarizing data with dimension reduction methods such as PCA raises theoretical and algorithmic pitfalls. A principal curve acts as a nonlinear generalization of PCA, and the present paper proposes a novel algorithm to automatically and sequentially learn principal curves from data streams. We show that our procedure is supported by regret bounds with optimal sublinear remainder terms. A greedy local search implementation (called slpc, for sequential learning principal curves) that incorporates both sleeping experts and multi-armed bandit ingredients is presented, along with its regret computation and performance on synthetic and real-life data.

Sequential Learning of Principal Curves: Summarizing Data Streams on the Fly

Documents similaires