A divided and prioritized experience replay approach for streaming regression

In the streaming learning setting, an agent is presented with a data stream on which to learn from in an online fashion. A common problem is catastrophic forgetting of old knowledge due to updates to the model. Mitigating catastrophic forgetting has received a lot of attention, and a variety of meth...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Mikkel Leite Arnø, John-Morten Godhavn, Ole Morten Aamo
Formato: article
Lenguaje:EN
Publicado: Elsevier 2021
Materias:
Q
Acceso en línea:https://doaj.org/article/0c6548a53bbf4b5aa7205376e59d3d02
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
Descripción
Sumario:In the streaming learning setting, an agent is presented with a data stream on which to learn from in an online fashion. A common problem is catastrophic forgetting of old knowledge due to updates to the model. Mitigating catastrophic forgetting has received a lot of attention, and a variety of methods exist to solve this problem. In this paper, we present a divided and prioritized experience replay approach for streaming regression, in which relevant observations are retained in the replay, and extra focus is added to poorly estimated observations through prioritization. Using a real-world dataset, the method is compared to the standard sliding window approach. A statistical power analysis is performed, showing how our approach improves performance on rare, important events at a trade-off in performance for more common observations. Close inspections of the dataset are provided, with emphasis on areas where the standard approach fails. A rephrasing of the problem to a binary classification problem is performed to separate common and rare, important events. These results provide an added perspective regarding the improvement made on rare events. • We divide the prediction space in a streaming regression setting • Observations in the experience replay are prioritized for further training by the model’s current error