Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning
Model-based reinforcement learning is expected to be a method that can safely acquire the optimal policy under real-world conditions by using a stochastic dynamics model for planning. Since the stochastic dynamics model of the real world is generally unknown, a method for learning from state transit...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
IEEE
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/af14030d799748c8865f08da2aa6ba56 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:af14030d799748c8865f08da2aa6ba56 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:af14030d799748c8865f08da2aa6ba562021-11-18T00:05:49ZMeta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning2169-353610.1109/ACCESS.2021.3125000https://doaj.org/article/af14030d799748c8865f08da2aa6ba562021-01-01T00:00:00Zhttps://ieeexplore.ieee.org/document/9599708/https://doaj.org/toc/2169-3536Model-based reinforcement learning is expected to be a method that can safely acquire the optimal policy under real-world conditions by using a stochastic dynamics model for planning. Since the stochastic dynamics model of the real world is generally unknown, a method for learning from state transition data is necessary. However, model learning suffers from the problem of bias-variance trade-off. Conventional model learning can be formulated as a minimization problem of expected loss. Failure to consider higher-order statistics for loss would lead to fatal errors in long-term model prediction. Although various methods have been proposed to explicitly handle bias and variance, this paper first formulates a new loss function, especially for sequential training of the deep neural networks. To explicitly consider the bias-variance trade-off, a new multi-objective optimization problem with the augmented weighted Tchebycheff scalarization, is proposed. In this problem, the bias-variance trade-off can be balanced by adjusting a weight hyperparameter, although its optimal value is task-dependent and unknown. We additionally propose a general-purpose and efficient meta-optimization method for hyperparameter(s). According to the validation result on each epoch, the proposed meta-optimization can adjust the hyperparameter(s) towards the preferred solution simultaneously with model learning. In our case, the proposed meta-optimization enables the bias-variance trade-off to be balanced for maximizing the long-term prediction ability. Actually, the proposed method was applied to two simulation environments with uncertainty, and the numerical results showed that the well-balanced bias and variance of the stochastic model suitable for the long-term prediction can be achieved.Takumi AotaniTaisuke KobayashiKenji SugimotoIEEEarticleMachine learning algorithmssystems modelingPareto optimizationbias-variance trade-offElectrical engineering. Electronics. Nuclear engineeringTK1-9971ENIEEE Access, Vol 9, Pp 148783-148799 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
Machine learning algorithms systems modeling Pareto optimization bias-variance trade-off Electrical engineering. Electronics. Nuclear engineering TK1-9971 |
spellingShingle |
Machine learning algorithms systems modeling Pareto optimization bias-variance trade-off Electrical engineering. Electronics. Nuclear engineering TK1-9971 Takumi Aotani Taisuke Kobayashi Kenji Sugimoto Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning |
description |
Model-based reinforcement learning is expected to be a method that can safely acquire the optimal policy under real-world conditions by using a stochastic dynamics model for planning. Since the stochastic dynamics model of the real world is generally unknown, a method for learning from state transition data is necessary. However, model learning suffers from the problem of bias-variance trade-off. Conventional model learning can be formulated as a minimization problem of expected loss. Failure to consider higher-order statistics for loss would lead to fatal errors in long-term model prediction. Although various methods have been proposed to explicitly handle bias and variance, this paper first formulates a new loss function, especially for sequential training of the deep neural networks. To explicitly consider the bias-variance trade-off, a new multi-objective optimization problem with the augmented weighted Tchebycheff scalarization, is proposed. In this problem, the bias-variance trade-off can be balanced by adjusting a weight hyperparameter, although its optimal value is task-dependent and unknown. We additionally propose a general-purpose and efficient meta-optimization method for hyperparameter(s). According to the validation result on each epoch, the proposed meta-optimization can adjust the hyperparameter(s) towards the preferred solution simultaneously with model learning. In our case, the proposed meta-optimization enables the bias-variance trade-off to be balanced for maximizing the long-term prediction ability. Actually, the proposed method was applied to two simulation environments with uncertainty, and the numerical results showed that the well-balanced bias and variance of the stochastic model suitable for the long-term prediction can be achieved. |
format |
article |
author |
Takumi Aotani Taisuke Kobayashi Kenji Sugimoto |
author_facet |
Takumi Aotani Taisuke Kobayashi Kenji Sugimoto |
author_sort |
Takumi Aotani |
title |
Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning |
title_short |
Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning |
title_full |
Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning |
title_fullStr |
Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning |
title_full_unstemmed |
Meta-Optimization of Bias-Variance Trade-Off in Stochastic Model Learning |
title_sort |
meta-optimization of bias-variance trade-off in stochastic model learning |
publisher |
IEEE |
publishDate |
2021 |
url |
https://doaj.org/article/af14030d799748c8865f08da2aa6ba56 |
work_keys_str_mv |
AT takumiaotani metaoptimizationofbiasvariancetradeoffinstochasticmodellearning AT taisukekobayashi metaoptimizationofbiasvariancetradeoffinstochasticmodellearning AT kenjisugimoto metaoptimizationofbiasvariancetradeoffinstochasticmodellearning |
_version_ |
1718425244968943616 |