Deep neural networks for genomic prediction do not estimate marker effects

Abstract Genomic prediction is a promising technology for advancing both plant and animal breeding, with many different prediction models evaluated in the literature. It has been suggested that the ability of powerful nonlinear models, such as deep neural networks, to capture complex epistatic effec...

Description complète

Enregistré dans:
Détails bibliographiques
Auteurs principaux: Jordan Ubbens, Isobel Parkin, Christina Eynck, Ian Stavness, Andrew G. Sharpe
Format: article
Langue:EN
Publié: Wiley 2021
Sujets:
Accès en ligne:https://doaj.org/article/33a2fce533d7499b90a08552cbaacd06
Tags: Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
Description
Résumé:Abstract Genomic prediction is a promising technology for advancing both plant and animal breeding, with many different prediction models evaluated in the literature. It has been suggested that the ability of powerful nonlinear models, such as deep neural networks, to capture complex epistatic effects between markers offers advantages for genomic prediction. However, these methods tend not to outperform classical linear methods, leaving it an open question why this capacity to model nonlinear effects does not seem to result in better predictive capability. In this work, we propose the theory that, because of a previously described principle called shortcut learning, deep neural networks tend to base their predictions on overall genetic relatedness rather than on the effects of particular markers such as epistatic effects. Using several datasets of crop plants [lentil (Lens culinaris Medik.), wheat (Triticum aestivum L.), and Brassica carinata A. Braun], we demonstrate the network's indifference to the values of the markers by showing that the same network, provided with only the locations of matches between markers for two individuals, is able to perform prediction to the same level of accuracy.