Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.

Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Ros...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Julian Nazet, Elmar Lang, Rainer Merkl
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2021
Materias:
R
Q
Acceso en línea:https://doaj.org/article/6796b26ee9344f37b55d5c9ec3cd23aa
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:6796b26ee9344f37b55d5c9ec3cd23aa
record_format dspace
spelling oai:doaj.org-article:6796b26ee9344f37b55d5c9ec3cd23aa2021-12-02T20:19:29ZRosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.1932-620310.1371/journal.pone.0256691https://doaj.org/article/6796b26ee9344f37b55d5c9ec3cd23aa2021-01-01T00:00:00Zhttps://doi.org/10.1371/journal.pone.0256691https://doaj.org/toc/1932-6203Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores.Julian NazetElmar LangRainer MerklPublic Library of Science (PLoS)articleMedicineRScienceQENPLoS ONE, Vol 16, Iss 8, p e0256691 (2021)
institution DOAJ
collection DOAJ
language EN
topic Medicine
R
Science
Q
spellingShingle Medicine
R
Science
Q
Julian Nazet
Elmar Lang
Rainer Merkl
Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.
description Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores.
format article
author Julian Nazet
Elmar Lang
Rainer Merkl
author_facet Julian Nazet
Elmar Lang
Rainer Merkl
author_sort Julian Nazet
title Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.
title_short Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.
title_full Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.
title_fullStr Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.
title_full_unstemmed Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network.
title_sort rosetta:msf:nn: boosting performance of multi-state computational protein design with a neural network.
publisher Public Library of Science (PLoS)
publishDate 2021
url https://doaj.org/article/6796b26ee9344f37b55d5c9ec3cd23aa
work_keys_str_mv AT juliannazet rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork
AT elmarlang rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork
AT rainermerkl rosettamsfnnboostingperformanceofmultistatecomputationalproteindesignwithaneuralnetwork
_version_ 1718374167792844800