MUfoldQA_G: High-accuracy protein model QA via retraining and transformation

Protein tertiary structure prediction is an active research area and has attracted significant attention recently due to the success of AlphaFold from DeepMind. Methods capable of accurately evaluating the quality of predicted models are of great importance. In the past, although many model quality...

Description complète

Enregistré dans:

Détails bibliographiques
Auteurs principaux:	Wenbo Wang, Junlin Wang, Zhaoyu Li, Dong Xu, Yi Shang
Format:	article
Langue:	EN
Publié:	Elsevier 2021
Sujets:	Protein structure prediction Protein model quality assessment Multi-model QA methods Biotechnology TP248.13-248.65
Accès en ligne:	https://doaj.org/article/1e7d5a1b894340a7aa9ad7e10b6b75b9
Tags:	Ajouter un tag Pas de tags, Soyez le premier à ajouter un tag!

id	oai:doaj.org-article:1e7d5a1b894340a7aa9ad7e10b6b75b9
record_format	dspace
spelling	oai:doaj.org-article:1e7d5a1b894340a7aa9ad7e10b6b75b92021-11-30T04:15:26ZMUfoldQA_G: High-accuracy protein model QA via retraining and transformation2001-037010.1016/j.csbj.2021.11.021https://doaj.org/article/1e7d5a1b894340a7aa9ad7e10b6b75b92021-01-01T00:00:00Zhttp://www.sciencedirect.com/science/article/pii/S2001037021004864https://doaj.org/toc/2001-0370Protein tertiary structure prediction is an active research area and has attracted significant attention recently due to the success of AlphaFold from DeepMind. Methods capable of accurately evaluating the quality of predicted models are of great importance. In the past, although many model quality assessment (QA) methods have been developed, their accuracies are not consistently high across different QA performance metrics for diverse target proteins. In this paper, we propose MUfoldQA_G, a new multi-model QA method that aims at simultaneously optimizing Pearson correlation and average GDT-TS difference, two commonly used QA performance metrics. This method is based on two new algorithms MUfoldQA_Gp and MUfoldQA_Gr. MUfoldQA_Gp uses a new technique to combine information from protein templates and reference protein models to maximize the Pearson correlation QA metric. MUfoldQA_Gr employs a new machine learning technique that resamples training data and retrains adaptively to learn a consensus model that is better than naïve consensus while minimizing average GDT-TS difference. MUfoldQA_G uses a new method to combine the results of MUfoldQA_Gr and MUfoldQA_Gp so that the final QA prediction results achieve low average GDT-TS difference that is close to the results from MUfoldQA_Gr, while maintaining high Pearson correlation that is the same as the results from MUfoldQA_Gp. In CASP14 QA categories, MUfoldQA_G ranked No. 1 in Pearson correlation and No. 2 in average GDT-TS difference.Wenbo WangJunlin WangZhaoyu LiDong XuYi ShangElsevierarticleProtein structure predictionProtein model quality assessmentMulti-model QA methodsBiotechnologyTP248.13-248.65ENComputational and Structural Biotechnology Journal, Vol 19, Iss , Pp 6282-6290 (2021)
institution	DOAJ
collection	DOAJ
language	EN
topic	Protein structure prediction Protein model quality assessment Multi-model QA methods Biotechnology TP248.13-248.65
spellingShingle	Protein structure prediction Protein model quality assessment Multi-model QA methods Biotechnology TP248.13-248.65 Wenbo Wang Junlin Wang Zhaoyu Li Dong Xu Yi Shang MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
description	Protein tertiary structure prediction is an active research area and has attracted significant attention recently due to the success of AlphaFold from DeepMind. Methods capable of accurately evaluating the quality of predicted models are of great importance. In the past, although many model quality assessment (QA) methods have been developed, their accuracies are not consistently high across different QA performance metrics for diverse target proteins. In this paper, we propose MUfoldQA_G, a new multi-model QA method that aims at simultaneously optimizing Pearson correlation and average GDT-TS difference, two commonly used QA performance metrics. This method is based on two new algorithms MUfoldQA_Gp and MUfoldQA_Gr. MUfoldQA_Gp uses a new technique to combine information from protein templates and reference protein models to maximize the Pearson correlation QA metric. MUfoldQA_Gr employs a new machine learning technique that resamples training data and retrains adaptively to learn a consensus model that is better than naïve consensus while minimizing average GDT-TS difference. MUfoldQA_G uses a new method to combine the results of MUfoldQA_Gr and MUfoldQA_Gp so that the final QA prediction results achieve low average GDT-TS difference that is close to the results from MUfoldQA_Gr, while maintaining high Pearson correlation that is the same as the results from MUfoldQA_Gp. In CASP14 QA categories, MUfoldQA_G ranked No. 1 in Pearson correlation and No. 2 in average GDT-TS difference.
format	article
author	Wenbo Wang Junlin Wang Zhaoyu Li Dong Xu Yi Shang
author_facet	Wenbo Wang Junlin Wang Zhaoyu Li Dong Xu Yi Shang
author_sort	Wenbo Wang
title	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_short	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_full	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_fullStr	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_full_unstemmed	MUfoldQA_G: High-accuracy protein model QA via retraining and transformation
title_sort	mufoldqa_g: high-accuracy protein model qa via retraining and transformation
publisher	Elsevier
publishDate	2021
url	https://doaj.org/article/1e7d5a1b894340a7aa9ad7e10b6b75b9
work_keys_str_mv	AT wenbowang mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT junlinwang mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT zhaoyuli mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT dongxu mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation AT yishang mufoldqaghighaccuracyproteinmodelqaviaretrainingandtransformation
_version_	1718406789952700416

MUfoldQA_G: High-accuracy protein model QA via retraining and transformation

Documents similaires