MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets

Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of hig...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yan-Kai Chen, Steven Shave, Manfred Auer
Formato: article
Lenguaje:EN
Publicado: MDPI AG 2021
Materias:
Acceso en línea:https://doaj.org/article/9e399a6a966d4276ac9fafce1d8b1586
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9e399a6a966d4276ac9fafce1d8b1586
record_format dspace
spelling oai:doaj.org-article:9e399a6a966d4276ac9fafce1d8b15862021-11-25T18:51:35ZMRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets10.3390/pr91120292227-9717https://doaj.org/article/9e399a6a966d4276ac9fafce1d8b15862021-11-01T00:00:00Zhttps://www.mdpi.com/2227-9717/9/11/2029https://doaj.org/toc/2227-9717Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of high quality, abundant training data for machine learning methods can be a major limiting factor in building effective property predictors. We utilize transfer learning techniques to get around this problem, first learning on a large amount of low accuracy predicted logP values before finally tuning our model using a small, accurate dataset of 244 druglike compounds to create MRlogP, a neural network-based predictor of logP capable of outperforming state of the art freely available logP prediction methods for druglike small molecules. MRlogP achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP. We have made the trained neural network predictor and all associated code for descriptor generation freely available. In addition, MRlogP may be used online via a web interface.Yan-Kai ChenSteven ShaveManfred AuerMDPI AGarticlelipophilicity predictionlogP predictiontransfer learningphysicochemical property predictionChemical technologyTP1-1185ChemistryQD1-999ENProcesses, Vol 9, Iss 2029, p 2029 (2021)
institution DOAJ
collection DOAJ
language EN
topic lipophilicity prediction
logP prediction
transfer learning
physicochemical property prediction
Chemical technology
TP1-1185
Chemistry
QD1-999
spellingShingle lipophilicity prediction
logP prediction
transfer learning
physicochemical property prediction
Chemical technology
TP1-1185
Chemistry
QD1-999
Yan-Kai Chen
Steven Shave
Manfred Auer
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
description Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of high quality, abundant training data for machine learning methods can be a major limiting factor in building effective property predictors. We utilize transfer learning techniques to get around this problem, first learning on a large amount of low accuracy predicted logP values before finally tuning our model using a small, accurate dataset of 244 druglike compounds to create MRlogP, a neural network-based predictor of logP capable of outperforming state of the art freely available logP prediction methods for druglike small molecules. MRlogP achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP. We have made the trained neural network predictor and all associated code for descriptor generation freely available. In addition, MRlogP may be used online via a web interface.
format article
author Yan-Kai Chen
Steven Shave
Manfred Auer
author_facet Yan-Kai Chen
Steven Shave
Manfred Auer
author_sort Yan-Kai Chen
title MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
title_short MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
title_full MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
title_fullStr MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
title_full_unstemmed MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
title_sort mrlogp: transfer learning enables accurate logp prediction using small experimental training datasets
publisher MDPI AG
publishDate 2021
url https://doaj.org/article/9e399a6a966d4276ac9fafce1d8b1586
work_keys_str_mv AT yankaichen mrlogptransferlearningenablesaccuratelogppredictionusingsmallexperimentaltrainingdatasets
AT stevenshave mrlogptransferlearningenablesaccuratelogppredictionusingsmallexperimentaltrainingdatasets
AT manfredauer mrlogptransferlearningenablesaccuratelogppredictionusingsmallexperimentaltrainingdatasets
_version_ 1718410664091844608