MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets
Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of hig...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/9e399a6a966d4276ac9fafce1d8b1586 |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:9e399a6a966d4276ac9fafce1d8b1586 |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:9e399a6a966d4276ac9fafce1d8b15862021-11-25T18:51:35ZMRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets10.3390/pr91120292227-9717https://doaj.org/article/9e399a6a966d4276ac9fafce1d8b15862021-11-01T00:00:00Zhttps://www.mdpi.com/2227-9717/9/11/2029https://doaj.org/toc/2227-9717Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of high quality, abundant training data for machine learning methods can be a major limiting factor in building effective property predictors. We utilize transfer learning techniques to get around this problem, first learning on a large amount of low accuracy predicted logP values before finally tuning our model using a small, accurate dataset of 244 druglike compounds to create MRlogP, a neural network-based predictor of logP capable of outperforming state of the art freely available logP prediction methods for druglike small molecules. MRlogP achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP. We have made the trained neural network predictor and all associated code for descriptor generation freely available. In addition, MRlogP may be used online via a web interface.Yan-Kai ChenSteven ShaveManfred AuerMDPI AGarticlelipophilicity predictionlogP predictiontransfer learningphysicochemical property predictionChemical technologyTP1-1185ChemistryQD1-999ENProcesses, Vol 9, Iss 2029, p 2029 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
lipophilicity prediction logP prediction transfer learning physicochemical property prediction Chemical technology TP1-1185 Chemistry QD1-999 |
spellingShingle |
lipophilicity prediction logP prediction transfer learning physicochemical property prediction Chemical technology TP1-1185 Chemistry QD1-999 Yan-Kai Chen Steven Shave Manfred Auer MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets |
description |
Small molecule lipophilicity is often included in generalized rules for medicinal chemistry. These rules aim to reduce time, effort, costs, and attrition rates in drug discovery, allowing the rejection or prioritization of compounds without the need for synthesis and testing. The availability of high quality, abundant training data for machine learning methods can be a major limiting factor in building effective property predictors. We utilize transfer learning techniques to get around this problem, first learning on a large amount of low accuracy predicted logP values before finally tuning our model using a small, accurate dataset of 244 druglike compounds to create MRlogP, a neural network-based predictor of logP capable of outperforming state of the art freely available logP prediction methods for druglike small molecules. MRlogP achieves an average root mean squared error of 0.988 and 0.715 against druglike molecules from Reaxys and PHYSPROP. We have made the trained neural network predictor and all associated code for descriptor generation freely available. In addition, MRlogP may be used online via a web interface. |
format |
article |
author |
Yan-Kai Chen Steven Shave Manfred Auer |
author_facet |
Yan-Kai Chen Steven Shave Manfred Auer |
author_sort |
Yan-Kai Chen |
title |
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets |
title_short |
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets |
title_full |
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets |
title_fullStr |
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets |
title_full_unstemmed |
MRlogP: Transfer Learning Enables Accurate logP Prediction Using Small Experimental Training Datasets |
title_sort |
mrlogp: transfer learning enables accurate logp prediction using small experimental training datasets |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/9e399a6a966d4276ac9fafce1d8b1586 |
work_keys_str_mv |
AT yankaichen mrlogptransferlearningenablesaccuratelogppredictionusingsmallexperimentaltrainingdatasets AT stevenshave mrlogptransferlearningenablesaccuratelogppredictionusingsmallexperimentaltrainingdatasets AT manfredauer mrlogptransferlearningenablesaccuratelogppredictionusingsmallexperimentaltrainingdatasets |
_version_ |
1718410664091844608 |