Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network

Abstract As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Jiarui Chen, Yain-Whar Si, Chon-Wai Un, Shirley W. I. Siu
Formato: article
Lenguaje:EN
Publicado: BMC 2021
Materias:
Acceso en línea:https://doaj.org/article/9bc5fbb95f9047c3bbe067400ba1fe36
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:9bc5fbb95f9047c3bbe067400ba1fe36
record_format dspace
spelling oai:doaj.org-article:9bc5fbb95f9047c3bbe067400ba1fe362021-11-28T12:30:21ZChemical toxicity prediction based on semi-supervised learning and graph convolutional neural network10.1186/s13321-021-00570-81758-2946https://doaj.org/article/9bc5fbb95f9047c3bbe067400ba1fe362021-11-01T00:00:00Zhttps://doi.org/10.1186/s13321-021-00570-8https://doaj.org/toc/1758-2946Abstract As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these experiments time consuming and costly, but experiments that involve animal testing are increasingly subject to ethical concerns. While traditional machine learning (ML) methods have been used in the field with some success, the limited availability of annotated toxicity data is the major hurdle for further improving model performance. Inspired by the success of semi-supervised learning (SSL) algorithms, we propose a Graph Convolution Neural Network (GCN) to predict chemical toxicity and trained the network by the Mean Teacher (MT) SSL algorithm. Using the Tox21 data, our optimal SSL-GCN models for predicting the twelve toxicological endpoints achieve an average ROC-AUC score of 0.757 in the test set, which is a 6% improvement over GCN models trained by supervised learning and conventional ML methods. Our SSL-GCN models also exhibit superior performance when compared to models constructed using the built-in DeepChem ML methods. This study demonstrates that SSL can increase the prediction power of models by learning from unannotated data. The optimal unannotated to annotated data ratio ranges between 1:1 and 4:1. This study demonstrates the success of SSL in chemical toxicity prediction; the same technique is expected to be beneficial to other chemical property prediction tasks by utilizing existing large chemical databases. Our optimal model SSL-GCN is hosted on an online server accessible through: https://app.cbbio.online/ssl-gcn/home .Jiarui ChenYain-Whar SiChon-Wai UnShirley W. I. SiuBMCarticleChemical toxicityDeep learningGraph convolutional neural networkSemi-supervised learningMean teacherTox21Information technologyT58.5-58.64ChemistryQD1-999ENJournal of Cheminformatics, Vol 13, Iss 1, Pp 1-16 (2021)
institution DOAJ
collection DOAJ
language EN
topic Chemical toxicity
Deep learning
Graph convolutional neural network
Semi-supervised learning
Mean teacher
Tox21
Information technology
T58.5-58.64
Chemistry
QD1-999
spellingShingle Chemical toxicity
Deep learning
Graph convolutional neural network
Semi-supervised learning
Mean teacher
Tox21
Information technology
T58.5-58.64
Chemistry
QD1-999
Jiarui Chen
Yain-Whar Si
Chon-Wai Un
Shirley W. I. Siu
Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
description Abstract As safety is one of the most important properties of drugs, chemical toxicology prediction has received increasing attentions in the drug discovery research. Traditionally, researchers rely on in vitro and in vivo experiments to test the toxicity of chemical compounds. However, not only are these experiments time consuming and costly, but experiments that involve animal testing are increasingly subject to ethical concerns. While traditional machine learning (ML) methods have been used in the field with some success, the limited availability of annotated toxicity data is the major hurdle for further improving model performance. Inspired by the success of semi-supervised learning (SSL) algorithms, we propose a Graph Convolution Neural Network (GCN) to predict chemical toxicity and trained the network by the Mean Teacher (MT) SSL algorithm. Using the Tox21 data, our optimal SSL-GCN models for predicting the twelve toxicological endpoints achieve an average ROC-AUC score of 0.757 in the test set, which is a 6% improvement over GCN models trained by supervised learning and conventional ML methods. Our SSL-GCN models also exhibit superior performance when compared to models constructed using the built-in DeepChem ML methods. This study demonstrates that SSL can increase the prediction power of models by learning from unannotated data. The optimal unannotated to annotated data ratio ranges between 1:1 and 4:1. This study demonstrates the success of SSL in chemical toxicity prediction; the same technique is expected to be beneficial to other chemical property prediction tasks by utilizing existing large chemical databases. Our optimal model SSL-GCN is hosted on an online server accessible through: https://app.cbbio.online/ssl-gcn/home .
format article
author Jiarui Chen
Yain-Whar Si
Chon-Wai Un
Shirley W. I. Siu
author_facet Jiarui Chen
Yain-Whar Si
Chon-Wai Un
Shirley W. I. Siu
author_sort Jiarui Chen
title Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_short Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_full Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_fullStr Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_full_unstemmed Chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
title_sort chemical toxicity prediction based on semi-supervised learning and graph convolutional neural network
publisher BMC
publishDate 2021
url https://doaj.org/article/9bc5fbb95f9047c3bbe067400ba1fe36
work_keys_str_mv AT jiaruichen chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork
AT yainwharsi chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork
AT chonwaiun chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork
AT shirleywisiu chemicaltoxicitypredictionbasedonsemisupervisedlearningandgraphconvolutionalneuralnetwork
_version_ 1718407972241014784