Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules

Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit iden...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Emna Harigua-Souiai, Mohamed Mahmoud Heinhane, Yosser Zina Abdelkrim, Oussama Souiai, Ines Abdeljaoued-Tej, Ikram Guizani
Formato: article
Lenguaje:EN
Publicado: Frontiers Media S.A. 2021
Materias:
Acceso en línea:https://doaj.org/article/7bc2c5a443dc4da8b6177a50cec908ee
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:7bc2c5a443dc4da8b6177a50cec908ee
record_format dspace
spelling oai:doaj.org-article:7bc2c5a443dc4da8b6177a50cec908ee2021-12-01T13:46:45ZDeep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules1664-802110.3389/fgene.2021.744170https://doaj.org/article/7bc2c5a443dc4da8b6177a50cec908ee2021-11-01T00:00:00Zhttps://www.frontiersin.org/articles/10.3389/fgene.2021.744170/fullhttps://doaj.org/toc/1664-8021Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit identification rate. We herein propose a pipeline for Ligand-Based Drug Discovery (LBDD) against SARS-CoV-2. Through an extensive search of the literature and multiple steps of filtering, we integrated information on 2,610 molecules having a validated effect against SARS-CoV and/or SARS-CoV-2. The chemical structures of these molecules were encoded through multiple systems to be readily useful as input to conventional machine learning (ML) algorithms or deep learning (DL) architectures. We assessed the performances of seven ML algorithms and four DL algorithms in achieving molecule classification into two classes: active and inactive. The Random Forests (RF), Graph Convolutional Network (GCN), and Directed Acyclic Graph (DAG) models achieved the best performances. These models were further optimized through hyperparameter tuning and achieved ROC-AUC scores through cross-validation of 85, 83, and 79% for RF, GCN, and DAG models, respectively. An external validation step on the FDA-approved drugs collection revealed a superior potential of DL algorithms to achieve drug repurposing against SARS-CoV-2 based on the dataset herein presented. Namely, GCN and DAG achieved more than 50% of the true positive rate assessed on the confirmed hits of a PubChem bioassay.Emna Harigua-SouiaiMohamed Mahmoud HeinhaneYosser Zina AbdelkrimOussama SouiaiInes Abdeljaoued-TejInes Abdeljaoued-TejIkram GuizaniFrontiers Media S.A.articledeep learningartificial neural networkSARS-CoV-2machine learninggraph convoluational networksdrug discovery and repurposingGeneticsQH426-470ENFrontiers in Genetics, Vol 12 (2021)
institution DOAJ
collection DOAJ
language EN
topic deep learning
artificial neural network
SARS-CoV-2
machine learning
graph convoluational networks
drug discovery and repurposing
Genetics
QH426-470
spellingShingle deep learning
artificial neural network
SARS-CoV-2
machine learning
graph convoluational networks
drug discovery and repurposing
Genetics
QH426-470
Emna Harigua-Souiai
Mohamed Mahmoud Heinhane
Yosser Zina Abdelkrim
Oussama Souiai
Ines Abdeljaoued-Tej
Ines Abdeljaoued-Tej
Ikram Guizani
Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
description Drug discovery and repurposing against COVID-19 is a highly relevant topic with huge efforts dedicated to delivering novel therapeutics targeting SARS-CoV-2. In this context, computer-aided drug discovery is of interest in orienting the early high throughput screenings and in optimizing the hit identification rate. We herein propose a pipeline for Ligand-Based Drug Discovery (LBDD) against SARS-CoV-2. Through an extensive search of the literature and multiple steps of filtering, we integrated information on 2,610 molecules having a validated effect against SARS-CoV and/or SARS-CoV-2. The chemical structures of these molecules were encoded through multiple systems to be readily useful as input to conventional machine learning (ML) algorithms or deep learning (DL) architectures. We assessed the performances of seven ML algorithms and four DL algorithms in achieving molecule classification into two classes: active and inactive. The Random Forests (RF), Graph Convolutional Network (GCN), and Directed Acyclic Graph (DAG) models achieved the best performances. These models were further optimized through hyperparameter tuning and achieved ROC-AUC scores through cross-validation of 85, 83, and 79% for RF, GCN, and DAG models, respectively. An external validation step on the FDA-approved drugs collection revealed a superior potential of DL algorithms to achieve drug repurposing against SARS-CoV-2 based on the dataset herein presented. Namely, GCN and DAG achieved more than 50% of the true positive rate assessed on the confirmed hits of a PubChem bioassay.
format article
author Emna Harigua-Souiai
Mohamed Mahmoud Heinhane
Yosser Zina Abdelkrim
Oussama Souiai
Ines Abdeljaoued-Tej
Ines Abdeljaoued-Tej
Ikram Guizani
author_facet Emna Harigua-Souiai
Mohamed Mahmoud Heinhane
Yosser Zina Abdelkrim
Oussama Souiai
Ines Abdeljaoued-Tej
Ines Abdeljaoued-Tej
Ikram Guizani
author_sort Emna Harigua-Souiai
title Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_short Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_full Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_fullStr Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_full_unstemmed Deep Learning Algorithms Achieved Satisfactory Predictions When Trained on a Novel Collection of Anticoronavirus Molecules
title_sort deep learning algorithms achieved satisfactory predictions when trained on a novel collection of anticoronavirus molecules
publisher Frontiers Media S.A.
publishDate 2021
url https://doaj.org/article/7bc2c5a443dc4da8b6177a50cec908ee
work_keys_str_mv AT emnahariguasouiai deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT mohamedmahmoudheinhane deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT yosserzinaabdelkrim deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT oussamasouiai deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT inesabdeljaouedtej deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT inesabdeljaouedtej deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
AT ikramguizani deeplearningalgorithmsachievedsatisfactorypredictionswhentrainedonanovelcollectionofanticoronavirusmolecules
_version_ 1718405126218055680