A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides
As major components of spider venoms, neurotoxic peptides exhibit structural diversity, target specificity, and have great pharmaceutical potential. Deep learning may be an alternative to the laborious and time-consuming methods for identifying these peptides. However, the major hurdle in developing...
Guardado en:
Autores principales: | , , , , , , , , |
---|---|
Formato: | article |
Lenguaje: | EN |
Publicado: |
MDPI AG
2021
|
Materias: | |
Acceso en línea: | https://doaj.org/article/87bd9b8690694363bfe6e4a9362bb36f |
Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
id |
oai:doaj.org-article:87bd9b8690694363bfe6e4a9362bb36f |
---|---|
record_format |
dspace |
spelling |
oai:doaj.org-article:87bd9b8690694363bfe6e4a9362bb36f2021-11-25T17:55:06ZA Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides10.3390/ijms2222122911422-00671661-6596https://doaj.org/article/87bd9b8690694363bfe6e4a9362bb36f2021-11-01T00:00:00Zhttps://www.mdpi.com/1422-0067/22/22/12291https://doaj.org/toc/1661-6596https://doaj.org/toc/1422-0067As major components of spider venoms, neurotoxic peptides exhibit structural diversity, target specificity, and have great pharmaceutical potential. Deep learning may be an alternative to the laborious and time-consuming methods for identifying these peptides. However, the major hurdle in developing a deep learning model is the limited data on neurotoxic peptides. Here, we present a peptide data augmentation method that improves the recognition of neurotoxic peptides via a convolutional neural network model. The neurotoxic peptides were augmented with the known neurotoxic peptides from UniProt database, and the models were trained using a training set with or without the generated sequences to verify the augmented data. The model trained with the augmented dataset outperformed the one with the unaugmented dataset, achieving accuracy of 0.9953, precision of 0.9922, recall of 0.9984, and <i>F</i>1 score of 0.9953 in simulation dataset. From the set of all RNA transcripts of <i>Callobius koreanus</i> spider, we discovered neurotoxic peptides via the model, resulting in 275 putative peptides of which 252 novel sequences and only 23 sequences showing homology with the known peptides by Basic Local Alignment Search Tool. Among these 275 peptides, four were selected and shown to have neuromodulatory effects on the human neuroblastoma cell line SH-SY5Y. The augmentation method presented here may be applied to the identification of other functional peptides from biological resources with insufficient data.Byungjo LeeMin Kyoung ShinIn-Wook HwangJunghyun JungYu Jeong ShimGo Woon KimSeung Tae KimWonhee JangJung-Suk SungMDPI AGarticledeep learningdata augmentationconvolutional neural networkneurotoxic peptide predictionspider transcriptomeBiology (General)QH301-705.5ChemistryQD1-999ENInternational Journal of Molecular Sciences, Vol 22, Iss 12291, p 12291 (2021) |
institution |
DOAJ |
collection |
DOAJ |
language |
EN |
topic |
deep learning data augmentation convolutional neural network neurotoxic peptide prediction spider transcriptome Biology (General) QH301-705.5 Chemistry QD1-999 |
spellingShingle |
deep learning data augmentation convolutional neural network neurotoxic peptide prediction spider transcriptome Biology (General) QH301-705.5 Chemistry QD1-999 Byungjo Lee Min Kyoung Shin In-Wook Hwang Junghyun Jung Yu Jeong Shim Go Woon Kim Seung Tae Kim Wonhee Jang Jung-Suk Sung A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides |
description |
As major components of spider venoms, neurotoxic peptides exhibit structural diversity, target specificity, and have great pharmaceutical potential. Deep learning may be an alternative to the laborious and time-consuming methods for identifying these peptides. However, the major hurdle in developing a deep learning model is the limited data on neurotoxic peptides. Here, we present a peptide data augmentation method that improves the recognition of neurotoxic peptides via a convolutional neural network model. The neurotoxic peptides were augmented with the known neurotoxic peptides from UniProt database, and the models were trained using a training set with or without the generated sequences to verify the augmented data. The model trained with the augmented dataset outperformed the one with the unaugmented dataset, achieving accuracy of 0.9953, precision of 0.9922, recall of 0.9984, and <i>F</i>1 score of 0.9953 in simulation dataset. From the set of all RNA transcripts of <i>Callobius koreanus</i> spider, we discovered neurotoxic peptides via the model, resulting in 275 putative peptides of which 252 novel sequences and only 23 sequences showing homology with the known peptides by Basic Local Alignment Search Tool. Among these 275 peptides, four were selected and shown to have neuromodulatory effects on the human neuroblastoma cell line SH-SY5Y. The augmentation method presented here may be applied to the identification of other functional peptides from biological resources with insufficient data. |
format |
article |
author |
Byungjo Lee Min Kyoung Shin In-Wook Hwang Junghyun Jung Yu Jeong Shim Go Woon Kim Seung Tae Kim Wonhee Jang Jung-Suk Sung |
author_facet |
Byungjo Lee Min Kyoung Shin In-Wook Hwang Junghyun Jung Yu Jeong Shim Go Woon Kim Seung Tae Kim Wonhee Jang Jung-Suk Sung |
author_sort |
Byungjo Lee |
title |
A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides |
title_short |
A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides |
title_full |
A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides |
title_fullStr |
A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides |
title_full_unstemmed |
A Deep Learning Approach with Data Augmentation to Predict Novel Spider Neurotoxic Peptides |
title_sort |
deep learning approach with data augmentation to predict novel spider neurotoxic peptides |
publisher |
MDPI AG |
publishDate |
2021 |
url |
https://doaj.org/article/87bd9b8690694363bfe6e4a9362bb36f |
work_keys_str_mv |
AT byungjolee adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT minkyoungshin adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT inwookhwang adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT junghyunjung adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT yujeongshim adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT gowoonkim adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT seungtaekim adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT wonheejang adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT jungsuksung adeeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT byungjolee deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT minkyoungshin deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT inwookhwang deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT junghyunjung deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT yujeongshim deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT gowoonkim deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT seungtaekim deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT wonheejang deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides AT jungsuksung deeplearningapproachwithdataaugmentationtopredictnovelspiderneurotoxicpeptides |
_version_ |
1718411873084243968 |