Detecting network communities: an application to phylogenetic analysis.

This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be a...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Roberto F S Andrade, Ivan C Rocha-Neto, Leonardo B L Santos, Charles N de Santana, Marcelo V C Diniz, Thierry Petit Lobão, Aristóteles Goés-Neto, Suani T R Pinho, Charbel N El-Hani
Formato: article
Lenguaje:EN
Publicado: Public Library of Science (PLoS) 2011
Materias:
Acceso en línea:https://doaj.org/article/d5a3ef27cf484961866388fc834d7c1c
Etiquetas: Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
id oai:doaj.org-article:d5a3ef27cf484961866388fc834d7c1c
record_format dspace
spelling oai:doaj.org-article:d5a3ef27cf484961866388fc834d7c1c2021-11-18T05:50:34ZDetecting network communities: an application to phylogenetic analysis.1553-734X1553-735810.1371/journal.pcbi.1001131https://doaj.org/article/d5a3ef27cf484961866388fc834d7c1c2011-05-01T00:00:00Zhttps://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21573202/?tool=EBIhttps://doaj.org/toc/1553-734Xhttps://doaj.org/toc/1553-7358This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis.Roberto F S AndradeIvan C Rocha-NetoLeonardo B L SantosCharles N de SantanaMarcelo V C DinizThierry Petit LobãoAristóteles Goés-NetoSuani T R PinhoCharbel N El-HaniPublic Library of Science (PLoS)articleBiology (General)QH301-705.5ENPLoS Computational Biology, Vol 7, Iss 5, p e1001131 (2011)
institution DOAJ
collection DOAJ
language EN
topic Biology (General)
QH301-705.5
spellingShingle Biology (General)
QH301-705.5
Roberto F S Andrade
Ivan C Rocha-Neto
Leonardo B L Santos
Charles N de Santana
Marcelo V C Diniz
Thierry Petit Lobão
Aristóteles Goés-Neto
Suani T R Pinho
Charbel N El-Hani
Detecting network communities: an application to phylogenetic analysis.
description This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis.
format article
author Roberto F S Andrade
Ivan C Rocha-Neto
Leonardo B L Santos
Charles N de Santana
Marcelo V C Diniz
Thierry Petit Lobão
Aristóteles Goés-Neto
Suani T R Pinho
Charbel N El-Hani
author_facet Roberto F S Andrade
Ivan C Rocha-Neto
Leonardo B L Santos
Charles N de Santana
Marcelo V C Diniz
Thierry Petit Lobão
Aristóteles Goés-Neto
Suani T R Pinho
Charbel N El-Hani
author_sort Roberto F S Andrade
title Detecting network communities: an application to phylogenetic analysis.
title_short Detecting network communities: an application to phylogenetic analysis.
title_full Detecting network communities: an application to phylogenetic analysis.
title_fullStr Detecting network communities: an application to phylogenetic analysis.
title_full_unstemmed Detecting network communities: an application to phylogenetic analysis.
title_sort detecting network communities: an application to phylogenetic analysis.
publisher Public Library of Science (PLoS)
publishDate 2011
url https://doaj.org/article/d5a3ef27cf484961866388fc834d7c1c
work_keys_str_mv AT robertofsandrade detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT ivancrochaneto detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT leonardoblsantos detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT charlesndesantana detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT marcelovcdiniz detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT thierrypetitlobao detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT aristotelesgoesneto detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT suanitrpinho detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
AT charbelnelhani detectingnetworkcommunitiesanapplicationtophylogeneticanalysis
_version_ 1718424819375013888