Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information

dc.contributor.authorAcencio, Marcio Luis [UNESP]
dc.contributor.authorLemke, Ney [UNESP]
dc.contributor.institutionUniversidade Estadual Paulista (Unesp)
dc.date.accessioned2014-05-20T13:49:38Z
dc.date.available2014-05-20T13:49:38Z
dc.date.issued2009-09-16
dc.description.abstractBackground: The identification of essential genes is important for the understanding of the minimal requirements for cellular life and for practical purposes, such as drug design. However, the experimental techniques for essential genes discovery are labor-intensive and time-consuming. Considering these experimental constraints, a computational approach capable of accurately predicting essential genes would be of great value. We therefore present here a machine learning-based computational approach relying on network topological features, cellular localization and biological process information for prediction of essential genes.Results: We constructed a decision tree-based meta-classifier and trained it on datasets with individual and grouped attributes-network topological features, cellular compartments and biological processes-to generate various predictors of essential genes. We showed that the predictors with better performances are those generated by datasets with integrated attributes. Using the predictor with all attributes, i.e., network topological features, cellular compartments and biological processes, we obtained the best predictor of essential genes that was then used to classify yeast genes with unknown essentiality status. Finally, we generated decision trees by training the J48 algorithm on datasets with all network topological features, cellular localization and biological process information to discover cellular rules for essentiality. We found that the number of protein physical interactions, the nuclear localization of proteins and the number of regulating transcription factors are the most important factors determining gene essentiality.Conclusion: We were able to demonstrate that network topological features, cellular localization and biological process information are reliable predictors of essential genes. Moreover, by constructing decision trees based on these data, we could discover cellular rules governing essentiality.en
dc.description.affiliationSão Paulo State Univ, Dept Phys & Biophys, São Paulo, Brazil
dc.description.affiliationUnespSão Paulo State Univ, Dept Phys & Biophys, São Paulo, Brazil
dc.description.sponsorshipFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
dc.description.sponsorshipConselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
dc.description.sponsorshipIdFAPESP: 07/02827-9
dc.description.sponsorshipIdFAPESP: 07/01213-7
dc.description.sponsorshipIdCNPq: 474278/2006-9
dc.format.extent18
dc.identifierhttp://dx.doi.org/10.1186/1471-2105-10-290
dc.identifier.citationBmc Bioinformatics. London: Biomed Central Ltd., v. 10, p. 18, 2009.
dc.identifier.doi10.1186/1471-2105-10-290
dc.identifier.fileWOS000270276400001.pdf
dc.identifier.issn1471-2105
dc.identifier.lattes7977035910952141
dc.identifier.urihttp://hdl.handle.net/11449/17701
dc.identifier.wosWOS:000270276400001
dc.language.isoeng
dc.publisherBiomed Central Ltd.
dc.relation.ispartofBMC Bioinformatics
dc.relation.ispartofjcr2.213
dc.relation.ispartofsjr1,479
dc.rights.accessRightsAcesso aberto
dc.sourceWeb of Science
dc.titleTowards the prediction of essential genes by integration of network topology, cellular localization and biological process informationen
dc.typeArtigo
dcterms.licensehttp://www.biomedcentral.com/about/license
dcterms.rightsHolderBiomed Central Ltd.
unesp.author.lattes7977035910952141
unesp.author.orcid0000-0002-8278-240X[1]
unesp.author.orcid0000-0001-7463-4303[2]
unesp.campusUniversidade Estadual Paulista (Unesp), Instituto de Biociências, Botucatupt

Arquivos

Pacote Original
Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
WOS000270276400001.pdf
Tamanho:
1.08 MB
Formato:
Adobe Portable Document Format
Licença do Pacote
Agora exibindo 1 - 2 de 2
Nenhuma Miniatura disponível
Nome:
license.txt
Tamanho:
1.71 KB
Formato:
Item-specific license agreed upon to submission
Descrição:
Nenhuma Miniatura disponível
Nome:
license.txt
Tamanho:
1.71 KB
Formato:
Item-specific license agreed upon to submission
Descrição: