Finding the combination of multiple biomarkers to diagnose oral squamous cell carcinoma – A data mining approach

da Costa, Nattane Luíza; de Sá Alves, Mariana [UNESP]; de Sá Rodrigues, Nayara [UNESP]; Bandeira, Celso Muller [UNESP]; Oliveira Alves, Mônica Ghislaine; Mendes, Maria Anita; Cesar Alves, Levy Anderson; Almeida, Janete Dias [UNESP]; Barbosa, Rommel

doi:10.1016/j.compbiomed.2022.105296

Finding the combination of multiple biomarkers to diagnose oral squamous cell carcinoma – A data mining approach

dc.contributor.author	da Costa, Nattane Luíza
dc.contributor.author	de Sá Alves, Mariana [UNESP]
dc.contributor.author	de Sá Rodrigues, Nayara [UNESP]
dc.contributor.author	Bandeira, Celso Muller [UNESP]
dc.contributor.author	Oliveira Alves, Mônica Ghislaine
dc.contributor.author	Mendes, Maria Anita
dc.contributor.author	Cesar Alves, Levy Anderson
dc.contributor.author	Almeida, Janete Dias [UNESP]
dc.contributor.author	Barbosa, Rommel
dc.contributor.institution	Science and Technology
dc.contributor.institution	Universidade Estadual Paulista (UNESP)
dc.contributor.institution	Universidade Mogi das Cruzes
dc.contributor.institution	Anhembi Morumbi University
dc.contributor.institution	Universidade de São Paulo (USP)
dc.contributor.institution	Universidade Paulista
dc.contributor.institution	Universidade Municipal de São Caetano do Sul
dc.contributor.institution	Universidade Federal de Goiás (UFG)
dc.date.accessioned	2022-05-01T13:41:29Z
dc.date.available	2022-05-01T13:41:29Z
dc.date.issued	2022-04-01
dc.description.abstract	Data mining has proven to be a reliable method to analyze and discover useful knowledge about various diseases, including cancer research. In particular, data mining and machine learning algorithms to study oral squamous cell carcinoma (OSCC), the most common form of oral cancer, is a new area of research. This malignant neoplasm can be studied using saliva samples. Saliva is an important biofluid that must be used to verify potential biomarkers associated with oral cancer. In this study, first, we provide an overview of OSSC diagnoses based on machine learning and salivary metabolites. To our knowledge, this is the first study to apply advanced data mining techniques to diagnose OSCC. Then, we give new results of classification and feature selection algorithms used to identify potential salivary biomarkers of OSCC. To accomplish this task, we used the filter feature selection random forest importance algorithm and a wrapper methodology to evaluate the importance of metabolites obtained from gas chromatography mass-spectrometry (GC-MS) in the context of differentiation of OSCC and the control group. Salivary samples (n = 68) were collected for the control group, and the OSCC group were from patients matched for gender, age, and smoking habit. The classification process occurred based on Random Forest (RF) classification algorithm along with 10-cross validation. The results showed that glucuronic acid, maleic acid, and batyl alcohol can classify the samples with an area under the curve (AUC) of 0.91 versus an AUC of 0.76 using all 51 metabolites analyzed. The methodology used in this study can assist healthcare professionals and be adopted to discover diagnostic biomarkers for other diseases.	en
dc.description.affiliation	Informatics Nucleo Goiano Federal Institute of Education Science and Technology, Campus Urutaí
dc.description.affiliation	Department of Biosciences and Oral Diagnosis Institute of Science and Technology São Paulo State University (Unesp)
dc.description.affiliation	Technology Reaearch Center (NPT) Universidade Mogi das Cruzes
dc.description.affiliation	School of Medicine Anhembi Morumbi University
dc.description.affiliation	Dempster MS Lab Universidade de São Paulo
dc.description.affiliation	School of Dentistry Universidade Paulista
dc.description.affiliation	School of Dentistry Universidade Municipal de São Caetano do Sul
dc.description.affiliation	Instituto de Informática Universidade Federal de Goiás
dc.description.affiliationUnesp	Department of Biosciences and Oral Diagnosis Institute of Science and Technology São Paulo State University (Unesp)
dc.description.sponsorship	Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
dc.description.sponsorship	Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
dc.description.sponsorship	Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
dc.description.sponsorshipId	FAPESP: 2016/08633-0
dc.identifier	http://dx.doi.org/10.1016/j.compbiomed.2022.105296
dc.identifier.citation	Computers in Biology and Medicine, v. 143.
dc.identifier.doi	10.1016/j.compbiomed.2022.105296
dc.identifier.issn	1879-0534
dc.identifier.issn	0010-4825
dc.identifier.scopus	2-s2.0-85124169435
dc.identifier.uri	http://hdl.handle.net/11449/234108
dc.language.iso	eng
dc.relation.ispartof	Computers in Biology and Medicine
dc.source	Scopus
dc.subject	Data mining
dc.subject	Feature selection
dc.subject	Machine learning
dc.subject	Metabolites
dc.subject	Oral squamous cell carcinoma
dc.subject	Salivary biomarkers
dc.title	Finding the combination of multiple biomarkers to diagnose oral squamous cell carcinoma – A data mining approach	en
dc.type	Artigo
dspace.entity.type	Publication
unesp.author.orcid	0000-0001-7310-1150[1]
unesp.author.orcid	0000-0003-0339-698X[3]
unesp.author.orcid	0000-0002-2078-9286[6]
unesp.campus	Universidade Estadual Paulista (UNESP), Instituto de Ciência e Tecnologia, São José dos Campos	pt
unesp.department	Biociências e Diagnóstico Bucal - ICT	pt

Coleções

São José dos Campos - ICT - Instituto de Ciência e Tecnologia

Finding the combination of multiple biomarkers to diagnose oral squamous cell carcinoma – A data mining approach

Arquivos

Coleções