An Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detection

dc.contributor.authorContreras, Rodrigo Colnago [UNESP]
dc.contributor.authorViana, Monique Simplicio
dc.contributor.authorFonseca, Everthon Silva
dc.contributor.authordos Santos, Francisco Lledo
dc.contributor.authorZanin, Rodrigo Bruno
dc.contributor.authorGuido, Rodrigo Capobianco [UNESP]
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.contributor.institutionFederal Institute of São Paulo
dc.contributor.institutionMato Grosso State University
dc.date.accessioned2023-07-29T16:16:17Z
dc.date.available2023-07-29T16:16:17Z
dc.date.issued2023-06-01
dc.description.abstractBiometrics-based authentication has become the most well-established form of user recognition in systems that demand a certain level of security. For example, the most commonplace social activities stand out, such as access to the work environment or to one’s own bank account. Among all biometrics, voice receives special attention due to factors such as ease of collection, the low cost of reading devices, and the high quantity of literature and software packages available for use. However, these biometrics may have the ability to represent the individual impaired by the phenomenon known as dysphonia, which consists of a change in the sound signal due to some disease that acts on the vocal apparatus. As a consequence, for example, a user with the flu may not be properly authenticated by the recognition system. Therefore, it is important that automatic voice dysphonia detection techniques be developed. In this work, we propose a new framework based on the representation of the voice signal by the multiple projection of cepstral coefficients to promote the detection of dysphonic alterations in the voice through machine learning techniques. Most of the best-known cepstral coefficient extraction techniques in the literature are mapped and analyzed separately and together with measures related to the fundamental frequency of the voice signal, and its representation capacity is evaluated on three classifiers. Finally, the experiments on a subset of the Saarbruecken Voice Database prove the effectiveness of the proposed material in detecting the presence of dysphonia in the voice.en
dc.description.affiliationDepartment of Computer Science and Statistics Institute of Biosciences Letters and Exact Sciences São Paulo State University, SP
dc.description.affiliationFederal Institute of São Paulo, SP
dc.description.affiliationFaculty of Architecture and Engineering Mato Grosso State University, MT
dc.description.affiliationUnespDepartment of Computer Science and Statistics Institute of Biosciences Letters and Exact Sciences São Paulo State University, SP
dc.identifierhttp://dx.doi.org/10.3390/s23115196
dc.identifier.citationSensors, v. 23, n. 11, 2023.
dc.identifier.doi10.3390/s23115196
dc.identifier.issn1424-8220
dc.identifier.scopus2-s2.0-85161510694
dc.identifier.urihttp://hdl.handle.net/11449/250048
dc.language.isoeng
dc.relation.ispartofSensors
dc.sourceScopus
dc.subjectcepstral analysis
dc.subjectdysphonia detection
dc.subjectmachine learning
dc.subjectpattern recognition
dc.subjectvoice disorder detection
dc.titleAn Experimental Analysis on Multicepstral Projection Representation Strategies for Dysphonia Detectionen
dc.typeArtigo
unesp.author.orcid0000-0003-4003-7791[1]
unesp.author.orcid0000-0002-2960-8293[2]
unesp.author.orcid0000-0001-6202-0806[3]
unesp.author.orcid0000-0002-7718-8203[4]
unesp.author.orcid0000-0002-4990-0056[5]
unesp.author.orcid0000-0002-0924-8024[6]
unesp.campusUniversidade Estadual Paulista (Unesp), Instituto de Biociências Letras e Ciências Exatas, São José do Rio Pretopt
unesp.departmentCiências da Computação e Estatística - IBILCEpt

Arquivos