Logo do repositório

Dual-Bandwidth Spectrogram Analysis for Speaker Verification

dc.contributor.authorVirgilli, Rafaello
dc.contributor.authorCandido Junior, Arnaldo [UNESP]
dc.contributor.authorda Rosa, Augusto Seben
dc.contributor.authorOliveira, Frederico S.
dc.contributor.authorSoares, Anderson da Silva
dc.contributor.institutionUniversidade Federal de Goiás (UFG)
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.contributor.institutionUniversidade Tecnológica Federal do Paraná
dc.date.accessioned2025-04-29T20:16:48Z
dc.date.issued2025-01-01
dc.description.abstractThe variability of the human voice is a challenge for speaker verification systems, influenced by individual traits and environmental conditions. This research introduces a novel approach that uses dual-bandwidth spectrograms with the Fast ResNet-34 neural network architecture for speaker verification. Dual-bandwidth spectrograms are data structures similar to multi-channel images, generated by stacking spectrograms derived from the same audio segment using two different window sizes. In this study, we employed window sizes of 5 ms and 30 ms. This approach captures a wider range of voice features across multiple temporal and spectral resolutions. Our findings demonstrate a statistically significant improvement in system performance, achieving an Equal Error Rate (EER) of 1.64% ±0.13%. This represents a 26% enhancement over the previously reported benchmark EER of 2.22% ±0.05%, validating our hypothesis that dual-bandwidth spectrograms offer a more detailed and comprehensive representation of voice features for accurate speaker verification. Analysis of individual bandwidth contributions reveals that narrowband spectrograms carry more relevant features for speaker verification, while the combination with broadband spectrograms provides complementary information.en
dc.description.affiliationUniversidade Federal de Goiás
dc.description.affiliationUniversidade Estadual Paulista
dc.description.affiliationUniversidade Tecnológica Federal do Paraná
dc.description.affiliationUnespUniversidade Estadual Paulista
dc.format.extent340-351
dc.identifierhttp://dx.doi.org/10.1007/978-3-031-79029-4_24
dc.identifier.citationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 15412 LNAI, p. 340-351.
dc.identifier.doi10.1007/978-3-031-79029-4_24
dc.identifier.issn1611-3349
dc.identifier.issn0302-9743
dc.identifier.scopus2-s2.0-85219182680
dc.identifier.urihttps://hdl.handle.net/11449/309817
dc.language.isoeng
dc.relation.ispartofLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.sourceScopus
dc.subjectbroadband
dc.subjectdual-bandwidth spectrogram
dc.subjectfeature fusion
dc.subjectnarrowband
dc.subjectspeaker verification
dc.titleDual-Bandwidth Spectrogram Analysis for Speaker Verificationen
dc.typeTrabalho apresentado em eventopt
dspace.entity.typePublication

Arquivos

Coleções