Dual-Bandwidth Spectrogram Analysis for Speaker Verification

Virgilli, Rafaello; Candido Junior, Arnaldo [UNESP]; da Rosa, Augusto Seben; Oliveira, Frederico S.; Soares, Anderson da Silva

doi:10.1007/978-3-031-79029-4_24

Dual-Bandwidth Spectrogram Analysis for Speaker Verification

dc.contributor.author	Virgilli, Rafaello
dc.contributor.author	Candido Junior, Arnaldo [UNESP]
dc.contributor.author	da Rosa, Augusto Seben
dc.contributor.author	Oliveira, Frederico S.
dc.contributor.author	Soares, Anderson da Silva
dc.contributor.institution	Universidade Federal de Goiás (UFG)
dc.contributor.institution	Universidade Estadual Paulista (UNESP)
dc.contributor.institution	Universidade Tecnológica Federal do Paraná
dc.date.accessioned	2025-04-29T20:16:48Z
dc.date.issued	2025-01-01
dc.description.abstract	The variability of the human voice is a challenge for speaker verification systems, influenced by individual traits and environmental conditions. This research introduces a novel approach that uses dual-bandwidth spectrograms with the Fast ResNet-34 neural network architecture for speaker verification. Dual-bandwidth spectrograms are data structures similar to multi-channel images, generated by stacking spectrograms derived from the same audio segment using two different window sizes. In this study, we employed window sizes of 5 ms and 30 ms. This approach captures a wider range of voice features across multiple temporal and spectral resolutions. Our findings demonstrate a statistically significant improvement in system performance, achieving an Equal Error Rate (EER) of 1.64% ±0.13%. This represents a 26% enhancement over the previously reported benchmark EER of 2.22% ±0.05%, validating our hypothesis that dual-bandwidth spectrograms offer a more detailed and comprehensive representation of voice features for accurate speaker verification. Analysis of individual bandwidth contributions reveals that narrowband spectrograms carry more relevant features for speaker verification, while the combination with broadband spectrograms provides complementary information.	en
dc.description.affiliation	Universidade Federal de Goiás
dc.description.affiliation	Universidade Estadual Paulista
dc.description.affiliation	Universidade Tecnológica Federal do Paraná
dc.description.affiliationUnesp	Universidade Estadual Paulista
dc.format.extent	340-351
dc.identifier	http://dx.doi.org/10.1007/978-3-031-79029-4_24
dc.identifier.citation	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), v. 15412 LNAI, p. 340-351.
dc.identifier.doi	10.1007/978-3-031-79029-4_24
dc.identifier.issn	1611-3349
dc.identifier.issn	0302-9743
dc.identifier.scopus	2-s2.0-85219182680
dc.identifier.uri	https://hdl.handle.net/11449/309817
dc.language.iso	eng
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.source	Scopus
dc.subject	broadband
dc.subject	dual-bandwidth spectrogram
dc.subject	feature fusion
dc.subject	narrowband
dc.subject	speaker verification
dc.title	Dual-Bandwidth Spectrogram Analysis for Speaker Verification	en
dc.type	Trabalho apresentado em evento	pt
dspace.entity.type	Publication

Coleções

Artigos

Dual-Bandwidth Spectrogram Analysis for Speaker Verification

Arquivos

Coleções