A framework for speaker retrieval and identification through unsupervised learning

Campos, Victor de Abreu [UNESP]; Pedronette, Daniel Carlos Guimarães [UNESP]

A framework for speaker retrieval and identification through unsupervised learning

dc.contributor.author	Campos, Victor de Abreu [UNESP]
dc.contributor.author	Pedronette, Daniel Carlos Guimarães [UNESP]
dc.contributor.institution	Universidade Estadual Paulista (Unesp)
dc.date.accessioned	2019-10-06T15:41:55Z
dc.date.available	2019-10-06T15:41:55Z
dc.date.issued	2019-11-01
dc.description.abstract	Speaker recognition is a task of remarkable relevance, with applications in diversified domains. Recently, mainly due to the facilities in audio-visual content acquisition, the capacity of analyzing growing datasets independent of labeled data has become a crucial advantage. This paper presents a speaker recognition approach based on recent unsupervised learning methods, which do not require any labeled data or user intervention. The approach is organized in terms of a framework which exploits a rank-based formulation. The similarity information defined by speaker modeling techniques is encoded in ranked lists, which are used as input by the unsupervised learning algorithms. Vector quantization, Gaussian mixture models and i-vectors are employed as modeling techniques, while the algorithms RL-Sim and ReckNN are used for unsupervised learning tasks. The framework was experimentally evaluated on query-by-example speaker retrieval and speaker identification tasks, both on clean and noisy speech recordings. An experimental evaluation was conducted on three public datasets, different languages, and recordings conditions. Effectiveness gains up to +56% on retrieval measures were obtained through the use of unsupervised learning algorithms over traditional speaker recognition techniques.	en
dc.description.affiliation	Department of Statistics Applied Mathematics and Computing State University of São Paulo (UNESP)
dc.description.affiliationUnesp	Department of Statistics Applied Mathematics and Computing State University of São Paulo (UNESP)
dc.description.sponsorship	Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
dc.description.sponsorship	Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
dc.description.sponsorshipId	FAPESP: #2015/07934-4
dc.description.sponsorshipId	FAPESP: #2017/25908-6
dc.description.sponsorshipId	FAPESP: #2018/15597-6
dc.description.sponsorshipId	CNPq: #308194/2017-9
dc.format.extent	153-174
dc.identifier	http://dx.doi.org/10.1016/j.csl.2019.04.004
dc.identifier.citation	Computer Speech and Language, v. 58, p. 153-174.
dc.identifier.doi	10.1016/j.csl.2019.04.004
dc.identifier.issn	1095-8363
dc.identifier.issn	0885-2308
dc.identifier.scopus	2-s2.0-85065105944
dc.identifier.uri	http://hdl.handle.net/11449/187617
dc.language.iso	eng
dc.relation.ispartof	Computer Speech and Language
dc.rights.accessRights	Acesso aberto
dc.source	Scopus
dc.subject	Gaussian mixture model
dc.subject	i-vector
dc.subject	Speaker recognition
dc.subject	Speaker retrieval
dc.subject	Unsupervised learning
dc.subject	Vector quantization
dc.title	A framework for speaker retrieval and identification through unsupervised learning	en
dc.type	Artigo

Coleções

Artigos

A framework for speaker retrieval and identification through unsupervised learning

Arquivos

Coleções