Logotipo do repositório
 

Publicação:
MilkQA: A dataset of consumer questions for the task of answer selection

dc.contributor.authorCriscuolo, Marcelo
dc.contributor.authorFonseca, Erick Rocha
dc.contributor.authorAluisio, Sandra Maria
dc.contributor.authorSperanca-Criscuolo, Ana Carolina [UNESP]
dc.contributor.institutionUniversidade de São Paulo (USP)
dc.contributor.institutionUniversidade Estadual Paulista (Unesp)
dc.date.accessioned2018-12-11T16:54:17Z
dc.date.available2018-12-11T16:54:17Z
dc.date.issued2018-01-04
dc.description.abstractWe introduce MilkQA, a question answering dataset from the dairy domain dedicated to the study of consumer questions. The dataset contains 2,657 pairs of questions and answers, written in the Portuguese language and originally collected by the Brazilian Agricultural Research Corporation (Embrapa). All questions were motivated by real situations and written by thousands of authors with very different backgrounds and levels of literacy, while answers were elaborated by specialists from Embrapa's customer service. Our dataset was filtered and anonymized by three human annotators. Consumer questions are a challenging kind of question that is usually employed as a form of seeking information. Although several question answering datasets are available, most of such resources are not suitable for research on answer selection models for consumer questions. We aim to fill this gap by making MilkQA publicly available. We study the behavior of four answer selection models on MilkQA: Two baseline models and two convolutional neural network archictetures. Our results show that MilkQA poses real challenges to computational models, particularly due to linguistic characteristics of its questions and to their unusually longer lengths. Only one of the experimented models gives reasonable results, at the cost of high computational requirements.en
dc.description.affiliationUniversity of São Paulo (USP) Institute of Mathematics and Computer Sciences
dc.description.affiliationSão Paulo State University (Unesp) College of Letters and Sciences
dc.description.affiliationUnespSão Paulo State University (Unesp) College of Letters and Sciences
dc.format.extent354-359
dc.identifierhttp://dx.doi.org/10.1109/BRACIS.2017.12
dc.identifier.citationProceedings - 2017 Brazilian Conference on Intelligent Systems, BRACIS 2017, v. 2018-January, p. 354-359.
dc.identifier.doi10.1109/BRACIS.2017.12
dc.identifier.scopus2-s2.0-85049513654
dc.identifier.urihttp://hdl.handle.net/11449/171183
dc.language.isoeng
dc.relation.ispartofProceedings - 2017 Brazilian Conference on Intelligent Systems, BRACIS 2017
dc.rights.accessRightsAcesso aberto
dc.sourceScopus
dc.titleMilkQA: A dataset of consumer questions for the task of answer selectionen
dc.typeTrabalho apresentado em evento
dspace.entity.typePublication

Arquivos

Coleções