Logo do repositório

Evaluation of Transformer-Based Large Language Models for Email Spam Detection Using BERT, Phi, and Gemma

dc.contributor.authorGrassmann, Ana Clara C. [UNESP]
dc.contributor.authorFeitosa, Juliana C. [UNESP]
dc.contributor.authorBrega, José R.F. [UNESP]
dc.contributor.authorda Costa, Kelton A.P. [UNESP]
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.date.accessioned2025-04-29T20:01:13Z
dc.date.issued2025-01-01
dc.description.abstractIn this paper, we study how LLMs based on the transformer architecture work and the possibility of adjusting these models to use only the body of email messages to classify them as spam or ham. The models studied are BERT, Gemma, and Phi. All of them underwent quantization stages, fine-tuning with a real dataset, and evaluation with metrics commonly used in binary classification problems. The Gemma model achieves over 99% accuracy in detecting spam, standing out as the best among the compared models.en
dc.description.affiliationDepartment of Computing São Paulo State University - UNESP
dc.description.affiliationUnespDepartment of Computing São Paulo State University - UNESP
dc.description.sponsorshipFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
dc.description.sponsorshipIdFAPESP: 2023/12830-0
dc.format.extent459-473
dc.identifierhttp://dx.doi.org/10.19139/soic-2310-5070-2267
dc.identifier.citationStatistics, Optimization and Information Computing, v. 13, n. 2, p. 459-473, 2025.
dc.identifier.doi10.19139/soic-2310-5070-2267
dc.identifier.issn2310-5070
dc.identifier.issn2311-004X
dc.identifier.scopus2-s2.0-85216991328
dc.identifier.urihttps://hdl.handle.net/11449/304869
dc.language.isoeng
dc.relation.ispartofStatistics, Optimization and Information Computing
dc.sourceScopus
dc.subjectBinary Classification
dc.subjectCybersecurity
dc.subjectFine-Tuning
dc.subjectLarge Language Models
dc.subjectSpam Detection
dc.titleEvaluation of Transformer-Based Large Language Models for Email Spam Detection Using BERT, Phi, and Gemmaen
dc.typeArtigopt
dspace.entity.typePublication

Arquivos

Coleções