Logo do repositório

Semi-supervised approach for detecting malicious domains in TLDs in their first query

dc.contributor.authorSilveira, Marcos Rogério [UNESP]
dc.contributor.authorCansian, Adriano Mauro [UNESP]
dc.contributor.authorKobayashi, Hugo Koji
dc.contributor.institutionUniversidade Estadual Paulista (UNESP)
dc.contributor.institutionBrazilian Network Information Center (NIC.br)
dc.date.accessioned2025-04-29T19:29:47Z
dc.date.issued2025-04-01
dc.description.abstractThe Domain Name System (DNS) is essential for the functioning of the Internet, and due to its importance, malicious users exploit this structure to register domains capable of spreading phishing and malware. This study presents a method to detect malicious domains recently registered in Top-Level Domains (TLDs) based on the first DNS query. The approach is semi-supervised, combining supervised and unsupervised machine learning. We use a combination of two supervised algorithms and clustering for analysis. The results of the models feed into a final classifier, providing a probability of maliciousness for the domain. For the training of supervised models, the data is balanced using a hybrid technique of undersampling and oversampling. The training of the unsupervised model focuses exclusively on malicious domains, creating distinct malicious clusters. The models are evaluated in a real environment and can be updated through a retraining module when necessary. The results indicate an AUC of 0.9620 (± 0.01) and an ACC of 0.91 during training, with notable metrics of ACC 0.88, TPR 0.884, TNR 0.875, FPR 0.124, and FNR 0.110 in the testing phase simulating production. This approach provides a robust solution for the early detection of malicious domains in TLDs.en
dc.description.affiliationSao Paulo State University (UNESP), Cristóvão Colombo, 2265, SP
dc.description.affiliationBrazilian Network Information Center (NIC.br), Av. das Nações Unidas, 11541, 7th Floor, SP
dc.description.affiliationUnespSao Paulo State University (UNESP), Cristóvão Colombo, 2265, SP
dc.description.sponsorshipFundação para o Desenvolvimento da UNESP (FUNDUNESP)
dc.description.sponsorshipIdFUNDUNESP: 2764/2018
dc.identifierhttp://dx.doi.org/10.1007/s10207-025-00996-3
dc.identifier.citationInternational Journal of Information Security, v. 24, n. 2, 2025.
dc.identifier.doi10.1007/s10207-025-00996-3
dc.identifier.issn1615-5270
dc.identifier.issn1615-5262
dc.identifier.scopus2-s2.0-85219749862
dc.identifier.urihttps://hdl.handle.net/11449/303495
dc.language.isoeng
dc.relation.ispartofInternational Journal of Information Security
dc.sourceScopus
dc.subjectDomain name system
dc.subjectMachine learning
dc.subjectMalicious domain
dc.subjectPassive DNS
dc.subjectSemi-supervised machine learning
dc.titleSemi-supervised approach for detecting malicious domains in TLDs in their first queryen
dc.typeArtigopt
dspace.entity.typePublication
unesp.campusUniversidade Estadual Paulista (UNESP), Instituto de Biociências, Letras e Ciências Exatas, São José do Rio Pretopt

Arquivos