Semi-supervised approach for detecting malicious domains in TLDs in their first query
| dc.contributor.author | Silveira, Marcos Rogério [UNESP] | |
| dc.contributor.author | Cansian, Adriano Mauro [UNESP] | |
| dc.contributor.author | Kobayashi, Hugo Koji | |
| dc.contributor.institution | Universidade Estadual Paulista (UNESP) | |
| dc.contributor.institution | Brazilian Network Information Center (NIC.br) | |
| dc.date.accessioned | 2025-04-29T19:29:47Z | |
| dc.date.issued | 2025-04-01 | |
| dc.description.abstract | The Domain Name System (DNS) is essential for the functioning of the Internet, and due to its importance, malicious users exploit this structure to register domains capable of spreading phishing and malware. This study presents a method to detect malicious domains recently registered in Top-Level Domains (TLDs) based on the first DNS query. The approach is semi-supervised, combining supervised and unsupervised machine learning. We use a combination of two supervised algorithms and clustering for analysis. The results of the models feed into a final classifier, providing a probability of maliciousness for the domain. For the training of supervised models, the data is balanced using a hybrid technique of undersampling and oversampling. The training of the unsupervised model focuses exclusively on malicious domains, creating distinct malicious clusters. The models are evaluated in a real environment and can be updated through a retraining module when necessary. The results indicate an AUC of 0.9620 (± 0.01) and an ACC of 0.91 during training, with notable metrics of ACC 0.88, TPR 0.884, TNR 0.875, FPR 0.124, and FNR 0.110 in the testing phase simulating production. This approach provides a robust solution for the early detection of malicious domains in TLDs. | en |
| dc.description.affiliation | Sao Paulo State University (UNESP), Cristóvão Colombo, 2265, SP | |
| dc.description.affiliation | Brazilian Network Information Center (NIC.br), Av. das Nações Unidas, 11541, 7th Floor, SP | |
| dc.description.affiliationUnesp | Sao Paulo State University (UNESP), Cristóvão Colombo, 2265, SP | |
| dc.description.sponsorship | Fundação para o Desenvolvimento da UNESP (FUNDUNESP) | |
| dc.description.sponsorshipId | FUNDUNESP: 2764/2018 | |
| dc.identifier | http://dx.doi.org/10.1007/s10207-025-00996-3 | |
| dc.identifier.citation | International Journal of Information Security, v. 24, n. 2, 2025. | |
| dc.identifier.doi | 10.1007/s10207-025-00996-3 | |
| dc.identifier.issn | 1615-5270 | |
| dc.identifier.issn | 1615-5262 | |
| dc.identifier.scopus | 2-s2.0-85219749862 | |
| dc.identifier.uri | https://hdl.handle.net/11449/303495 | |
| dc.language.iso | eng | |
| dc.relation.ispartof | International Journal of Information Security | |
| dc.source | Scopus | |
| dc.subject | Domain name system | |
| dc.subject | Machine learning | |
| dc.subject | Malicious domain | |
| dc.subject | Passive DNS | |
| dc.subject | Semi-supervised machine learning | |
| dc.title | Semi-supervised approach for detecting malicious domains in TLDs in their first query | en |
| dc.type | Artigo | pt |
| dspace.entity.type | Publication | |
| unesp.campus | Universidade Estadual Paulista (UNESP), Instituto de Biociências, Letras e Ciências Exatas, São José do Rio Preto | pt |

