Semi-supervised approach for detecting malicious domains in TLDs in their first query
Carregando...
Arquivos
Fontes externas
Fontes externas
Data
Orientador
Coorientador
Pós-graduação
Curso de graduação
Título da Revista
ISSN da Revista
Título de Volume
Editor
Tipo
Artigo
Direito de acesso
Arquivos
Fontes externas
Fontes externas
Resumo
The Domain Name System (DNS) is essential for the functioning of the Internet, and due to its importance, malicious users exploit this structure to register domains capable of spreading phishing and malware. This study presents a method to detect malicious domains recently registered in Top-Level Domains (TLDs) based on the first DNS query. The approach is semi-supervised, combining supervised and unsupervised machine learning. We use a combination of two supervised algorithms and clustering for analysis. The results of the models feed into a final classifier, providing a probability of maliciousness for the domain. For the training of supervised models, the data is balanced using a hybrid technique of undersampling and oversampling. The training of the unsupervised model focuses exclusively on malicious domains, creating distinct malicious clusters. The models are evaluated in a real environment and can be updated through a retraining module when necessary. The results indicate an AUC of 0.9620 (± 0.01) and an ACC of 0.91 during training, with notable metrics of ACC 0.88, TPR 0.884, TNR 0.875, FPR 0.124, and FNR 0.110 in the testing phase simulating production. This approach provides a robust solution for the early detection of malicious domains in TLDs.
Descrição
Palavras-chave
Domain name system, Machine learning, Malicious domain, Passive DNS, Semi-supervised machine learning
Idioma
Inglês
Citação
International Journal of Information Security, v. 24, n. 2, 2025.





