Logo do repositório

Semi-supervised approach for detecting malicious domains in TLDs in their first query

Carregando...
Imagem de Miniatura

Orientador

Coorientador

Pós-graduação

Curso de graduação

Título da Revista

ISSN da Revista

Título de Volume

Editor

Tipo

Artigo

Direito de acesso

Resumo

The Domain Name System (DNS) is essential for the functioning of the Internet, and due to its importance, malicious users exploit this structure to register domains capable of spreading phishing and malware. This study presents a method to detect malicious domains recently registered in Top-Level Domains (TLDs) based on the first DNS query. The approach is semi-supervised, combining supervised and unsupervised machine learning. We use a combination of two supervised algorithms and clustering for analysis. The results of the models feed into a final classifier, providing a probability of maliciousness for the domain. For the training of supervised models, the data is balanced using a hybrid technique of undersampling and oversampling. The training of the unsupervised model focuses exclusively on malicious domains, creating distinct malicious clusters. The models are evaluated in a real environment and can be updated through a retraining module when necessary. The results indicate an AUC of 0.9620 (± 0.01) and an ACC of 0.91 during training, with notable metrics of ACC 0.88, TPR 0.884, TNR 0.875, FPR 0.124, and FNR 0.110 in the testing phase simulating production. This approach provides a robust solution for the early detection of malicious domains in TLDs.

Descrição

Palavras-chave

Domain name system, Machine learning, Malicious domain, Passive DNS, Semi-supervised machine learning

Idioma

Inglês

Citação

International Journal of Information Security, v. 24, n. 2, 2025.

Itens relacionados

Unidades

Departamentos

Cursos de graduação

Programas de pós-graduação

Outras formas de acesso