Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques

Nenhuma Miniatura disponível

Data

2022-04-01

Autores

Silva, Leandro Marcos da [UNESP]
Silveira, Marcos Rogério [UNESP]
Cansian, Adriano Mauro [UNESP]
Kobayashi, Hugo Koji

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

DNS is vital for the proper functioning of the Internet. However, users use this structure for domain registration and abuse. These domains are used as tools for these users to carry out the most varied attacks. Thus, early detection of abused domains prevents more people from falling into scams. In this work, an approach for identifying abused domains was developed using passive DNS collected from an authoritative DNS server TLD along with the data enriched through geolocation, thus enabling a global view of the domains. Therefore, the system monitors the domain's first seven days of life after its first DNS query, in which two behavior checks are performed, the first with three days and the second with seven days. The generated models apply the machine learning algorithm LightGBM, and because of the unbalanced data, the combination of Cluster Centroids and K-Means SMOTE techniques were used. As a result, it obtained an average AUC of 0.9673 for the three-day model and an average AUC of 0.9674 for the seven-day model. Finally, the validation of three and seven days in a test environment reached a TPR of 0.8656 and 0.8682, respectively. It was noted that the system has a satisfactory performance for the early identification of abused domains and the importance of a TLD to identify these domains.

Descrição

Palavras-chave

abused domains in TLD, cybersecurity, data imbalanced, machine learning algorithms, passive DNS

Como citar

International Journal of Communication Networks and Information Security, v. 14, n. 1, p. 76-85, 2022.