Deep Convolutional Neural Network and Character Level Embedding for DGA Detection
| dc.contributor.author | Gregório, João Rafael [UNESP] | |
| dc.contributor.author | Cansian, Adriano Mauro [UNESP] | |
| dc.contributor.author | Neves, Leandro Alves [UNESP] | |
| dc.contributor.author | Salvadeo, Denis Henrique Pinheiro [UNESP] | |
| dc.contributor.institution | Universidade Estadual Paulista (UNESP) | |
| dc.date.accessioned | 2025-04-29T18:48:20Z | |
| dc.date.issued | 2024-01-01 | |
| dc.description.abstract | Domain generation algorithms (DGA) are algorithms that generate domain names commonly used by botnets and malware to maintain and obfuscate communication between a botclient and command and control (C2) servers. In this work, a method is proposed to detect DGAs based on the classification of short texts, highlighting the use of character-level embedding in the neural network input to obtain meta-features related to the morphology of domain names. A convolutional neural network structure has been used to extract new meta-features from the vectors provided by the embedding layer. Furthermore, relu layers have been used to zero out all non-positive values, and maxpooling layers to analyze specific parts of the obtained meta-features. The tests have been carried out using the Majestic Million dataset for examples of legitimate domains and the NetLab360 dataset for examples of DGA domains, composed of around 56 DGA families. The results obtained have an average accuracy of 99.12% and a precision rate of 99.33%. This work contributes with a natural language processing (NLP) approach to DGA detection, presents the impact of using character-level embedding, relu and maxpooling on the results obtained, and a DGA detection model based on deep neural networks, without feature engineering, with competitive metrics. | en |
| dc.description.affiliation | Department of Computer Science and Statistics (DCCE) São Paulo State University (UNESP), São José do Rio Preto | |
| dc.description.affiliation | Institute of Geociences and Exact Sciences (IGCE) São Paulo State University (UNESP) | |
| dc.description.affiliationUnesp | Department of Computer Science and Statistics (DCCE) São Paulo State University (UNESP), São José do Rio Preto | |
| dc.description.affiliationUnesp | Institute of Geociences and Exact Sciences (IGCE) São Paulo State University (UNESP) | |
| dc.description.sponsorship | Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) | |
| dc.description.sponsorshipId | CNPq: 313643/2021-0 | |
| dc.format.extent | 167-174 | |
| dc.identifier | http://dx.doi.org/10.5220/0012605700003690 | |
| dc.identifier.citation | International Conference on Enterprise Information Systems, ICEIS - Proceedings, v. 2, p. 167-174. | |
| dc.identifier.doi | 10.5220/0012605700003690 | |
| dc.identifier.issn | 2184-4992 | |
| dc.identifier.scopus | 2-s2.0-85193951302 | |
| dc.identifier.uri | https://hdl.handle.net/11449/300007 | |
| dc.language.iso | eng | |
| dc.relation.ispartof | International Conference on Enterprise Information Systems, ICEIS - Proceedings | |
| dc.source | Scopus | |
| dc.subject | Convolutional Neural Networks | |
| dc.subject | Cybersecurity | |
| dc.subject | DGA | |
| dc.subject | Domain Generation Algorithms | |
| dc.subject | Embedding | |
| dc.subject | NLP | |
| dc.subject | Short Text Classification | |
| dc.title | Deep Convolutional Neural Network and Character Level Embedding for DGA Detection | en |
| dc.type | Trabalho apresentado em evento | pt |
| dspace.entity.type | Publication | |
| unesp.author.orcid | 0000-0001-7783-2567[1] | |
| unesp.author.orcid | 0000-0003-4494-1454[2] | |
| unesp.author.orcid | 0000-0001-8580-7054[3] | |
| unesp.author.orcid | 0000-0001-8942-0033[4] | |
| unesp.campus | Universidade Estadual Paulista (UNESP), Instituto de Biociências, Letras e Ciências Exatas, São José do Rio Preto | pt |
| unesp.campus | Universidade Estadual Paulista (UNESP), Instituto de Geociências e Ciências Exatas, Rio Claro | pt |

