Del av: IEEE Big Data 1st International Workshop on Big Data Analytic for Cyber Crime Investigation and Prevention 2017 (IEEE, 2017)
Blacklists and whitelists are often employed to filter outgoing and incoming traffic on computer networks. One central function of these lists is to mitigate the security risks posed by malware threats by associating a reputation (for instance benign or malicious) to end-point hosts. The creation and maintenance of these lists is a complex and time-consuming process for security experts. As a consequence, blacklists and whitelists are prone to various errors, inconsistencies and omissions, as only a tiny fraction of end-point hosts are effectively covered by the reputation lists. In this paper, we present a machine learning model that is able to automatically detect whether domain names and IP addresses are benign, malicious or sinkholes. The model relies on a deep neural architecture and is trained on a large passive DNS database. Evaluation results demonstrate the effectiveness of the approach, as the model is able to detect malicious DNS records with a F1 score of 0.96. In other words, the model is able to detect 95 % of the malicious hosts with a false positive rate of 1:1000.