INTRUSION DETECTION IN COMPUTER NETWORKS USING LATENT SPACE REPRESENTATION AND MACHINE LEARNING
Keywords:intrusion detection, machine learning, clustering, traffic detection, anomalies, neural nets
Anomaly detection (AD) identifies samples that are not related to the overall distribution in the feature space. This problem has a long history of research through diverse methods, including statistical and modern Deep Neural Networks (DNN) methods. Non-trivial tasks such as covering ambiguous user actions and the complexity of standard algorithms challenged researchers. This article discusses the results of introducing an intrusion detection system using a machine learning (ML) approach. We compared these results with the characteristics of the most common existing rule-based Snort system. Signature Based Intrusion Detection System (SBIDS) has critical limitations well observed in a large number of previous studies. The crucial disadvantage is the limited variety of the same attack type due to the predetermination of all the rules. DNN solves this problem with long short-term memory (LSTM). However, requiring the amount of data and resources for training, this solution is not suitable for a real-world system. This necessitated a compromise solution based on DNN and latent space techniques.
S. Merity, N. S. Keskar, and R. Socher, Regularizing and Optimizing LSTM Language Models, arXiv preprint arXiv:1708.02182, 2017, [Online]. Available at: https://arxiv.org/abs/1708.02182
V. Kumar and O. P. Sangwan, “Signature base intrusion detection system using SNORT,” International Journal of Computer Application & Information Technology, vol. 1, no. 3, pp. 35-41, 2012.
T. Joachims, “Making large-scale SVM learning practical,” In B. Scholkopf, C.J.C. Burges, and A. J. Smola (Eds.), Advances in Kernel Methods – Support Vector Learning, Cambridge, MA: MIT Press, pp. 169-184, 1999.
S. Lundberg and S. Lee, A Unified Approach to Interpreting Model Predictions, arXiv preprint arXiv:1705.07874, 2017, [Online]. Available at: https://arxiv.org/pdf/1705.07874.pdf.
S.S. Khan, M.G. Madden, “One-class classification: Taxonomy of study and review of techniques”, The Knowledge Engineering Review, no. 29 (03), pp. 345–374, 2014.
W. Zhu, P. Zhong, “A new one-class SVM based on hidden information,” Knowledge-Based Systems, no. 60, pp. 35–43, 2014.
B. J. Radford, L. M. Apolonio, A. J. Trias, and J. A. Simpson, Network Traffic Anomaly Detection Using Recurrent Neural Networks, arXiv preprint arXiv:1803.10769, 2018, [Online]. Available at: http://arxiv.org/abs/1803.10769
R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, and F. Giannotti, A Survey of Methods for Explaining Black Box Models, arXiv preprint arXiv:1802.01933, 2018, [Online]. Available at: https://arxiv.org/pdf/1802.01933.pdf.
L. Dhanabal, D. S. Shantharajah, “A study on NSL-KDD dataset for intrusion detection system based on classification algorithms,” Int. J. Adv. Res. Comput. Commun. Eng., vol. 4, no. 6, pp. 446-452, 2015.
J. Cha, K. S. Kim, S. Lee, On the Transformation of Latent Space in Autoencoders, arXiv preprint arXiv:1901.08479, 2018, [Online]. Available at: https://arxiv.org/abs/1901.08479
P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, and P.-A. Manzagol, “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion,” Journal of Machine Learning Research, vol. 11, pp. 3371–3408, Dec. 2010, [Online]. Available at: http://www.jmlr.org/papers/volume11/vincent10a/vincent10a.pdf
D. P. Kingma, J. Ba, Adam: A Method for Stochastic Optimization, arXiv preprint arXiv:1412.6980, 2015, [Online]. Available at: https://arxiv.org/pdf/1412.6980.pdf.
Y. Bengio and Y. Grandvalet, “Bias in estimating the variance of K-Fold cross-validation,” Statistical Modeling and Analysis for Complex Data Problems, vol. 1, pp. 75-95, 2005.
L. I. Kuncheva, Combining Pattern Classifiers: Methods and Algorithms, Wiley, 2004, 350 p.
O. Chapelle, V. Vapnik, O. Bousquet, and S. Mukherjee, “Choosing multiple parameters for support vector machines,” Machine Learning, vol. 46, pp. 131–159, 2002.
Liu H. and Yu L., “Toward integrating feature selection algorithms for classification and clustering,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, issue 4, pp. 491–502, 2005.
A. Graves and J. Schmidhuber, “Framewise phoneme classification with bidirectional LSTM and other neural network architectures,” Neural Networks, vol. 18, issues 5-6, pp. 602-610, 2005.
W. Siblini, J. Fréry, L. He-Guelton, F. Oblé, Y.-Q. Wang, Master your Metrics with Calibration, arXiv preprint arXiv:1909.02827, 2019, [Online]. Available at: https://arxiv.org/pdf/1909.02827.pdf.
Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, May 2015. [Online]. Available at: http://dx.doi.org/10.1038/nature14539.
V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proceedings of the IEEE, vol. 105, pp. 2295–2329, 2017.
K. Kawaguchi, “Deep learning without poor local minima,” Advances in Neural Information Processing Systems, vol. 29, pp. 586–594, 2016.
L. Deng, J. Li, J.-T. Huang, K. Yao, D. Yu, F. Seide, M. Seltzer, G. Zweig, X. He, J. Williams et al., “Recent advances in deep learning for speech research at Microsoft,” Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Vancouver, Canada, pp. 734-748 2013.
M. Mathieu, M. Henaff, and Y. LeCun, “Fast training of convolutional networks through FFTs,” Proceedings of the International Conference on Learning Representations (ICLR2014), Banff, Canada, 2014. [Online]. Available at: https://arxiv.org/abs/1312.5851
M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, et al., Tensorflow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, arXiv preprint arXiv:1603.04467, 2016, [Online]. Available at: https://arxiv.org/pdf/1603.04467.pdf
S. Han, J. Pool, J. Tran, and W. Dally, “Learning both weights and connections for efficient neural network,” Advances in Neural Information Processing Systems, pp. 1135–1143, 2015.
How to Cite
LicenseInternational Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:
• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.