Use of Transfer-Learning to improve the detection of zero-day attacks

Protection of critical infrastructures (CIs) has become a priority in the last years. CIs have adopted new technologies which have increased their efficiency, but that, unfortunately have made them to be more vulnerable to cyberattacks, specially when considering the Industrial Internet of Things (IIoT), where industrial devices, such as sensors or actuators are connected through wireless networks.

There is no doubt that IIoT will positively impact industries on different verticals, such as transportation, healthcare or energy, just to name a few. However, IIoT is vulnerable to a wide range of cyberattacks that can cause reputational and financial harm to organizations. Aimed at addressing this weakness, several ongoing efforts are currently invested on developing efficient security approaches to protect IIoT systems through Intrusion Detection Systems (IDS). 

Indeed, recent works on IIoT security provisioning propose the adoption of machine learning (ML) and deep learning (DL) techniques for the enhancement of IDS. Initially, these contributions make extensive use of ML techniques, but they lack the feature engineering and they have low detection rates. In addition, ML-based solutions fail in identifying different types of threats and intrusions, especially for unforeseen and unpredictable attacks. DL techniques have been subsequently adopted to overcome these constraints, turning into a notable improvement in the ability of ML-based solutions to prevent attacks, by identifying patterns that are different from normal behaviour, thus increasing detection accuracy and reducing the false positives.

DL-based IDSs have demonstrated their capabilities to extract complex patterns when a large collection of labelled data is available, to train the classification models in order to detect intrusions. However, in IIoT environments, there is a lack of such large collection of labelled data for zero-day attacks, or even for known families of attacks. In fact, in IIoT networks,  training data is scarce and time-consuming to collect, or occasionally non existent. Moreover, when a new intrusion is detected, DL models must be retrained with the new data from scratch, involving a huge amount of computing resources and time. Thus, DL-based IDSs are suffering the challenges of IIoT networks where datasets are scarce and unbalanced, and devices usually have limited computing capabilities.

The emergence of transfer learning (TL) helps IDSs overcome the well-known limitations in the detection of zero-day attacks and evolving threats, as well as in the effective detection of cyberattacks in networks with scarce and unbalanced datasets. TL is a recent ML progress, which applies in a target domain the knowledge previously learned in a related source domain, creating a high-performance learner for the target domain trained from the related source domain. TL has been demonstrated to be effective in the areas of natural language processing (NLP) and computer vision (CV). Image classification models trained to detect different categories of objects are repurposed for new, different, but related, domains. Transferring the knowledge gives better results than training the new image dataset from scratch. Research works demonstrate that the performance of a model built using TL is similar to that obtained by DL models, even if the TL one works with only one to ten per cent of the labelled training data. 

Considering the benefits TL brings to other areas, recently, TL has been explored in IDSs. It has been demonstrated that TL and network fine-tuning improve IDS even in unbalanced datasets and in the detection of zero-day attacks. Initial experimental results show that the TL-based IDS, achieves an excellent accuracy and a very low False positive rate (FPR). Moreover, detection rates significantly improve for the different families of known and novel attacks, compared to previous DL-based IDS. The promising results in the detection accuracy of new intrusions point out a new research direction for the enhancement of existing IDS adopting TL models.

Author: The CRAAX team (UPC).