Anomaly detection in railway infrastructures

Raúl Rabadán, Ester Simó, Eva Rodríguez 

Universitat Politècnica de Catalunya (UPC)

Sensor based IoT devices are being increasingly integrated into critical infrastructures to enable continuous monitoring, enhancing operational performance and safety. Decisions related to operations, maintenance, and safety are made based on the historical and real-time data collected from IoT devices. This helps ensure the integrity of the data that reaches processing systems is essential. Securing IoT networks in critical infrastructure is essential for mitigating risks, ensuring data integrity, and maintaining the continuity of essential services.

Securing IoT networks in critical infrastructure is essential for mitigating risks, ensuring data integrity, and maintaining the continuity of essential services.

When monitoring critical systems, anomalies can be detected which could be indicative of system failures or even attacks on the system. Traditional methods of anomaly detection have limitations, most are based on fixed thresholds, and alarms are generated when sensor readings exceed these thresholds. Therefore, these are methods of limited effectiveness since if the threshold is not adjusted correctly, we can find that normal fluctuations in the system can lead to false alerts. And we can also find sophisticated attacks, which present progressive changes, remaining within the thresholds until the problem is critical. This is because traditional systems do not take into account the dynamic nature of the monitored variables, whose values can fluctuate at certain times of the day, and if they are compared with fixed thresholds, false alarms could be generated.

Forecasting tools are a key resource to predict trends, and mitigate risks. In this line, we propose a forecasting system based on machine learning models that does not depend on fixed thresholds. This approach favours intelligent and effective monitoring, since it learns the expected behaviour of the monitored variables from historical data, thus allowing deviations to be detected in real time. The tool is composed of four modules that interact with each other. Each one is relatively simple on its own, but the interaction between them and the checks necessary to ensure an adequate logical flow introduce an additional degree of complexity to the system.

Data Collection Module: Responsible for collecting key system metrics, which are processed and stored to feed the forecasting models and also for future analysis and reference.

Forecasting Module: The time series forecasting component trains different machine learning models with historical data, compares the performance of the models and selects the one with the best performance with respect to different metrics, to predict future values ​​of the sensor data. This process is repeated periodically over time, so that the models used to make predictions are trained with updated data. In this way, the dynamic nature of the monitored variables is taken into account.

Anomaly Detection Module: This component evaluates whether real-time IoT values ​​are within an acceptable range. The Anomaly Detection component relies on the Tukey fence method to find the limits of the acceptance range. For each time series, the upper and lower Tukey limits are periodically recalculated using historical measurement values ​​collected from the sensors. Using this method, we can differentiate between normal fluctuations in sensor data and outliers that may indicate abnormal activity. When a sensor measurement falls outside these established boundaries, it is flagged as an abnormal value.

Alerts and Notifications Module: When the system detects a significant deviation, an automatic alert is generated through the monitoring module, so that critical system administrators can take corrective measures in time if necessary, allowing a rapid response to possible incidents.

This approach enables intelligent and adaptive monitoring, optimizing monitoring efficiency by reducing the number of false positives and detecting anomalies more accurately.