This paper presents a generalized multi-layer Intrusion Detection System (IDS) aimed at improving the security of modern smart buildings, where a mix of IoT devices, sensors, and network services creates a wide attack surface. To handle the high heterogeneity of these environments, the authors design an IDS that works at both the network layer and the sensor layer, using a unified preprocessing and machine-learning pipeline. The system begins by converting raw data—network packet captures and IoT sensor logs—into a structured format suitable for analysis. Network traces are processed through Zeek, merged and cleaned, while sensor logs undergo lightweight formatting. After preprocessing, all data is passed through an aggregation step that computes summary statistics over rolling time windows. This reduces enormous raw datasets to a compact feature set while preserving temporal patterns that indicate potential attacks or anomalies.
A key innovation is transforming the aggregated feature vectors into RGB image representations, allowing the IDS to leverage Convolutional Neural Networks (CNNs) for classification. This approach enables a single ML pipeline to process diverse device types without redesign, since any new sensor or network feature can be automatically encoded into the image format. The resulting dataset is also highly compressed compared with its original size. The IDS includes two CNN-based models: one trained to detect cyberattacks on network traffic, and another tailored to identify anomalous patterns in sensor telemetry. Experiments using the ToN-IoT dataset show strong performance at the network layer, with very high accuracy and low false-negative rates. Sensor-layer performance varies depending on device type—some sensors achieve excellent results, while others show higher false-positive rates, suggesting room for refinement, especially given that the dataset includes simulated sensor data. Overall, the paper demonstrates a practical and scalable framework for securing smart buildings. Its ability to unify heterogeneous data sources, adapt to new devices, and achieve strong detection performance positions it as a promising direction for future cyber-physical security systems.