OAR@UM Collection: /library/oar/handle/123456789/123835 Thu, 06 Nov 2025 11:32:18 GMT 2025-11-06T11:32:18Z Real-time multi-camera tracking and od-matrix estimation of vehicles /library/oar/handle/123456789/140266 Title: Real-time multi-camera tracking and od-matrix estimation of vehicles Abstract: With computer vision, it is possible to capture data which is of great use to urban planners and infrastructure engineers. Informed decisions can then be taken to evolve existing and new infrastructure in a more robust and greener way. Data can be captured with the use of a single-camera tracker, which detects and tracks vehicles and pedestrians in the camera view. However, in more complex scenarios, such as a roundabout or intersection, the use of a single camera is not sufficient. For this study, a single-camera tracker, developed by Greenroads Ltd, is readily available [...] Description: M.Sc. ICT(Melit.) Mon, 01 Jan 2024 00:00:00 GMT /library/oar/handle/123456789/140266 2024-01-01T00:00:00Z Detecting anomalies from roadside video streams /library/oar/handle/123456789/140265 Title: Detecting anomalies from roadside video streams Abstract: The interconnected nature of road networks implies that anomalies on narrow residential roads can ripple through the entire traffic system, particularly in high‐ traffic areas as common for the Maltese Islands. Detecting anomalies in such en‐ vironments using roadside cameras is challenging due to the multitude of normal and anomalous events, changes in illumination, obstructions, complex anomalies, and difficult viewing angles. This thesis investigates anomaly detection methods tailored to the realistic road and data limitations typical of Maltese urban roads. Classical anomaly detection, which identifies anomalies from structured data, and deep learning‐based techniques, which detect anomalies directly from video input, were evaluated. The literature review revealed limited evaluations on realistic datasets for both methods. The classical method was developed to filter out ID switch artifacts and identify specific anomalies using a combination of filtering, DBSCAN clustering, masking, and rule‐based techniques. For the deep learning method, an AE model with the STAE [1] architecture was chosen for its ability to capture temporal rep‐ resentation. Both methods were evaluated on video datasets collected in Malta and a relabeled Street Scene [2] dataset. The classical method demonstrated high reliability in detecting anomalies in structured data, achieving an 82% true positive rate and a 3% false positive rate for a local dataset. However, the data acquisition method did not accurately record all anomalies, reducing the true positive rate for actual video anomalies. The deep learning method showed strong performance across all datasets, achiev‐ ing an 83% AUC and a 25% EER for a dataset recorded in the same location. Per‐ formance was slightly reduced for locations with heavy shadows, as shown on a second local dataset. Segmenting frames into tiles and augmenting datasets improved performance in shadow‐affected conditions, as did masking irrelevant regions. An event‐level comparison showed both methods performed similarly in detecting non‐typical vehicle paths. The classical method excelled at identifying non‐typical object locations and was more robust against changes in scene dynam‐ ics, is more modular, and easier to debug. The deep learning method was better at detecting non‐typical slow‐moving and non‐typical vehicles and was more resilient to variations in the data acquisition method within the Intelligent Traffic System (ITS). However, neither method effectively detected unforeseen anomalies. Over‐ all, this thesis provides valuable insights and guidance for choosing the most ap‐ propriate anomaly detection methods tailored to different types of anomalies in complex urban road environments. Description: M.Sc. ICT(Melit.) Mon, 01 Jan 2024 00:00:00 GMT /library/oar/handle/123456789/140265 2024-01-01T00:00:00Z Understanding activity in private and public setups using 3D video content /library/oar/handle/123456789/132780 Title: Understanding activity in private and public setups using 3D video content Abstract: Human action recognition (HAR), which deals with classifying human action in video, is a core component of behaviour monitoring which has found applications in surveillance, security, sports, traffic management, medical monitoring and assisted living systems. The state of the art in human action recognition depends on machine learning techniques, which require large datasets for training. This results in huge amounts of time spent in training as well as a large power consumption, which reduces the feasibility of adopting such methods in real‐world applications. Most of the data present in video consists of background clutter that is irrelevant to human action classification. Thus, the training time can be significantly reduced by limiting the data that is processed to only those regions that capture the action taking place. A viable strategy for identifying these regions with low effort is motion saliency detection given that human action necessitates motion. The main contributions of this work include a novel solution for identifying regions that capture human actions and a new HAR method that uses this solution to achieve a classification accuracy that is comparable to the state of the art however with a significant reduction in the time spent in training and inference. In this thesis various solutions to motion saliency detection were explored. A new motion saliency solution was developed as existing solutions were found to be too computationally intensive for any reduction in training time to be realised by their adoption in a HAR pipeline. The use of this motion saliency solution in HAR methods was explored. It was found that the highest classification is achieved with the Model‐based Multimodal Network (MMNet) [1] method, which is a multimodal HAR method that fuses the classification results of the skeleton and colour modalities. A new HAR method, MMNet with motion saliency (MMNet‐MS), was developed that is based on the MMNet method. Whilst MMNet relies on the OpenPose [2] tool that estimates skeleton joint coordinates, to identify the regions that are relevant to action classification, the proposed MMNet‐MS identifies these regions using motion saliency detection to replace the computationally expensive OpenPose skeleton estimation step. Experimental results showed that the proposed MMNet‐MS method achieves a comparable classification accuracy, on average 75.91%, to MMNet, which has an accuracy of 76.67% on average. In private settings such as the TST fall detection dataset [3], the accuracy of the proposed MMNet‐MS, 55.69%, surpasses that of MMNet, 29.55%. A significant reduction in training and inference time is achieved where MMNet takes on average 28.63 hours to train and 27 milliseconds to classify an action in a video while MMNet‐MS takes on average 14.92 hours to train and 9 milliseconds to classify an action. This reduction in training time leads to a reduced power consumption in most cases particularly on the NTU‐60 dataset where MMNet consumed on average 3.43 kWh during training while MMNet‐MS consumed an average of 1.80 kWh during training. Description: Ph.D.(Melit.) Mon, 01 Jan 2024 00:00:00 GMT /library/oar/handle/123456789/132780 2024-01-01T00:00:00Z Deep learning techniques for network intrusion detection /library/oar/handle/123456789/123888 Title: Deep learning techniques for network intrusion detection Abstract: With the increasing prevalence of security threats and the exponential growth of network traffic, the need for robust and efficient Intrusion Detection Systems (IDS) has reached a critical point in the field of cybersecurity. Machine Learning (ML) and other anomaly detection techniques play a crucial role in identifying and isolating anomalous elements within network traffic data. The aim of this work was to leverage transformer networks, a Deep Learning (DL) - based model, to develop a dynamic and efficient IDS capable of identifying and categorising cyberattacks. To assess the effectiveness of this model, extensive experiments were carried out on the widely accessible Canadian Institute for Cybersecurity IDS 2017 (CICIDS2017) dataset, encompassing a diverse array of threats such as Distributed Denial of Service (DDoS), brute force, Cross-Site Scripting (XSS), Structured Query Language (SQL) injection, and Botnet (BoT) activities. The proposed model underwent thorough evaluation and comparison against classical ML algorithms, including Random Forest (RF), Support Vector Machine (SVM) and Vanilla Neural Network (VNN), as well as DL models such as Recurrent Neural Network (RNN), Convolutional Neural Network (CNN), and Long Short-Term Memory (LSTM). The assessment, utilising fundamental performance metrics including accuracy, precision, recall, and f1-score, underscored the effectiveness of the transformer network in detecting network intrusion threats. Specifically focusing on overall benign (non-malicious) and Denial of Service (DoS) GoldenEye binary classification within the CICIDS2017 dataset, the results unveiled impressive metrics: a precision of 99%, recall of 96%, f1-score of 97%, and an accuracy of 98%, achieved within a processing time of 513 seconds. Description: M.Sc.(Melit.) Mon, 01 Jan 2024 00:00:00 GMT /library/oar/handle/123456789/123888 2024-01-01T00:00:00Z