Expo Stream 5.1 - 5.7 - L-Universit脿 ta' Malta

Expo Stream 5.1 - 5.7

AI and Machine Learning Foundations

16:05 - 17:25 | Aula Magna (Level 1)

QARSA: A Hybrid Reinforcement Learning Algorithm Combining Q-Learning and SARSA Dynamic Control of Nonlinear Systems - Dr Hani Ahmed

Dr Hani Ahmed
Department of Systems and Control Engineering, Faculty of Engineering

QARSA is a novel hybrid reinforcement learning algorithm that integrates off-policy and on-policy learning principles by combining Q-learning and SARSA for the dynamic control of nonlinear systems. The proposed approach is designed to exploit the high sample efficiency and fast convergence characteristics of off-policy learning while maintaining the stability and reduced variance typically associated with on-policy methods. By introducing a tunable blending factor, QARSA enables a controlled balance between optimistic and conservative value updates, making it well-suited for environments that exhibit both deterministic and stochastic dynamics.

The performance of QARSA is evaluated using the CartPole-v1 benchmark environment within the OpenAI Gym framework and compared against standard Q-learning and SARSA implementations. All algorithms are implemented under identical experimental conditions to ensure a fair and consistent comparison. Performance is assessed using three key metrics: average reward, learning stability, and sample efficiency. Experimental results demonstrate that QARSA achieves higher average rewards while exhibiting improved learning stability and superior sample efficiency relative to its constituent algorithms. In particular, QARSA consistently learns effective control policies using fewer interactions with the environment, indicating enhanced robustness and reduced sensitivity to variance during training.

These findings suggest that hybrid reinforcement learning strategies such as QARSA can offer significant advantages in control tasks where learning stability, efficient data utilisation, and long-term performance are critical. The results provide valuable insights into the design of combined on-policy and off-policy reinforcement learning algorithms and highlight the potential applicability of QARSA to more complex and high-dimensional dynamic control problems for nonlinear systems.

ASTRA - A Reinforcement Learning Approach for Resolving Complex Air Traffic Events - Ms Cynthia Koopman

TADA: An AI-Driven Decision Support Tool for Arrival Management in Complex Airspace - Mr Samuel Baldacchino

Machine Learning–Based Landing Runway Prediction: Evidence from Zurich Airport - Dr Timothy Kayode Samson

Flight Data Anomaly Detection Using Machine Learning Techniques - Dr Asma Fejjari

Intrusion Detection for Smart Home IoT Using Feature Selection - Mr Musawar Ahmad

An Intelligence and Simulation-Based Stress Test Development for Resilience Assessment in Manufacturing Shop Floors - Dr Tanel Aruväli

Dr Tanel Aruväli
Department of Industrial and Manufacturing Engineering, Faculty of Engineering

Manufacturing companies are increasingly exposed to volatility stemming from multiple severe disruptions in the external environment, posing significant challenges to shop floor operational stability, strategic planning, and investment management. Current resilience assessment approaches are largely fragmented, generic, and retrospective, providing limited capability to anticipate future resilience behaviour. To overcome these shortcomings and improve resilience assessment under volatile industrial conditions, the Resistte project introduces a novel approach to assessing resilience on shop floors. The project objective is to develop and validate a scalable simulation-based stress-testing tool to assess shop floor resilience under plausible disruption scenarios. The study is inspired by well-established EU-wide stress-testing methodologies used in the financial sector. Such a stress test aims to evaluate shop floors’ potential disruption-based risks and to simulate resilience dynamics over the coming periods, using company-specific data and disruption foresights. A predictive dimension is introduced by contextualising a Large Language Model to interpret processed data from a news database and retrieve potential company-specific external disruptions, with calibration. By applying sensitivity analysis and converting retrieved next-period potential external disruptions into specific effects on shop floor operations, resilience can be assessed across several categories. The company-specific disruption effects will be incorporated with the manufacturing system logic and adaptive recovery actions to simulate the resilience dynamics. Such a stress-testing tool enables manufacturing managers and stakeholders to make more informed decisions about long-term investment strategies and strategic planning. In parallel, investors and financial institutions gain deeper insights into prospective resilience conditions ahead of investment decisions.

/events/researchexpo2026/programme/parallelsessions2academicsresearchers/expostream51-57/

福利在线免费

University of Malta Research Expo 2026

Expo Stream 5.1 - 5.7

Expo Stream 5.1 - 5.7