Safety and Reliability of DRL Agents Through Testing and Safety Monitoring
| dc.contributor.author | Zolfagharian, Amirhossein | |
| dc.contributor.supervisor | Briand, Lionel C. | |
| dc.date.accessioned | 2024-11-29T22:26:41Z | |
| dc.date.available | 2024-11-29T22:26:41Z | |
| dc.date.issued | 2024-11-29 | |
| dc.description.abstract | Deep Reinforcement Learning (DRL) agents have shown significant promise across various domains, including autonomous driving, healthcare, and robotics. However, their deployment in safety-critical applications presents substantial concerns regarding their safe and reliable behavior. The complexity and unpredictability of DRL environments, combined with their objective of maximizing long-term rewards, can lead to unintended safety violations. This thesis proposes two complementary methods aimed at improving the safety and reliability of DRL agents: (1) a pre-deployment testing approach called STARLA (Search-based Testing Approach for Reinforcement Learning Agents) and (2) a runtime safety monitoring approach known as SMARLA (Safety Monitoring Approach for Reinforcement Learning Agents). The first component, STARLA, addresses the challenges of systematically testing DRL agents by employing a search-based strategy that aims to reveal functional faults - situations where the agent may encounter an unsafe state. STARLA uses state abstraction, machine learning models, and evolutionary algorithms to efficiently generate test episodes that expose functional faults, within a limited simulation budget. The second component, SMARLA, focuses on runtime safety by predicting potential safety violations at runtime. SMARLA is agnostic to the inputs of the DRL agent and is a black-box approach. By continuously observing the agent’s behavior through the analysis of Q-values and leveraging state abstraction, SMARLA enables timely predictions before safety violations occur. Together, STARLA and SMARLA form a comprehensive framework for improving both pre-deployment quality assurance and runtime safety of DRL agents. Finally, the proposed approaches have been extensively evaluated on complex case studies and through large-scale experiments. Empirical results demonstrate the effectiveness of these approaches in identifying and mitigating safety risks. | |
| dc.identifier.uri | http://hdl.handle.net/10393/49919 | |
| dc.identifier.uri | https://doi.org/10.20381/ruor-30734 | |
| dc.language.iso | en | |
| dc.publisher | Université d'Ottawa / University of Ottawa | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | en |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | Genetic Algorithm | |
| dc.subject | Machine Learning | |
| dc.subject | Reinforcement Learning | |
| dc.subject | State Abstraction | |
| dc.subject | Testing | |
| dc.subject | Safety Monitoring | |
| dc.title | Safety and Reliability of DRL Agents Through Testing and Safety Monitoring | |
| dc.type | Thesis | en |
| thesis.degree.discipline | Génie / Engineering | |
| thesis.degree.level | Doctoral | |
| thesis.degree.name | PhD | |
| uottawa.department | Science informatique et génie électrique / Electrical Engineering and Computer Science |
