Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/184131
Title: PASOT-HRL: probably approximate safety options verification with temporal properties of hierarchical reinforcement learning policy
Authors: Toh, Jing Qiang
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Toh, J. Q. (2025). PASOT-HRL: probably approximate safety options verification with temporal properties of hierarchical reinforcement learning policy. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/184131
Project: CCDS24-0498
Abstract: As the deployment of Artificial Intelligence (AI) agents in real-world applications grows, ensuring their safety is increasingly important. One approach to ensure safety is via safety verification to evaluate the safety probability of an AI agent. This paper presents Probably Approximate Safety Options Verification with Temporal Properties Of Hierarchical Reinforcement Learning Policy (PASOT-HRL), a novel extension to Probably Approximate Safety (PAS) verification. It introduces several key enhancements: (1) extending PAS verification to Hierarchical Reinforcement Learning sub-policies (2) modifying PAS verification to support larger state spaces for scalability, and (3) incorporating temporal elements to evaluate safety in dynamic environments for time-bound tasks. The proposed solution is validated in the MiniGrid Dynamic Obstacles Environment, showcasing how PASOT-HRL is successful in providing probabilistic safety guarantees for larger environmental state spaces and time-bound tasks, at both high-level and low level policy levels, while incorporating temporal tracking of safety probabilities for low-level sub-policies. PASOT-HRL is able to identify safety boundaries at each policy level and visualize them as well. The proposed solution aims to advance the safety verification for Hierarchical Reinforcement Learning policies and make safety verification more scalable for real-world, time-bound environments.
URI: https://hdl.handle.net/10356/184131
Schools: College of Computing and Data Science 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:CCDS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP_Final_Report - Toh Jing Qiang.pdf
  Restricted Access
2.71 MBAdobe PDFView/Open

Page view(s)

50
Updated on May 7, 2025

Download(s)

2
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.