Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/150858
Title: Reinforcement learning and dynamic motion primitives
Authors: Mudgal, Saurabh
Keywords: Engineering::Mechanical engineering
Issue Date: 2021
Publisher: Nanyang Technological University
Source: Mudgal, S. (2021). Reinforcement learning and dynamic motion primitives. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/150858
Abstract: Multi-agent algorithms in Reinforcement Learning are a close approximation of real-world scenarios where there is a complex interplay between competition and collaboration between agents existing in an unpredictable environment. MultiAgent POsthumous Credit Assignment (MA-POCA) is a novel algorithm by Unity that has the potential to adapt the theories of multi-agent Reinforcement Learning to industrial applications. In this thesis, we study the theory of underlying concepts and literature of Reinforcement Learning that lead to such a sophisticated algorithm. Following that, we run evaluative experiments implementing the MA-POCA algorithm in simulated multi-agent environments. We discover that MA-POCA uses a fixed ratio parameter to balance collaborative and competitive self-play. This introduces problems similar to that seen in a Trust Region Policy Optimization (TRPO) and can be fixed using concepts of Proximal Policy Gradient (PPO). Further work is suggested to benchmark performance improvements from such modifications.
URI: https://hdl.handle.net/10356/150858
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:MAE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
FYP_U1722228C_SaurabhMudgal.pdf
  Restricted Access
A brief review of Reinforcement Learning algorithms and applications1.4 MBAdobe PDFView/Open

Page view(s)

96
Updated on Jan 21, 2022

Download(s)

4
Updated on Jan 21, 2022

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.