Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/157056
Title: Profit-maximizing sequential task allocation to a team of selfish agents with deep reinforcement learning
Authors: Zhang, Shizhuo
Keywords: Science::Mathematics
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Zhang, S. (2022). Profit-maximizing sequential task allocation to a team of selfish agents with deep reinforcement learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/157056
Abstract: We study the problem of sequential task allocation among selfish agents through the lens of dynamic mechanism design framework. In this game, the manager has to maximize its own utility in face of a random team of selfish agents.The problem assumes a discrete-time setting in which each time step comprises of two sub-procedures: 1) contracting, where the manager offers payments to ask each agent to pursue certain goals and agents decide on whether they are satisfied; and 2) acting. The complication of this set-up lies in that reporting is involved as in traditional mechanism design settings, and truthful revelation of hidden information is impossible. Meanwhile, the agents act in a high-dimensional space, adding to the difficulty of making proper assumptions and devising optimization algorithms. To this end, we leverage the power of deep reinforcement learning. It is necessary to model the agents’ hidden information for the manager to make correct decisions, while this makes the learning problem non-Markovian, causing complications in applying reinforcement learning algorithms. We proposed a framework to tackle historical dependency leveraging the strong representation learning capability of deep learning methods and gradient-based multi-task updates, allowing the RL-based manager to act in a Markov latent space. We proposed the use of successor-representation based intrinsic reward to encourage strategic exploration. We performed empirical studies in various game settings to demonstrate the power of our proposed framework.
URI: https://hdl.handle.net/10356/157056
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:SPMS Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Interim_Report_FYP (2).pdf
  Restricted Access
1.24 MBAdobe PDFView/Open

Page view(s)

25
Updated on May 18, 2022

Download(s)

2
Updated on May 18, 2022

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.