Please use this identifier to cite or link to this item:
|Title:||Deep reinforcement learning for intractable routing & inverse problems||Authors:||Zhang, Rongkai||Keywords:||Engineering::Computer science and engineering::Computing methodologies::Artificial intelligence
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
|Issue Date:||2023||Publisher:||Nanyang Technological University||Source:||Zhang, R. (2023). Deep reinforcement learning for intractable routing & inverse problems. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/164058||Abstract:||Solving intractable problems with huge/infinite solution space is challenging and has motivated much research. Classical methods mainly focus on fast searching via either approximation or (meta)heuristics with the help of some regularizers. However, neither the solution quality nor inference time is satisfying. Recently, a popular trend is to leverage deep learning to learn to solve intractable problems and much impressive progress has been achieved with good solution quality and fast inference. Among the learning-based ones, deep reinforcement learning (DRL) based ones show superiority, since they learn a more flexible policy with less supervision. Many exciting achievements can be found in board games, video games, robotics. However, most of the current methods are proposed for some specific tasks with practical settings neglected. To push DRL one step forward to real-life applications, we propose a paradigm that can learn to solve a wider range of intractable problems and attempt to provide an instruction and insight on how to systematically learn to solve more practical intractable problems via DRL. Following the proposed paradigm, we proposed four frameworks for four practical intractable problems, namely travelling salesman problem with time window and rejection (TSPTWR), multiple TSPTWR (mTSPTWR), robust image denoising and customized low-light image enhancement respectively. Particularly, different from the counterparts, where the deep neural network (DNN) is the main concern, in our paradigm, the modelling of Markov decision process (MDP), and the design of action and reward are also studied. By doing so, we are able to flexibly circumvent the complex design of DNN and make good use of existing DRL based methods to more practical problems. Extensive experiments show that our proposed frameworks can outperform both classical and learning-based baselines for these applications. The success of these four applications demonstrates that our proposed paradigm is a general and promising solution to solve intractable problems efficiently. In the end, we conclude this thesis and point out some interesting directions that could be followed as future work.||URI:||https://hdl.handle.net/10356/164058||Rights:||This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).||Fulltext Permission:||open||Fulltext Availability:||With Fulltext|
|Appears in Collections:||EEE Theses|
Updated on Jan 27, 2023
Updated on Jan 27, 2023
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.