Please use this identifier to cite or link to this item:
https://hdl.handle.net/10356/101896
Title: | Cooperative reinforcement learning in topology-based multi-agent systems | Authors: | Xiao, Dan Tan, Ah-Hwee |
Keywords: | DRNTU::Engineering::Computer science and engineering | Issue Date: | 2011 | Source: | Xiao, D., & Tan, A.-H. (2013). Cooperative reinforcement learning in topology-based multi-agent systems. Autonomous Agents and Multi-Agent Systems, 26(1), 86-119. | Series/Report no.: | Autonomous agents and multi-agent systems | Abstract: | Topology-based multi-agent systems (TMAS), wherein agents interact with one another according to their spatial relationship in a network, are well suited for problems with topological constraints. In a TMAS system, however, each agent may have a different state space, which can be rather large. Consequently, traditional approaches to multi-agent cooperative learning may not be able to scale up with the complexity of the network topology. In this paper, we propose a cooperative learning strategy, under which autonomous agents are assembled in a binary tree formation (BTF). By constraining the interaction between agents, we effectively unify the state space of individual agents and enable policy sharing across agents. Our complexity analysis indicates that multi-agent systems with the BTF have a much smaller state space and a higher level of flexibility, compared with the general form of n-ary (n > 2) tree formation. We have applied the proposed cooperative learning strategy to a class of reinforcement learning agents known as temporal difference-fusion architecture for learning and cognition (TD-FALCON). Comparative experiments based on a generic network routing problem, which is a typical TMAS domain, show that the TD-FALCON BTF teams outperform alternative methods, including TD-FALCON teams in single agent and n-ary tree formation, a Q-learning method based on the table lookup mechanism, as well as a classical linear programming algorithm. Our study further shows that TD-FALCON BTF can adapt and function well under various scales of network complexity and traffic volume in TMAS domains. | URI: | https://hdl.handle.net/10356/101896 http://hdl.handle.net/10220/19819 |
ISSN: | 1387-2532 | DOI: | 10.1007/s10458-011-9183-4 | Schools: | School of Computer Engineering | Rights: | © 2011 The Author(s). | Fulltext Permission: | none | Fulltext Availability: | No Fulltext |
Appears in Collections: | SCSE Journal Articles |
SCOPUSTM
Citations
20
11
Updated on Mar 14, 2025
Web of ScienceTM
Citations
20
7
Updated on Oct 30, 2023
Page view(s) 50
644
Updated on Mar 19, 2025
Google ScholarTM
Check
Altmetric
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.