Please use this identifier to cite or link to this item:
|Title:||Adaptive goal selection for agents in dynamic environments||Authors:||Zhang, Huiliang
Miao, Chun Yan
|Keywords:||DRNTU::Engineering::Computer science and engineering||Issue Date:||2013||Source:||Zhang, H., Luo, X., Miao, C., Shen, Z., & You, J. (2013). Adaptive goal selection for agents in dynamic environments. Knowledge and information systems, in press.||Series/Report no.:||Knowledge and information systems||Abstract:||In psychology, goal-setting theory, which has been studied by psychologists for over 35 years, reveals that goals play significant roles in incentive, action and performance for human beings. Based on this theory, a goal net model has been proposed to design intelligent agents that can be viewed as a soft copy of human being somehow. The goal net model has been successfully applied in many agents, specially, non-player-character agents in computer games. Such an agent selects the optimal solution in all possible solutions found by using a recursive algorithm. However, if a goal net is very complex, the time of selection could be too long for the agent to respond quickly when the agent needs to re-select a new solution against the world’s change. Moreover, in some dynamic environments, it is impossible to know the exact outcome of choosing a solution in advance, and so the possible solutions cannot be evaluated precisely. Thus, to address the problem, this paper applies learning algorithm into goal selection in dynamic environments. More specifically, we first develop a reorganization algorithm that can convert a goal net to its equivalent counterpart that a Q-learning algorithm can operate on; then, we define the key component of Q-learning, reward function, according to the feature of goal nets; and finally lots of experiments are conducted to show that, in dynamic environments, the agent with the learning algorithm significantly outperforms the one with the recursive searching algorithm. Therefore, our work suggests an agent model that can effectively be applied in dynamic time-sensitive domain, like computer games and the P2P systems of online movie watching.||URI:||https://hdl.handle.net/10356/96478
|DOI:||10.1007/s10115-013-0645-7||Fulltext Permission:||none||Fulltext Availability:||No Fulltext|
|Appears in Collections:||SCSE Journal Articles|
Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.