Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/176294
Title: Curriculum learning improves compositionality of reinforcement learning agent across concept classes
Authors: Lin, Zijun
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Lin, Z. (2024). Curriculum learning improves compositionality of reinforcement learning agent across concept classes. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176294
Abstract: The compositional structure afforded by language allows humans to decompose complex phrases and map them to novel visual concepts, demonstrating flexible intelligence. Although there have been several algorithms that can demonstrate compositionality, they do not give us insights on how humans learn to compose concept classes to ground visual cues. To study this multi-modal learning problem, we created a 3-dimensional environment, where a reinforcement learning agent has to navigate to a location specified by a natural language phrase (instruction). The instruction is composed of nouns, attributes and additionally, determiners or prepositions. This visual grounding task increases the compositional complexity for reinforcement learning agents, as navigating to the blue cubes above some red spheres will not be rewarded when the instruction is to navigate to “some blue cubes below the red sphere”. We first demonstrate that reinforcement learning agents can ground determiner concepts to visual scenes but struggle to ground the more complex preposition concepts. Secondly, we show that curriculum learning, a strategy employed by humans, improves concept learning efficiency by reducing the total number of training episodes needed to achieve a certain performance criterion by 15% in determiner environment. Moreover, it enables the agents to learn the preposition concepts. Lastly, we establish that agents trained on determiner or preposition concepts can decompose held-out test instructions, and also rapidly map their navigation policies to unseen visual object combinations. Various text encoders are also being compared to see whether they could facilitate the agents’ training. To conclude, our results clarify that multi-modal reinforcement learning agents can achieve compositional understanding of complex concept classes, and demonstrate the effectiveness of human-like learning strategies to improve the learning efficiency for artificial systems.
URI: https://hdl.handle.net/10356/176294
Schools: School of Electrical and Electronic Engineering 
Fulltext Permission: restricted
Fulltext Availability: With Fulltext
Appears in Collections:EEE Student Reports (FYP/IA/PA/PI)

Files in This Item:
File Description SizeFormat 
Lin Zijun -- Final Report.pdf
  Restricted Access
9.25 MBAdobe PDFView/Open

Page view(s)

106
Updated on May 7, 2025

Download(s)

5
Updated on May 7, 2025

Google ScholarTM

Check

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.