Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/182796
Title: Empowering natural language processing in low-resource regimes
Authors: Feng, Zijian
Keywords: Computer and Information Science
Issue Date: 2025
Publisher: Nanyang Technological University
Source: Feng, Z. (2025). Empowering natural language processing in low-resource regimes. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182796
Abstract: As a vital subfield of artificial intelligence (AI), natural language processing (NLP) strives to enable computers to understand, process, and analyze text as humans do. NLP has a broad range of applications, including text classification, information extraction, and chatbot dialogue systems. However, most NLP models demand substantial amounts of training data, which is often labor-intensive and expensive to collect and annotate. This issue, known as the low-resource NLP challenge, is defined as the lack of sufficient data and is considered one of the most critical obstacles in NLP. To tackle the low-resource NLP challenge, existing solutions either expand the training data or reduce data requirements by enhancing the training efficiency. Techniques like data augmentation, distant supervision, and semi-supervised learning generate synthetic data to increase the training set, enabling NLP models to achieve greater generalization across various NLP tasks. Sequential transfer learning methods, including fine-tuning and prompt-based learning, substantially boost the learned text representations with limited data, thereby reducing the overall data demand. These approaches employ lightweight strategies to add adaptation modules or adjust the parameters of pre-trained language models (PLMs), effectively tailoring powerful PLMs to specific downstream NLP tasks. Despite their success, most of these solutions overlook interpretability in their development, failing to create more targeted approaches for NLP models. The first category of interpretability is local explanations, which elucidate the model's decision-making process for individual inputs. By understanding how different input words influence model outputs, we can design more efficient strategies for utilizing and processing limited training data. The second category is global explanations, which analyze how the model's overall structure, weights, and parameters affect its prediction process. Leveraging these explanations enables us to devise better strategies for adapting PLMs to NLP tasks, thereby reducing the data requirements even further. However, effectively integrating interpretability into the development of low-resource NLP solutions remains an unresolved research question. This thesis aims to integrate interpretability into the design of more targeted methods to address the low-resource challenge. By leveraging both local and global explanations, we develop efficient strategies for data augmentation, fine-tuning, and prompt-based learning across various natural language understanding (NLU) and natural language generation (NLG) tasks. Specifically: Tailored Text Augmentation Using Local Explanations. We employ local explanations, such as the word's importance and discriminative power for the prediction, to tailor data augmentation operations for different word types. Discriminative words are used to introduce more task-oriented knowledge into synthetic data, while irrelevant words are controlled to be evenly distributed in synthetic data for low-resource sentiment analysis. Boosting Fine-Tuning with Knowledge-Driven Interpretability. We utilize external knowledge and local explanations of word importance to create text-saliency graphs for each input text. These graphs are encoded and combined with the original text representations to enhance the discriminative power of text representation learning, significantly improving model performance in low-resource hierarchical text classification (HTC). Enhanced Input Saliency for Prompt Manipulation. We develop a simple yet efficient local explanation method that leverages token distribution dynamics (TDD) to elucidate the prompt’s influence on large language model (LLM) outputs. Using these reliable local explanations derived from input saliency, we manipulate prompts by identifying and modifying keywords in input prompts for zero-shot controllable text generation (CTG). Learning-Free Text Generation with Global Interpretability. We investigate global explanations, focusing on how the weights of feed-forward network (FFN) modules in LLMs impact their outputs. We then establish control centers using these FFNs and adaptively update their weights for CTG. Remarkably, our proposed method does not require any training data or learning process, significantly mitigating the low-resource challenge.
URI: https://hdl.handle.net/10356/182796
DOI: 10.32657/10356/182796
Schools: Interdisciplinary Graduate School (IGS) 
Organisations: Future Resilient Systems (FRS), Singapore-ETH Center 
Research Centres: Institute of Catastrophe Risk Management (ICRM) 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:IGS Theses

Files in This Item:
File Description SizeFormat 
thesis_final0226.pdf2.05 MBAdobe PDFThumbnail
View/Open

Page view(s)

171
Updated on May 7, 2025

Download(s) 50

183
Updated on May 7, 2025

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.