Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/176227
Title: Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning
Authors: Rakaraddi, Appan
Keywords: Computer and Information Science
Issue Date: 2024
Publisher: Nanyang Technological University
Source: Rakaraddi, A. (2024). Advancement in graph data mining: applications in unsupervised, continual, and few-shot learning. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176227
Abstract: Graph mining has proven to be extremely useful in analysing features and properties of real-world graphs. This enables a number of tasks including the prediction and evaluation of how information varies with changes in the link structure, generating and building models to extract properties such as link prediction, node classification and recommendation, cluster and community detection, etc. Deep learning techniques have become the prevalent approach for graph mining as they are able to take advantage of the increasing availability of graph data. Traditional deep learning models pre-process the graph data by mapping the nodes to a real vector, which can desecrate important node relationships in the graphs. Training on such pre-processed structures with the traditional deep learning models may present extremely unstable and inaccurate results. To overcome these issues, Graph Neural Network (GNN) was introduced to handle the non-Euclidean structure of the graphs for many of the classifier and regressor tasks. However, state-of-the-art GNNs require a large amount of training data, which incur an enormous burden of labelled data annotations for supervised learning. Also, GNNs vastly suffer from degradation with the increasing number of depth layers leading to oversmoothing during neighbourhood feature aggregation. The aim of this research is to overcome the above-mentioned challenges faced by GNNs for graph mining. We propose a method for graph data mining as a regression problem for the estimation of Eigenvector Centrality in graphs with a GNN based approach in a completely unsupervised learning environment. To achieve this, we define an Encoder-Decoder based model architecture called CUL. We show that even when trained on a small number of datasets, the model performed at least on par with the supervised methods in terms of accuracy with different types of embedding schemes like Graph Convolutional Network (GCN), GraphSAGE, and Graph Attention Network (GAT). This is reinforced by the model performance based on top-$\mathcal{N}\%$ accuracy metric, on real-world and synthetically generated datasets. CUL demonstrates at least on par or even a superior performance over existing methods. Next, we focus on the applications of GNNs in the domain of continual learning. We propose a model called GCL that learns across a sequence of tasks for node-classification in graphs under task-incremental and class-incremental settings. GCL comprises of a Reinforced Controller/RLC and an expandable GNN framework called Child Network/CN, along with a memory buffer to store the node samples. We compared our method against state-of-the-art methods in regards to average accuracy and average forgetting on four datasets with different GNNs. Our model performed better than the other methods i.e., it showed higher accuracy values as well as lower forgetting rates across the tasks in both task-incremental and class-incremental settings. Finally, we propose a method for improving few-shot node classification on graphs on any generic GNN backbone framework. We propose an uncertainty-based estimator which is modeled using a GNN that maps the scalar discrete probabilities of a GNN-classifier outputs to a continuous probability distribution. We demonstrate that the few-shot node classification accuracy improves by testing on different graph datasets. We also focus on bridging gaps between different few-shot learning methods for node classification for graphs. The performance variance between the methods is analysed and the pros/cons of each of the architectures are highlighted. This analysis aids in understanding the performance differentiator and development of better architectures for few-shot learning on graph node classification.
URI: https://hdl.handle.net/10356/176227
DOI: 10.32657/10356/176227
Schools: School of Computer Science and Engineering 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: embargo_20241114
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
mythesis2.pdf
  Until 2024-11-14
4.92 MBAdobe PDFUnder embargo until Nov 14, 2024

Page view(s)

70
Updated on Jul 18, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.