Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/163960
Title: Efficient optimization of network metrics in large uncertain graphs
Authors: Saha, Arkaprava
Keywords: Engineering::Computer science and engineering
Issue Date: 2022
Publisher: Nanyang Technological University
Source: Saha, A. (2022). Efficient optimization of network metrics in large uncertain graphs. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/163960
Abstract: Graphs constitute an omnipresent data structure that can model objects and their relationships in a wide variety of real-world applications like social networks and geo-spatial road networks. Most of these graphs nowadays, in addition to being huge, are intrinsically uncertain for a variety of reasons. As an example, the well-known social network Twitter alone has at least 187 million users and 132 billion connections as of now, and these numbers keep on increasing. Also, the Twitter network is uncertain in the sense that all connections do not signify complete influence or trust between the concerned users all the time. Thus, extracting relevant information from or performing meaningful operations on these graphs is highly challenging. A network metric is a measure that quantifies a certain property of a given component in a graph. Examples include the length of a given path, the density of a given subgraph and the influence spread of a given node set. The optimization of network metrics, i.e., finding the component maximizing or minimizing a certain metric, finds use in a plethora of real-world applications, e.g., road network routing, finding social network communities and viral marketing. Most of the exact techniques for such tasks turn out to be prohibitively time-consuming and memory-intensive for the huge graphs that are usually encountered. Thus, there is a need for efficient approximation algorithms. Although this may result in some loss in accuracy, the goal is to design techniques which achieve reasonably accurate results and scale well to very large graphs. This thesis focuses on the efficient optimization of network metrics in large uncertain graphs. Specifically, in this thesis, the following research problems have been studied. The first problem aims to find, between a given pair of nodes in an uncertain graph, the k paths having the highest probabilities of being a shortest path. The proposed solution improves the existing filtering-and-verification framework by coupling Dijkstra’s algorithm with Monte Carlo sampling to reduce the running time by some orders of magnitude, with end-to-end accuracy guarantees. Such shortest paths are then used to define a novel concept of betweenness centrality in uncertain graphs. The next problem involves finding the k node sets in an uncertain graph which have the largest probabilities of inducing a densest subgraph according to a given notion of density (edge, clique or any arbitrary pattern). The proposed solution samples some deterministic graphs, for each of which it computes the densest subgraphs, and eventually returns the k subgraphs with the highest frequencies. This problem is extended to computing the node sets with high probabilities of being contained in a densest subgraph. The solution to this is similar to the previous one, except that the final step involves using a maximal frequent itemset mining algorithm on all the sampled densest subgraphs. All the proposed methods are highly efficient and effective, both theoretically as well as experimentally. The final problem is a novel variant of the well-known opinion maximization problem where, given a social network of users with real-valued opinions (about different candidates) which diffuse according to a specified model starting from time 0, the goal is to choose the top-k seed users maximizing a specific voting-based score for the target candidate at a given finite time horizon. In contrast to prior works, the scoring function may not be submodular with respect to the seed set, which necessitates the augmentation of the well-known greedy algorithm with sandwich approximation using some suitably designed bound functions. To improve the efficiency, random walk and sketching-based methods are proposed, with theoretical quality guarantees.
URI: https://hdl.handle.net/10356/163960
DOI: 10.32657/10356/163960
Schools: School of Computer Science and Engineering 
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
Thesis.pdfPhD Thesis4.6 MBAdobe PDFThumbnail
View/Open

Page view(s)

247
Updated on May 20, 2024

Download(s) 50

139
Updated on May 20, 2024

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.