Please use this identifier to cite or link to this item: https://hdl.handle.net/10356/68898
Title: Efficient index structures for reachability and shortest path queries
Authors: Wang, Sibo
Keywords: DRNTU::Engineering::Computer science and engineering::Information systems::Database management
Issue Date: 2016
Source: Wang, S. (2016). Efficient index structures for reachability and shortest path queries. Doctoral thesis, Nanyang Technological University, Singapore.
Abstract: Graphs are a fundamental data structure to represent objects and their relations in various domains, e.g., social science, citation analysis, web link analysis, and navigation systems. Reachability and shortest path queries are two types of primitive and well-studied graph queries. In this thesis, we investigate on how to build efficient index structures to answer the reachability queries and two variants of the shortest path queries. The first variant of the shortest path query considers the shortest path in timetable graphs, where it requires the consideration of spatiotemporal constraints between different edges. The second variant aims to provide the shortest path under an additional cost constraint. These problems find many real world applications, and are challenging to achieve high efficiency on sizable graphs. First, we study the reachability queries on directed graphs. Given a directed graph $G$, a source $s$, and a target $t$, a reachability query asks whether there exists a path from $s$ to $t$ in $G$. The state-of-the-art solutions for answering reachability queries are a class of labelling schemes referred to as {\em Hierarchical Hub Labelings (HHL)}. One distinct characteristic of {\em HHL} is that any {\em HHL} index implicitly or explicitly defines a total order of vertices in the graph, and the performance of the index is solely decided by the ordering. Several heuristic approaches have been proposed for vertex ordering in {\em HHL}, but none of them provides any worst-case guarantee. We present a novel study on the vertex ordering in {\em HHL}, and devise a polynomial-time algorithm for vertex ordering that yields an {\em HHL} index whose size is at most $O(\sqrt{n})$ times the optimal one. Motivated by our theoretical study, we design a new heuristic for vertex ordering, and show that it leads to an {\em HHL} index with superior practical performance for reachability queries. Besides, identifying fast routes in public transportation networks is an important problem with applications in map services and navigation systems. In public transportation networks, any traversal between different network locations relies on transportation services (e.g., buses, subways, and ferries.) that run on fixed routes with pre-defined schedules. A public transportation network can often be modeled as a timetable graph where (i) each node represents a station; and (ii) each directed edge $\langle u, v\rangle $ is associated with a timetable that records the departure (resp. arrival) time of each vehicle at station $u$ (resp. $v$). There are several types of shortest path queries on timetable graphs. We study three types of shortest path queries for route planning that have been extensively studied on timetable graphs: {\em earliest arrival path (EAP)}, {\em latest departure path (LDP)}, and {\em shortest duration path (SDP)} queries. These three types of queries aim to find a path that arrives at a place as early as possible, departs as late as possible but arrives on time, and has the shortest travel time, respectively. We present {\em Timetable Labelling (TTL)}, an efficient indexing technique for shortest path queries on timetable graphs. The basic idea of {\em TTL} is to associate each node $u$ with a set of labels, each of which records the shortest travel time from $u$ to some other node $v$ given a certain departure time from $u$; such labels would then be used during query processing to improve efficiency. In addition, we propose query algorithms that enable {\em TTL} to support EAP, LDP, and SDP queries, and investigate how we reduce the space consumption of {\em TTL} with advanced preprocessing and label compression methods. By conducting an extensive set of experiments on real world datasets, we demonstrate that {\em TTL} significantly outperforms the states of the art in terms of query efficiency, while incurring moderate preprocessing and space overheads. Finally, {\em constrained shortest path (CSP)} is a variant of the shortest path problem with an additional total cost constraint. Specifically, in a CSP query, each edge in the road network is associated with both a length and a cost, e.g., travel distance and time. Given an origin $s$, a destination $t$, and a cost constraint $\theta$, the goal is to find the path from $s$ to $t$ that minimizes its total length, while satisfying that its total cost does not exceed $\theta$. Deriving the exact answer for a CSP query has been proven to be NP-hard, and the majority of previous work focuses on approximate solutions. We propose {\em COLA}, the first practical solution for approximate CSP processing on large road networks. {\em COLA} exploits the facts that a road network can be effectively partitioned, and that there exists a relatively small set of landmark vertices that commonly appear in CSP results. Meanwhile, {\em COLA} includes both an effective indexing scheme for partition boundary vertices, and an efficient on-the-fly algorithm called $\alpha$-Dijk for path computation within a partition. Extensive experiments demonstrate that on continent-sized networks with tens of millions of vertices, {\em COLA} answers an approximate CSP query in sub-second time, whereas existing methods take hours. Interestingly, even without an index, the $\alpha$-Dijk algorithm in {\em COLA} still outperforms previous solutions by more than an order of magnitude.
URI: https://hdl.handle.net/10356/68898
DOI: 10.32657/10356/68898
Fulltext Permission: open
Fulltext Availability: With Fulltext
Appears in Collections:SCSE Theses

Files in This Item:
File Description SizeFormat 
Sibo_WANG_thesis_final.pdfThesis1.33 MBAdobe PDFThumbnail
View/Open

Google ScholarTM

Check

Altmetric


Plumx

Items in DR-NTU are protected by copyright, with all rights reserved, unless otherwise indicated.