Spatial keyword querying beyond the single geo-textual object granularity
Date of Issue2014
School of Computer Engineering
Centre for Advanced Information Systems
With the proliferation of geo-positioning techniques and mobile devices such as smart-phones and tablet computers, the web is increasingly being used by mobile users and accurate user positioning is increasingly available. As a result, a spatial, or geographical web is emerging where contents and users are associated with locations. This leads to the fact that massive amounts of objects are available on the web that possess both a geographical location and a textual description. Such geo-textual objects include stores, tourist attractions, hotels, restaurants, businesses, entertainment services, public transport, etc. The availability of substantial amounts of geo-textual objects gives prominence to spatial keyword queries that target these objects. Such queries exploit both locations and textual descriptions and occur in many types of mobile and traditional web services and applications, e.g., Yellow Pages and Maps services, which greatly facilitate our everyday life. For example, in Google Maps the functionality “search nearby” allows users to retrieve points of interest around a specified location. Spatial keyword querying has attracted significant industrial and academic interests. There exist many proposals on studying spatial keyword queries, which retrieve lists of geo-textual objects that are both textually relevant to the query and satisfy a spatial query predicate. However, most existing work on spatial keyword querying treats the geo-textual objects as independent. This thesis focuses on efficiently processing the spatial keyword queries beyond the single geo-textual object granularity. First, we believe that a relevant result geo-textual object with nearby objects that are also relevant to the query is likely to be preferable over a relevant object without relevant nearby objects. We propose the concept of prestige-based relevance to capture both the textual relevance of an object to a query and the effects of nearby objects. Based on this, a new type of query, the top-k Prestige-based Spatial Keyword (kPSK) query, is proposed that retrieves the top-k geo-textual objects ranked according to both prestige-based relevance and location proximity. Two algorithms are devised to answer kPSK queries. Empirical studies with real-world datasets demonstrate that kPSK queries are more effective in retrieving geo-textual objects than is a previous approach without the consideration of the effects of nearby objects; the experimental results also show that the proposed algorithms are scalable and outperform a baseline approach significantly. Next, proposals for spatial keyword search so far generally focus on finding individual objects rather than finding groups of objects where the objects in a group collectively satisfy the requirements. We define another new type of query named the Spatial Group Keyword (SGK) query, which retrieves a group of geo-textual objects such that the group’s keywords cover the query’s keywords and such that objects are nearest to the query location and have the lowest inter-object distances. Specifically, we study two variants of this problem, both of which are NP-hard. We devise exact solutions as well as approximate solutions with provable approximation bounds to the problems. We present empirical studies that offer insight into the efficiency and accuracy of the solutions. Third, we consider a spatial keyword query over road networks that retrieves routes that are formed by a sequence of geo-textual objects. Identifying a preferable route is an important problem that finds applications in map services. When a user plans a trip within a city, the user may want to find “a most popular route such that it passes by shopping mall, restaurant, and pub, and the travel time to and from my hotel is within 4 hours.” However, no existing algorithms can be used to answer such queries. Motivated by this, we define the Keyword-aware Optimal Route query, denoted by KOR, which finds a route such that it covers a set of user-specified keywords, a specified budget constraint is satisfied, and an objective score of the route is optimal. The problem of answering KOR queries is NP-hard. We first devise an approximation algorithm with provable approximation bounds. Based on this algorithm, a more efficient approximation algorithm is proposed. We also design a greedy approximation algorithm. Results of empirical studies show that all the proposed algorithms are capable of answering KOR queries efficiently. The empirical studies also evaluate the accuracy of the proposed algorithms.
DRNTU::Engineering::Computer science and engineering