Database Systems – Progressive Skyline Computation

The skyline of a d-dimensional data set contains the points that are not dominated by any other point on all dimensions. Skyline computation has recently received considerable attention in the database community, especially for progressive methods that can quickly return the initial results without reading the entire database.

All the existing algorithms, however, have some serious shortcomings which limit their applicability in practice. In this article we develop branch-and bound skyline (BBS), an algorithm based on nearest-neighbor search, which is I/O optimal, that  is, it performs a single access only to those nodes that may contain skyline points. BBS is simple to implement and supports all types of progressive processing (e.g., user preferences, arbitrary dimensionality, etc).

Furthermore, we propose several interesting variations of skyline computation, and show how BBS can be applied for their efficient processing.
Categories and Subject Descriptors: Database Management, Information Storage and Retrieval, Information Search and Retrieval

General Terms: Algorithms, Experimentation

INTRODUCTION

The skyline operator is important for several applications involving multi-criteria decision making. Given a set of objects p1, p2, . . . , pN, the operator returns all objects pi such that pi is not dominated by another object pj . Using the common example in the literature, assume in Figure 1 that we have a set of hotels and for each hotel we store its distance from the beach (x axis) and its price ( y axis). The most interesting hotels are a, i, and k, for which there is no point that is better in both dimensions. Borzsonyi et al. [2001] proposed an SQL syntax for the skyline operator, according to which the above query would be expressed as: [Select *, From Hotels, Skyline of Price min, Distance min], where min indicates that the price and the distance attributes should be minimized.
The syntax can also capture different conditions (such as max), joins, group-by, and so on. For simplicity,we assume that skylines are computed with respect to min conditions on all dimensions; however, all methods discussed can be applied with any combination of conditions. Using the min condition, a point pi dominates1 another point pj if and only if the coordinate of pi on any axis is not larger than the corresponding coordinate of pj . Informally, this implies that pi is preferable to pj according to any preference (scoring) function which is monotone on all attributes. For instance, hotel a in Figure 1 is better than hotels b and e since it is closer to the beach and cheaper (independently of the relative importance of the distance and price attributes).

Attachments

Leave a Reply

Your email address will not be published. Required fields are marked *