C o m p u t a t i o n a l    L o g i c

Spatial Databases and Spatial Logic


 

Spatial Databases and Spatial Logic

Jan Paredaens

Introduction

Spatial databases basically investigate that kind of databases that manage information about one, two or more dimensional real space. In classical database theory the content of databases is abstract and has no property except the equality. This is why genericity [16] is so important here. In spatial databases, the content has some interpretation. All the laws of real geometry hold. This induces a far richer class of datastructures. It augments the possibilities to represent these databases. The content of the latter is indeed infinite, since it contains, mostly, an infinite number of points. There are however, different possibilities to represent this information in a finite way.

The different geometrical properties give rise to many kinds of genericity [49]. They express the importance of the special semantic properties of real spatial databases. Therefore, the search for complete languages for expressing these generic queries is a hard challenge, especially for the generic classes that cannot be described in a finite way. We are not going to discuss this topic but refer to [38] for an elaborated overview.

Algebras and Graphs

The geo-relational algebra was a first design of an extended relational algebra including spatial data types with a comprehensive set of operations [33]. This algebra was implemented within a large prototype system, the Gral System. The ROSE algebra offers more general datatypes for a 2D spatial database [34, 35]. The concept of a realm is fundamental for this kind of work. An important problem in spatial databases that is not solved through the introduction of spatial data types and algebras is the representation and querying of networks, such as roads, rivers or public transport systems. This is handled in the GraphDB model [36, 37]. More information on these topics can be found in [2].

First order in Spatial Databases

The framework of constraint databases, introduced by Kanellakis, Kuper and Revesz [41], provides an elegant and powerful model of spatial databases [49, 51]. In this setting, a spatial database, viewed as a possibly infinite set of points in real space, is represented as a Boolean combination of polynomial inequalities. For example, the unit disk in the real plane except its upper left quadrant would be represented as

{(x,y) *x2 + y2 ? 1*? (x >= 0 * y >= 0)}.

By extending the relational calculus with polynomial inequalities one obtains a simple spatial query language. For example, the query whether the database S contains a circle can be expressed as

(*x0)(*y0)(*r)(*x)(*y)((x-x0)2+(y-y0)2=r2 *S(x,y)).

Although variables now range over the whole of the real numbers, queries expressed in the calculus can still be effectively computed using methods from symbolic computation.

The expressiveness of the calculus is limited, however. For example, the combined results of Benedikt et al. [12] and Grumbach and Su [29] imply that one cannot express in the calculus that the database is topologically connected. The topological connectivity query can be viewed as the spatial analogue of the standard relational query of graph connectivity, which is also not expressible in the standard relational calculus.

4 Deductive Spatial Databases

Consider the following trivial program

R(0)*

R(y)*R(x), y=x+1.

This program does not terminate, regardless of the input database (which is not used at all in the program), because the resulting set R grows unboundedly. Let us therefore see what happens if a fixed bound is put on the result, by requiring that the result must stay within some fixed bounded region.

The following example shows that under this requirement programs may still loop forever:

R(1) *

R(x) * R(x'), x = x'/2.

Relation R remains in the interval [0,1] but the program does not terminate; the interval is repeatedly cut in half and this goes on indefinitely.

In search of termination guarantees, we therefore put the drastic restriction on programs that no arithmetic (polynomial inequalities) is allowed in rule bodies. In other words, we concentrate on pure Datalog programs. Clearly, under this restriction, programs that do not rely on the database, such as the two programs given above, can no longer loop forever.

The classical example in pure Datalog is the transitive closure program TC:

R(x,y) * S(x,y)

R(x,y) * R(x,z), S(x,y).

On finite inputs S, TC always terminates. On infinite inputs, however, this is of course no longer true. For example, TC will loop forever on the semi-algebraic set S defined by y=x+1; in the n-th iteration all points of the form (x,x+n) are added.

It is a basic fact of standard relational database theory that on finite relations, pure Datalog programs always terminate. One might hope that this property carries over to infinite but bounded, semi-algebraic sets. For example, by bounding the above-mentioned set y=x+1 as

y=x+1*0 ? x ? 10,

TC will terminate after ten iterations. This hope is unjustified, however. On the bounded input

S= {(x,y) *y=2x * 0 ? x ? 1/2},

TC will loop forever; every iteration adds a line segment of increasing slope.

The previous example also shows that, even if we require not only the input set but also the output set to be bounded, programs still can loop forever. Indeed, the lines of increasing slope all remain in the unit square.

As a last resort in our search for termination guarantees, we therefore investigate the stronger requirement that the output set is always contained in the input set. In other words, we concentrate on selection queries. The output of a selection query on a bounded input is clearly also bounded.

Whether or not a program expresses a selection query is a semantic property, but we can put a strong syntactic range restriction on the rules so that only selection queries can be expressed: every rule has to be of the form

R(x,y) * S(x,y) , B,

where S is as always the EDB predicate denoting the database and B is the rest of the rule body. A program in which every rule is of this form is called a selection program. The transitive-closure program is not a selection program.

The following program, a range-restricted variation of TC, is an example of a selection program:

R(x,y)*S(x,y), S(z,x)

R(x,y)*S(x,y), S(x,z), R(z,y).

On unbounded semi-algebraic inputs, selection programs can loop forever, as witnessed by the set

S = {(x,y)*y=x+1* x >= 0} E {(0,0)}.

On this S, the above program will not terminate; in the first iteration the point (0,0) is added, and in subsequent iterations subsequent points of the form (n,n+1) are added. However, if we bound S, e.g., by adding the constraint y ? 10, the program will terminate.

Hence, our last hope is that

selection programs on bounded semi-algebraic inputs always terminate.

The following selection program destroys also this last hope:

R(x,y) * S(x,y), S(x,x)

R(x,y) * S(x,y), R(z,x).

On the bounded input

S = {(x,y)* y = x/2 * 0 x ? 1}E{(1,1)},

this program does not terminate; in the first iteration the point (1,1) is added, in the second iteration the endpoint of the line segment in the database is added, and in subsequent iterations the line segment is repeatedly cut in half, going on indefinitely. [44] presents some results in this context.

5 Topological Aspects

Motivated by applications in geographical information systems where only relative positions of spatial data are of importance, a number of researchers have investigated topological properties of spatial databases and query languages in which to express them. Topological queries are those whose result only depends on the topological aspects of the spatial data.

In the polynomial model, the natural query language, as mentioned before, is first-order logic over the reals (expanded with a symbol to address the database). Relative to this query language, we say that two spatial databases are called topologically elementary equivalent if they cannot be distinguished by such topological first-order queries. Kuijpers, Paredaens and Van den Bussche [50] have given an effective characterization of topological elementary equivalence of closed databases in the real plane. This characterization is based on a known topological property of semi-algebraic sets, namely that locally around each point they are "conical''. The points in the database are partitioned according to the types of their cones. Roughly, the characterization then says that two databases are topologically elementary equivalent if and only if the cardinalities of the equivalence classes of their partitions match.

The same authors have also investigated the question of which topological properties can be expressed in first-order logic. They have introduced a two-tiered logic for expressing topological properties, which is subsumed by first-order logic over the reals. and they prove on certain classes of databases that the two logics are actually equivalent.

Another approach to investigating topological properties of planar spatial databases was followed by Papadinitriou, Suciu and Vianu [48]. They consider labeled regions in the plane and investigate topological relationships between them that were introduced by Egenhofer and Franzosa [27]. The query languages that they consider are not coordinate or point based, but rather region based, in the sense that quantifiers range over regions. They have shown that the closure of these relationships under appropriate logical operations yields a language that is complete for topological properties of spatial databases.

Some queries only involve properties of the database that are topological in nature. In this class of queries, concepts such as adjacency, connectivity, and containment are important. Queries like "Is there a highway connecting Brussels to Paris?'' or "Give all countries in Europe adjacent to the Atlantic'' are typical in this respect. Characteristic of topological properties is that they do not distinguish between two databases that can be obtained from each other by a topological deformation. We will call such databases topologically equivalent.

In applications in which only topological properties are under consideration, it may be desirable to be able to work with a representation of the database which is topologically invariant, meaning that two topologically equivalent databases will be represented identically. Ideally, as we will illustrate later, a representation should also be lossless, in the sense that two databases that are not topologically equivalent will be represented differently.

This topological property is elaborated in the context of a limited class of spatial databases consisting of points in the plane R2, lines between these points, and areas formed by these lines. This model is commonly referred to as the topological data model [36, 46]. An example application is a subway or railroad map in which only relative positions of spatial objects such as stations and tracks are depicted without, for instance, taking the actual length of the trajectory into account. A survey of application domains that can be modeled in this manner was given by Laurini and Thompson [46]. Another approach of topological issues in the context of spatial databases can be found in [4].

6 Some Spatial Database Systems

The World Wide Web hosts a multitude of commercial spatial database products and scientific prototypes. Geographic Information Systems (GIS) constitute the most well-known applications of spatial databases, though other applications (e.g. CAD and robotics) also exist. In this paragraph, the first three products are well-known implementations of spatial databases, whereas the last two are examples of numerous commercial GIS products. For an exhaustive list of GIS resources on the Web, we refer to [1].

1.The ROSE algebra (Gueting and Schneider, 1995) [6].

The ROSE algebra offers general spatial data types in 2D, is based on finite resolution geometry, has a complete formal definition of semantics, and interfaces with any DBMS data model. The main techniques used are (parallel) traversal of objects, plane-sweep, and graph algorithms. The ROSE algebra has also been implemented and the Modula-2 source code is freely available from the authors for study or use.

2. Dedale (Stephane Grumbach et al.) [7]

Dedale is a prototype of a constraint database for multidimensional data, conceived for spatio-temporal applications. It is implemented on top of the O2 DBMS, and uses a graphical user interface for querying purposes. Its main characteristics are (i) to offer a real abstraction of geometric data allowing the development of high level extensible query languages with a potential for optimization, while (ii) allowing the use of optimal computation techniques for spatial queries.

3. ARC/INFO (ESRI) [8]

ARC/INFO, running on UNIX or Windows NT, is a powerful GIS server for large or small organizations. ARC/INFO is implemented as a client to ESRI's Spatial Database Engine (SDE ) universal spatial data server. It enables the user to create and maintain geographic information, manage large, multi-user spatial databases, perform sophisticated spatial analysis, etc.

4. TNT-mips (Micro Images)[9]

TNTmips is the Map and Image Processing System from MicroImages, Inc. for geospatial analysis. Some of the analysis features of the product include: Elevation Mapping and Orthoimage Creation, Surface Fitting, Classification and Interpretation, Statistical Measurements, Polygon Fitting, etc.

5. Laser Scan [10]

Laser-Scan provides an advanced object-oriented GIS application development and spatial database environment together with a range of solutions built on this software platform.

7 Conclusions and Future Directions

I hope that the reader realizes that we still do not really understand the interaction between deductive databases and spatial databases. A lot of investigation is still needed in this domain. It could be that we need another data model to represent spatial information. On the other hand a lot of knowledge is available in geometry and computational geometry. Spatial databases are mainly based on some spatial logic. This spatial logic should be studied deeply. A fruitful marriage between computational geometry, deduction, computational logic, the spatial logic and some new data modelling techniques can give birth to a better understanding and a more efficient threatement of spatial information.

Acknowledgement

I especially thank Bart Kuijpers and Stijn Dekeyser for their help during the production of this text.

References

[1] Geographical Information Systems WWW Resource List,http://www.geo.ed.ac.uk/home/giswww.htmly (last visited 9-dec-97)

[2] Ralf Hartmut Gueting's Home Page, http://voss.fernuni-hagen.de/gebiete/pi4/gueting/home2.html (last visited 9-dec-97)

[3] Spatial Databases, http://www.luc.ac.be/research/ group s/theocomp/spatial.html (last visited 9-dec-97)

[4] Spatial Database Research Group, http://spatial. maine.edu/ max/max3.html (last visited 9-dec-97)

[5] Database Group UIA, http://pucky.uia.ac.be/ makke/Database/index.phtml (last visited 9-dec-97)

[6] The ROSE algebra, http://voss.fernuni-hagen.de/gebiete/pi4/gueting/home.html (last visited 9-dec-97)

[7] Dedale prototype, http://www-rocq.inria.fr/verso/dedale/ (last visited 9-dec-97)

[8] ESRI's ARC/INFO, professional GIS, http://www.esri.com/base/products/arcinfo/arcinfo.html (last visited 9-dec-97)

[9] TNTmips, The Map and Image Processing System, http://tnt.microimages.com/tntmips.htm (last visited 9-dec-97)

[10] Laser Scan, http://www.lsl.co.uk/ (last visited 9-dec-97)

[11] Alexandroff P., Elementary Concepts of Topology. Dover Pub., Inc., New York, 1961.

[12] Benedikt M., Dong G., Libkin L., Wong L., Relational expressive Power of Constraint Query Languages. In Proceedings 15th ACM Symposium on Principles of Database Systems, pages 5-16, 1996.

[13] Bochnak J., Coste M., Roy M.-F., Geometry algebrique reelle. Springer-Verlag, 1987.

[14] Burton W., Representation of many-sided polygons and polygonal lines for rapid processing. Comm. of the ACM, 20, 3, 166--171, 1977.

[15] Chan E., Zhu R., QL/G - A Query language for Geometric Data Bases.TR CS-94-25, Univ. of Waterloo, 1994.

[16] Chandra A., Harel D., Computable queries for relational database systems. Journal of Comp. and System Sciences, 21, 2, 156--178, 1980.

[17] Cole P., King C., Quantitative Geography. John Wiley, London, 1968.

[18] Collins G.E., Quantifier elimination for real closed fields by cylindrical algebraic decomposition. LNCS 33, 134--183, 1975.

[19] Corbett J.P., Topological Principles in Cartography. Technical paper 48, US Bureau of the Census, Washington DC, 1979.

[20] Dumortier F., Gyssens M., Vandeurzen L., Van Gucht D., On the decidability of Semi-Linearity for Semi-Algebraic Sets and its Implications for Spatial Databases In Proceedings 16th ACM Symposium on Principles of Database Systems, pages 68-77. ACM Press, 1997.

[21] Egenhofer M., Frank A., Towards a spatial query language: User interface considerations. Proc. of VLDB, 1988.

[22] Egenhofer M., A Formal definition of Binary Topological Relationships. LNCS 367, 457--472, 1989.

[23] Egenhofer M., Reasoning about Binary topological Relations. \newblock LNCS 525, 143--160, 1991.

[24] Egenhofer M., Topological Relations between Regions in R2 and Z2. LNCS 692, 316--336, 1993.

[25] Egenhofer M., Franzosa R.,On the equivalence of topological relations. Int. J. Geographical Information Systems, 523--542, 1994.

[26] Egenhofer M., Spatial SQL: a Query and Presentation Language. IEEE Transactions on Knowledge and Data Engineering, Vol. 6, N o.1, 1994.

[27] Egenhofer M., Franzosa R., Point-set topological spatial relations. Int. J. Geographical Information Systems, 5(2):161--174, 1991.

[28] Egenhofer M., Herring J., Advances in Spatial Databases, Lecture Notes of Computer Science, Volume 951, 1995

[29] Grumbach S., Su J., First-order definability over constraint databases In Montanari and Rossi, Principles and practice of constraint programming, pages 121-136, 1995.

[30] Guenther O., Efficient structures for geometric data management. LNCS 377, Springer-Verlag, 1988.

[31] Guenther O., Buchmann A., Research Issues in Spatial Databases. Sigmod Record Vol 19,4, 61--68, 1990.

[32] Guting R., Geo-Relational Algebra: A model and query language for geometric database systems. Proc of Extending Data Base Technology, Venice, 506--527, 1988.

[33] Guting R., Gral: An Extensible Relational Database System for Geometric Applications. Proc 15th VLDB, Amsterdam, 33-44, 1989.

[34] Guting R., Schneider M., Realm-based Spatial Data Types: The ROSE Algebra. TR 141-3-93 of Fern Universitat, Hagen, 1993.

[35] Guting R., Schneider M., Realms: A Foundation for Spatial Data Types in Database Systems. 3rd Int. Symp. on Large Spatial Databases, 14-35, Singapore, 1993.

[36] Guting R., GraphDB: A Data Model and Query Language for Graphs in Databases. Informatik Berichte 155-2/1994, FernUniversitat, Hagen, 1994.

[37] Guting R., An Introduction to Spatial Database Systems. The VLDB Journal, 3, 4, 1994.

[38] Gyssens M., Van den Bussche J., Van Gucht D., Complete geometrical query languages, In Proceedings 16th ACM Symposium on Principles of Database Systems, pages 62-67. ACM Press, 1997.

[39] Hirst T., Harel D., Completeness Results for Recursive Data Bases. Proc. 12th Symp. on Princ. of Database Systems, Washington, DC, 244--252, 1993.

[40] Hoel E., Performance of Data-parallel Spatial Operations. VLDB94, 156--167, Chili, 1994.

[41] Kanellakis P., Kuper G., Revesz P., Constraint query languages. Proc. 9th Symp. on Princ. of Database Systems, 299--313, 1990.

[42] Kemper A., Wallrath M., An analysis of geometric modeling in database systems. Computing Surveys, 19, 1, 47--91, 1987.

[43] Kuijpers B., Paredaens J., Van den Bussche J., Lossless Representation of Topological Spatial Data. To appear in LNCS, Proceedings of the 4th Symposium on Large Spatial Data Bases, Portland (Maine) , 1995.

[44] Kuijpers B., Paredaens J., Smits M, Van den Bussche J., Termination properties of Spatial Database Programs, Proc. Logic in Databases, LNCS 1154, pages 95-110, 1996

[45] Kuijpers B., Paredaens J., Data Models and Query Languages for Spatial Databases, Accepted in DKE

[46] Laurini R., Thompson D., Fundamentals of Spatial Information Systems. Academic Press, APIC Series 37, 1992.

[47] Moise E.E., Geometric Topology in Dimensions 2 and 3. Graduate Texts in Mathematics, 47, Springer-Verlag, 1977.

[48] Papadimitriou C., Suciu D., Vianu V., Topological queries in spatial databases. In Proceedings 15th ACM Symposium on Principles of Database Systems, pages 81--92. ACM Press, 1996.

[49] Paredaens J., Spatial Databases, the Final Frontier, ICDT95, LNCS 893, pages 18-32, 1995

[50] Paredaens J., Kuijpers B., Van~den Bussche J., On topological elementary equivalence of spatial databases. In F. Afrati and Ph. Kolaitis, editors, Database Theory---ICDT'97, volume 1186 of Lecture Notes in Computer Science, pages 432--446. Springer, 1997.

[51] Paredaens J., Van den Bussche J., Van Gucht D., Towards a Theory of Spatial Database Queries. In Proceedings 13th ACM Symposium on Principles of Database Systems, pages 279-288, 1994.

[52] Paredaens J., Van den Bussche J., Van Gucht D., Full paper of "Towards a theory of spatial database queries''. In preparation.

[53] Peano G., Sur une courbe qui remplit toute une aire plane. Mathematische Annalen, 36(a), 157--160, 1890.

[54] Preparata F., Shamos M., Computational Geometry. Springer-Verlag, 1985.

[55] Rigaux Ph., Scholl M., Multiple Representation Modelling and Querying. IGIS94, 1994.

[56] Samet H., The Design and Analysis of Spatial Data Structures. Addison-Wesley, 1990.

[57] Scholl M., Voisard A., Thematic Map Modeling. Proc. SSD, 167--192, also in LNCS 409, Springer-Verlag, 1989.

[58] Stolboushkin A., Taitslin M., Linear vs. Order Constraint Queries over Relational Databases In Proceedings 15th ACM Symposium on Principles of Database Systems, pages 17-27, 1996.

[59] Tarski A., A decision method for elementary algebra and geometry. University of California Press, 1951.

Jan Paredaens

University of Antwerp, Belgium

pareda@uia.ua.ac.be


Coordinator's Report ] Intelligent Access to Heterogeneous Information Sources ] PODS '97 -- ACM Symposium on Principles of Database Systems ] [ Spatial Databases and Spatial Logic ] From Databases to Web-Bases ] Institute of Information Systems, Technical University of Vienna ]


Home ] Automated Deduction Systems ] Computational Logic & Machine Learning ] Concurrent & Constraint Logic Programming ] Language Design, Semantics & Verification Methods ] Logic Based Databases ] Program Development ] Knowledge Representation & Reasoning ]