Author: Frank, Richard
Spatial classification is the task of learning models to predict class labels for spatial entities based on their features as well as the spatial relationships to other entities and their features. One way to perform classification on spatial data is to use a multi-relational database, by transforming the spatial data into multi-relational data, and then applying Inductive Logic Programming (ILP) onto it. However this presents novel challenges not present in multi-relational data mining problems. One such problem is that spatial relationships are embedded in space. When applying a multi-relational data mining algorithm, the algorithm needs to determine which relationships are important and what spatial features of the entity to consider. In order to determine when two entities are spatially related in an adaptive and non-parametric way, a Voronoi-based neighbourhood definition is introduced in this thesis upon which spatial literals can be built. Compounding the complexity is the need to use aggregation since the effect of a single spatial entity is negligible when in the neighbourhood of hundreds or thousands of other such entities. Properties of these neighbourhoods also need to be described and used for classification purposes. Non-spatial aggregation literals already exist within the multi-relational framework of ILP, but are not sufficient for comprehensive spatial classification. This thesis adds a formal set of additions to the ILP framework, to be able to represent the aggregation of multiple features, spatial aggregations as well as spatial features and literals. These additions allow for capturing more complex interactions and spatial occurrences such as spatial trends. In order to more efficiently perform the rule learning and exploit powerful multi-processor machines, a scalable parallelized method capable of reducing the runtime by several factors is presented. The method is compared against existing methods by experimental evaluation on several real world crime datasets.
Copyright is held by the author.
The author has not granted permission for the file to be printed nor for the text to be copied and pasted. If you would like a printable copy of this thesis, please contact email@example.com.
Member of collection