

Spatial data from different geographic databases can show a high degree of diversity in terms of object modelling, thematic information, completeness, or currentness of data. Thus, between datasets, database objects representing, e.g., the same road of the real world can show strong differences. Integrating two or more spatial data sources requires a matching on the instance level to identify and link multiple representations of the same real-world entities. Spatial matching has to analyze complex objects for their geometry and other attributes, as well as the dataset topology in order to calculate a matching between geographic objects based on similarity.
In this paper, we propose an iterative object matching process called SimMatching which strongly relies on attribute and relational similarity measures. Starting from geometric and thematic attribute similarity, relational similarity increases when neighboring objects have been matched. Additionally, strong constraints specifying allowed or forbidden matchings help to improve runtime and result quality. The main goals of our algorithm are adaptability to different input data and efficiency for handling complex objects while still achieving high quality results. Scalability to large datasets is supported by using a partitioning framework and parallel processing.