In recent years, knowledge graphs (KGs) and ontologies have been widely adopted for modelling any kind of domain. Many of them are released openly, which benefits those who start new projects because they have a wide choice for ontology reuse and for linking to existing data. Nevertheless, understanding the content of an ontology or a knowledge graph is far from straightforward, and existing methods only partially address this issue. For example, exploring and comparing multiple ontologies is a tedious manual task. This thesis is based on the assumption that identifying the Ontology Design Patterns (ODPs) used in an ontology or a knowledge graph contributes to address this problem. Most of the time the reused ODPs are not explicitly annotated, or their reuse may be unintentional. Therefore, there is a challenge to automatically identify ODPs in existing ontologies and knowledge graphs, which is the main focus of this research work. To the best of our knowledge, there is lack of tools to effectively support this task. This thesis contributes to the state of the art by analysing the role of ODPs in ontology engineering, through experiences in real-world ontology projects, placing this analysis in the wider context of existing ontology reuse approaches and implementations. Moreover, this thesis introduces (i) a novel method for extracting empirical ontology design patterns (EODPs) from ontologies, and (ii) a novel method for extracting EODPs from knowledge graphs, whose schemas are implicit.
The first method is able to group the extracted EODPs in clusters that are named conceptual components. Each conceptual component represents a generalised modelling problem, for example representing collections. As EODPs are fragments possibly extracted from different ontologies, some of them will fall in the same cluster, meaning that they are expected to be implemented solutions to the same modelling problem, e.g. different solutions to model collections. Hence, EODPs and conceptual components enable the empirical observation of modelling solutions to common modelling problems adopted by different ontologies, therefore supporting their comparison.
The second method extracts EODPs from a knowledge graph as sets of probabilistic axioms and constraints involving the classes and the properties instantiated in the KG. The probabilistic axioms are annotated with relevant provenance information, e.g. the KG they were observed from. These EODPs may support KG inspection and comparison, as they provide insights on how certain entities are described in a KG and linked to other resources. Both methods are applied to ontologies and knowledge graphs largely adopted and reused, such as Wikidata.
An additional contribution of this thesis is an ontology for annotating ODPs in ontologies and knowledge graphs, which can be used as a basis for both manual and automatic annotation.