
Ebook: Applications and Practices in Ontology Design, Extraction, and Reasoning

Semantic Web technologies enable people to create data stores on the Web, build vocabularies, and write rules for handling data. They have been in use for several years now, and knowledge extraction and knowledge discovery are two key aspects investigated in a number of research fields which can potentially benefit from the application of semantic web technologies, and specifically from the development and reuse of ontologies.
This book, Applications and Practices in Ontology Design, Extraction, and Reasoning, has as its main goal the provision of an overview of application fields for semantic web technologies. In particular, it investigates how state-of-the-art formal languages, models, methods, and applications of semantic web technologies reframe research questions and approaches in a number of research fields. The book also aims to showcase practical tools and background knowledge for the building and querying of ontologies.
The first part of the book presents the state-of-the-art of ontology design, applications and practices in a number of communities, and in doing so it provides an overview of the latest approaches and techniques for building and reusing ontologies according to domain-dependent and independent requirements. Once the data is represented according to ontologies, it is important to be able to query and reason about them, also in the presence of uncertainty, vagueness and probabilities.
The second part of the book covers some of the latest advances in the fields of ontology, semantics and reasoning, without losing sight of the book’s practical goals.
Semantic Web technologies have been in use for several years. In particular, the second version of the Web Ontology Language (OWL), a logical formalism for defining ontologies, is a W3C recommendation since 2012. Secondly, knowledge extraction and knowledge discovery are key aspects investigated in a number of research fields, which can potentially benefit from the application of Semantic Web technologies, and specifically from the development and reuse of ontologies.
The main goal of this book is to provide an overview of application fields for Semantic Web technologies. In particular, the objective is to investigate how state-of-the-art formal languages, models, methods, and applications of Semantic Web technologies reframe research questions and approaches in a number of research fields. Secondly, the book aims at showcasing practical tools and background knowledge for building and querying ontologies.
The first part of this book presents the state of the art of ontology design, applications and practices in a number of communities, so as to provide an overview of the latest approaches and techniques for building and reusing ontologies according to domain-dependent and independent requirements. Once the data is represented according to ontologies, it is important to be able to query and reason about them, also in presence of uncertainty, vagueness and probabilities.
The second part of this book, without forgetting its practical goal, illustrates the modern advances in the field of ontology semantics and reasoning.
We provide an in-depth example of modular ontology engineering with ontology design patterns. The style and content of this chapter is adapted from previous work and tutorials on Modular Ontology Modeling. It offers expanded steps and updated tool information. The tutorial is largely self-contained, but assumes that the reader is familiar with the Web Ontology Language OWL; however, we do briefly review some foundational concepts. By the end of the tutorial, we expect the reader to have an understanding of the underlying motivation and methodology for producing a modular ontology.
Ontology reuse aims to foster interoperability and facilitate knowledge reuse. Several approaches are typically evaluated by ontology engineers when bootstrapping a new project. However, current practices are often motivated by subjective, case-by-case decisions, which hamper the definition of a recommended behaviour. In this chapter we argue that to date there are no effective solutions for supporting developers’ decision-making process when deciding on an ontology reuse strategy. The objective is twofold: (i) to survey current approaches to ontology reuse, presenting motivations, strategies, benefits and limits, and (ii) to analyse two representative approaches and discuss their merits.
With the adoption of Semantic Web technologies, an increasing number of vocabularies and ontologies have been developed in different domains, ranging from Biology to Agronomy or Geosciences. However, many of these ontologies are still difficult to find, access and understand by researchers due to a lack of documentation, URI resolving issues, versioning problems, etc. In this chapter we describe guidelines and best practices for creating accessible, understandable and reusable ontologies on the Web, using standard practices and pointing to existing tools and frameworks developed by the Semantic Web community. We illustrate our guidelines with concrete examples, in order to help researchers implement these practices in their future vocabularies.
In this chapter, an overview of the state of the art on knowledge graph generation is provided, with focus on the two prevalent mapping languages: the W3C recommended R2RML and its generalisation RML. We look into details on their differences and explain how knowledge graphs, in the form of RDF graphs, can be generated with each one of the two mapping languages. Then we assess if the vocabulary terms were properly applied to the data and no violations occurred on their use, either using R2RML or RML to generate the desired knowledge graph.
One of the most important goals of digital humanities is to provide researchers with data and tools for new research questions, either by increasing the scale of scholarly studies, linking existing databases, or improving the accessibility of data. Here, the FAIR principles provide a useful framework. Integrating data from diverse humanities domains is not trivial, research questions such as “was economic wealth equally distributed in the 18th century?”, or “what are narratives constructed around disruptive media events?”) and preparation phases (e.g. data collection, knowledge organisation, cleaning) of scholars need to be taken into account. In this chapter, we describe the ontologies and tools developed and integrated in the Dutch national project CLARIAH to address these issues across datasets from three fundamental domains or “pillars” of the humanities (linguistics, social and economic history, and media studies) that have paradigmatic data representations (textual corpora, structured data, and multimedia). We summarise the lessons learnt from using such ontologies and tools in these domains from a generalisation and reusability perspective.
Ontologies of research areas have been proven to be useful resources for analysing and making sense of scholarly data. In this chapter, we present the Computer Science Ontology (CSO), which is the largest ontology of research areas in the field, and discuss a number of applications that build on CSO to support high-level tasks, such as topic classification, metadata extraction, and recommendation of books.
In Digital Humanities, one of the main challenge consists in capturing the structure of complex information in data models and ontologies, in particular when connections between terms are not trivial. This is typically the case for librarian music data. In this chapter, we provide some good practices for representing complex knowledge using the DOREMUS ontology as an exemplary case. We also show various applications of a Knowledge Graph leveraging on the ontology, ranging from an exploratory search engine, a recommender system and a conversational agent enabling to answer classical music questions.
While there exist several reasoners for Description Logics, very few of them can cope with uncertainty. BUNDLE is an inference framework that can exploit several OWL (non-probabilistic) reasoners to perform inference over Probabilistic Description Logics.
In this chapter, we report the latest advances implemented in BUNDLE. In particular, BUNDLE can now interface with the reasoners of the TRILL system, thus providing a uniform method to execute probabilistic queries using different settings. BUNDLE can be easily extended and can be used either as a standalone desktop application or as a library in OWL API-based applications that need to reason over Probabilistic Description Logics.
The reasoning performance heavily depends on the reasoner and method used to compute the probability. We provide a comparison of the different reasoning settings on several datasets.
In this work we describe preferential Description Logics of typicality, a nonmonotonic extension of standard Description Logics by means of a typicality operator T allowing to extend a knowledge base with inclusions of the form T(C) ⊑ D, whose intuitive meaning is that “normally/typically Cs are also Ds”. This extension is based on a minimal model semantics corresponding to a notion of rational closure, built upon preferential models. We recall the basic concepts underlying preferential Description Logics. We also present two extensions of the preferential semantics: on the one hand, we consider probabilistic extensions, based on a distributed semantics that is suitable for tackling the problem of commonsense concept combination, on the other hand, we consider other strengthening of the rational closure semantics and construction to avoid the so called “blocking of property inheritance problem”.
Axiom pinpointing refers to the task of finding the specific axioms in an ontology which are responsible for a consequence to follow. This task has been studied, under different names, in many research areas, leading to a reformulation and reinvention of techniques. In this chapter, we present a general overview to axiom pinpointing, providing the basic notions, different approaches for solving it, and some variations and applications which have been considered in the literature. This should serve as a starting point for researchers interested in related problems, with an ample bibliography for delving deeper into the details.
DLN is a recent approach that extends description logics with defeasible reasoning capabilities. In this paper we provide an overview on DLN, illustrating the underlying knowledge engineering requirements as well as the characteristic features that preserve DLN from some recurrent semantic and computational drawbacks. We also compare DLN with some alternative nonmonotonic semantics, enlightening the relationships between the KLM postulates and DLN.
The problem of querying RDF data is a central issue for the development of the Semantic Web. The query language SPARQL has become the standard language for querying RDF since its W3C standardization in 2008. However, the 2008 version of this language missed some important functionalities: reasoning capabilities to deal with RDFS and OWL vocabularies, navigational capabilities to exploit the graph structure of RDF data, and a general form of recursion much needed to express some natural queries. To overcome those limitations, a new version of SPARQL, called SPARQL 1.1, was released in 2013, which includes entailment regimes for RDFS and OWL vocabularies, and a mechanism to express navigation patterns through regular expressions. Nevertheless, there are useful navigation patterns that cannot be expressed in SPARQL 1.1, and the language lacks a general mechanism to express recursive queries. This chapter is a gentle introduction to a tractable rule-based query language, in fact, an extension of Datalog with value invention, stratified negation, and falsum, that is powerful enough to define SPARQL queries enhanced with the desired functionalities focussing on a core fragment of the OWL 2 QL profile of OWL 2.
Reasoning over OWL 2 is a very expensive task in general, and therefore the W3C identified tractable profiles exhibiting good computational properties. Ontological reasoning for many fragments of OWL 2 can be reduced to the evaluation of Datalog queries. This paper surveys some of these compilations, and in particular the one addressing queries over Horn-SHIQ knowledge bases and its implementation in DLV2 enanched by a new version of the Magic Sets algorithm.