Ebook: Abstraction in Ontology-based Data Management
Effectively documenting data services is a crucial issue in any organization, not only for governing data but also for interoperation purposes. Indeed, in order to fully realize the promises and benefits of a data-driven society, data-driven approaches need to be resilient, transparent, and fully accountable.
This book, Abstraction in Ontology-based Data Management, proposes a new approach to automatically associating formal semantic description to data services, thus bringing them into compliance with the FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles. The approach is founded on the Ontology-based Data Management (OBDM) paradigm, in which a domain ontology is used to provide a high-level semantic layer mapped to the source schema of an organization containing data, thus abstracting from the technical details of the data layer implementation. A formal framework for a novel reasoning task in OBDM, called Abstraction, is introduced in which a data service is assumed to be expressed as a query over the source schema, and the aim is to derive a query over the ontology that semantically describes the given data service best with respect to the underlying OBDM specification. In a general scenario that uses the most popular languages in the OBDM literature, an in-depth complexity analysis of two computational problems associated with the framework is carried out. Also investigated is the problem of expressing abstractions in a non-monotonic query language as well as the impact of adding inequalities. Regarding the latter, the problem of answering queries with inequalities over lightweight ontologies is first studied. Lastly, the author illustrates how the achieved results contribute to new results in the Semantic Web context and in the Relational Database theory.
The book will be of interest to all those engaged in Artificial Intelligence and Data Management.
In many aspects of our society there is growing awareness and consent on the need for data-driven approaches that are resilient, transparent, and fully accountable. But in order to fulfil the promises and benefits of a data-driven society, it is necessary that the data services exposed by the organisations’ information systems are well-documented, and their semantics is clearly specified. Effectively documenting data services is indeed a crucial issue for organisations, not only for governing their own data, but also for interoperation purposes.
In this thesis, we propose a new approach to automatically associate formal semantic descriptions to data services, thus bringing them into compliance with the FAIR guiding principles, i.e., make data services automatically Findable, Accessible, Interoperable, and Reusable (FAIR). We base our proposal on the Ontology-based Data Management (OBDM) paradigm, where a domain ontology is used to provide a semantic layer mapped to the data sources of an organisation, thus abstracting from the technical details of the data layer implementation.
The basic idea is to characterise or explain the semantics of a given data service expressed as query over the source schema in terms of a query over the ontology. Thus, the query over the ontology represents an abstraction of the given data service in terms of the domain ontology through the mapping, and, together with the elements in the vocabulary of the ontology, such abstraction forms a basis for annotating the given data service with suitable metadata expressing its semantics.
We illustrate a formal framework for the task of automatically produce a semantic characterisation of a given data service expressed as a query over the source schema. The framework is based on three semantically well-founded notions, namely perfect, sound, and complete source-to-ontology rewriting, and on two associated basic computational problems, namely verification and computation. The former verifies whether a given query over the ontology is a perfect (respectively, sound, complete) source-to-ontology rewriting of a given data service expressed as a query over the source schema, whereas the latter computes one such rewriting, provided it exists. We provide an in-depth complexity analysis of these two computational problems in a very general scenario which uses languages amongst the most popular considered in the literature of managing data through an ontology. Furthermore, since we study also cases where the target query language for expressing source-to-ontology rewritings allows inequality atoms, we also investigate the problem of answering queries with inequalities over lightweight ontologies, a problem that has been rarely addressed. In another direction, we study and advocate the use of a non-monotonic target query language for expressing source-to-ontology rewritings. Last but not least, we outline a detailed related work, which illustrates how the results achieved in this thesis notably contributes to new results in the Semantic Web context, in the relational database theory, and in view-based query processing.