The management of uncertainty in the Semantic Web is of foremost importance given the nature and origin of the available data. This book presents a probabilistic semantics for knowledge bases, DISPONTE, which is inspired by the distribution semantics of Probabilistic Logic Programming. The book also describes approaches for inference and learning. In particular, it discusses 3 reasoners and 2 learning algorithms. BUNDLE and TRILL are able to find explanations for queries and compute their probability with regard to DISPONTE KBs while TRILLP compactly represents explanations using a Boolean formula and computes the probability of queries. The system EDGE learns the parameters of axioms of DISPONTE KBs. To reduce the computational cost, EDGEMR performs distributed parameter learning. LEAP learns both the structure and parameters of KBs, with LEAPMR using EDGEMR for reducing the computational cost. The algorithms provide effective techniques for dealing with uncertain KBs and have been widely tested on various datasets and compared with state of the art systems.
The Semantic Web introduced a new vision of the World Wide Web where the information resources published on the Internet are readable and understandable by machines. However, incompleteness and/or uncertainty are intrinsic to much information, specially when it is collected from different sources. Thus we need a way to manage this kind of data.
In this thesis we address this problem and we present a complete framework for handling uncertainty in the Semantic Web. Description Logics (DLs) are the basis of the Semantic Web. DL knowledge bases (KBs) contains both assertional and terminological information regarding individuals, classes of individuals and relationships among them. We first defined a probabilistic semantics for DLs, called DISPONTE. It is inspired by the distribution semantics, a well known approach in probabilistic logic programming. DISPONTE permits to associate degrees of belief to pieces of information and to compute the probability of queries to KBs.
The thesis then proposes a suite of algorithms for reasoning with KBs following DISPONTE:
• BUNDLE, for “Binary decision diagrams for Uncertain reasoNing on Description Logic thEories”, computes the probability of queries w.r.t. DISPONTE KBs by means of the tableau algorithm and knowledge compilation. BUNDLE is based on Pellet, a state of the art reasoner, and is written in Java.
• TRILL, for “Tableau Reasoner for descrIption Logics in Prolog”, performs inference over DISPONTE KBs with the tableau algorithm implemented in the declarative Prolog language. Prolog is useful for managing the nondeterminism of the reasoning process.
• TRILLP, for “TRILL powered by Pinpointing formulas”, differs from TRILL because it encodes the set of all explanations for queries with a more compact Boolean formula.
A second problem to address is the fact that the probability values are difficult to set for humans. However, usually information is available which can be leveraged for tuning these parameters. Moreover, terminological information in KBs may be incomplete or poorly structured. We thus need of learning systems able to cope with these problems. We present two learning systems, one for each problem:
• EDGE, for “Em over bDds for description loGics paramEter learning”, learns the parameters of a DISPONTE KB.
• LEAP, for “LEArning Probabilistic description logics”, learns terminological axioms together with their parameters by using EDGE.
However, the size of the data is constantly increasing, leading to the socalled Bid Data, Dataset are often too huge to be handled by a single machine in a reasonable time. Modern computing infrastructures such as clusters and clouds must be used where the work is divided among different machines. We thus extended both EDGE and LEAP to exploit these facilities by implementing EDGEMR and LEAPMR that distribute the work using a MapReduce approach.
All systems were tested on real life problems and their performances was comparable or superior to the state of the art.
IOS Press, Inc.
6751 Tepper Drive
Clifton, VA 20124
Tel.: +1 703 830 6300
Fax: +1 703 830 2300 firstname.lastname@example.org
(Corporate matters and books only) IOS Press c/o Accucoms US, Inc.
For North America Sales and Customer Service
West Point Commons
Lansdale PA 19446
Tel.: +1 866 855 8967
Fax: +1 215 660 5042 email@example.com