Instance Retrieval for Class Expression Learning Using SPARQL

Karalis, Nikolaos; Bigerl, Alexander; Demir, Caglar; Heidrich, Liss; Ngonga Ngomo, Axel-Cyrille

doi:10.3233/FAIA250207

Abstract

Over 100 billions RDF assertions are available on the Web. With this high availability of knowledge expressed in RDF knowledge bases comes the need to make them amenable to Web-scale machine learning. Knowledge graph embeddings cater for enabling the use of knowledge bases in neural settings. However, they do so by discarding the explicit semantics that underpin knowledge bases. In contrast, class expression learning makes explicit use of the semantics of RDF knowledge bases expressed in description logics. In contrast to neural approaches, this form of machine learning generates models that can be translated into natural language and can thus be understood by domain experts. However, most implementations of this paradigm fail to scale to the large knowledge bases found on the Web and in real-life applications. The corresponding literature suggests that one common bottleneck of these approaches is the instance retrieval function. We address this drawback by introducing an approach based on worst-case optimal multi-way joins for the evaluation of SPARQL queries that correspond to ALC class expressions. We implement our algorithm into a tensor-based triple store and use this triple store as backend to efficiently answer retrieval queries in ALC under the closed-world assumption. We evaluate the implementation of our approach on five benchmark datasets against four state-of-the-art graph storage solutions for RDF knowledge graphs. The results of our extensive evaluation show that our approach outperforms its competition across all datasets and that it is the only one able to scale to large datasets. With our approach, class expression learning can now be used on Web-scale knowledge bases.

This website uses cookies

This website uses cookies