Equipping machines with commonsense as well as domain-specific knowledge in order to endow them with human-like understanding of certain problem domains has been and still is a main goal of artificial intelligence research. In this context, a crucial question is how high the cost actually is for encoding all the relevant knowledge in such a way that it can be exploited by machines for automatic reasoning, inconsistency detection, etc. While there has been some recent work on developing methodologies allowing us to estimate the cost of knowledge engineering projects [12], it is legitimate to assume that not all the relevant knowledge can be encoded manually. Techniques that can extract and discover knowledge by analyzing human behaviour and data produced as a result thereof can offer an important contribution in this respect.
The field of ontology learning, a term coined by Alexander Mädche and Steffen Staab in 2001 [7], is concerned with the development of methods that can induce relevant ontological knowledge from data. The field can by now look back into more than ten years of intense research. Early research in the field focused on applying shallow methods to term and concept extraction as well as hierarchical and non-hierarchical relation extraction [7]. Later on, in my PhD thesis with the title “Ontology Learning and Population from Text: Algorithms, Evaluation and Applications”, I defined ontology learning as the acquisition of a domain model from data and attempted to provide a systematic overview of ontology learning tasks by introducing the so called ontology learning layer cake, which has received wide attention since then. In recent years, several researchers have attempted to increase the expressivity of the ontologies learned from text data, in particular by attempting to extract deeper axiomatic knowledge (e.g. see [13], [14] and [4]). Some contributions along these lines can also be found in this volume, e.g. aiming at learning OWL axioms by applying inductive techniques (cf. Lehmann et al. [5] and Lisi [6] in this volume).
The problem of ontology learning has turned out to be much more difficult than expected. The main reason for this is, in my view, that an ontology always reflects a way of conceptualizing the world or a given domain, while the results of an ontology learning algorithm that learns from a set of data essentially reflects the idiosyncrasies of the dataset in question. As such, turning the results of an ontology algorithm into an ontology that actually reflects the conceptualization one has of a domain can be more costly than actually building the ontology from scratch. The problem of ontology learning has turned out to be much more difficult than expected. The main reason for this is, in my view, that an ontology always reflects a way of conceptualizing the world or a given domain, while the results of an ontology learning algorithm that learns from a set of data essentially reflects the idiosyncrasies of the dataset in question. As such, turning the results of an ontology learning algorithm into an ontology that actually reflects the conceptualization one has of a domain can be more costly than actually building the ontology from scratch. A second problem so far has been the lack of applications for automatically learned ontologies. While Hotho, Bloehdorn and myself [2] showed some positive impacts of automatically learned ontologies on classification and clustering tasks, not many other convincing applications were at sight at that stage. Recently, ontology learning has, however, seen interesting applications for inducing the semantics of tags used in social media data and folksonomies [1]. Recently, Meilicke et al. have shown that automatically induced knowledge, disjointeness axioms in particular, can be deployed to debug ontology mappings [9]. The fact that such applications are emerging is clearly a good sign that there is definitely progress in the field. A further interesting and very promising application potential for ontology learning lies in the field of Linked Data. Learning from Linked Data will allow us to induce schemata in a bottom-up fashion and let the schema evolve with the data. Ontology population will also continue to play a crucial role in taming and structuring the large amount of unstructured data available, e.g. in Scientific Publications. In many applications domains, one needs to consider all data together in order to extract key facts and knowledge, structure this knowledge in the form of a database in order to aggregate and summarize the data and provide analytical procedures that support decision making by experts.
In the mid-term, I foresee two very interesting research directions for ontology learning. When modelling knowledge, it is relatively easy to model “the obvious” and straightforward knowledge in a particular domain. However, the “not-so-obvious” and more complex relationships are harder to come up with for a knowledge engineer. This is where algorithms that induce more complex and non-trivial relationships from data can assist a human in the process of modelling more complex axioms. This is especially relevant and valuable in the Linked Data era where not many people seem to want to put effort into axiomatizing the vocabulary used in their datasets.
This is tightly related to the second future direction I regard as crucial within the field of ontology learning. So far, there has not been too much focus on how humans and machines can collaborate on the task of modelling relevant knowledge for a given domain. The role of machine agents should be to derive interesting axioms and relationships from data, generating hypotheses induced from data and asking the human for validation and clarification of them. Humans would then rely on their domain knowledge to confirm the induced knowledge or reject it by providing counter-examples. Methodologies that define and clarify the role of machines and humans in ontology engineering and ontology learning are urgently needed in order to exploit the capabilities of both humans and machines optimally. In this volume, for instance, there is one contribution by Simperl et al. [11] that shows how humans can be involved in the process of ontology learning through games with a purpose. Unless we have good methodologies incorporating the human in the loop, I dare to predict that major breakthroughs in ontology learning can not be expected.
Concerning applications, in my view there is one very important application for ontology learning, i.e. natural language processing (NLP). Given that language processing inherently requires world knowledge to guide the interpretation process [3], it strikes that so far – at least to my knowledge – there have not been too many convincing applications of proper OWL ontologies within NLP. We may wonder why this is the case. Is the field of NLP not aware of the techniques developed in the Semantic Web community? Or are existing ontology learning techniques too noisy for researchers that are used to work with highly curated, hand-crafted resources such as WordNet, FrameNet etc.? Or is it simply the case that the knowledge contained in OWL ontologies and produced by state-of-the-art ontology learning tools is not adequate for NLP purposes? Being far from knowing the answer, let me speculate a bit about the reasons:
• Lack of coverage: domain ontologies – whether learned or not – typically have a limited coverage and scope, whereas NLP researchers are typically used to work with domain-independent resources such as WordNet, FrameNet, etc. An NLP researcher would have to work with many, possibly overlapping ontologies.
• Limited quality: Automatically learned ontologies might be noisy, but still useful as some applications have been shown. This requires also a paradigm shift towards working with non-perfect, possibly noisy, incomplete or even logically inconsistent ontologies.
• Limitations of crisp knowledge: In human language, the meaning of words is often vague so that prototypes or distributions might be better suitable for NLP applications.rather than crisp knowledge representations formalisms.
• No need for expressive knowledge (yet!):With the statistical turn in the 80s and 90s, NLP focused mainly on statistical approaches to natural language processing, moving away from purely symbolic approaches. However, one observes a move back into symbolic approaches as people realize more and more that inference is crucial for NLP. New models such as Markov Logic Networks [10] that combine statistic with symbolic, first-order theories have received in fact wide attention in the NLP and Semantic Web communities. They might, thus, offer a point of convergence for both communities. Concentrating on learning probabilistic knowledge will thus be an important avenue for future work.
• Lack of awareness: For sure, the NLP community is not particularly aware of what is going on in the Semantic Web and ontology learning communities. This is clearly corroborated by the fact that only few NLP researchers attend Semantic Web conferences. The number of references to Semantic Web related work in NLP papers at major NLP conference converges practically to zero, as an informal empirical study by Josef van Genabith showed (Josef van Genabith presented this observation at the Dagstuhl Seminar on the “Multilingual SemanticWeb” in September of 2012.). For sure, there is a lot that the Semantic Web community could do about this, i.e. organizing tutorials and workshops at NLP conferences, but also regarding NLP as a potential consumer of ontologies – whether learned or not. A number of recent activities in the Semantic Web field, such as the development of NIF (Hellmann et. al, this volume) or lemon(See also the standardization activities by the ontolex group: http://www.w3.org/community/ontolex.), a model for the lexicon-ontology interface [8] have contributed already to creating important synergies between and to the convergence of both communities.
In the future, ontology learning research should in my opinion not only concentrate on what we can learn, but also for which purpose or which applications the learned knowledge might be useful. Only then will we be able to create a strong case for Semantic Web technologies, ontology learning techniques in particular, outside the Semantic Web community. Having said this, I kindly invite you as reader to immerse into the current state-of-the-art in ontology learning research and enjoy the great book that Jens Lehmann and Johanna Völker have compiled together. For sure, there is no better book to learn about recent developments in ontology learning research. Thanks to Jens and Johanna for editing this great book and for working on this topic so enthusiastically in order to contribute to the further evolution of this (still) very exciting and important field.
References
[1] Dominik Benz, Christian Körner, Andreas Hotho, Gerd Stumme, and Markus Strohmaier. One tag to bind them all : Measuring term abstractness in social metadata. In Grigoris Antoniou, Marko Grobelnik, Elena Simperl, Bijan Parsia, Dimitris Plexousakis, Jeff Pan, and Pieter De Leenheer, editors, Proceedings of the 8th Extended Semantic Web Conference (ESWC 2011), 2011.
[2] Stephan Bloehdorn, Philipp Cimiano, and Andreas Hotho. Learning ontologies to improve text clustering and classification. In Myra Spiliopoulou, Rudolf Kruse, Andreas Nürnberger, Christian Borgelt, and Wolfgang Gaul, editors, From Data and Information Analysis to Knowledge Engineering: Proceedings of the 29th Annual Conference of the German Classification Society (GfKl 2005), March 9-11, 2005, Magdeburg, Germany, volume 30, pages 334–341. Springer, Berlin–Heidelberg, Germany, 2006.
[3] Philipp Cimiano, Christina Unger, and John McCrae. Ontology-based Interpretation of Natural Language. Synthesis Lectures on Human Language Technology. Morgan & Claypool Publishers. to appear.
[4] Daniel Fleischhacker, Johanna Völker, and Heiner Stuckenschmidt. Mining rdf data for property axioms. In Proceedings of the Confederated International Conferences (OTM), pages 718–735, 2012.
[5] Jens Lehmann, Nicola Fanizzi, and Claudia d'Amato. Concept learning. In Jens Lehmann and Johanna Völker, editors, Perspectives on Ontology Learning, Studies on the Semantic Web. AKA Heidelberg / IOS Press, 2014.
[6] Francesca Lisi. Learning onto-relational rules with inductive logic programming. In Jens Lehmann and Johanna Völker, editors, Perspectives on Ontology Learning, Studies on the Semantic Web. AKA Heidelberg / IOS Press, 2014.
[7] Alexander Maedche and Steffen Staab. Ontology learning for the semantic web. IEEE Intelligent Systems, 16(2):72–79, 2001.
[8] J. McCrae, G. Aguado-de Cea, P. Buitelaar, P. Cimiano, T. Declerck, A. Gómez-Pérez, J. Gracia, L. Hollink, E. Montiel-Ponsoda, D. Spohr, and T. Wunner. Interchanging lexical resources on the Semantic Web. Language Resources and Evaluation, 2012.
[9] Christian Meilicke, Johanna Völker, and Heiner Stuckenschmidt. Learning disjointness for debugging mappings between lightweight ontologies. In Proceedings of the 16th International Conference on Knowledge Engineering: Practice and Patterns, pages 93–108, 2008.
[10] Matthew Richardson and Pedro Domingos. Markov logic networks. Machine Learning, 62(1-2):107– 136, 2006.
[11] Elana Simperl, Stephan Wölger, Stefan Thaler, and Katharina Siorpaes. Learning ontologies via games with a purpose. In Jens Lehmann and Johanna Völker, editors, Perspectives on Ontology Learning, Studies on the Semantic Web. AKA Heidelberg / IOS Press, 2014.
[12] Elena Simperl, Tobias Bürger, Simon Hangl, Stephan Wörgl, and Igor O. Popov. Ontocom: A reliable cost estimation method for ontology development projects. Journal of Web Semantics, 16:1–16, 2012.
[13] Johanna Völker, Pascal Hitzler, and Philipp Cimiano. Acquisition of owl dl axioms from lexical resources. In Proceedings of the 4th European Semantic Web Conference (ESWC), pages 670–685, 2007.
[14] Johanna Völker, Denny Vrandecic, York Sure, and Andreas Hotho. Learning disjointness. In Proceedings of the 4th European Semantic Web Conference (ESWC), pages 175–189, 2007.