In recent years, the field of ontology learning from text has attracted a lot of attention, resulting in a wide variety of approaches to the extraction of knowledge from textual data. Yet, results so far are still limited as the semantic gap between human language on the one hand and formalized knowledge on the other is significant. Knowledge formalized in the form of ontologies is declarative, explicit and in general monotonic and crisp. Knowledge expressed by human language is highly diluted, very implicit, vague and even defeasible.
As Brewster et al. [1] have correctly argued, when writing an article, authors assume a large body of background knowledge which they share with their community and potential readers, while focusing on a very specific aspect, i.e. on the specific message they want their text to convey. Thus, most of the knowledge in texts is actually very implicit and remains “under the surface”. Further, natural language lacks conceptual preciseness, allowing people to have very different conceptualizations and yet use the very same words to express them. In addition, for reasons of economy, people use language in a rather vague and underspecified way and are just precise enough to still allow reasonable communication. Thus, knowledge conveyed by means of language has to be considered as implicit, vague and defeasible in general and is consequently far away from current ontology models assuming knowledge that is defined declaratively, explicitly as well as in a crisp and monotonic manner.
By definition, an ontology is an explicit specification of a shared conceptualization (see [2] and [3]). In essence, it is thus a view on how the world or a specific domain is structured as agreed upon by the members of a community. Assuming that we have perfect natural language processing tools for extracting knowledge from text, it is still questionable whether we will be able to actually learn an ontology from text as the conceptualization behind an ontology is typically assumed to be the result of an intentional process. Ontologies therefore cannot be “learned” by machines in the strict sense of the word as they lack intention and purpose. Instead, ontology learning may only support an ontology engineer in defining their conceptualization of a particular part of the world, e.g. a technical domain, on the basis of empirical evidence derived from textual and other data.
If we adopt such a view of ontology learning, we have to conclude however that a number of important questions in this regard remain largely unanswered by the current literature:
• textual evidence: What kind of empirical textual evidence should an ontology engineer actually consider when modelling an ontology?
• evidence-based agreement: How can we foster the process of consensus building and agreement by presenting empirical evidence derived from data for different design choices?
• data-driven ontology engineering: On a more general note, what should the role of data-driven ontology learning be in the overall process of ontology engineering?
• methodological integration: How should ontology learning tools be integrated into a larger framework for ontology engineering from a methodological point of view?
• user interface: What is the best way to support an ontology engineer in presenting empirical evidence at the user interface level and what is the optimal way for a user to interact with such a system?
At least four different research communities may contribute to answering these questions: natural language processing, machine learning, knowledge representation/engineering and user interface design. In fact, it seems to us that the above questions can only be addressed through an interdisciplinary research program across these research communities. We will briefly elaborate why.
The natural language processing community has so far applied their best techniques to the task of ontology learning, mainly for term extraction and for learning paradigmatic relations between terms such as synonymy (see [4] and [5]), hyperonymy (see [6,7]) and meronymy (see [8]). However, these are lexical relations which do not hold between concepts with explicitly defined intensions. Lexical relations do in fact not map straightforwardly to relations between concepts, e.g. A is a subconcept of B iff every A is also a B. In contrast, according to Lyons [9], hypernymy is defined as “the relation which holds between a more specific, or subordinate, lexeme and a more general, or superordinate, lexeme.”, which is clearly not equivalent to the definition above in terms of subsumption of extension. Typically, hypernym relations are indicated through so called diagnostic frames. In the case of hypernymy, one useful diagnostic frame is “An X is a kind/type of Y” (see [10]). However, such diagnostic frames clearly lack the necessary preciseness. First of all, they do not distinguish whether the terms are roles (in the sense of OntoClean [11]) or actually types (concepts). Thus, student and person can be actually found in such a diagnostic frame “A student is a person who studies”, whereas the first is clearly a (material) role and the second a type. Similar remarks hold for the meronymy relation. It is well-known in artificial intelligence that there are various types of part-of relations (compare [12]) that can clearly not be differentiated from each other through diagnostic frames. In summary, an important problem is that there is neither a straightforward mapping between terms in language to concepts with a well-defined intension and extension nor can lexical relations be mapped to ontological relations in the general case (see also [13]). NLP research has in many cases ignored such intricate questions in knowledge acquisition and focused instead on learning paradigmatic relations between linguistic objects. In this sense, stronger bonds between the NLP and knowledge representation communities are definitely needed.
The machine learning community provides a large number of sound techniques for data-driven (inductive) learning but, with a few exceptions, is in fact quite opposed to the idea of learning ontologies. Ontologies are logical theories and declarative by nature. Machine learning is in principle concerned with developing analytical models that explain data. In its supervised fashion (compare [14]), such models serve prediction purposes, i.e. for classifying novel examples. In unsupervised learning, one aims to discover regularities or patterns in data such as homogeneous groups or clusters (see [15]) or general associations (see for instance [16]). Many techniques from unsupervised machine learning such as clustering and mining associations have been applied to ontology learning. Mädche and Staab have for example used association rules to discover relations between (lexicalizations) of concepts (compare [17]) and Cimiano et al. have used clustering techniques to group and hierarchically arrange words (see [18]). Most of the papers in this volume also apply machine learning techniques in some way, in particular clustering (Brunzel, Poesio et al.), classification (Poesio et al.), memory-based learning (Tanev et al.) as well as induction of patterns from examples (Pantel et al., Alfonseca et al.). However, analytical models as considered in machine learning are generally not declarative in the sense of a logical theory. Some branches of machine learning research have indeed aimed at learning declarative logical theories from data. This is the case for example for Inductive Logic Programming (ILP) [19]. However, theories learned from data through ILP differ crucially from ontologies. The latter reflect a shared understanding of a domain of interest, produced as the byproduct of reflection and consensus within a certain community and thus representing a commitment to a specific conceptualization. For logical theories derived inductively from data, it is unclear in how far they can be seen as expressing a shared conceptualization. The most promising way of applying inductive techniques seems to be in ontology refinement. First blueprints in this direction can be found in the works of Lisi and Esposito [20] and Rudolph et al. [21]. In general, it seems to us that an important avenue for future machine learning work in ontology learning is to systematically analyze the question how inductively derived models, classifications, associations etc. can support an ontology engineer to formulate or refine their conceptualization in the form of an ontology, seeing ontology learning always as an interactive and cooperative process between an ontology engineer and a system (see also the definition of ontology learning in [22]).
The knowledge representation community has focused traditionally on methods for efficient reasoning and inference, but to a large extent neglected the following issues: i) integrating insights from linguistics into ontology development (with the exception of some of the work on DOLCE [23]) ii) integrating ontology learning into methodologies for engineering ontologies, iii) integrating knowledge representation and inferencing paradigms which are closer to the way knowledge is expressed in human language (notable exceptions being the work on computing with words of Zadeh [24], the work on natural logic [25] or the conceptual graphs of Sowa [26]). The linguistics community has in fact developed category systems based on linguistic principles that could be integrated in ontologies, for example Vendler's verb categories [27] or the so called ‘Aktionsarten’ [28]. Ontologists have largely neglected such distinctions which might be useful exactly in bridging the gap between text and knowledge. While there is some work on integrating machine learning into traditional knowledge acquisition and engineering methodologies such as CommonKADS [29], the integration of ontology learning with more recent ontology engineering methodologies such as On-To-Knowledge [30], DILIGENT [31] or METHONTOLOGY [32]) has not been approached to a satisfactory extent. A first step in this direction is included in this volume (Paslaru-Bontas et al.), while methodological issues related to the interplay between linguistic analysis and ontology engineering are addressed by Aussenac-Gilles et al. (also included in this volume).
Finally, the contribution from the user interface community is urgently needed in ontology learning. We have argued above that ontology learning cannot be, by its very nature, fully automatic. On the contrary, ontology engineering is a highly interactive task in which a user interacts with a system that presents empirical textual evidence in support of the human task of modelling a particular domain. Novel user interface paradigms are needed here. First blueprints considering usability aspects can be found in the work of Wang et al. [33] and Missikoff et al. [34]. Unfortunately, we have no contribution on this issue included in this volume.
In summary, ontology learning research in which the “ontology” is taken seriously requires a joint effort of various communities. Through this volume we therefore aim at forging stronger bonds between these by presenting promising research from the different communities in one collection. In this way we hope to have contributed to the development of a more integrated and cross-disciplinary approach to ontology learning. We hope that this book will stimulate further research in the field and encourage researchers to increasingly tackle also the harder challenges in ontology learning as outlined above.
Paul Buitelaar, Philipp Cimiano, Saarbrücken/Karlsruhe, November 2007
References
[1] C. Brewster, F. Ciravegna, and Y. Wilks. Background and foreground knowledge in dynamic ontology construction. In Proceedings of the SIGIR Semantic Web Workshop, 2003.
[2] T.R. Gruber. A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2):199–220, 1993.
[3] R. Studer, R. Benjamins, and D. Fensel. Knowledge engineering: Principles and methods. Data Knowledge Engineering, 25(1-2):161–197, 1998.
[4] P.D. Turney. Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the 12th European Conference on Machine Learning (ECML), pages 491 – 502, 2001.
[5] D. Lin. Automatic retrieval and clustering of similar words. In Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING-ACL), pages 768–774, 1998.
[6] M.A. Hearst. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING), pages 539–545, 1992.
[7] K. Ahmad and H. Fulford. Knowledge processing: Semantic relations and their use in elaborating terminology. Technical report, University of Surrey, 1992.
[8] M. Berland and E. Charniak. Finding parts in very large corpora. In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics (ACL), pages 57–64, 1999.
[9] J. Lyons. Semantics: Volume 1. Cambridge University Press, 1977.
[10] D. Cruse. Lexical Semantics. Cambridge University Press, 1986.
[11] C.A. Welty and N. Guarino. Supporting ontological analysis of taxonomic relationships. Data Knowledge Engineering (DKE), 39(1):51–74, 2001.
[12] A. Artale, E. Franconi, N. Guarino, and L. Pazzi. Part-whole relations in object-centered systems: An overview. Data Knowledge Engineering, 20(3):347–383, 1996.
[13] J. Völker, P. Hitzler, and P. Cimiano. Acquisition of OWL DL axioms from lexical resources. In Proceedings of the European Semantic Web Conference (ESWC), pages 670–685, 2007.
[14] Tom Mitchell. Machine Learning. McGraw Hill, 1997.
[15] A.K. Jain and R.C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.
[16] T. Imielinski R. Agrawal and A.N. Swami. Mining association rules between sets of items in large databases. SIGMOD, 22(2):207–216, 1993.
[17] A. Mädche and S. Staab. Discovering conceptual relations from text. In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI), pages 321–325, 2000.
[18] P. Cimiano, A. Hotho, and S. Staab. Learning concept hierarchies from text corpora using formal concept analysis. Journal of Artificial Intelligence Research (JAIR), 24:305–339, 2005.
[19] N. Lavrac and S. Dzeroski. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, 1994.
[20] F. Lisi and F. Esposito. Two orthogonal biases for choosing the intensions of emerging concepts in ontology refinement. In Proccedings of the European Conference on Artificial Intelligence (ECAI), pages 765–766, 2006.
[21] S. Rudolph, J. Völker, and P. Hitzler. Supporting lexical ontology learning by relational exploration. In Proceedings of the International Conference on Conceptual Structures (ICCS), pages 488–491, 2007.
[22] A. Mädche and S. Staab. Handbook of Ontologies, chapter Ontology Learning, pages 173–190. Handbook of Information Systems. Springer, 2004.
[23] C. Masolo, S. Borgo, A. Gangemi, N. Guarino, and A. Oltramari. Ontology library (final). WonderWeb deliverable D18.
[24] L.A. Zadeh. From computing with numbers to computing with words – from manipulation of measurements to manipulation of perceptions. IEEE Transactions on Circuits and Systems, 45:105–119, 1999.
[25] L.S. Moss. Natural language, natural logic, natural deduction. Draft Available from the author.
[26] John F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, 1984.
[27] Z. Vendler. Verbs and times. The Philosophical Review, 66:143–160, 1957.
[28] B. Comrie. Aspect: Introduction to the Study of Verbal Aspect and Related Problems. Cambridge University Press, 1976.
[29] W. van de Velde. Machine learning issues in CommonKADS. KADS-II Project Deliverable D2.11, 1992.
[30] Y. Sure, S. Staab, and R. Studer. Methodology for development and employment of ontology-based knowledge management applications. SIGMOD Record, 31(4):18–23, 2002.
[31] D. Vrandecic, H.S. Pinto, Y. Sure, and C. Tempich. The DILIGENT knowledge processes. Journal of Knowledge Management, 9(5):85–96, 2005.
[32] M. Fernandez-Lopez, A. Gomez-Perez, and N. Juristo. METHONTOLOGY: From ontological art towards ontological engineering. In Proceedings of the AAAI Spring Symposium on Ontological Engineering, pages 33–40, 1997.
[33] Y. Wang, J. Völker, and P. Haase. Towards semi-automatic ontology building supported by large-scale knowledge acquisition. In Proceedings of the AAAI Fall Symposium On Semantic Web for Collaborative Knowledge Acquisition, pages 70–77, 2006.
[34] M. Missikoff, R. Navigli, and P. Velardi. The usable ontology: An environment for building and assessing a domain ontology. In Proceedings of the International Semantic Web Conference (ISWC), pages 39–53, 2002.