Ebook: Formal Ontology in Information Systems
Researchers in areas such as artificial intelligence, formal and computational linguistics, biomedical informatics, conceptual modeling, knowledge engineering and information retrieval have come to realise that a solid foundation for their research calls for serious work in ontology, understood as a general theory of the types of entities and relations that make up their respective domains of inquiry. In all these areas, attention is now being focused on the content of information rather than on just the formats and languages used to represent information. The clearest example of this development is provided by the many initiatives growing up around the project of the Semantic Web. And, as the need for integrating research in these different fields arises, so does the realisation that strong principles for building well-founded ontologies might provide significant advantages over ad hoc, case-based solutions. The tools of formal ontology address precisely these needs, but a real effort is required in order to apply such philosophical tools to the domain of information systems. Reciprocally, research in the information sciences raises specific ontological questions which call for further philosophical investigations. The purpose of FOIS is to provide a forum for genuine interdisciplinary exchange in the spirit of a unified effort towards solving the problems of ontology, with an eye to both theoretical issues and concrete applications. This book contains a wide range of areas, all of which are important to the development of formal ontologies.
Since ancient times, ontology, the analysis and categorisation of what exists, has been fundamental to philosophical enquiry. But, until recently, ontology has been seen as an abstract, purely theoretical discipline, far removed from the practical applications of science. However, with the increasing use of sophisticated computerised information systems, solving problems of an ontological nature is now key to the effective use of technologies supporting a wide range of human activities. The ship of Theseus and the tail of Tibbles the cat are no longer merely amusing puzzles. We employ databases and software applications to deal with everything from ships and ship building to anatomy and amputations. When we design a computer to take stock of a ship yard or check that all goes well at the veterinary hospital, we need to ensure that our system operates in a consistent and reliable way even when manipulating information that involves subtle issues of semantics and identity. So, whereas ontologists may once have shied away from practical problems, now the practicalities of achieving cohesion in an information-based society demand that attention must be paid to ontology.
Researchers in such areas as artificial intelligence, formal and computational linguistics, biomedical informatics, conceptual modeling, knowledge engineering and information retrieval have come to realise that a solid foundation for their research calls for serious work in ontology, understood as a general theory of the types of entities and relations that make up their respective domains of inquiry. In all these areas, attention is now being focused on the content of information rather than on just the formats and languages used to represent information. The clearest example of this development is provided by the many initiatives growing up around the project of the Semantic Web. And, as the need for integrating research in these different fields arises, so does the realisation that strong principles for building well-founded ontologies might provide significant advantages over ad hoc, case-based solutions. The tools of formal ontology address precisely these needs, but a real effort is required in order to apply such philosophical tools to the domain of information systems. Reciprocally, research in the information sciences raises specific ontological questions which call for further philosophical investigations.
The purpose of FOIS is to provide a forum for genuine interdisciplinary exchange in the spirit of a unified effort towards solving the problems of ontology, with an eye to both theoretical issues and concrete applications. In our call for papers, we asked for contributions reporting work in a wide range of areas, all of which are important to the development of formal ontologies:
Foundational Issues:
• Kinds of entity: particulars vs. universals, continuants vs. occurrents, abstracta vs. concreta, dependent vs. independent, natural vs. artificial
• Formal relations: parthood, identity, connection, dependence, constitution, subsumption, instantiation
• Vagueness and granularity
• Identity and change
• Formal comparison among ontologies
• Ontology of physical reality (matter, space, time, motion, …)
• Ontology of biological reality (genes, proteins, cells, organisms, …)
• Ontology of mental reality (mental attitudes, emotions, …)
• Ontology of social reality (institutions, organizations, norms, social relationships, artistic expressions, …)
• Ontology of the information society (information, communication, meaning negotiation, …)
• Ontology and natural language semantics, ontology and cognition, ontology and epistemology, semiotics
Methodologies and Applications:
• Top-level vs. application ontologies
• Role of reference ontologies; Ontology integration and alignment
• Ontology-driven information systems design
• Requirements engineering
• Knowledge engineering
• Knowledge management and organization
• Knowledge representation; Qualitative modeling
• Computational lexica; Terminology
• Information retrieval; Question-answering
• Semantic web; Web services; Grid computing
• Domain-specific ontologies, especially for: Linguistics, Geography, Law, Library science, Biomedical science, E-business, Enterprise integration
Out of the 76 papers submitted to FOIS-06, 29 were secected by the Programme Committee, with the help of a number of extra reviewers (listed in the following section on Conference Organisation). With few exceptions, all papers have been refereed by three experts. On behalf of the Organising Committee, we would like to thank the members of the Program Committee and additional reviewers for their careful work and constructive suggestions, which have helped us to produce a very high quality conference programme. We are also extremely grateful to the two invited speakers, Doug Lenat and Antony Galton, for enthusiastically agreeing to speak at FOIS. Finally, we would like to thank the Conference Chair, Nicola Guarino, the Local Chair, Bill Andersen, the Publicity Chair, Leo Obrst, the Website Administrator, Sira Greco, and Allan Third for help with editing the camera ready copy. The hard work and good will of all these people have contributed to the success of FOIS-06.
Brandon Bennett, Christiane Fellbaum
Though Cyc is a formal ontology, the process of building it, over the past 22 years, has been a passionately empirical process. We have had several surprises along the way, some of them scientific, some engineering, and some sociological. For instance, the requirement to represent arbitrary pieces of commonsense knowledge led us, in the mid-1980's, against our intuitions, to move to an increasingly expressive formal representation language. By 1990, we had to admit that the dream of a “Final Encyclopedia” of correct knowledge was a chimera, and what we needed to focus on was a tapestry of locally-consistent “micro-theories” containing contextualized knowledge. Since then, we have begun to work out the fine structure of these micro-theories, their important attributes and ways in which they related to each other, and to appreciate the surprising complexity of the calculi required to formally reason across them. We have also experienced a tipping-point, methodologically, over the past few years, as the ontology has grown large enough to serve as an inductive bias for further knowledge acquisition. I.e., Cyc increasingly actively helps with its own continuing expansion, and by now almost all the activity going on at Cycorp is related to semi-automatic learning from corpora (including the Web) of text and structured sources, whereas as recently as three years ago the majority of the activity here was a cadre of ontological engineers manually writing more axioms to expand the Cyc Knowledge Base. We've also developed and used — and in most cases discarded — a series of interfaces, training paradigms, and so on, as the ontology has grown. In the talk, I shall survey what we used, and when, and why we moved on. Most of the reasons have to do with the ontology outgrowing the tools, or increasing variety among the types of users and ontological engineers. Finally, I will discuss some of our ongoing research efforts, and ongoing interface efforts, which are becoming increasingly intermingled — and why that is perhaps inevitable.
The purpose of this talk is to advocate a particular way of thinking about processes and their relationship to objects and events. The point of view put forward is unorthodox in that it regards processes as being in some ways more closely akin to objects than to events, specifically with regard to their relationship to the directly experienced world and their capacity for undergoing change over time. A consequence of this is that the traditional distinction between continuants and occurrents becomes overshadowed by a more prominent distinction, that between the world of direct experience (made up of, inter alia, objects and processes) and the world of historical record (made up of events). In conclusion, a number of remarks are offered concerning the implications of this shift of viewpoint for formal ontology.
The world of ontology development is full of mysteries. Recently, ISO Standard 15926 (“Lifecycle Integration of Process Plant Data Including Oil and Gas Production Facilities”), a data model initially designed to support the integration and handover of large engineering artefacts, has been proposed by its principal custodian for general use as an upper level ontology. As we shall discover, ISO 15926 is, when examined in light of this proposal, marked by a series of quite astonishing defects, which may however provide general lessons for the developers of ontologies in the future.
Ontologies describe a conceptualization of a part of the world relevant to some application. What are the units of conceptualizations? Current ontologies often equate concepts with words from natural languages. Words are certainly not the smallest units of conceptualization, neither are the sets of synonyms of WordNet or other linguistically justified units. I suggest to take distinctions as basic units and to construct concepts from them whereas other approaches start with concepts and discover properties that distinguish them. Distinctions separate concepts and produce a taxonomic lattice, which contains the named concepts together with other potential conceptual units. The taxa are organized in a superclass/subclass (better supertaxa/subtaxa) relation and for any two taxa there is always a single least common supertaxon. Algorithms to maintain such a taxonomic structure and methods to combine different taxonomies are shown, using a four valued (relevance) logic as introduced by Belnap [1]. The novel aspect of the method is that distinctions that are only meaningful in the context of other distinctions restrict the lattice of concepts to the meaningful subset.
The approach is restricted to the is_a relation between classes; it relates to Formal Concept Analysis, but replaces the “formal attributes” with (necessary) distinctions and uses a four-valued logic. It stresses the focus of recent ontological studies like DOLCE or WonderWeb on qualities; it is expected that distinctions as introduced here for the is_a hierarchy influence the mereological aspects of an ontology (i.e., the part_of relation) and connect to Gibson's affordances [2] and contribute to the classification of operations.
This article reflects an ongoing effort to systematize the use of terms applied by philosophers and computer scientists in the context of ontology and ontological engineering. We show that a common reference terminology is needed to connect terms in representational artifacts to what they mean ontologically. Without such a reference, statements in and about knowledge representation languages will be ambiguous, both as between various languages and within a single language.
We identify problems common to a number of knowledge representation languages used to formalize ontologies. We show that a reference terminology can be used to disambiguate the meanings of some, and to reveal ontological problems in other, evidently confused, statements in and about different representation languages. Our final conclusion is not that our proposed terminology is the ultimate one to serve as a common reference; rather, we argue that it is necessary to have such a standard with well-defined terms linked to an axiomatized theory, if unambiguous cross-paradigm and cross-language communication is to be achieved.
In line with Nirenburg and Raskin's paradigm of ontological semantics, we adhere to the basic tenet that natural language semantics needs to be captured with respect to an explicitly formalized ontology. Many researchers in computational semantics, however, have neglected the ontological aspects of meaning representation, and even more have neglected aspects of meaning representation related to domain-independent ontologies, i.e. foundational or upper-level ontologies. In this paper we argue for a stronger integration of foundational ontologies in computational semantics. We show that relying on foundational ontologies can, on the one hand, lead to a clean separation between domain-specific and domain-independent components of natural language processing systems. On the other hand, we show how the interplay between foundational, domain ontologies and lexical semantics resources can elegantly account for disambiguation as well as allow to draw non-trivial inferences. Further, a temporal theory compliant with the foundational ontology is absolutely necessary for supporting temporal reasoning in natural language understanding.
We present a theory of granular parthood based on qualitative cardinality and size measures. Using standard mereological relations and qualitative, context-dependent relations such as roughly the same size, we define a granular parthood relation and distinguish different ways in which a collection of smaller objects may sum to a larger object. At one extreme, an object x may be a mereological sum of a large collection p where the members of p are all negligible in size with respect to x (e.g., x is a human body and p is the collection of its molecules). At the other extreme, x may be a mereological sum of a collection q none of whose members are negligible in size with respect to x (e.g., x is again a human body and p is the collection consisting of its head, neck, torso, and limbs).
We cannot give precise quantitative definitions for relations such as roughly the same size or negligible in size with respect to since these are, even within a fixed context, vague relations. The primary focus in the formal theory presented in this paper is on the context-independent logical properties of these qualitative cardinality and size relations and their interaction with mereological relations. In developing our formal theory, we draw upon work on order of magnitude reasoning.
We discuss how the spatial extent of physical endurants influences the conceptualization of their spatial qualities. Comparing the spatial dimensionality of a physical endurant with the spatial dimensionality of its qualities leads to an interesting formal ontological question. Should a spatial quality be conceptualized as having a value range instead of a single value when its bearer has a higher spatial dimensionality? For example, the one-dimensional depth quality can be conceptualized as having a value range when it is assigned to the three-dimensional water body of a lake. In terms of the foundational ontology DOLCE, the “value” of a quality, sometimes called quale, is located at an atomic region at a certain time. Allowing a value range at a time is to model qualities as being located at non-atomic regions at a time. That might be philosophically debatable, yet, this modeling approach enables the development of information discovery systems that can cope with ontologically imprecise user queries and can assist the user in defining ontologically precise quality specifications. This brings formal ontology closer to practical applications.
The investigation is based on the foundational ontology DOLCE and introduces a classification for spatial qualities based on their spatial dimensionality.
Biomedical ontologies define entities and relations in order to represent knowledge in the biomedical domain. In this paper we concentrate on the domain of medical imaging. In previous work, we analyzed a representative sample of computed tomography reports in order to determine to which entities and relations the terms used in such reports refer (with regard to the Foundational Model of Anatomy (FMA) and the recently published Open Biomedical Ontology (OBO) Relation Ontology, respectively) in order to construct an imaging ontology for electronic reporting in radiology. In this paper we expand the role of two OBO relations in particular, as they may be applied to radiological image information: the relations located_in and adjacent_to. Defining these relations in terms of the basic topological relations of Region Connection Calculus (RCC), we show how the qualitative description of image feature locations in radiological reporting may be formalized for reasoning.
The increasing need for advanced ontology-based knowledge management in the life sciences is generally being acknowledged but, up until now, the development of biological ontologies lacks adherence to foundational principles of ontology design. This is particularly true of so-called upper-level ontologies such as the GENIA ontology which covers biological continuants and has mainly been devised for corpus annotation in a text mining context. As an alternative, we introduce BIOTOP,an upper ontology of physical continuants in the domain of biology, with a coverage similar to the GENIA ontology. We report on design specifications and modeling decisions for BIOTOP which are based upon formal ontology principles. As a major desideratum, these continuants are described in terms of necessary and sufficient conditions. We accomplished this goal for 85 out of the 146 existing GENIA classes. We use OWL-DL as a formal knowledge representation language and may thus use a terminological reasoner for classification in order to check and maintain consistency during the ontology engineering phase.
The field of BioInformatics has become a major venue for the development and application of computational ontologies. Ranging from controlled vocabularies to annotation of experimental data to reasoning tasks, BioOntologies are advancing to form a comprehensive knowledge foundation in this field. With the Glycomics Ontology (GlycO), we are aiming at providing both a sufficiently large knowledge base and a schema that allows classification of and reasoning about the concepts we expect to encounter in the glycoproteomics field. The schema exploits the expressiveness of OWL-DL to place restrictions on relationships, thus making it suitable to be used as a means to classify new instance data. On the instance level, the knowledge is modularized to address granularity issues regularly found in ontology design. Larger structures are semantically composed from smaller canonical building blocks. The information needed to populate the knowledge base is automatically extracted from several partially overlapping sources. In order to avoid multiple entries, transformation and disambiguation techniques are applied. An intelligent search is then used to identify the individual building blocks that model the larger chemical structures. To ensure ontological soundness, GlycO has been annotated with OntoClean properties and evaluated with respect to those. In order to facilitate its use in conjunction with other biomedical Ontologies, GlycO has been checked for NCBO compliance and has been submitted to the OBO website.
This paper examines the concepts biological function (BF) and functioning as they are used in recent work on formal ontology and its applications in the biomedical domain. My purpose is not to offer an entirely new definition of BF. My objectives are: (1) to find out the basic features of BF mentioned in the reviewed articles; (2) to make more explicit the description of BF already present in those articles by relating it to an ontological category system; and (3) to emphasize the distinction between three cases of predication involving BFs, a distinction that should be taken into account when designing an information system. Hopefully, the results will make a contribution to the goal of providing a general, objective description of biological functions.
Some events recur, and some happen only once. Galton refers to the latter as “once-only” events [1]. In a first-order logic of events that makes a type-token distinction, the possibility of concurrent occurrences of the same event renders the characterization of the intuitive once-onliness not very intuitive. In particular, the paradigmatic case of the nth occurrence of a recurring event is shown to be not necessarily once-only. Counter-examples give rise to a classification of events based on the temporal relations among their occurrences. The problematic cases turn out to be those events that involve an indefinite individual; we call these indefinitely-specified events. We consider two options. The first is to restrict our event ontology, as has been implicitly done in most logics of events, to events that are definitely-specified. The second is to admit all sorts of events into our ontology and distinguish those that are definitely-specified from those that are not by statements in the object language. We opt for a representation of events as functional terms in the logic, and those terms denoting indefinitely-specified events seem to inevitably contain variables. Such non-ground terms turn out to be semantically problematic. To smoothly resolve these problems, we adopt Shapiro's logic of arbitrary and indefinite objects in which indefinite individuals are denoted by special terms [2]. Thus, indefinitely-specified events are naturally represented by functional terms with at least one argument denoting an indefinite individual.
Some temporal ontologies require a way of enforcing the temporal qualification of certain assertions—those about changing entities. In a knowledge representation language based on first–order logic, this is straightforwardly done by having a category of temporal regions and augmenting predicates with an additional argument place for the time at which a given predicate holds.
Here, I address the problem of representing entities changing over time and enforcing temporal qualification in first–order languages with predicates at most binary. It is possible, I argue, using temporal entities known as perdurants (events or processes)—towards which binary languages seem prima facie biased. There is however virtually no ontological cost for an ontology which in addition to changing entities recognizes changes, events and processes. Temporal knowledge representation therefore is not a lost cause even with languages with syntax and semantics limited to the representation of binary relations.
The focus of this paper are actions in which agents employ instruments in order to achieve desired outcomes. I explore the ontological structure of such actions and the semantic features of the sentences by means of which we refer to these actions. The logical framework for this philosophical enterprise is the theory of the so-called stit operator: …see to it that …. I modify the original formulation in such a way that we could represent those events in which agents see to things with the help of physical objects. As a result, I obtain a formal theory of the operator of instrumental stit: …see to it that …with the help of ….
A variety of disciplines and research areas have separately studied the notions of action, agents and agency, but no integrated and well-developed formal ontology for them is currently available. This paper is a first attempt at bridging this gap, focusing especially on the relationship between agency and action.
The departure point is STIT logic, the most expressive among the current logics of agency. Agency is the relationship between an agent and the states of affairs it brings about, without referring to how this is done, i.e., the actions performed. Since ontological investigations are best done in a first-order framework, making explicit at the language level the domain of quantification, we first propose a first-order theory that is proved equivalent to the propositional modal logic STIT. The domain and language of this theory is then extended to cover actions, obtaining the theory we call OntoSTIT+.
We are on Mars again – the favourite laboratory for philosophical experiments. Our host colleagues introduce us to some Martian stuff referred to as “T”, and ask us to help them to identify T on other possible worlds. Or, technically speaking, we are asked to determine the intension of “T”, i.e., what the term designates with respect to different possible worlds. Following a short series of experiments on the planet, we conclude that the intension of “T” depends upon three factors: (1) The semantic rule linked with the term, i.e., the way in which the term is designed to pick out its referent with respect to different possible worlds (e.g., as a definite description, or as a proper name, or as an actualised description etc.); (2) The properties of the referent of “T” in the actual world; and, (3) What we shall call 'the metaphysical background of the universe', i.e., what counts as a thing vs. what counts as a property of things (e.g., whether the universe is such that it contains material objects that merely happen to have their manifest properties, or whether the universe primarily contains manifest objects that merely happen to have their material constitution). As our experiments show, changing the values of any of these variables will result in a change in the reference of the term with respect to different possible worlds, viz., it will result in a change in the intension of the term. We then demonstrate how the three variables are interrelated, and specify how exactly they combine to produce a particular intension of a term. We conclude with a general “formula” that determines what will deserve to be called “T” relative to the different values of the above variables, i.e., we come up with a calculator of intensions. Finally, we also draw some morals about rigidity.
Natural languages are easy to learn by infants, they can express any thought that any adult might ever conceive, and they accommodate the limitations of human breathing rates and short-term memory. The first property implies a finite vocabulary, the second implies infinite extensibility, and the third implies a small upper bound on the length of phrases. Together, they imply that most words in a natural language will have an open-ended number of senses — ambiguity is inevitable. Peirce and Wittgenstein are two philosophers who understood that vagueness and ambiguity are not defects in language, but essential properties that enable it to accommodate anything that people need to say. In analyzing the ambiguities, Wittgenstein developed his theory of language games, which allow words to have different senses in different contexts, applications, or modes of use. Recent developments in lexical semantics, which are remarkably compatible with the views of Peirce and Wittgenstein, are based on the recognition that words have an open-ended number of dynamically changing and context-dependent microsenses. The resulting flexibility enables natural languages to adapt to any possible subject from any perspective for any humanly conceivable purpose. To achieve a comparable level of flexibility with formal ontologies, this paper proposes an organization with a dynamically evolving collection of formal theories, systematic mappings to formal concept types and informal lexicons of natural language terms, and a modularity that allows independent distributed development and extension of all resources, formal and informal.