Ebook: Information Modelling and Knowledge Bases XVII
The number of abstraction levels of information, the size of databases and knowledge bases and the amount and complexity of information stored in WWW are continuously growing. The aim of this series of Information Modelling and Knowledge Bases is to bring together experts from different areas who have a common interest in understanding and solving problems of information modelling and knowledge bases, as well as applying the results of research into practice. We aim at recognizing and pursuing research on new topics in the area of information modelling and knowledge bases, but also in connected areas in philosophy and logic, cognitive science, knowledge management, linguistics, multimedia, theory and practice of semantic web, software engineering and business management. The papers in this book present a valuable advancement in the area of information modelling and knowledge bases research and practice.
Information modelling and knowledge bases are becoming very important topics not only in academic communities related to information systems and computer science but also in the business field of information technology.
Currently, the structural complexity of information resources, the variety of abstraction levels of information, and the size of databases and knowledge bases are continuously growing. We are facing the complex problems of structuring, sharing, managing, searching and mining data and knowledge from a large amount of complex information resources existing in databases and knowledge bases. New methodologies in many areas of information modelling and knowledge bases are expected to provide sophisticated and reliable solutions to these problems.
The aim of this series of Information Modelling and Knowledge Bases is to provide research communities in information modelling and knowledge bases with scientific results and experiences achieved by using innovative methodologies in computer science and other disciplines related to linguistics, philosophy, and psychology.
Those interdisciplinary research results include common interests in understanding and solving problems on information modelling and knowledge bases, as well as applying those research results to the practical application areas.
The research topics in this series are mainly concentrated on a variety of themes in the important domains:
• theoretical and philosophical basis of concept modelling and conceptual modelling,
• conceptual modelling, information modelling and specification,
• conceptual models in intelligent activity,
• collections of data, knowledge, and descriptions of concepts,
• human-computer interaction and modelling,
• database and knowledge base systems,
• software engineering and modelling and
• applications for information modelling and knowledge bases.
It is very significant to recognize, study and share new areas related to information modelling and knowledge bases on which great attention is focused. Therefore, cognitive science, knowledge management, linguistics, philosophy, psychology, logic, and management science are relevant areas, too. This is reflected in the number of research results dealing with multimedia databases, WWW information managements, and temporal-spatial data models. These new directions are pushing the frontier of knowledge and creating novel ways of modelling real worlds.
To achieve these aims in this series, the international program committee has selected 16 full papers, 11 short papers and 1 position paper in a rigorous reviewing process from 38 submissions.
The selected papers cover many areas of information modelling, concept theories, database semantics, knowledge bases and systems, software engineering, WWW information managements, context-based information access spaces, ontological technology, image databases, temporal and spatial databases, document data managements, and many more.
In Program Committee, there were 32 well-known researchers from the areas of information modelling, concept theories, conceptual modelling, database theory, knowledge bases, information systems, linguistics, philosophy, logic, image processing, temporal and spatial databases, document data managements and other related fields. We are very grateful for their great work in reviewing the papers.
We hope that the series of Information Modelling and Knowledge Base will be productive and valuable in the advancement of research and practice of those academic areas.
The Editors, Yasushi Kiyoki, Jaak Henno, Hannu Jaakkola, Hannu Kangassalo
This paper introduces a model designed to describe and annotate documents. This model is mostly indebted to the standard Entity-Relationship approach and the ODMG data model, plus it has specific features related to information evolution and flexibility requirements. It has not been implemented as a brand new environment but mapped to existing ones. Realational database management systems have been chosen as target implementation in order to respond to scalability demands.
The current state of Semantic Web ontology languages is briefly described, and the ontology languages are characterised from the logical point of view. Generally, these languages are based on the first-order predicate logic enriched with ad hoc higher-order constructs wherever needed. We argue that in the Semantic Web we need a rich language with transparent semantics, in order to build up metadata on the conceptual level of the Semantic Web architecture. A powerful logical tool of Transparent Intensional Logic (TIL) is described, which provides a logico-semantic framework for a fine-grained knowledge representation and conceptual analysis. TIL is based on a rich ontology of entities organised in an infinite ramified hierarchy of types. The conceptual role of TIL in building ontologies is described, and we show that such a system can serve as a unifying logical framework. Concluding we argue that the conceptual and logical level of the Web architecture have an important role, and we should pay a due attention to these levels.
Ordering objects of interest according to a given criterion often provides a useful piece of information to solve a problem of information processing. A new method of linear order based on Formal Concept Analysis (FCA) and Moebius function is described. The description method uses an example of selecting important key indicators in a specific geographical area. An indicator is defined as a characteristic number (weight) representing in a unique way certain important features of the area in question. The weights are assigned to each indicator in different categories. Using FCA, interdependencies among the indicators are analysed and the indicators are sorted according to their importance and uniqueness in the area description. A new method based on Conjugate Moebius Inversion Function (CMI) was used to evaluate the set of indicators selected as representatives for a pilot area. The output of the method is a sequence of objects / indicators ordered according to their calculated value of importance of every indicator. The method described in the article appears to be an important contribution to the evaluation of regional competitiveness in the 5th Framework Programme RTD project “Iron Curtain”.
In this paper, we present a learning system with a Semantic Spectrum Analyzer to realize appropriate and sharp semantic vector spaces for semantic associative search. In semantic associative search systems, a learning system is essentially required to obtain semantically related and appropriate information from multimedia databases. We propose a new learning algorithm with a Semantic Spectrum Analyzer for the semantic associative search. A Semantic Spectrum Analyzer is essential for adapting retrieval results according to individual variation and for improving accuracy of the retrieval results. This learning algorithm is applied to adjust retrieval results to keywords and retrieval-candidate data. The Semantic Spectrum Analyzer makes it possible to extract semantically related and appropriate information for adjusting the initial positions of semantic vectors to the positions adapting to the individual query requirements.
In this paper, we present the detailed theory and implementation on an automatic adaptive metadata generation system using content analysis of sample images with a variety of experimental results. Instead of costly human-created metadata, our method ranks sample images by distance computation on their structural similarity to query images, and automatically generates metadata as textual labels that represent geometric structural properties of the most similar sample images to the query images. First, our system screens out improper query images for metadata generation by using CBIR that computes structural similarity between sample images and query images. We have realized automatic selection of proper threshold-values in the screening module. Second, the system generates metadata by selecting sample indexes attached to the sample images that are structurally similar to query images. Third, the system detects improper metadata and re-generates proper metadata by identifying wrongly selected metadata. Our system has improved metadata generation by 23.5% on recall ratio and 37% on fallout ratio rather than just using the results of content analysis even with more practical experimental figures. Our system has its extensibility to the various types of specific object domains with inclusion of computer vision techniques.
Web applications, which are computer programs ported to the Web, allow us to use various remote services and tools through our Web browsers. There are an enormous number of Web applications on the Web, and they are becoming the basic infrastructure of everyday life. At the same time, multimodal character agents, which interact with human users by both verbal and nonverbal behavior, have recently seen remarkable development. It would be of great benefit if we could easily modify existing Web applications by adding multimodal user interface to them. This paper proposes a framework where IntelligentPad and Multimodal Presentation Markup Language work in collaboration to introduce multimodal character agentsto the front-end of existing Web applications. Example applications include attaching a multimodal user-interface to a news site on the Web. The framework does not require users to write any program code or script.
Privacy is becoming a major issue of social, ethical and legal concern on the Internet. The development of information technology and the Internet has major implications for the privacy of individuals. Studies of private or personal information have related it to such concepts as identifiably, secrecy, anonymity, control, etc. Few studies have examined the basic features of this type of information. Also, database models (e.g., 'Hippocratic' [AKX02] database) for this type of information have been proposed. This paper studies the nature of private information and develops a new conceptual model for databases that contain exclusively private information. The model utilizes the theory of infons to define “private infons”, and develops taxonomy of these private infons based on the notions of proprietary and possession. The proposed model also specifies different privacy rules and principles, derives their enforcement, and develops and tests architecture for this type of databases.
Description Logics are a well-known formalism used in artificial intelligence. We present an approach where a Description Logic is used as an intuitive modelling language. Unlike many other modelling methods, this provides a sound semantic basis for modelling. We indicate that the database implementation is effective and unambiguous, since there is a direct mapping between our formalism and dependencies in the relational model. We also give an example applying the method to the Semantic Web.
As Grid technologies become more mature, tools and methods are needed to enable interoperability and common understanding of the components in different Grid implementations. In this paper, we survey the existing efforts (GLUE, OGSA), and unify them with emerging WWW technologies. We concentrate on modelling the structure of a Grid computing element (CE) and propose a design of a Grid resource broker so that it could make better use of the information about the CEs when user requests some Grid resource or service. We expand our view to consider Grid services and data management, too.
An application of NKRL (Narrative Knowledge Representation Language) techniques on (declassified) 'terrorism' documents has been carried out in the context of the IST Parmenides project. To allow the broadest possible exploitation of the 'semantic content' of the original documents, this application has required implementing the integration between the two main inferencing modes of NKRL, 'hypotheses' and 'transformations'. The paper describes the conceptual problems encountered and the solutions adopted.
Formal and semi-formal transfer of knowledge in the conventional object-oriented software development is seriously impaired. This occurs because it is impossible to assure both completeness and consistency of the initial body of knowledge from which the knowledge transfer may commence. In this paper we propose that better utilization of the formal transfer of knowledge requires the focus on the functional aspects of the problem domain to a much higher degree than it is currently customary. We propose that the two hemisphere model based approach, where the problem domain knowledge is expressed in terms of the business process model and the conceptual model, offers effective as well as efficient knowledge transfer mechanism, if sufficiently complete and consistent, explicit and structured, problem domain knowledge is available.
Recently we proposed a new scheme of aiming at scene recognition by understanding human activities, objects and environment from video images including these constituents. The most significant difference of this approach from conventional ones is the proposal of a method for cooperative understanding of humans' actions and related objects through analyzing motion of humans' face and hands. It is based on the idea that usefulness and functions of an object a person is going to deal with could be inferred from analyzing the person s movements because of that human motions, especially trajectories of face and hands, and the relative position and movement of the objects are closely related to the usage and the states of the objects. In this paper, we will describe the above mentioned scheme focusing on the structure of database and its usage in inference for recognizing human movements and objects. e-mail: kt@ksc.kwansei.ac.jp, fax:+81-79565-9077, ark@center.osakafu-u.ac.jp, fukunaga@cs.osakafu-u.ac.jp.
As the Internet and mobile Internet have been expanding widely, we can now obtain many kinds of information easily and in large quantities. However, it is still difficult, for example, for tourists to get reliable and useful information about locations that they have never visited. In this paper we propose a reliable and useful information distribution system, called the “Kuchicomi Network,” to provide “kuchicomi,” or word-of-mouth, information that is supplied by local informants through mobile phones or PCs connected to the Internet. We introduce the concept of, and business model for, a “Kuchicomi Network” and explain the system structure. We then describe features of the system and outline the development of a prototype system.
Fire is an extremely complex phenomenon and therefore fire spread prediction is not trivial. Spread prediction in grassland fires differs from the prediction in forest fires as the factors influencing each one are not the same. Moreover, input values for the algorithms differ depending on the factors they consider to have influence on fire behavior, even though the algorithms might be applicable to surfaces with similar characteristics. Selecting the most suitable algorithm in each particular case is not a simple task.
In this paper, we describe an easily extensible object oriented model for fire spread prediction in, potentially, any surface. This is of great importance since new mathematical algorithms can be added in order to predict fire spread in new surfaces. Additionally, we present an approach to derive unavailable data, required by the algorithms, from other available data. Finally, we provide a way to manage the selection of the “most suitable algorithm” from a collection of applicable algorithms in a particular type of surface, depending on the needs of the user (accuracy, execution time, etc.).
For professionals in the IT industry it is important to have easy access to knowledge about the day-to-day business processes. Such knowledge, which is called method knowledge, can be made easy accessible through a knowledge repository system. One factor that determines the accessibility of method knowledge in such a repository is the way in which this knowledge is structured and made accessible. This paper presents a meta-modeling technique for modeling knowledge structures. The technique is called Knowledge Entry Map and supports the design of knowledge repository systems based on topic maps. To validate the technique it has been applied in two case studies in the professional IT industry, a software developer and a service provider. The case studies demonstrate the applicability of the meta-modeling technique for capturing the structure of method knowledge in IT organizations. Moreover, the case studies provided the idea of a generic method knowledge structure for project-oriented and product-oriented IT organizations.
This paper describes a way to build a conceptual model for diversified purposes of modelling Enterprise Architectures (EA). It is commonly known that, due to the complexity, Enterprise Architectures need to be considered from several viewpoints. This provokes an integration problem: how to ensure that parallel EA models are consistent. We believe that the best way to solve this problem is to build a generic conceptual model (or an ontology) that is based on the purpose and needs of EA modelling rather than on the metamodels or modelling techniques of the prevailing (viewpoint-specific) domains of EA modelling. In other words, instead of aggregating existing sub-domains of EA we should try to find the core concepts through analyzing the EA domain as a whole. We emphasize the importance of the process through which the conceptual model is produced. Therefore, besides the conceptual skeleton and its utilization we provide an in-depth description of the modelling process we have developed and applied.
We add inflationary and non-inflationary fixed-points to higher-order logics. We show that, for every order, it is sufficient to increase the order of the given logic by one to capture inflationary fixed-points and by two to capture non-inflationary fixed-points. In the two cases, restricting to the existential fragment of the corresponding logic turns out to be enough. This also holds for non-deterministic fixed-points.
We propose a framework for discovering narrative relationships between objects on the WWW based on perspective information access. Today people can access Web resources easily. However, it is difficult to reach information that satisfies users' requests. Search engines are the most popular tools to find information, if a user can specify appropriate keywords to express the concrete contents of the required information. However, our target users are persons looking for new information concerned with other information, even when the relationship among them is not yet known. To support such users, we propose a perspective information access framework that shows paths to reach the required information. Such perspective path corresponds to narrativity between the source and the destination information.
We built a computer science text corpus/search engine called X-Tec. We automatically collected 2.98 million sentences (68.9 million words) from carefully chosen English computer science documents on the Web using 678 hours. We also built an interactive sample sentence query system and an automatic expression diag- nostic system for graduate students. Our computer science text corpus/search engine can be also used for knowledge search and word co-occurrence frequency retrieval.
As the increase of digital image resources, image retrieval has been received widespread research interest. A popular approach for realizing the retrieval of relevant images from an image database is to match the vision features like histogram, color layout, textures and shapes automatically derived from images. However, the visual similarity does not always match to the human required retrieval results. This problem is known as the gap between visual similarity and human semantic. In this paper, we represent a method to bridge the gap. In our method, first, image's edges and their relative position information are derived. After that, independent factors hidden in the derived edge and position information are extracted by using a mathematic method referred to as the Singular Value Decomposition (SVD). We present our analysis on the relationship between the extracted independent factors and the human semantic. The most important contribution of this paper is that most extracted independent factors based on our method are demonstrated to be related to human semantic according to our experiments which are performed on 7,000 images.
Information systems and database systems development is a very complex process. In the past database development has mainly be considered as development of database structuring. Interactivity specification and partially functionality specification has been neglected in the past. The derivability of functionality has been a reason for this restriction of the database design approach. At the same time, applications and the required functionality became more complex. Distribution specification is based on the description of collaboration and exchange frames. The integration of all these parts of the information systems design process has not yet been performed. The co-design approach aims in bridging all these different aspects of applications and provides additionally a sound methodology for development of all these aspects. The co-design framework is based on a number of steps that must be performed. These steps must be based on a well-recorded management, on a development stewardship and on a well-defined process and assessment framework. We used the SPICE framework for evolution of an information systems development methodology. This framework can be used by organizations involved in planning, managing, monitoring, controlling, and improving the acquisition, supply, development, operation, evolution and support of software.
Database semantics (dbs) was developed originally to model the basic mechanism of natural language communication (Hausser 2001). This important application is extended here to another crucial task of cognitive modelling, namely the task of pattern completion during recognition. It consists in efficiently reconstructing a complex concept from its basic parts.
It is shown that the data structure of dbs, called a word bank, supports recognition by providing for any elementary concept matching part of a complex external object all potential candidates connected to it. In this way recognition is provided with a limited set of concepts to actively try to match against other parts of the complex object. Once a second basic concept has been matched successfully and connected to the first, the database provides a much smaller list of potential candidates for a third basic concept, etc. This algorithm converges very quickly.
The basic concepts are defined as geons, in accordance with the RBC (recognition by components) or geon theory by (Biederman, 1987). Geons are structures of a complexity intermediate between features and templates, e.g., cubes, spheres, or cylinders. The approach works also for the hypothetical reconstruction of the unseen side of a known object, explained by (Barsalu, 1999) on the basis of frames.
We study the effects of horizontal fragmentation in complex value databases on query processing using a query cost model. We show that optimisation of query processing and optimisation of fragment allocation are largely orthogonal to each other. We then show that if selection predicates used for horizontal fragmentation are ordered according to their likeliness to impact on the query costs, a binary search procedure can be adopted to find an “optimal” fragmentation and allocation.
Today, there is high demand for digital media archives that can archive wide variety of academic resources and deliver customized contents upon audiences' needs. Key challenges of the digital media archives for academic resources are a) a methodology to embed academically meaningful interpretation of resources, i.e. metadata, to digital media data themselves b) search mechanism to such academic metadata, and c) sophisticated access methodology to such digital archives. In this position paper, we describe our approach to these challenges, and how we're going to implement the digital media archives in our project.