Ebook: Information Modelling and Knowledge Bases XXX
Information modeling and knowledge bases have become essential subjects in the last three decades, not only in academic communities related to information systems and computer science, but also in the areas of business where information technology is applied.
This book presents the proceedings of the 28th International Conference on Information Modelling and Knowledge Bases (EJC2018), held in Riga, Latvia from 4–8 June 2018. The aim of the conference was to bring together experts with a common interest in the understanding and solving of problems on information modelling and knowledge bases, as well as those from different areas of computer science and other disciplines who apply the results of research to practice. The 39 accepted papers collected here cover a variety of topics, including: conceptual modeling; knowledge and information modeling and discovery; linguistic modeling; cross-cultural communication and social computing; multimedia data modeling and systems; and environmental modeling and engineering.
The book will be of interest to researchers and practitioners alike, and to anyone wanting a better understanding of current advances in information technology.
In the last three decades information modelling and knowledge bases have become essentially important subjects not only in academic communities related to information systems and computer science but also in the business area where information technology is applied.
The series of International Conference on Information Modelling and Knowledge Bases (EJC) originally started as a co-operation initiative between Japan and Finland in 1988, following the series of conferences in the Scandinavian level since 1982. The practical operations were then organized by professor Ohsuga in Japan and Professors Hannu Kangassalo and Hannu Jaakkola in Finland. Geographical scope has expanded to cover first Europe and then other countries. The original “Scandinavian Japanese” was replaced by “European Japanese” in 1991 and by “International” in 2014 in the title of the conference. The workshop characteristic – discussion, time for presentations and limited number of participants – is still typical for the conference.
The 28th International Conference on Information Modelling and Knowledge Bases (EJC2018) constitutes a worldwide research forum for the exchange of scientific results. In this way a platform has been established, which brings together researches, as well as practitioners in information modelling and knowledge bases. The main topics of EJC conference cover a variety of themes:
1. Conceptual modelling: Modelling and specification languages; Domain-specific conceptual modelling; Concepts, concept theories and ontologies; Conceptual modelling of large and heterogeneous systems; Conceptual modelling of spatial, temporal and biological data; Methods for developing, validating and communicating conceptual models.
2. Knowledge and information modelling and discovery: Knowledge discovery, knowledge representation and knowledge management; Advanced data mining and analysis methods; Conceptions of knowledge and information; Modelling information requirements; Intelligent information systems; Information recognition and information modelling.
3. Linguistic modelling: Models of HCI; Information delivery to users; Intelligent informal querying; Linguistic foundation of information and knowledge; Fuzzy linguistic models; Philosophical and linguistic foundations of conceptual models.
4. Cross-cultural communication and social computing: Cross-cultural support systems; Integration, evolution and migration of systems; Collaborative societies; Multicultural web-based software systems; Intercultural collaboration and support systems; Social computing, behavioral modeling and prediction.
5. Environmental modelling and engineering: Environmental information systems (architecture); Spatial, temporal and observational information systems; Large-scale environmental systems; Collaborative knowledge base systems; Agent concepts and conceptualization; Hazard prediction, prevention and steering systems.
6. Multimedia data modelling and systems: Modelling multimedia information and knowledge; Content-based multimedia data management; Content-based multimedia retrieval; Privacy and context enhancing technologies; Semantics and pragmatics of multimedia data; Metadata for multimedia information systems.
The Program Committee accepted thirty-nine papers to be published in this book. The papers were evaluated by the international panel of reviewers. In accordance with the conference principles, all papers were presented at the conference, were accepted for publication in this volume. All papers had to be both improved and resubmitted after the conference for this journal publication.
We thank all colleagues for their support to the conference arrangements, especially the program committee, the organizing committee, the program coordination team, and the reviewers. And we thank all participants very much because they are the ones who make the conference.
The Editors
Tatiana Endrjukaite
Alexander Dudko
Hannu Jaakkola
Bernhard Thalheim
Yasushi Kiyoki
Naofumi Yoshida
We herein present a method that dynamically generates the curricula specialized to the learning circumstances of individual learners, given prior learning goals and learning objects. A generated curriculum encourages the selection of learning behaviors according to the learning objects that may be feasibly acquired within a limited timeframe. Our method evaluates the circumstances of an individual learner; by dynamically selecting feasible learning objects based on the individual's learning behaviors and their past records, it finds the best learning tasks within the constraints of time, circumstances, and activities. Using prior known rules and strengths of causal/dependency relations between learning items, our method enables, from individual test results, the discovery of the learning objects that are important, and how they should be ordered, on an individual basis. This enables an effective support in choosing the most appropriate learning behaviors, tailored to the individual learner. It also enables the selection of effective learning behaviors by examining the behavior records of other individuals, treating the influence of their prior learning behaviors on subsequent learning behaviors as experience quotients and using them by converting them into expected scores for the individual's learning behaviors. Accordingly, we evaluate whether the learning behaviors selected by the individual are indeed learning tasks that would correspond to anticipated learning results, conduct prior assessment of the influence that the results of this intervention would have upon the learning circumstances and thus prioritize more effective learning behaviors. When implemented, our method assesses the changes in the individual's learning circumstances based on their learning behaviors on a timeline and subsequently adjusts the recommended behaviors. The method can provide effective support for individual learners: along with effective feedback on learning task selection in response to the individual's circumstances, it dynamically generates an individualized curriculum by measuring the relationships between the individual's learning circumstances and learning items. We herein present a method for dynamically generating the curricula in response to an individual's learning circumstances through measuring the causal/dependency relations between learning items, thus enabling the calculation of the relationship between an individual's past learning record, and the learning behaviors and learning objects available to be chosen by the individual. We investigate its efficacy and achievability through empirical testing by using actual data.
We describe a logical calculus for process synthesis applying reaction rules for consuming, transforming and producing products at costs. Unlike common logical calculi, which deal with propositions supported by truth semantics, the process synthesis calculus deals with products (in a concrete or abstract sense) and their costs, and creation of products by the composing of appropriate reaction assemblies from a database of reactions or transactions. As such the calculus may be understood as a resource logic akin to linear logics. The calculus appeals to means-end backwards reasoning using a derivative of definite clause logic. Although rather general in scope, as an interesting case in this paper the calculus is applied to chemical retro-synthesis – in particular with the notoriously difficult Solvay cluster designs.
This paper proposes a discrete-event model of a multi-product (r, Q) inventory control system. The described approach provides a computationally efficient method to estimate a current inventory policy and test alternatives. In the considered model, values of the reorder level r and the reorder quantity Q are approached through iterative methods. The modeled inventory system operates under stochastic demand and lead time. Besides, the limited storage capacity is allocated among several products. Unfulfilled demand is interpreted as a lost opportunity and no backlog shall be fulfilled later. Generally, the inventory control system under consideration may be classified as an extended “base stock” with a common storage resource. The paper includes the mathematical description, the simulation algorithm and the numerical example of the model of the three-product (r, Q) inventory control system with detailed risk and reliability analysis.
In order to realize an artificial intelligent system, a basic mechanism should be provided for expressing and processing the semantic. We have presented semantic computing models in which original data are mapped in to a semantic space and presented as points in semantic spaces. That is, we presented a method to process semantic information by calculating Euclidean distances of those points in the semantic spaces. In our continuous studies, we note that different mapping matrixes are required to map the original data in to the semantic space when this model is applied in different application areas. Therefore, it is an important research topic to develop methods to create the mapping matrixes applied in different areas. Many research works are presented on applying the model in the areas of semantic information retrieving, semantic information classifying, semantic information extracting, and semantic information analyzing on reason and results, etc. In these works, the mapping matrixes are created based on the analyzations in the application areas with human knowledge. In this paper, we present a new method to perform the semantic mapping through deep-learning computation. The most important feature of our method is that we implement semantic mapping through training data sets rather than the mapping matrix which is created based on the analyzations of human being. We first discuss five basic operations, the semantic space creation, semantic mapping, semantic mapping matrix, semantic space expansion and contraction. After that, we present our method. In order to present correlations of the semantic information correctly in Euclidean distances, the axes of a semantic space must be orthogonal to each other. Therefore, we also discuss how to implement semantic orthogonal mapping. We believe that our study will open new application areas on semantic computing and deep-learning.
Modelling is an essential part of information systems development. Models are used for communication between interest groups and inside development teams. Models are also used for transferring baseline artefacts between development phases. Models are mainly developed by humans, which represent certain cultures - national, enterprise, professional, team, project etc. Because of that we claim that models, as well as many other information systems related artefacts are culture dependent. The models are born in certain context and these must be also interpreted by taking the original context into account. In our earlier studies we have analysed the effect of culture in information systems development: culture related aspects in general level, in information search and interaction and in web information systems. We focus now on modelling. Because of that we shortly answer to the question “How cultures differ from each other”. This reviews and synthesis generally accepted frameworks for cultural analysis. In addition we shortly open the results of our earlier studies. Because modelling is a human activity, as well as information systems are used by humans, we integrate the use context into information systems development. The findings of culture analysis are transferred to modelling practices via our framework that defines model as an instrument transferring elements of its development context to the models – we discuss the roles of normal models, deep models and modelling matrix. Finally we will concentrate on the problems of cross-cultural modelling using selected national cultures as an example.
The paper is devoted to the problem of multidimensional data visualization, when an overflow of the image with graphic objects makes it difficult to understand the trends and patterns existing in the data. One of the emerging problems is the visual clutter problem and the aim of the paper is to analyse the criteria which can quantitatively assess the quality of the clutter reduction methods for data visualization in parallel coordinates. An overview of clutter reduction methods, and the aesthetic criteria used to evaluate parallel coordinate plots and chartings, is presented. The study explored the dimension reordering and algorithms based on the clustering. The quality of the results was assessed by the aesthetic criteria. The evaluation criteria system could potentially make it possible to select the best clutter reduction technique automatically.
As information systems are producing vast amounts of data with ever increasing speed and diversity, the management of data is becoming an important part of gaining the information that we need. With this as the motivation, this paper proposes a Manageable Data Sources framework for the systematic management of data sources. The framework is derived from a new conceptual model of data processing: the Faucet-Sink-Drain model. The framework achieves two aims: the unification of data processing, and secondly, the componentization and decoupling of data processing related tasks. The framework is described and a reference architecture is laid out for the creation of a proof-of-concept implementation to solve the given use case.
Collaboration is a crucial part of joint research. Especially in projects, where multiple academic domains and disciplines are involved, this is challenging. Different research foci and individual perspectives are often mismatching. Heterogeneous data, diverging working habits, and differing standards are common. Research itself is agile, dynamic, and evolving. Changes and revisions of data structures, requirements, and actual data values arise continuously. And so on…
Adaptations due to new circumstances and evolution driven changes are normal for such projects and need to be considered explicitly by its data management strategy. The same applies for heterogeneities in general. No project wide working standards, models, or structures can be expected to be continuous throughout the complete run time of a project. Thus any static approach for managing data in research projects will fail to provide sufficient support. In effect also the collaboration cannot be supported well.
We propose a new approach basing on the separation of data storage and data usage. The actual research is performed in local and individual working environments. These can be set up to meet the individual research requirements of the project members best. Therefore the storage does not need to support research related computation activities or reflecting the working models of project members. Instead it uses a universal model to store data decomposed into values and structures. Thus the individual perspectives can be modelled as compositions of these decomposed elements and allow the creation of customisable interfaces. In effect we get a flexible and modern approach for handling data in research projects and for supporting interdisciplinary research.
The goal of the paper is to demonstrate the approach of using microscopic traffic flow simulation for evaluation of the environmental impact of new Shopping mall in the urban area. The results of the research demonstrate, that implementation of the such attraction point as Shopping mall could significantly influence the environment in the area, this should be considered during development of the Shopping mall.
The performance of modern information systems in the hydrosphere is based on the robustness of the wireless underwater communication between the cooperative subsystems. It is strongly intertwined with the coordination reliability for sharing resources and common purposes. Secure underwater coordination is a key functionality (C4I-STAR). The physical laws of the nature restrict the band-width and reliability of underwater communication links, the quality differs from deep to shallow water areas, with the hydro acoustical parameters, like the sound speed and sediment profile, with the weather conditions and day time. To over-come low data rates and non robust transmission links of submerged manned and unmanned platforms, error tolerant processes are needed to fulfill different hand-over maneuvers in the submerged teams. But, what are efficient process chains in this field? S-BPM has advantages to support users and scientists to handle this layer modeling for a standard network code of conduct.
Semantic computing integration with deep-learning realizes a new artificial brain-memory system. We have presented a concept of “MMM: Semantic Computing System” for analyzing and interpreting environmental phenomena and changes occurring in the oceans and rivers in the world. We also introduce the concept of “SPA (Sensing, Processing and Analytical Actuation Functions)” for realizing a global environmental system, to apply it to Multi-dimensional World Map (5-Dimensional World Map) System. This concept is effective and advantageous to design environmental systems with Physical-Cyber integration to detect environmental phenomena as real data resources in a physical-space (real space), map them to cyber-space to make analytical and semantic computing, and actuate the analytically computed results to the real space with visualization for expressing environmental phenomena, causalities and influences. This paper presents integration and semantic-analysis methods for KEIO-MDBL-UN-ESCAP Joint system for global ocean-water analysis with Coral-Image Analysis in two environmental-semantic spaces with water-quality and image databases. We have implemented an actual space integration system for accessing environmental information resources with water-quality and image analysis. We clarify the feasibility and effectiveness of our method and system by showing several experimental results for environmental medical document data Environmental-semantic space integration realizes deep analysis environmental phenomena and situations. The essential computation in environmental study is context-dependent-differential computation to analyze the changes of various situations (air, water, CO2, places of livings, sea level, coral area, etc.). It is important to realize global environmental computing methodology for analyzing difference and diversity of nature and livings in a context dependent way with a large amount of information resources in terms of global environments. In the design of environment-analysis systems, one of the most important issues is how to integrate several environmental aspects and analyze environmental data resources with semantic interpretations. In this paper, we present an environmental-semantic computing system. Our environmental-semantic computing system realizes integration and semantic-search among environmental-semantic spaces with waterquality and image databases.
In this paper we proposed a new best energy mixture model and electricity generation simulation approach. This approach is explained based on Latvia case example where we use existing gas power station with combination of introduced hybrid bio-wind energy system. The goal is to replace imported fossil fuels for electricity generation with locally produced renewable sources of energy, to secure and increase independence of the country and to reduce CO2 emissions. The introduced best energy mix is based on two renewable sources: biomass and wind power stations. To make use of both advantages we combine these two types of power stations into one hybrid power system that gives a stable power output to supply country's energy needs.
An automatic feature extraction for classifying the clickbait for Thai headlines is presented. The first corpus of 132,948 Thai headline news was collected. To transform Thai words into features, Word2Vec is utilized to overcome the ambiguity of the word segmentation. Then, the features are automatically extract using a Convolutional Neural Network (CNN). A number of experiments for CNN have been conducted to find the suitable value of the parameters that achieve the best classification result. We found that using a non-static modelling technique together with 50 dimension of Word2vec feature, {2, 3, 4} window size, and epoch equal to 5 achieves the accuracy of 95.25%. The experimental results also showed that the proposed method achieves the best result as compared to the other classification methods such as Support Vector Machine (SVM) and Naïve Bayes, which achieve 87.17% and 87.32%, respectively.
This paper presents a multi-parameterized water quality prediction method with differential computing among sampling sites at Bangkok City, Thailand. Here, two canals were selected for case study and nine parameters were chosen for water quality prediction, they are Temperature, pH, DO, BOD, COD, NH3-N, NO2-N, NO3-N, and TP. The data obtained from 2007 to November 2017. The differential computing is chosen to predict the parameters along sampling sites. The results are indicated the predictive values of temperature and pH are entirely accurate than another parameter because the error values are low values and both parameters are slightly changed from the past up to present. Therefore, the differential computing possibly uses to predict some water quality parameters which they are quite stable conditions.
In the paper, we explore the topic of mobile user experience and we focus on native mobile applications. The goal of this paper is to research what mobile user experience is, identify important mobile patterns, and define processes and guidelines for the design and measurement of the mobile user experience and the identified mobile patterns. Based on the literature review, there are several elements or factors affecting the mobile user experience. We established a big advantage of native applications, which are designed for a specific platform, where there is more time to create a specific customer experience. The main findings of the paper showed that the mobile user experience differs from the desktop user experience, and that there are many mobile patterns, which may significantly improve or degrade the mobile user experience for various target groups.
The success of automated reasoning techniques over large natural-language texts heavily relies on a fine-grained analysis of natural language assumptions. While there is a common agreement that the analysis should be hyperintensional, most of the automatic reasoning systems are still based on an intensional logic, at the best. In this paper we introduce the TIL-Script language, which is a computational variant of Pavel Tichý's Transparent Intensional Logic (TIL). TIL is a hyperintensional, typed lambda calculus of partial functions. Hyperintensional, because the TIL terms are interpreted as denoting procedures rather than their products, which are partial functions-in-extension. Thus, in our stratified ontology we have got procedures, their products, i.e. functions-in-extensions, or even procedures of a lower order, as well as functional values. These procedures are rigorously defined as TIL constructions. With constructions of constructions, constructions of functions, functions, and functional values in our stratified ontology, we need to keep track of the traffic between multiple logical strata. The ramified type hierarchy does just that. The type of first order objects includes all objects that are not constructions. The type of second-order objects includes constructions of first-order objects. The type of third-order objects includes constructions of first- or second-order objects. And so on, ad infinitum. The goal of this paper is to introduce the algorithm of type control over the results of logical analysis of natural-language expressions, i.e., checking whether the analysis results in a type-theoretically coherent procedure assigned to the expression as its meaning.
In this paper we deal with the extension of the functionalities of the TIL-Script language, namely the proof system based on natural deduction. The system processes a subset of the set of TIL-Script constructions that are typed to v-construct a truth-value. Since TIL-Script is a functional programming language based on a hyperintensional lambda calculus with procedural semantics, we also describe the way how to validly apply beta conversion and how to operate in a hyperintensional context where the very procedure is an object of predication.
The concept space is large, unexplored and open. Concept structures to preserve and analyze database concepts are focused, and the properties of concepts completeness and classes/concepts completeness are introduced. Outside the space where structures preserving and analyzing database concepts are located, incompleteness holds. Exploiting these properties as reference points for the location of concepts and concept structures in the space, a frame is proposed. Not all the concept structures can be linked with the scientific domain of computer science. In the open concept space, both formal and informal concept structures are located. These can be complete, incomplete, not necessarily related to computer science. An example of the frame applicability is given.
The paper proposes an original approach to solving the problem of ineffective processing of qualitative constraints of a subject domain using the constraint programming technique. The approach is based on the use of specialized matrix-like structures providing a “compressed” representation of constraints over finite domains, as well as on the use of the authors' inference techniques on these structures. In comparison with the prototypes using the typical representation of multi-place relations in the form of a table, the techniques make it possible to more efficiently reduce the search space. The paper presents practical aspects of implementation of user-developed types of constraints and corresponding algorithms-propagators, with the help of constraint programming libraries. The algorithms performance has been assessed by means of the above matrix structures to clearly demonstrate the advantages of representation and processing of qualitative constraints of a subject domain.
This paper presents an application of 5D World Map System with Context-Diversity-Responsive Semantic Associative Search for a large database of environmental news articles. By applying 5D World Map System to data describing social and natural phenomena, a new news-article subscription environment is realized by the dynamic combination of various semantic, temporal and spatial “context”. The new subscription environment enables to create, accumulate and visualize a series of analyzed results of semantic relations among phenomena as an interpreted social story. The system realizes the extraction of correlative and causal relations between phenomena which are potentially included in news articles. A semantic associative search method is applied to this system for realizing the concept that “semantics” of words, documents, and events/phenomena vary according to the “context”. The main feature of this system is to create various context-dependent patterns of social stories according to user's viewpoints and the diversity of context in phenomena dynamically. This system also provides an environment for analyzing the time-series change and spatial expansion of social and natural phenomena on a time-series multi-geographical space. In this paper, we show a prototype system applied for vast ten years of news-articles and several experiments about “global warming” to clarify the feasibility of the system.
The global environmental analysis system is a new platform to analyze environmental multimedia data that acquired from nature resources. This study aims to realize and interpret coral reefs phenomena and changes occurring that happening in global scale by utilizing Acropora coral as a bioindicator or natural sensing. This paper presents a new environmental-semantic computing system of multispectral imagery for automated coral health monitoring and analysis to realize and recognize coral condition in actual situation. Multispectral semantic-image space for coral monitoring and analysis can be utilized for ocean environment monitoring and assessment by measuring coral reef health which highly beneficial to the current ocean pollution problem. Our method applies semantic distance calculation to measure similarity between multispectral image data and context images including three coral conditions (healthy, bleaching, and dead). In our experiments, we applied the SPA function which is an effective concept to design environmental systems with Physical-Cyber integration. This paper presents case study of Acropora coral monitoring and assessment at Man-nai Island, Rayong province, Thailand. Therefore, an additional objective of this research is to apply the Artificial Intelligence (AI) and Environmental monitoring system for combatting the ocean pollution problem by transferring the knowledges and technology from computer science fields to fundamental marine science research.
The amount of chlorophyll in a plant can indicate leaf N concentration, as chlorophyll is a major component of nitrogen. Identifying chlorophyll levels in plants, therefore, leads to appropriate nitrogen fertilizer recommendations for the plants and to their fertilization at the proper time, optimizing efficiency of agricultural production and helping to preserve the environment by reducing excess use of nitrogen fertilizer. Measuring leaf N concentration in a high accuracy laboratory is expensive and time consuming, whereas a SPAD chlorophyll handheld meter can be carried conveniently in the field and can assess rapidly. However, the handheld meter's capacity is limited, especially in large areas, where samples must be taken to represent an area. The use of vegetation indices calculated from UAV imagery and correlated with SPAD values enables better estimations of N leaf concentrations over large areas. The result showed that relationship between NDVI and SPAD value (chlorophyll contents) of waxy corn (Zea mays L. var. ceratina) at V6 and R1 stage in R2 were 0.594 and 0.632, respectively. The results of this study are beneficial for knowing the amount of chlorophyll in plants, which can result in accurate nitrogen fertilizer requirements and better administration of management zones in precision agriculture.
This study explores the use of Christian elements in news discourses and analyzes the functions, as well as differences, in the use of religious elements from a semantic approach. We compare the German, American and Japanese news coverage of the 2011 Great East Japan Earthquake (3.11 disaster) between March 2012 and March 2017, following a media discourse analysis approach. We qualitatively analyzed 1,550 national and regional newspaper articles and created a newspaper database in order to display the frequency and uncover the functions of the Christian elements in the coverage. We find that Christian elements are employed in the German and American coverage as metaphors, metonymies and personifications, to describe the 3.11 disaster and its social and political aftermath. Taking a closer look, Christian elements in the coverage function in the following ways: To describe the emotional state of the catastrophe's victims, to evoke certain feelings, such as fear of nuclear energy, in the reader, and to emphasize the opinion of the journalist. Differences in the use of Christian elements in the German and American news coverage also highlight differences in the respective national political and social discourses on nuclear energy and reflect different cultural values regarding environmental issues.
This paper describes a new method of data retrieval from free text documents in medical domain. Proposed approach creates the document summary and highlights most important keywords in the text. To achieve this result we process the document natural language text and build a descriptor as an internal representation of the document. This descriptor is a graph with concepts, relations between them, and concept points as a metric of relevance. By means of points in the descriptor the approach performs ambiguity resolution, selects most relevant concepts to display in the summary, and votes for keywords highlighting in the text. Besides the direct representation of identified information in the summary, this work proposes a way to provide extended summary by using additional knowledge about relations between medications, procedures, diseases and anatomy. The described approach helps to speed up analysis and decision making processes by means of providing aggregated summary for a document and highlighting most meaningful parts of the document's text. Experiment results demonstrate that automatic summary generation and keywords highlighting can be successfully performed by the proposed approach to achieve meaningful and highly relevant results.