
Ebook: Databases and Information Systems IX

Databases and information systems are now indispensable for the day-to-day functioning of businesses and society.
This book presents 25 selected papers from those delivered at the 12th International Baltic Conference on Databases and Information Systems 2016 (DB&IS 2016), held in Riga, Latvia, in July 2016. Since it began in 1994, this biennial conference has become an international forum for researchers and developers in the field of databases, information systems and related areas, and the papers collected here cover a wide spectrum of topics related to the development of information systems and data processing. These include: the development of ontology applications; tools, technologies and languages for model-driven development; decision support systems and data mining; natural language processing and building linguistic components of information systems; advanced systems and technologies related to information systems, databases and information technologies in teaching and learning.
The book will be of interest to all those whose work involves the design, application and use of databases and information systems.
This volume collects the selected papers presented at the 12th International Baltic Conference on Databases and Information Systems 2016 (DB&IS 2016). The conference was held on July 4–6, 2016 in Riga, Latvia. The DB&IS 2016 continued the DB&IS series of biennial conferences, which have been held in Trakai (1994), Tallinn (1996, 2002, 2008, 2014), Riga (1998, 2004, 2010), and Vilnius (2000, 2006, 2012). Doctoral Consortium also accompanies the conference. IOS Press has been a partner of the conference for years, and selected papers are published as volumes in the book series Frontiers in Artificial Intelligence and Applications (FAIA).
During this period, the DB&IS conference has become an international forum for researchers and developers in the field of databases, information systems, and related areas. The conference features original research and application papers on the theory, design, and implementation of today's information systems. Since the conference is popular among researchers of Baltic countries, papers represent the most popular research fields regarding information systems.
The DB&IS 2016 was organized by the Faculty of Computing, University of Latvia. The International Programme Committee had 56 representatives from 21 countries all over the world. This year, 62 submissions from 16 countries were received. At least three reviewers evaluated each conference paper applying the single-blind type of peer review. Accepted papers were published in the volume of Communications in Computer and Information Science (CCIS) printed by Springer and in the journal Baltic Journal of Modern Computing (BJMC) before the conference. After the conference, the editors of this book selected improved versions of 25 papers for publishing in this volume of the FAIA.
The accepted papers span a wide spectrum of topics related to the development of information systems and data processing. The most significant topics covered by the conference are following: the development of ontology applications; tools, technologies, and languages for model-driven development; decision support systems and data mining; natural language processing and building linguistic components for information systems; advanced systems and technologies related to IS and databases, information technologies in teaching and learning.
We would like to express our warmest thanks to all authors who contributed to the 12th International Baltic Conference on Databases and Information Systems 2016. Our special thanks to the invited speakers Prof. Andris Ambainis, Prof. Gintautas Dzemyda, and Prof. Jaak Vilo for sharing their knowledge. We are very grateful to the members of the International Program Committee and additional referees for their reviews and comments. We are also grateful to the presenters, session chairs, and conference participants for their time and effort that made the DB&IS 2016 success. We also wish to express our thanks to the conference organizing team, the University of Latvia, the Exigen Services Latvia, the IEEE, the IEEE Latvia Section, and other supporters for their contribution and making the event possible.
September 2016
Vineta Arnicāne
Guntis Arnicāns
Juris Borzovs
Laila Niedrīte
Expressive query language for arbitrary data ontologies (ER models) that can be easily perceptible by non-programmers (domain experts) is still an open problem. In this paper we introduce an important subset of data ontologies called semistar ontologies. Patient health records are typical examples of semistar ontologies. We propose a new natural language-based query language for semistar ontologies and provide an efficient implementation for the language. The expressiveness of the query language was demonstrated by turning it into a working language for Riga Children's Clinical University Hospital. Experiments with potential end-users have shown that the proposed query language can be taught to non-programmers in a couple of hours.
To participate in Semantic Web projects, domain experts need to be able to understand the ontologies involved. Visual notations can provide an overview of the ontology and help users to understand the connections among entities. However, users first need to learn the visual notation before they can interpret it correctly. Controlled natural language representation would be readable right away and might be preferred in case of complex axioms, however, the structure of the ontology would remain less apparent. To achieve the combination of the graphical and Controlled natural language approaches (CNL), we describe the possibility of adding CNL information into graphical OWL ontology editor OWLGrEd.
We review the RDB2OWL language for relational database to RDF/OWL mapping specification, as it has evolved from the original notation, taking into account the experience from practical use cases. We describe the RDB2OWL mapping implementation via translation both into D2RQ and standard R2RML mapping notations.
Extracting OWL ontologies from relational databases is extremely helpful for realising the Semantic Web vision. However, most of the approaches in this context often drop many of the expressive features of OWL. This is because highly expressive axioms can not be detected from database schema alone, but instead require a combined analysis of the database schema and data. In this paper, we present an approach that transforms a relational schema to a basic OWL schema, and then enhances it with rich OWL 2 constructs using schema and data analysis techniques. We then rely on the user for the verification of these features. Furthermore, we apply machine learning algorithms to help in ranking the resulting features based on user supplied relevance scores. Testing our tool on a number of databases demonstrates that our proposed approach is feasible and effective.
Customization is a very important feature of any manufacturing scheduling system. In many cases large commercial manufacturing scheduling systems are not easily and efficiently customizable to meet requirements of small and medium size enterprises. Therefore, this paper proposes an ontology-based architectural solution for the customization of manufacturing scheduling systems. According to the approach, the input to the scheduling system is a customized manufacturing scheduling ontology that is an extension of the manufacturing scheduling ontology provided in this paper. The customized manufacturing scheduling ontology performs as a knowledge base and data access point for the manufacturing scheduling system and enables it to be easily adapted to different products and their manufacturing processes.
The paper faces the problem of simplifying the development of web applications, which have separated resources (the client and the server side) and must deal with multiple user accounts. We propose a model-driven approach that factors out such web-specific aspects and allows the developer to concentrate on the main functionality of the application. In particular, models are used as memory abstraction, which is transparently synchronized between the client and the server. The two main benefits of the proposed approach are 1) the assumption just one target PC and 2) the ability to use the same code base for both desktop and web-based applications.
Domain-specific diagram editor building environments nowadays as a rule involve some usage of metamodels. However normally the metamodel alone is not sufficient to define an editor. Frequently the metamodel just defines the abstract syntax of the domain, mappings or transformations are required to define the editor functionality for diagram building. Another approach [8] is based on a fixed type metamodel describing the possible diagram elements, there an editor definition consists of an instance of this metamodel to be executed by an engine. However there typically a number of functionality extensions in a transformation language is required. The paper offers a new approach based on metamodel specialization – by just creating subclasses. First the permitted metamodel specialization based on standard UML class diagrams and OCL is precisely defined. A universal metamodel and an associated universal engine for the diagram editor domain is described, then it is shown how a specific editor definition can be obtained by specializing this metamodel. Examples of a flowchart editor and UML class diagram editor are given.
The approach proposed in this paper is based on usage of Domain Specific Languages (DSL) and allows create executable information systems' models. It lets us apply principles of the Model Driven Development (MDD) to information systems development and use: bridges the gap between business and IT, take an exact specification of information system, up-to-date documentation etc. The practical experience proves the viability of the proposed approach.
ajoo is a framework for implementing DSML tools as web applications. The framework consists of DSML tool building platform and the DSML configuration tool. The platform allows implementing various kinds of DSMLs by reusing its components without programming them each time from scratch. The framework contains a configuration tool that provides means to specify DSML tools using graphical user interface to simplify and accelerate the DSML implementation process. Thus the framework's allows implementing a wide area of DSML tools without programming.
It is generally accepted that security requirements have to be identified as early as possible to avoid later rework in the systems development process. However, in practice quite often security aspects are considered either at the later stages of development cycles (increments in agile projects) or addressed only when problems arise. One of the reasons for difficulties of early detection of security requirements is the complexity of security requirements identification. In this paper we discuss an extension of the method for security requirements elicitation from business processes (SREBP). The extension includes the application of the enterprise model frame to provide an enterprise architecture context for analyzed business process models. The enterprise model frame covers practically all concepts of the information security related definitions; the use of the frame with the SREBP method complies with the common enterprise modeling and enterprise architecture approaches; and it use helps to consider security requirements and control at the business, application, and technology levels simultaneously.
Nowadays, to stay competitive businesses must be able to quickly adapt to rapidly changing marketing environment and/or exploit new market opportunities. Simulation of business process allows to analyze various business scenarios under different circumstances, provides an understanding of the most important factors affecting the process, and helps in identifying critical parts of the process. However, traditional business process simulation approaches consider business process as a pre-defined sequence of activities, limiting possibilities to adapt a process model to changes occurring in the business environment. In this paper we present a goal-oriented approach to simulate dynamic business processes, where a complete set of process activities as well as their execution sequence are not known in advance. The paper highlights main principles of the dynamic business process simulation approach, presents a prototype tool implementing the approach, and provides an example, demonstrating the dynamic business process simulation.
Run-time adaptability is a key feature of dynamic business environments. Accordingly, the business processes need to be constantly refined and restructured to deal with exceptional situations and changing requirements. Therefore, gaining insight in an existing or proposed future situation needs simulation process. Simulation is a cost-effective way to analyse several alternatives that it will become apparent which parts are critical. However, dynamic business processes (DBP) simulation are not investigated sufficient what prevents their effective use. Therefore, this paper presents a systematic literature review of conference and journal articles on the topic of DBP simulation. The review has been undertaken to define DBP, identify the DBP requirements, analyse proposed solutions for their implementation and dynamic simulation.
The paper is devoted to the topics of self-management and its implementation. Self-management features are intended to support the usage and maintenance processes in information systems life cycle. Four self-management types are analyzed in the paper. Run-time verification and environment testing can be implemented without any intervention in base business processes, while self-testing and business process incorporation in system require an instrumentation of the base business processes. The approach is applied in praxis and shows that the implementation of self-management features requires relatively modest resources.
Smart spaces define a development approach to creation of service-oriented information systems for computing environments of the Internet of Things (IoT). Semantic-driven resource sharing is applied to make fusion of the physical and information worlds based on the methods of ontology modeling. Knowledge from both worlds is selectively encompassed in the smart space to serve for users' needs. This paper considers several principles of the smart spaces approach to semantic-driven design of service-oriented information systems. Applying these principles the information system development achieves such properties as (a) involvement for service construction many surrounding devices and personal mobile devices of the user, (b) use of external Internet services and data sources for enhancing the constructed services, (c) information-driven programming of service construction based on resource sharing. The principles are explained using such application domains as collaborative work and cultural heritage environments.
The paper presents algorithms for automatic detection of non-stationary periods of cardiac rhythm during professional activity. While working and subsequent rest operator passes through the phases of mobilization, stabilization, work, recovery and the rest. The amplitude and frequency of non-stationary periods of cardiac rhythm indicates the human resistance to stressful conditions. We introduce and analyze a number of algorithms for non-stationary phase extraction: the different approaches to phase preliminary detection, thresholds extraction and final phases extraction are studied experimentally.
Due to very significant differences between streams obtained from different persons and relatively small amount of data common machine learning techniques do not work well with our data. Thus, we had to develop adaptive algorithms based on domain-specific high-level properties of data and adjust parameters based on the preliminary analysis of the stream, making the algorithms adaptive and thus able to capture individual features of a person.
These algorithms are based on local extremum computation and analysis of linear regression coefficient histograms. The algorithms do not need any labeled datasets for training and could be applied to any person individually. The suggested algorithms were experimentally compared and evaluated by human experts.
The algorithms for mining of frequent itemsets appeared in the early 1990s. This problem has an important practical application, so there have appeared a lot of new methods of finding frequent itemsets. The number of existing algorithms complicates choosing the optimal algorithm for a certain task and dataset. Twelve most widely used algorithms for mining of frequent itemsets are analyzed and compared in this article. The authors discuss the capabilities of each algorithm and the features of classes of algorithms. The results of empirical research demonstrate different behavior of classes of algorithms according to certain characteristics of datasets.
The existing research in the field of large-scale high-resolution display wall systems has mainly been focused on specific needs and environments. In many cases, the solutions built for large-scale high-resolution visualization are only valid for a specific use case (e.g. distributed 3D rendering, presentation of HTML content, stereoscopic projection). However, such solutions can soon become outdated and be left unmaintained (as seen with several distributed OpenGL library implementations). This happens due to the appearance of new technologies that solve the existing tasks in a more effective manner. A possible solution is to prevent most limitations on the set of technologies that can be used when the software is running on a display wall system. One way of achieving this is to use virtualization. This chapter presents Infiniviz – a virtual machine based high-resolution display wall system. Infiniviz approaches the visualization task in a seamless manner. The main aim of Infiniviz is to be able running any common desktop operating system software on a large scale high-resolution display wall without any modifications. Infiniviz achieves this by running a headless virtual machine with the required operating system backed by a custom software stack that handles the actual visualization. The authors have performed performance evaluations, virtualization environment comparisons and comparisons among other display wall architectures. This work along with key conclusions has been summarized in this chapter.
With the rise of e-commerce, online consumer reviews have become crucial for consumers' purchasing decisions. Most of the existing research focuses on the detection of explicit features and sentiments in such reviews, thereby ignoring all that is reviewed implicitly. This study builds, in extension of an existing implicit feature algorithm that can only assign one implicit feature to each sentence, a classifier that predicts the presence of multiple implicit features in sentences. The classifier makes its prediction based on a custom score function and a trained threshold. Only if this score exceeds the threshold, we allow for the detection of multiple implicit feature. In this way, we increase the recall while limiting the decrease in precision. In the more realistic scenario, the classifier-based approach improves the F1-score from 62.9% to 64.5% on a restaurant review data set. The precision of the computed sentiment associated with the detected features is 63.9%.
With the volume of daily news growing to sizes too big to handle for any individual human, there is a clear need for effective search algorithms. Since traditional bag-of-words approaches are inherently limited since they ignore much of the information that is embedded in the structure of the text, we propose a linguistic approach to search called Destiny in this paper. With Destiny, sentences, both from news items and the user queries, are represented as graphs where the nodes represent the words in the sentence and the edges represent the grammatical relations between the words. The proposed algorithm is evaluated against a TF-IDF baseline using a custom corpus of user-rated sentences. Destiny significantly outperforms TF-IDF in terms of Mean Average Precision, normalized Discounted Cumulative Gain, and Spearman's Rho.