
Ebook: Databases and Information Systems V

The Eighth International Baltic Conference on Databases and Information Systems took place on June 2–5 2008 in Tallinn, Estonia. This conference is continuing a series of successful bi-annual Baltic conferences on databases and information systems (IS). The aim is to provide a wide international forum for academics and practitioners in the field of databases and modern information systems for exchanging their achievements in this area. The original research results presented in Databases and Information Systems V mostly belong to novel fields of IS and database research such as database technology and the semantic web, ontology-based IS, IS and AI technologies and IS integration. The contribution of Dr. Jari Palomäki showed how different ontological commitments affect the way we are modeling the world when creating an information system. As semantic technologies have been gaining more attention recently, a special session on semantic interoperability of IS was organized. The invited talks from each Baltic State gave a good insight how semantic interoperability initiatives are developing in each of the Baltic States and how they relate to the European semantic interoperability framework.
The Eighth International Baltic Conference on Databases and Information Systems (Baltic DB&IS'2008) took place on June 2–5 2008 in Tallinn. This conference is continuing a series of successful bi-annual Baltic conferences on databases and information systems, which have been held in Trakai (1994), Tallinn (1996, 2002), Riga (1998, 2004), and Vilnius (2000, 2006).
The conference was organised by the Institute of Cybernetics at Tallinn University of Technology and the Department of Computer Engineering of Tallinn University of Technology in co-operation with the Estonian Informatics Centre (Ministry of Economic Affairs and Communications of Estonia).
The aim of the Baltic DB&IS series of conferences is to provide a wide international forum for academics and practitioners in the field of databases and modern information systems for exchanging their achievements in this area. The objective of the conference is to bring together researchers as well as practitioners and PhD students to present their work and exchange ideas, and trigger co-operation.
The International Programme Committee had representatives from 22 countries all over the world. They received 43 submissions from 12 countries. Each conference paper was reviewed by three referees from different countries. As a result, 29 regular papers were accepted for presentation at the conference. From the presented papers, 22 best papers were selected and are included in this volume.
The original research results presented in the conference papers mostly belong to novel fields of IS and database research such as database technology and semantic web, ontology-based IS, IS and AI technologies, IS integration. The invited talk by Dr. Jari Palomäki showed how different ontological commitments affect the way we are modeling the world when creating an information system.
As semantic technologies have been gaining more attention recently, then special session on semantic interoperability of IS was organised. The invited talks from each Baltic State gave a good insight how semantic interoperability initiatives are developing in each of the Baltic States and how they relate to the European semantic interoperability framework. Two of these papers are included in this book.
Finally, we would like to thank the authors for their contributions and the invited speakers for sharing with us their views. We are very grateful to members of the Programme Committee and the additional referees for carefully reviewing the submissions.
We wish to thank all the organising team and our sponsors, who have made this conference and proceedings possible. We express our special thanks to Mr. T. Robal for technical editing of the proceedings. Last, but not least, we thank all the participants, who really made the conference.
September 2008
Hele-Mai Haav, Ahto Kalja
In this paper the role of ontology is considered in the context of information systems and conceptual modelling. Firstly we describe information systems and conceptual modelling without using the word “ontology”. Then two senses of ontology are presented: philosophical and knowledge representation views. In this connection both Gruber's and Guarino's definitions of “ontology” are scrutinized. It is proposed that the kind of terminological shifts concerning the word “ontology”, and practiced e.g. by Gruber and Guarino, have caused more confusion than clarification in the field of information systems and conceptual modelling. Instead of given a new or a better definition of our own, our view is that the word “ontology” should be used in its traditional philosophical sense only, whereas our aim in the field of information systems and conceptual modelling is to reach conceptual clarity.
We propose a label based technique for an effective management of structural locks in XML databases. In particular we consider two types of such locks, acquired by transactions on sets of nodes. These are path locks and range locks – the latter understood as some subset of document's nodes defined by a pair of labels, with the sub-tree lock being a special case of range lock. We put forward algorithms for consolidation of shared locks that define a continuous covering, a way of initial processing simplifying and accelerating the lock manager. Two algorithms for the lock manager are framed with the use of XML labels for both the 2PL protocol and the cooperative one. We also point to some interesting ties between the lock management and the shortest path problem. Our motivation is given in the context of the SEDNA native XML DB isolation rules, however, our technique is useful for most of the fine-granular lock protocols.
We explore existing methods to summarize XML data based on document levels, document paths, and element distribution. Using empirical data, we cross-compare them according to size and time needed for summary representation and estimate computation and, most important, the estimation accuracy achieved.
Websites employ different structures to provide information to visitors. Nevertheless, users are browsing the web according to their informational needs and comprehensions of the sites, based on the conceptual model in their mind. This delivers a good basis for evaluating web-based systems. Herein we propose a framework for web systems evaluation on the basis of users' actions. Exploiting a special log system and techniques of web usage mining, typical navigational patterns are recognized. As a result, measures for system evaluation are proposed with experimental data.
We explore the possibility to improve typical Learning management system (LMS) by integrating modern achievements in the fields of semantic web, and especially, ontology engineering. Ontologies gain its popularity due to ability 1) to implement reuse on knowledge level; and 2) to enhance information retrieval processes. Automatically conducted information retrieval allows to increase effectiveness of human-computer interaction processes, which is also important in e-Learning processes. In this article, we analyse the possibilities and types of reasoning over different ontology elements. Then we explore a theoretically proposed framework for conceptual linking of educational resources, which is intended to support learners' navigation in Distance Study Course (DSC). We demonstrate the application of the proposed framework by the means of a designed scenario with real domain ontology and concrete pedagogical goals in mind: to automatically form a dynamical navigational menu in order to foster predictable learning paths.
The paper presents the plagiarism detection tool for processing template-based documents specifying its conception, the method chosen for the plagiarism detection, the process of the checking of the originality of a submitted student's work, tool's users and requirements. Logical level of the architecture is given as well. Software needed for the implementation and the use of the tool is described.
Business requirements are essential for developing any information system, including a data warehouse. However, the requirements, written in natural language, may be ambiguous and imprecise. This paper offers business requirement formalization metamodel, which defines business requirements according to a certain pattern, specific to data warehouse systems. We also propose a method to create a conceptual model of a data warehouse, based on the knowledge about requirements and conceptual models of previously developed projects. The method is extended by providing evaluation of possible ways of selecting conceptual model of existing data warehouse projects.
Schemata of data warehouses often need to be adapted because of evolving business requirements or changes in data sources. To accumulate the history of schemata and data, it is possible to maintain multiple versions of data warehouse schemata. We propose the formal model to store the data about data warehouse logical and physical schemata and their versions. For each modification of a data warehouse schema, we outline the changes that need to be made to the formal model. We present the data warehouse framework that is able to track evolution process and adapt data warehouse schemata and data extraction, transformation and loading (ETL) processes.
In this paper we show how semantic web technologies are used in a real application in the domain of national medical databases where an important technological gap between the legacy relation databases and OWL ontologies is bridged by the recently standardized UML profile for OWL. After data has been exported from multiple relational databases into a single shared RDF database structured according to an integrated OWL ontology there emerges a need for a convenient end-user query tools. We describe a fully graphical access to the exported data through the graphical front-end tool (named “DEMO” tool) based on the UML profile for OWL and SPARQL query language.
E-services are built upon the exchange of information between several databases and institutions. The creation process of new e-services may be not an efficient one because the discovery, integration and reuse of appropriate information resources is time-consuming as the meaning of these information resources is not well-enough documented. In order to describe the meaning of information resources – being poured into hundreds of heterogeneous databases, distributed between several institutions – some coordination, principles, architecture and infrastructure are needed. This paper outlines proposed large-scale semantic interoperability architecture and its pilot implementation in the Estonian public sector's semantic interoperability initiative.
Spiking neural networks (SNNs) are more powerful than their non-spiking predecessors as they can encode temporal information in their signals, but they also need different and biologically more plausible rules for synaptic plasticity. In this paper, an unsupervised learning algorithm is introduced for a spiking neural network. The algorithm is based on the Hebbian rule, with the addition of the principle of adapting a layer activation level so as to guarantee frequent firings of neurons for each layer. The proposed algorithm has been illustrated with experimental results related to the detection of temporal patterns in an input stream.
Testing is a process which can be viewed as a complex system. There are techniques for reducing the level of complexity which come from ways of establishing multi-agent systems that have been discovered in the area of artificial intellect. The idea is that testing processes should at least in part be viewed as multi-agent systems. Different paradigms for organising testing processes become possible if many primitive agents are established and if operations and tasks are broken up into primitive units. This makes it possible to use available resources more effectively. The principles which are presented in this paper can gradually develop testing processes into more effective operations. We particularly focus on the organisational aspects of the testing process.
Nowadays to be competitive universities as well as other organizations are forced to improve effectiveness and efficiency of their services. Implementation of different e-initiatives is a typical solution to take advantage of ICT to reduce operating costs of the institution, to increase responsiveness to customers, etc. This paper describes conceptual approach and technical solutions to an e-University initiative. The proposed approach allows setting up measurable business goals, to measure business processes in context of defined goals and to search for business process improvement. In this paper also two levels of measurement of systems that support business processes are considered.
The paper addresses a problem in software operation that maintenance tasks may lead to software failures. A model for automated checking of software execution environment is proposed. The provided solution is to develop a “profile” for each deployable item which contains information about software requirements regarding its execution. The profile document is added to software deliverables together with a set of tools capable of validating the adequacy of the execution environment according to the document. Regular checks of execution environment may be performed during system operation. The paper discusses first practical results of the proposed approach.
Computer auditing involves the manual collection and review of independent log files that may be completely different in format and contents, yet they may be conceptually related as they could be part of the picture for an incident. The aim of this paper is to investigate the feasibility of heterogeneous log integration via the use of Log Management Information Bases, i.e. metadata descriptions of each proprietary log that would allow for log ‘unification’ in a standardised language such as XML, and their subsequent correlation into a software module that could be either an independent application or an add-on module of generalised audit software.
The paper is dedicated to principles of smart technologies as well as advantages and restrictions for usage of them. Self-testing as a feature of smart technologies is analyzed more closely. Self-testing is carried out using complete test cases and a built-in test execution mechanism (self-testing mode). The self-testing feature enables testing of software during the whole life cycle and especially in the maintenance phase since this feature operates in both testing and productive environments. The paper also contains an experience-based report about economical aspects of usage of smart technologies and especially self-testing.
Design and evolution of modern information systems is influenced by many factors: technical, organizational, social, and psychological. This is especially true for open source software systems (OSSS), when many developers from different backgrounds interact, share their ideas and contribute towards the development and improvement of a software product. The evolution of an OSSS is a continuous process of source code development, adaptation, improvement and maintenance. Studying changes to the various characteristics of source code can help us understand the evolution of a software system. In this paper, the software evolution process is analyzed using a proposed Evolution curve (E-curve) method, which is based on information theoretic metrics of source code. The method allows identifying major evolution stages and transition points of an analyzed software system. The application of the E-curves is demonstrated for the eMule system.