Managing Diversity in Knowledge
In the vision of pervasive communications and computing, information and communication technologies seamlessly and invisibly pervade into everyday objects and environments, delivering services adapted to the person and the context of their use. The communication and computing landscape will be sensing the physical world via a huge variety of sensors, and controlling it via a plethora of actuators. Applications and services will therefore have to be greatly based on the notions of context and knowledge.
In such foreseeable technology rich environments, the role of content providers and content consumers is being reshaped due to their immense and unprecedented number, and the way they generate, preserve, discover, use and abandon information. Pervasive communications call for new architectures based on device autonomy, fragmented connectivity, spatial awareness and data harnessing inside each network node. The realisation of this vision will then depend on the ability to access decentralized data, with demanding performance, scalability, security requirements than cannot be matched by centralized approaches. One of the key research challenges is then how to design a distributed data management infrastructure, allowing the handling of very high levels of complexity in the management of distributed highly heterogeneous data and knowledge sources (as they can be found in the Web), the integration of continuously changing data flows, and in general the management of multimedia data (e.g. personal, cultural heritage, education).
Global Data Management is playing a crucial role in the development of our networked distributed society. Its importance has been recognised in the IST programme since several years, in particular in its long-term research part, Future and Emerging Technologies (FET). Many of the papers included in this book refer to IST and FET projects currently running or recently completed. This subject is also one of the focal points identified for long-term FET research in the 7th Framework Programme for Community Research. The basic principles identified in the areas “Pervasive Computing and Communications” and “Managing Diversity in Knowledge” (see http://cordis.europa.eu/ist/fet/), as summarised in this foreword, are very much in line with the goals of this book.
An unforeseen growth of the volume and diversity of the data, content and knowledge is being generated all over the globe. Several factors lead to this growing complexity, among them: Size (the sheer increase in the numbers of knowledge producers and users, and in their production/use capabilities), Pervasiveness (in space and time of knowledge, knowledge producers and users), Dynamicity (new and old knowledge items will appear and disappear virtually at any moment), Unpredictability (the future dynamics of knowledge are unknown not only at design time but also at run time). The situation is made worse by the fact that the complexity of knowledge grows exponentially with the number of interconnected components.
The traditional approach of knowledge management and engineering is top-down and centralised, and depends on fixing at design time what can be expressed and how. The key idea is to design a “general enough” reference representation model. Examples of this top-down approach are the work on (relational) databases, the work on distributed databases, and, lately, the work on information integration (both with databases and ontologies).
There are many reasons why this approach has been and is still largely successful. From a technological point of view it is conceptually simple, and it is also the most natural way to extend the technology developed for relational databases and single information systems. From an organisational point of view, this approach satisfies the companies' desire to centralise and, consequently, to be in control, of their data. Finally, from a cultural point of view, this approach is very much in line with the way knowledge is thought of in the western culture and philosophy, and in particular with the basic principle (rooted in ancient Greek philosophy) that it must be possible to say whether a knowledge statement is (universally) true or false. This property is reassuring and also efficient from an organisational point of view in that it makes it “easy” to decide what is “right” and what is “wrong”.
However, as applications become increasingly open, complex and distributed, the knowledge they contain can no longer be managed in this way, as the requirements are only partially known at design time. The standard solution so far has been to handle the problems which arise during the life time of a knowledge system as part of the maintenance process. This however comes at a high price because of the increased cost of maintenance (exponentially more complex than the knowledge parts integrated inside it), the decreased life time of systems, and the increased load on the users who must take charge of the complexity which cannot be managed by the system. In several cases this approach has failed simply because people did not come to an agreement on the specifics of the unique global representation.
In pervasive distributed systems, the top-down approach must be combined with a new, bottom-up approach in which the different knowledge parts are designed and kept ‘locally’ and independently, and new knowledge is obtained by adaptation and combination of such items.
The key idea is to make a paradigm shift and to consider diversity as a feature which must be maintained and exploited and not as a defect that must be absorbed in some general schema. People, organisations, communities, populations, cultures build diverse representations of the world for a reason, and this reason lies in the local context, representing a notion of contextual, local knowledge which satisfies, in an optimal way, the (diverse) needs of the knowledge producer and knowledge user.
The bottom-up approach provides a flexible, incremental solution where diverse knowledge parts can be built and used independently, with some degree of complexity arising in their integration.
A second paradigm shift moves from the view where knowledge is mainly assembled by combining basic building blocks to a view where new knowledge is obtained by the design- or run-time adaptation of existing, independently designed, knowledge parts. Knowledge will no longer be produced ab initio, but more and more as adaptations of other, existing knowledge parts, often performed in run-time as a result of a process of evolution. This process will not always be controlled or planned externally but induced by changes perceived in the environment in which systems are embedded.
The challenge is to develop theories, methods, algorithms and tools for harnessing, controlling and using the emergent properties of large, distributed and heterogeneous collections of knowledge, as well as knowledge parts that are created through combination of others. The ability to manage diversity in knowledge will allow the creation of adaptive and, when necessary, self-adaptive knowledge systems.
The complexity in knowledge is a consequence of the complexity resulting from globalisation and the vitalisation of space and time produced by the current computing and networking technology, and of the effects that this has on the organisation and social structure of knowledge producers and users. This includes the following focus issues:
• Local vs. global knowledge. The key issue will be to find the right balance and interplay between operations for deriving local knowledge and operations which construct global knowledge.
• Autonomy vs. coordination, namely how the peer knowledge producers and users find the right balance between their desired level of autonomy of and the need to achieve coordination with the others.
• Change and adaptation, developing organisation models which facilitate the combination and coordination of knowledge and which can effectively adapt to unpredictable dynamics.
• Quality, namely how to maintain good enough quality e.g. through self-certifying algorithms, able to demonstrate correct answers (or answers with measurable incorrectness) in the presence of inconsistent, incomplete, or conflicting knowledge components.
• Trust, reputation, and security of knowledge and knowledge communities, for instance as a function of the measured quality; how to guard against deliberate introduction of falsified data.
Europe is very well positioned given the investment already done in many of these areas. This book represents a further step in the right direction.
Fabrizio Sestini
This text presents solely the opinions of the author, which do not prejudice in any way those of the European Commission.