Grids and P2P systems have emerged as distributed computing paradigms for the development of large-scale distributed applications. With the fast developments in Internet technologies and the continuous improvements in the connected computational resources, Grid and P2P appeared as the disruptive technologies that can greatly impact not only on scientific and academic activities but also in business and enterprise productivity.
Computational grids were motivated by the need to develop computational frameworks to support large-scale applications that benefit from the large computing potential offered by such distributed infrastructures. As a matter of fact, the first successful stories of Grid-enabled applications are found from scientific computing domain. On the other hand, P2P systems appeared as the new paradigm after client-server and web-based computing. Differently from centralized or hierarchical models of Grid systems, P2P systems distinguish for their very large scale, decentralized and self-organizing nature. Although P2P systems have become popular for file sharing, these systems are very rapidly evolving to an important distributed computing paradigm.
During the past years considerable research efforts and, as a consequence, important advances are reported for programming models, middleware and communication libraries for Grid and P2P systems. However, Grid and P2P applications still remain difficult to develop in practice for many users, mainly, because Grid/P2P middleware provide fundamental services but they are often “low level”. Moreover, such middle-ware are neither standard nor complete as regarding the needs of different domain applications, requiring thus some ad hoc development. Researchers and developers of the Grid and P2P systems are addressing issues that concern both systems, which are yielding to new insights on large-scale distributed application development in such systems.
The chapters of this volume bring recent advances in Grid and P2P programming models, middleware, communication libraries as well as their application to the resolution of real life problems. The book consists of an introductory chapter and eleven chapters selected out of nineteen chapter proposals. Chapters were carefully reviewed by the editor and blind reviewers; each chapter received at least two review reports. The chapters of the volume are organized as follows.
Chapter 1 introduces the Grid and P2P paradigms, in view of their common and different features. Most commonly used programming models, middleware and communication libraries are surveyed by emphasizing their advantages and drawbacks. Also, some applications from massive processing of regularly sequenced data using Grid and P2P approaches are given.
Chapter 2 by Cong-Vinh addresses the use of categorical structures to establish a formal basis for specifying tasks parallel, data parallel, peer-to-peer structures and self-organization in Large Scale Distributed Networks (LSDNs). The aim of the chapter is to formalize parallel programming in LSDNs, including parallel composition of tasks in LSDNs, category in categories and categorical aspects of self-configuration in P2P networks.
In chapter 3, Pllana et al. study the use of hybrid models for performance modelling and prediction of large scale parallel systems. This approach combines mathematical modelling with discrete event. A high-level performance model is thus obtained, which combines the evaluation speed of mathematical models with the structure awareness and fidelity of the simulation model.
Chapter 4 by Pujol Ahulló and García López performs a comparison of the recent works on the family of similarity queries including range queries, k-nearest neighbour queries and spatial queries as richer queries for distributed systems based on peer-to-peer networks. The authors have identified a set of evaluation parameters and have used them for the comparison analysis of different systems.
Genaud and Rattanapoka, in the fifth chapter, present P2P-MPI, a peer-to-peer framework for message passing parallel programs in large scale distributed systems. It uses MPJ (Message Passing for Java) communication library. P2P-MPI is intended as a light-weight, self-contained software package that facilitates the development and maintenance of parallel programs with minimum efforts to users. Experimental results are presented for allocation and performance, fault-tolerance and replicas using the grid testbed Grid5000.
In the sixth chapter, Quan and Tang report parallel implementations for mapping Service Level Agreement (SLA)-based workflows onto Grid resources. The proposed approach aims at overcoming the limitations of the mapping module, which could be the bottleneck of the systems when many requests are to be during a short period of time. Parallelized mapping algorithms to increase the capability of the SLA workflow broker are presented and performance measurements are experimentally evaluated.
Seventh chapter by Gounaris et al. considers the parallel query processing on the Grid, research issues and challenges in a Grid setting. The authors analyze the Grid-oriented and/or service-based query processors and how they differ from traditional ones. The chapter focuses on scheduling parallel database queries over non-dedicated, distributed resources, alleviating the impact of increased data transfer cost and load balancing in the Grid setting.
In the eighth chapter Hellinckx et al. deal with runtime prediction in desktop Grids and its application in the prediction-aware mapping of jobs onto available resources aiming at reducing the number of jobs prematurely interrupted due to insufficient resource availability. A framework for efficiently executing parameter sweep applications based on both runtime prediction modelling techniques and resource availability is introduced. The prediction based scheduling is experimentally evaluated by simulation and real time scheduling and results are compared.
The ninth chapter by Mateos et al. presents an approach to just-in-time gridification of conventional Java applications. Gridification methods aim to facilitate running Grid applications by semi-automatically deriving the Grid-enabled version of an user application from its binary code. The authors propose BYG a gridification method for binary Java applications. The feasibility of the approach is experimentally shown.
Choy et al. in the tenth chapter exploit large scale distributing computing for solving linear algebra problems such as large size eigenvalue problem and bring interesting real life experiences for the resolution of such problems. Several linear algebra methods have been considered and parallelisation paradigms such as parametric parallelism are considered for their efficient resolution. Real size problem instances are solved using the prposed approach in a world-wide platform using XtremeWeb P2P middleware and OmniRPC Grid middleware.
Chapter eleven by Muñoz-Marí presents parallel implementations of Support Vector Machines (SVM) for earth observation problems (hyper spectral remote sensing). SVM have shown their efficiency for hyper spectral image classification but suffer from high computational cost. The authors present two parallel versions of SVMs for remote sensing image classification and experimental results are presented for the parallel implementation of SVM via Cholesky factorization in a complex multispectral image classification.
In the last chapter, Tantar et al. propose the use of landscape analysis in Grid-enabled meta-heuristics (hierarchical and multistage distributed evolutionary algorithms). Local search algorithms and parallel versions developed in ParadisEO framework are considered and analyzed. The parallelization model is based on a synchronous multi-start approach. The Protein Structure Prediction conformational sampling problem, which is known for its high computational cost, is considered as a case study. More than 500MB of raw data has been processed for the local search algorithms analysis.
By approaching Grid and P2P paradigms in an integrated and comprehensive way, we believe that this book will serve as a reference for the researchers and developers of the Grid and P2P computing community.
I would like to thank the authors of this volume for their contributions and the reviewers for their careful reviewing and interesting feedback to authors of the chapters. I am very grateful to Prof. Dr. Gerhard Joubert, Series Editor of “Advances in Parallel Computing” of IOS Press for his continuous and timely support to this book project and to Ms. Anne Marie de Rover (Head of Book Department, IOS Press) for the editorial assistance.
Fatos Xhafa
Department of Languages and Informatic Systems
Technical University of Catalonia, Spain
March 2009, Barcelona, Spain