Ebook: Communicating Process Architectures 2007
This publication deals with Computer Science and models of Concurrency. It particularly emphasises on hardware/software co-design, and the understanding of concurrency that results from these systems. A range of papers on this topic have been included, from the formal modeling of buses in co-design systems through to software simulation and development environments. The book includes a contribution by Professor Sir Tony Hoare (FRS), the founding father of the theoretical basis upon which much of the work in this series is based. He shares new thoughts on fine-grained concurrency. Another important contribution is by Professor David May (FRS) on his new architecture for massively multicore processors, its underlying programming model and applications. The editors trust you will find this publication informative and inspirational.
The University of Surrey is delighted to host the Communicating Process Architectures 2007 conference. There are many reasons why this University, and Guildford in particular, is appropriate for a conference about Computer Science and models of Concurrency. Not least is the connection with one of the most famous Computer Scientists of them all: Alan Turing, who grew up only a short distance from the site of the University. A statue of him erected in the main plaza overlooks the conference venue, serving as an inspiration and reminder to the strong theoretical and mathematical basis of the topic of this conference. Delegates may also have paused for enlightenment at the village of Occam (now spelt Ockham), famously the birthplace of William of Occam, as they approached Guildford.
This is the 30th meeting of this conference series. The first was a single day workshop, organised by Inmos, and took place in Bristol in 1985. With the success of the Transputer, the meeting grew into an international conference series, with the proceedings formally published by IOS press since March 1988. The fact that the conference series is growing in strength as technology evolves shows that its founding ideas still push the boundaries of Computer Science and are as relevant as ever.
Since inception, the CPA conference series has always had strong interest from industry, and this year is no exception with the conference being co-sponsored by AWE UK. This year, there is a particular emphasis on hardware/software co-design, and the understanding of concurrency that results from these systems. A range of papers on this topic have been included, from the formal modeling of buses in co-design systems through to software simulation and development environments.
Industrial relevance is further reflected in the achievements of this year's invited speakers. Professor Sir Tony Hoare, FRS, is the founding father of the theoretical basis upon which much of the work in this series is based. The organisers are delighted that he has accepted the invitation to address the conference on his new thoughts on fine-grained concurrency. Professor David May, FRS, has been one of the leading lights of this, and other, communities for many years. He was chief architect for the Transputer and the occam programming language. The organisers are also delighted that he has accepted the invitation to address the conference on his latest work on communicating process architecture for massively multicore processors and how to program them.
We hope you will find the meeting exciting, invigorating, and motivating. We trust you will find the published proceedings informative and inspirational – and the informal Fringe presentations fun and thought provoking. This year, we have published abstracts of several fringe presentations that were offered in advance of the conference.
Finally, the editors would like to thank the Programme Committee, friends of the conference, and reviewers, for all their diligent hard work in reviewing papers, the staff of the University of Surrey – especially Sophie Gautier-O'Shea – for their assistance in organizing the event, and the Systems Assurance Group at the AWE for all their support.
Alistair McEwan (University of Surrey), Steve Schneider (University of Surrey), Peter Welch (University of Kent), Wilson Ifill (AWE UK)
I have been interested in concurrent programming since about 1963, when its associated problems contributed to the failure of the largest software project that I have managed. When I moved to an academic career in 1968, I hoped that I could find a solution to the problems by my research. Quite quickly I decided to concentrate on coarse-grained concurrency, which does not allow concurrent processes to share main memory. The only interaction between processes is confined to explicit input and output commands. This simplification led eventually to the exploration of the theory of Communicating Sequential Processes.
Since joining Microsoft Research in 1999, I have plucked up courage at last to look at fine-grain concurrency, involving threads which interleave their access to main memory at the fine granularity of single instruction execution. By combining the merits of a number of different theories of concurrency, one can paint a relatively simple picture of a theory for the correct design of concurrent systems. Indeed, pictures are a great help in conveying the basic understanding. This paper presents some on-going directions of research that I have been pursuing with colleagues in Cambridge – both at Microsoft Research and in the University Computing Laboratory.
Communicating process architecture can be used to build efficient multicore chips scaling to hundreds of processors. Concurrent processing, communications and input-output are supported directly by the instruction set of the cores and by the protocol used in the on-chip interconnect. Concurrent programs are compiled directly to the chip exploiting novel compiler optimisations. The architecture supports a variety of programming techniques, ranging from statically configured process networks to dynamic reconfiguration and mobile processes.
We have recently constructed a model, and carried out an analysis, of a concurrent extension to an object-oriented language at a level of abstraction above threads. The model was constructed in CSP. We subsequently found that existing CSP tools were unsuitable for reasoning about and analysing this model, so it became necessary to create a new tool to handle CSP models: CSPsim. We describe this tool, its capabilities and algorithms, and compare it with the related tools, FDR2 and ProBE. We illustrate CSPsim's usage with examples from the model. The tool's on-the-fly construction of successor states is important for exhaustive and non-exhaustive state exploration. Thus we found CSPsim to be particularly useful for parallel compositions of components with infinite states that reduce to finite-state systems.
Aldwych is a general purpose programming language which we have developed in order to provide a mechanism for practical programming which can be thought of in an inherently concurrent way. We have described Aldwych elsewhere in terms of a translation to a concurrent logic language. However, it would be more accurate to describe it as translating to a simple operational language which, while able to be represented in a logic-programming like syntax, has lost much of the baggage associated with “logic programming”. This language is only a little more complex than foundational calculi such as the pi-calculus. Its key feature is that all variables are moded with a single producer, and some are linear allowing a reversal of polarity and hence interactive communication.
The ProB model checker provides tool support for an integrated formal specification approach,combiningthe classical state-based B language with the event-based process algebra CSP. In this paper, we present a developing strategy for implementing such a combined ProB specification as a concurrent Java program. A Java implementationof the combinedB and CSP model has been developedusing a similar approach to JCSP. A set of translation rules relates the formal model to its Java implementation, and we also provide a translation tool JCSProB to automatically generate a Java program from a ProB specification. To demonstrate and exercise the tool, several B/CSP models, varying both in syntactic structure and behavioural/concurrency properties, are translated by the tool. The models manifest the presence and absence of various safety, deadlock, and bounded fairness properties; the generated Java code is shown to faithfully reproducethem. Run-time safety and boundedfairness checking is also demonstrated. The Java programs are discussed to demonstrate our implementation of the abstract B/CSP concurrencymodel in Java. In conclusion we consider the effectiveness and generality of the implementation strategy.
Component-based software engineering is becoming an important approach for system development. A crucial issue is to fill the gap between high-level models, needed for design and verification, and implementation. This paper introduces first a component model with explicit protocols based on symbolic transition systems. It then presents a Java implementation for it that relies on a rendezvous mechanism to synchronize events between component protocols. This paper shows how to get a correct implementation of a complex rendezvous in presence of full data types, guarded transitions and, possibly, guarded receipts.
Honeysuckle is a language in which to describe systems with prioritized service architecture (PSA), whereby processes communicate values and (mobile) objects deadlock-free under client-server protocol. A novel syntax for the description of service (rather than process) composition is presented and the relation to implementation discussed. In particular, the proper separation of design and implementation becomes possible, allowing independent abstraction and verification.
This paper deals with the applicability of CSP in general and SystemCSP, as a notation and design methodology based on CSP, in particular in the application area of real-time systems. The paper extends SystemCSP by introducing time-related operators as a way to specify time properties. Since SystemCSP aims to be used in practice of real-time systems development, achieving real-time in practice is also addressed. The mismatch between the classical scheduling theories and CSP paradigm is explored. Some practical ways to deal with this mismatch are presented.
The testing of systems using tools such as JUnit is well known to the sequential programming community. It is perhaps less well known to the parallel computing community because it relies on systems terminating so that system outputs can be compared with expected outputs. A highly parallel architecture is described that allows the JUnit testing of non-terminating MIMD process based parallel systems. The architecture is then extended to permit the sampling of a continuously running system. It is shown that this can be achieved using a small number of additional components that can be easily modified to suit a particular sampling situation. The system architectures are presented using a Groovy implementation of the JCSP and JUnit packages.
The original package developed for network mobility in JCSP, although useful, revealed some limitations in the underlying models permitting code mobility and channel migration. In this paper, we discuss these limitations, as well as describe the new models developed to overcome them. The models are based on two different approaches to mobility in other fields, mobile agents and Mobile IP, providing strong underlying constructs to build upon, permitting process and channel mobility in networked JCSP systems in a manner that is transparent to the user.
The advent of mass-market multicore processors provides exciting new opportunities for parallelism on the desktop. The original C++CSP – a library providing concurrency in C++ – used only user-threads, which would have prevented it taking advantage of this parallelism. This paper details the development of C++CSP2, which has been built around a many-to-many threading model that mixes user-threads and kernel-threads, providing maximum flexibility in taking advantage of multicore and multi-processormachines. New and existing algorithms are described for dealing with the run-queue and implementing channels, barriers and mutexes. The latter two are benchmarked to motivate the choice of algorithm. Most of these algorithms are based on the use of atomic instructions, to gain maximal speed and efficiency. Other issues related to the new design and related to implementing concurrency in a language like C++ that has no direct support for it, are also described. The C++CSP2 library will be publicly released under the LGPL before CPA 2007.
SystemCSP is a graphical design specification language aimed to serve as a basis for the specification of formally verifiable component-based designs. This paper defines a mapping from SystemCSP designs to a software implementation. The possibility to reuse existing practical implementations was analyzed. Comparison is given for different types of execution engines usable in implementing concurrent systems. The main part of the text introduces and explains the design principles behind the software implementation. A synchronization mechanism is introduced that can handle CSP kind of events with event ends possibly scattered on different nodes and OS threads, and with any number of participating event ends, possibly guarded by alternative constructs.
The Python programming language is effective for rapidly specifying programs and experimenting with them. It is increasingly being used in computational sciences, and in teaching computer science. CSP is effective for describing concurrency. It has become especially relevant with the emergence of commodity multi-core architectures. We are interested in exploring how a combination of Python and CSP can benefit both the computational sciences and the hands-on teaching of distributed and parallel computing in computer science. To make this possible, we have developed PyCSP, a CSP library for Python. PyCSP presently supports the core CSP abstractions. We introduce the PyCSP library, its implementation, a few performance benchmarks, and show example code using PyCSP. An early prototype of PyCSP has been used in this year's Extreme Multiprogramming Class at the CS department, university of Copenhagen with promising results.
A fine-grained massively-parallel process-oriented model of platelets (potentially artificial) within a blood vessel is presented. This is a CSP inspired design, expressed and implemented using the occam-pi language. It is part of the TUNA pilot study on nanite assemblers at the universities of York, Surrey and Kent. The aim for this model is to engineeremergentbehaviourfromthe platelets, such that they respond to a wound in the blood vessel wall in a way similar to that found in the human body – i.e. the formation of clots to stem blood flow from the wound and facilitate healing. An architecture for a three dimensional model (relying strongly on the dynamic and mobilecapabilitiesof occam-pi)is given,alongwith mechanismsforvisualisation and interaction. The biological accuracy of the current model is very approximate. However, its process-oriented nature enables simple refinement (through the addition of processes modelling different stimulants/inhibitors of the clotting reaction, different platelet types and other participating organelles) to greater and greater realism. Even with the current system, simple experiments are possible and have scientific interest (e.g. the effect of platelet density on the success of the clotting mechanism in stemming blood flow: too high or too low and the process fails). General principles for the design of large and complex system models are drawn. The described case study runs to millions of processes engaged in ever-changing communication topologies. It is free from deadlock, livelock, race hazards and starvation by design, employing a small set of synchronisation patterns for which we have proven safety theorems.
Concurrency control mechanisms such as turn-taking, locking, serialization, transactional locking mechanism, and operational transformation try to provide data consistency when concurrent activities are permitted in a reactive system. Locks are typically used in transactional models for assurance of data consistency and integrity in a concurrent environment. In addition, recovery management is used to preserve atomicity and durability in transaction models. Unfortunately, conventional lock mechanisms severely (and intentionally) limit concurrency in a transactional environment. Such lock mechanisms also limit recovery capabilities. Finally, existing recovery mechanisms themselves afford a considerable overhead to concurrency. This paper describes a new transaction model that supports release of early results inside and outside of a transaction, decreasing the severe limitations of conventional lock mechanisms, yet still warranties consistency and recoverability of released resources (results). This is achieved through use of a more flexible locking mechanism and by using two types of consistency graph. This provides an integrated solution for transaction management, recovery management and concurrency control. We argue that these are necessary features for management of long-term transactions within “digital ecosystems” of small to medium enterprises.
This paper describes trancell, a translator and associated runtime environment that allows programs written in the occam programming language to be run on the Cell BE microarchitecture. trancell cannot stand alone, but requires the front end from the KRoC/Linux compiler for generating Extended Transputer Code (ETC), which is then translated into native Cell SPU assembly code and linked with the trancell runtime. The paper describes the difficulties in implementing occam on the Cell, notably the runtime support required for implementing channel communications and true parallelism. Various benchmarks are examined to investigate the success of the approach.
In-Situ Monitoring systems measure and relay environmental parameters. From a system design perspective such devices represent one node in a network. This paper aims to extend the networking idea from the system level towards the design level. We describe In-Situ Monitoring systems as network of components. In the proposed design these components can be implemented in either hardware or software. Therefore, we need a versatile hardware-software platform to accommodate the particular requirements of a wide range of In-Situ Monitoring systems. The ideal testing ground for such a versatile hardware-software platform are FPGAs (Field Programmable Gate Arrays) with embedded CPUs. The CPUs execute software processes which represent software components. The FPGA part can be used to implement hardware components in the form of hardware processes and it can be used to interface to other hardware components external to the processor. In effect this setup constitutes a network of communicating sequential processes within a chip. This paper presents a design flow based on the theory of CSP. The idea behind this design flow is to have a CSP model which is turned into a network of hardware and software components. With the proposed design flow we have extended the networking aspect of sensor networks towards the system design level. This allows us to treat In-Situ Measurement systems as sub-networks within a sensor network. Furthermore, the CSP based approach provides abstract models of the functionality which can be tested. This yields more reliable system designs.
This case observation describes how an embedded industrial software architecture was “mapped” onto an office layout. It describes a particular type of program architecture that does this mapping rather well. The more a programmer knows what to do, and so may withdraw to his office and do it, the higher the cohesion or completeness. The less s/he has to know about what is going on in other offices, the lower the coupling or disturbance. The project, which made us aware of this, was an embedded system built on the well-known process data-flow architecture. All interprocess communication that carried data was on synchronous, blocking channels. In this programming paradigm, it is possible for a process to refuse to “listen” on a channel while it is busy doing other things. We think that this in a way corresponds to closing the door to an office. When another process needs to communicate with such a process, it will simply be blocked (and descheduled). No queuing is done. The process, or the programmer, need not worry about holding up others. The net result seems to be good isolation of work and easier implementation. The isolation also enables faster pinpointing of where an error may be and, hence, in fixing the error in one place only. Even before the product was shipped, it was possible to keep the system with close to zero known errors. The paradigm described here has become a valuable tool in our toolbox. However, when this paradigm is used, one must also pay attention should complexity start to grow beyond expectations, as it may be a sign of too high cohesion or too little coupling.
Operating-systems are the core software component of many modern computer systems, ranging from small specialised embeddedsystems through to large distributed operating-systems. The demands placed upon these systems are increasingly complex, in particular the need to handle concurrency: to exploit increasingly parallel (multi-core) hardware; support increasing numbers of user and system processes; and to take advantageof increasinglydistributedand decentralisedsystems. The languages and designs that existing operating-systems employ provide little support for concurrency, leading to unmanageable programming complexities and ultimately errors in the resulting systems; hard to detect, hard to remove, and almost impossible to prove correct.Implemented in occam-π, a CSP derived language that provides guarantees of freedom from race-hazards and aliasing error, the RMoX operating-system represents a novel approach to operating-systems, utilising concurrencyat all levels to simplify design and implementation. This paper presents the USB (universal serial bus) device-driver infrastructure used in the RMoX system, demonstrating that a highly concurrent process-orientated approach to device-driver design and implementation is feasible, efficient and results in systems that are reliable, secure and scalable.
The LEGO Mindstorms RCX is a widely deployed educational robotics platform. This paper presents a concurrent operating environment for the Mindstorms RCX, implemented natively using occam-pi running on the Transterpreter virtual machine. A concurrent hardware abstraction layer aids both the developer of the operating system and facilitates the provision of process-oriented interfaces to the underlying hardware for students and hobbyists interested in small robotics platforms.
This paper presents the extended and re-integrated JCSP library of CSP packages for Java. It integrates the differing advances made by Quickstone's JCSP Network Edition and the “core” library maintained at Kent. A more secure API for connecting networks and manipulating channels is provided, requiring significant internal re-structuring. This mirrors developments in the occam-pi language for mandated direction specifiers on channel-ends. For JCSP, promoting the concept of channel-ends to first-class entities has both semantic benefit (the same as for occam-pi) and increased safety. Major extensions include alting barriers (classes supporting external choice over multiple multi-way synchronisations), channel output guards (straightforward once we have the alting barriers), channel poisoning (for the safe and simple termination of networks or sub-networks) and extended rendezvous on channel communications (that simplify the capture of several useful synchronisation design patterns). Almost all CSP systems can now be directly captured with the new JCSP. The new library is available under the LGPL open source license.
The principal contribution of this paper is the demonstration of a promising technique for the synthesis of hardware and software from a single specification which is also amenable to formal analysis. We also demonstrate how the notion of synchronous observers may provide a way for engineers to express formal assertions about circuits which may be more accessible then the emerging grammar based approaches. We also report that the semantic basis for the system we evaluate pays dividends when formal static analysis is performed using model checking.
In this paper, we present a formal model and analysis of the AMBA Advanced High-performance Bus (AHB) on-chip bus. The model is given in CSP∥B—an integration of the process algebra CSP and the state-based formalism B. We describe the theory behind the integration of CSP and B. We demonstrate how the model is developed from the informal ARM specification of the bus. Analysis is performed using the model-checker ProB. The contribution of this paper may be summarised as follows: presentation of work in progress towards a formal model of the AMBA AHB protocol such that it may be used for inclusion in, and analysis of, co-design systems incorporating the bus, an evaluation of the integration of CSP and B in the production of such a model, and a demonstration and evaluation of the ProB tool in performing this analysis. The work in this paper was carried out under the Future Technologies for Systems Design Project at the University of Surrey, sponsored by AWE.
Research augmenting B machines presented at B2007 has demonstrated how fragments of control flow expressed as annotations can be added to associated machine operations, and shown to be consistent. This enables designers' understanding about local relationships between successive operations to be captured at the point the operations are written, and used later when the controller is developed. This paper introduces several new annotations and I/O into the framework to take advantage of hardware's parallelism and to facilitate refinement and translation. To support the new annotations additional CSP control operations are added to the control language that now includes: recursion, prefixing, external choice, if-then-else, and sequencing. We informally sketch out a translation to Handel-C for prototyping.