Ebook: Communicating Process Architectures 2009
This book is a collection of the papers presented at the 32nd Communicating Process Architecture conference (CPA), held at the Technical University Eindhoven, the Netherlands, from the 1st to the 4th of November 2009. Concurrency is a fundamental mechanism of the universe, existing in all structures and at all levels of granularity. To be useful in this universe, any computer system has to model and reflect an appropriate level of abstraction. For simplicity, therefore, the system needs to be concurrent -- so that this modeling is obvious and correct. Today, the commercial reality of multicore processors means that concurrency issues can no longer be ducked if applications are going to be able to exploit more than an ever-diminishing fraction of their power. This is a second, but very forceful, reason to take this subject seriously. We need theory and programming technology that turns this around and makes concurrency an elementary part of the everyday toolkit of every software engineer. This is what these proceedings are all about. Subjects covered in this volume include: system design and implementation for both hardware and software; tools for concurrent programming languages, libraries and run-time kernels; and formal methods and applications.
This thirty-second Communicating Process Architectures conference, CPA 2009, takes place as part of Formal Methods Week, 1-6 November, 2009. Under the auspices of WoTUG, CPA 2009 has been organised by TASS (formerly Philips TASS) in co-operation with the Technische Universiteit Eindhoven (TU/e). This is for the second time – we had a very successful conference here in 2005 and are very pleased to have been invited back.
We see growing awareness of the ideas characterized by “Communicating Processes Architecture” and their growing adoption. The complexity of modern computing systems has become so great that no one person – maybe not even a small team – can understand all aspects and all interactions. The only hope of making such systems work is to ensure that all components are correct by design and that the components can be combined to achieve scalability and predictable function.
A crucial property is that the cost of making a change to a system depends linearly on the size of that change – not on the size of the system being changed. This must be true whether that change is a matter of maintenance (e.g. to take advantage of increasing multicore capability) or to add new functionality. One key is that system composition (and disassembly) introduces no surprises. A component must behave consistently, no matter the context in which it is used – which means that component interfaces must be explicit, published and free from hidden side-effect. Our view is that concurrency, underpinned by the formal process algebras of Hoare's Communicating Sequential Processes and Milner's π-Calculus, provides the strongest basis for the development of technology that can make this happen. Many current systems cannot be maintained if they do not have concurrency working for them and not against – certainly not if multicores are involved!
We have again received an interesting set of papers covering many different grounds: system design and implementation (for both hardware and software), tools (concurrent programming languages, libraries and run-time kernels), formal methods and applications. They have all been strongly refereed and are of high quality. As these papers are presented in a single stream you won't have to miss out on anything. As always, we will have plenty of space for informal contact and do not forget the evening Fringe Programme – where we will not have to worry about the bar closing at half past ten!
We are very pleased this year to have Professor Michael Goldsmith of the e-Security Group (WMG Digital Laboratory, University of Warwick) as our keynote speaker. He is one of the creators of the CSP model checking program FDR, which is used by many in industry and academia to check concurrent systems for deadlock, livelock and verify correct patterns of behaviour through refinement checks – all before any code is written.
We thank the authors for their submissions and the Programme Committee for their hard work in reviewing the papers. We also thank Tijn Borghuis and Erik de Vink for inviting CPA 2009 to join FMweek 2009 and for making the arrangements with the TU/e.
Peter Welch (University of Kent), Herman Roebbers (TASS, Eindhoven),
Jan Broenink (University of Twente), Frederick Barnes (University of Kent),
Carl Ritson (University of Kent), Adam Sampson (University of Kent),
Dyke Stiles (Utah State University), Brian Vinter (University of Copenhagen).
Process algebras like CSP and CCS inspired the original occam model of communication and process encapsulation. Later the π-calculus and various treatments handling mobility in CSP added support for mobility, as realised in practical programming systems such as occam-π, JCSP, CHP and Sufrin's CSO, which allow a rather abstract notion of motion of processes and channel ends between parents or owners. Milner's Space and Motion of Communicating Agents on the other hand describes the bigraph framework, which makes location more of a first-class citizen of the calculus and evolves through reaction rules which rewrite both place and link graphs of matching sections of a system state, allowing more dramatic dynamic reconfigurations of a system than simple process spawning or migration. I consider the tractability of the notation, and to what extent the additional flexibility reflects or elicits desirable programming paradigms.
SCOOP is a minimal extension to the sequential object-oriented programming model for concurrency. The extension consists of one keyword (separate) that avoids explicit thread declarations, synchronized blocks, explicit waits, and eliminates data races and atomicity violations by construction, through a set of compiler rules. SCOOP was originally described for the Eiffel programming language. This paper makes two contributions. Firstly, it presents a design pattern for SCOOP, which makes it feasible to transfer SCOOP's concepts to different object-oriented programming languages. Secondly, it demonstrates the generality of the SCOOP model by presenting an implementation of the SCOOP design pattern for Java. Additionally, we describe tools that support the SCOOP design pattern, and give a concrete example of its use in Java.
Model checking is an efficient technique for verifying properties on reactive systems. Partial-order reduction (POR) and symbolic model checking are two common approaches to deal with the state space explosion problem in model checking. Traditionally, symbolic model checking uses BDDs which can suffer from space blowup. More recently bounded model checking (BMC) using SAT-based procedures has been used as a very successful alternative to BDDs. However, this approach gives poor results when it is applied to models with a lot of asynchronism. This paper presents an algorithm which combines partial order reduction methods and bounded model checking techniques in an original way that allows efficient verification of temporal logic properties (LTLX) on models featuring asynchronous processes. The encoding to a SAT problem strongly reduces the complexity and non-determinism of each transition step, allowing efficient analysis even with longer execution traces. The starting-point of our work is the Two-Phase algorithm (Namalesu and Gopalakrishnan) which performs partial-order reduction on process-based models. At first, we adapt this algorithm to the bounded model checking method. Then, we describe our approach formally and demonstrate its validity. Finally, we present a prototypal implementation and report encouraging experimental results on a small example.
Representation of scopes of names is important for analysis and verification of concurrent systems. However, it is difficult to represent the scopes of channel names precisely with models based on process algebra.We introduced a model of concurrent systems with higher-order communication based on graph rewriting in our previous work. A bipartite directed acyclic graph represents a concurrent system that consists of a number of processes and messages in that model. The model can represent the scopes of local names precisely. We defined an equivalence relation such that two systems are equivalent not only in their behavior, but also in extrusion of scopes of names. This paper shows that our equivalence relation is a congruence relation w.r.t. τ-prefix, new-name, replication and composition, even when higher-order communication is allowed. We also show our equivalence relation is not congruent w.r.t. input-prefix though it is congruent w.r.t. input-prefix in the first-order case.
This paper presents two algorithms for analysing gCSP models in order to improve their execution performance. Designers tend to create many small separate processes for each task, which results in many (resource intensive) context switches. The research challenge is to convert the model created from a design point of view to models which have better performance during execution, without limiting the designers in their ways of working. The first algorithm analyses the model during run-time execution in order to find static sequential execution traces that allow for optimisation. The second algorithm analyses the gCSP model for multi-core execution. It tries to find a resource-efficient placement on the available cores for the given target systems. Both algorithms are implemented in two tools and are tested. We conclude that both algorithms complement each other and the analysis results are suitable to create optimised models.
As well as being a useful tool for formal reasoning, a trace can provide insight into a concurrent program's behaviour, especially for the purposes of run-time analysis and debugging. Long-running programs tend to produce large traces which can be difficult to comprehend and visualise. We examine the relationship between three types of traces (CSP, VCR and Structural), establish an ordering and describe methods for conversion between the trace types. Structural traces preserve the structure of composition and reveal the repetition of individual processes, and are thus well-suited to visualisation. We introduce the Starving Philosophers to motivate the value of structural traces for reasoning about behaviour not easily predicted from a program's specification. A remaining challenge is to integrate structural traces into a more formal setting, such as the Unifying Theories of Programming – however, structural traces do provide a useful framework for analysing large systems.
This paper describes the application of the Analytical Software Design methodology to the development of a mathematically verified I2C device driver for Linux. A model of an I2C controller from NXP is created, against which the driver component is modelled. From within the ASD tool the composition is checked for deadlock, livelock and other concurrency issues by generating CSP from the models and checking these models with the CSP model checker FDR. Subsequently C code is automatically generated which, when linked with a suitable Linux kernel runtime, provides a complete defect-free Linux device driver. The performance and footprint are comparable to handwritten code.
Escape analysis is the process of discovering boundaries of dynamically allocated objects in programming languages. For object-oriented languages such as C++ and Java, this analysis leads to an understanding of which program objects interact directly, as well as what objects hold references to other objects. Such information can be used to help verify the correctness of an implementation with respect to its design, or provide information to a run-time system about which objects can be allocated on the stack (because they do not “escape” the method in which they are declared). For existing object-oriented languages, this analysis is typically made difficult by aliasing endemic to the language, and is further complicated by inheritance and polymorphism. In contrast, the occam-π programming language is a process-oriented language, with systems built from layered networks of communicating concurrent processes. The language has a strong relationship with the CSP process algebra, that can be used to reason formally about the correctness of occam-π programs. This paper presents early work on a compositional escape analysis technique for mobiles in the occam-π programming language, in a style not dissimilar to existing CSP analyses. The primary aim is to discover the boundaries of mobiles within the communication graph, and to determine whether or not they escape any particular process or network of processes. The technique is demonstrated by analysing some typical occam-π processes and networks, giving a formal understanding of their mobile escape behaviour.
During the design of a small channel-based concurrency runtime system (ChanSched, written in ANSI C), we saw that application timers (which we call egg and repeat timers) could be part of its supported ALT construct, even if their states live through several ALTs. There are no side effects into the ALT semantics, which enable waiting for channels, channel timeout and, now, the new application timers. Application timers are no longer busy polled for timeout by the process. We show how the classical occam language may benefit from a spin-off of this same idea. Secondly, we wanted application programmers to be freed from their earlier practice of explicitly coding communication states at channel synchronisation points, which was needed by a layered in-house scheduler. This led us to develop an alternative to the non-ANSI C “computed goto” (found in gcc). Instead, we use a switch/case with goto line-number-tags in a synch-point-table for scheduling. We call this table, one for each process, a proctor table. The programmer does not need to manage this table, which is generated with a script, and hidden within an #include file.
The LLVM compiler infrastructure project provides a machine independent virtual instruction set, along with tools for its optimisation and compilation to a wide range of machine architectures. Compiler writers can use the LLVM's tools and instruction set to simplify the task of supporting multiple hardware/software platforms. In this paper we present an exploration of translation from stack-based Extended Transputer Code (ETC) to SSA-based LLVM assembly language. This work is intended to be a stepping stone towards direct compilation of occam-π and similar languages to LLVM's instruction set.
This paper describes an implementation of resumable and mobile processes for a new process-oriented language called ProcessJ. ProcessJ is based on CSP and the π-calculus; it is structurally very close to occam-π, but the syntax is much closer to the imperative part of Java (with new constructs added for process orientation). One of the targets of ProcessJ is Java bytecode to be executed on the Java Virtual Machine (JVM), and in this paper we describe how to implement the process mobility features of ProcessJ with respect to the Java Virtual Machine. We show how to add functionality to support resumability (and process mobility) by a combination of code rewriting (adding extra code to the generated Java target code), as well as bytecode rewriting.
OpenComRTOS is one of the few Real-Time Operating Systems for embedded systems that was developed using formal modelling techniques. The goal was to obtain a proven dependable component with a clean architecture that delivers high performance on a wide variety of networked embedded systems, ranging from a single processor to distributed systems. The result is a scalable relibable communication system with real-time capabilities. Besides, a rigorous formal verification of the kernel algorithms led to an architecture which has several properties that enhance safety and real-time properties of the RTOS. The code size in particular is very small, typically 10 times less than a typical equivalent single processor RTOS. The small code size allows a much better use of the on-chip memory resources, which increases the speed of execution due to the reduction of wait states caused by the use of external memory. To this point we ported OpenComRTOS to the MicroBlaze processor from Xilinx, the Leon3 from ESA, the ARM Cortex-M3, the Melexis MLX16, and the XMOS. In this paper we concentrate on the Microblaze port, which is an environment where OpenComRTOS competes with a number of different operating systems, including the standard operating system Xilinx Micro Kernel. This paper reports code size figures of the OpenComRTOS on a MicroBlaze target. We found that this code size is considerably smaller compared with published code sizes of other operating systems.
We describe an experiment which aims to reduce significantly the costs of running a particular large-scale grid-enabled application using commercial cloud computing resources. We incorporate three tactics into our experiment: improving the serial performance of each work unit, seeking the most cost-effective computation cycles, and making use of an optimized resource manager and scheduler. The application selected for this work is a genetics association analysis and is representative of a common class of embarrassingly parallel problems.
In this paper, we discuss the implementation of a simple pedestrian simulation that uses a multi agent based design pattern developed by the CoSMoS research group. Given the nature of Multi Agent Systems (MAS), parallel processing techniques are inevitably used in their implementation. Most of these approaches rely on conventional parallel programming techniques, such as threads, Message Passing Interface (MPI) and Remote Method Invocation (RMI). The CoSMoS design patterns are founded on the use of Communicating Sequential Processes (CSP), a parallel computing paradigm that emphasises a process oriented rather than object oriented programming perspective.
Localised mobile channel support is now a feature of Communicating Process Architecture (CPA) based frameworks, from JCSP and C++CSP to occam-π. Distributed mobile channel support has also been attempted in JCSP Networking and occam-π via the pony framework, although the capabilities of these two separate approaches is limited and has not led to the widespread usage of distributed mobile channel primitives. In this paper, an initial investigation into possible models that can support distributed channel mobility are presented and analysed for features such as transmission time, robustness and reachability. The goal of this work is to discover a set of models which can be used for channel mobility and also supported within the single unified protocol for distributed CPA frameworks. From the analysis presented in this paper, it has been determined that there are models which can be implemented to support channel end mobility within a single unified protocol which provide suitable capabilities for certain application scenarios.
Some message-passing concurrent systems, such as occam 2, prohibit aliasing of data objects. Communicated data must thus be copied, which can be time-intensive for large data packets such as video frames. We introduce automatic mobility, a compiler optimisation that performs communications by reference and deduces when these communications can be performed without copying. We discuss bounds for speed-up and memory use, and benchmark the automatic mobility optimisation. We show that in the best case it can transform an operation from being linear with respect to packet size into constant-time.
This paper introduces a denotational model and refinement theory for a process algebra with mobile channels. Similarly to CSP, process behaviours are recorded as trace sets. To account for branching-time semantics, the traces are decorated by structured locations that are also used to encode the dynamics of channel mobility in a denotational way. We present an original notion of split-equivalence based on elementary trace transformations. It is first characterised coinductively using the notion of split-relation. Building on the principle of trace normalisation, a more denotational characterisation is also proposed. We then exhibit a preorder underlying this equivalence and motivate its use as a proper refinement operator. At the language level, we show refinement to be tightly related to a construct of delayed sums, a generalisation of non-deterministic choices.
PyCSP was introduced two years ago and has since been used by a number of programmers, especially students. The original motivation behind PyCSP was a conviction that both Python and CSP are tools that are especially well suited for programmers and scientists in other fields than computer science. Working under this premise the original PyCSP was very similar to JCSP and the motivation was simply to provide CSP to the Python community in the JCSP tradition. After two years we have concluded that PyCSP is indeed a usable tool for the target users; however many of them have raised some of the same issues with PyCSP as with JCSP. The many channel types, lack of output guards and external choice wrapped in the select-then-execute mechanism were frequent complaints. In this work we revisit PyCSP and address the issues that have been raised. The result is a much simpler PyCSP with only one channel type, support for output guards, and external choice that is closer to that of occam than JCSP.
In this work we motivate and describe three unique implementations of processes for PyCSP: process, thread and greenlet based. The overall purpose is to demonstrate the feasibility of Communicating Sequential Processes as a framework for different application types and target platforms. The result is a set of three implementations of PyCSP with identical interfaces to the point where a PyCSP developer need only change which implementation is imported to switch to any of the other implementations. The three implementations have different strengths; processes favors parallel processing, threading portability and greenlets favor many processes with frequent communication. The paper includes examples of applications in all three categories.
Recently, much discussion has taken place within the Python programming community on how best to support concurrent programming. This paper describes a new Python library, python-csp, which implements synchronous, message-passing concurrency based on Hoare's Communicating Sequential Processes. Although other CSP libraries have been written for Python, python-csp has a number of novel features. The library is implemented both as an object hierarchy and as a domain-specific language, meaning that programmers can compose processes and guards using infix operators, similar to the original CSP syntax. The language design is intended to be idiomatic Python and is therefore quite different to other CSP libraries. python-csp targets the CPython interpreter and has variants which reify CSP process as Python threads and operating system processes. An equivalent library targets the Jython interpreter, where CSP processes are reified as Java threads. jython-csp also has Java wrappers which allow the library to be used from pure Java programs. We describe these aspects of python-csp, together with performance benchmarks and a formal analysis of channel synchronisation and choice, using the model checker SPIN.
This paper investigates the feasibility of developing a CSP to Python translator using a concurrent framework for Python. The objective of this translation framework, developed under the name of Hydra, is to produce a tool that helps programmers implement concurrent software easily using CSP algorithms. This objective was achieved using the ANTLR compiler generator tool, Python Remote Objects and PyCSP. The resulting Hydra prototype takes an algorithm defined in CSP, parses and converts it to Python and then executes the program using multiple instances of the Python interpreter. Testing has revealed that the Hydra prototype appears to function correctly, allowing simultaneous process execution. Therefore, it can be concluded that converting CSP to Python using a concurrent framework such as Hydra is both possible and adds flexibility to CSP with embedded Python statements.
We consider the language of CSP extended with a construct that allows processes to test whether a particular event is available (without actually performing the event). We present an operational semantics for this language, together with two congruent denotational semantic models. We also show how this extended language can be simulated using standard CSP, so as to be able to analyse systems using the model checker FDR.
Toc is an experimental programming language based on occam that combines CSP-based concurrency with integrated specification of timing requirements. In contrast to occam with strict round-robin scheduling, the Toc scheduler is lazy and does not run a process unless there is a deadline associated with its execution. Channels propagate deadlines to dependent tasks. These differences from occam necessitate a different approach to programming, where a new concern is to avoid dependencies and conflicts between timing requirements. This paper introduces client-server design patterns for Toc that allow the programmer precise control of timing. It is shown that if these patterns are used, the deadline propagation graph can be used to provide sufficient conditions for schedulability. An alternative definition of deadlock in deadline-driven systems is given, and it is demonstrated how the use of the suggested design patterns allow the absence of deadlock to be proven in Toc programs. The introduction of extended rendezvous into Toc is shown to be essential to these patterns.
Device and service discovery is a very important topic when considering pervasive environments. The discovery mechanism is required to work in networks with dynamic topology and on limited software, and be able to accept different device descriptions. This paper presents a service and device discovery mechanism using JCSP agents and the JCSP network package jcsp.net2.