Ebook: Dependable Software Systems Engineering
We are all increasingly dependent on software systems to run the technology we use every day, so we need these systems to be both reliable and safe.
This book presents papers from the NATO Advanced Study Institute Summer School Dependable Software Systems Engineering, held in Marktoberdorf, Germany, in July and August 2014. Lecturers were drawn from prestigious research groups representing both industry and academia, and the course was designed as an in-depth presentation and teaching of state-of-the-art scientific techniques and methods covering research and industrial practice as well as scientific principles.
Topics covered included: syntax-guided synthesis; system behaviors and problem frames; dependable human-intensive systems; automatic alias analysis and frame inference; fault-based testing; and mechanized unifying theories of programming.
Marktoberdorf is one of the most renowned international computer science summer schools, and this book, with its detailed overview of current research results and the discussion and development of new ideas will be of interest to all those whose work involves the engineering of dependable software systems.
To cover the broad range of engineering dependable software systems, we assembled in our NATO Advanced Study Institute Summer School Marktoberdorf 2014 a group of internationally leading researchers and experts. They taught their profound knowledge and experience in Software Engineering of Dependable Software Systems. Our lecturers representing excellent research groups from industry as well as universities presented a detailed and excellent overview on current research results with special emphasis to solve current problems. To impart this big picture in the excellent networking as a “ASI-Summer School Marktoberdorf” the course was designed as an in-depth presentation and excellent teaching of state-of-the-art scientific techniques and methods covering research and industrial practice as well as scientific foundations.
RAJEEV ALUR lectured on the topic Syntax-Guided Synthesis. The classical formulation of the program-synthesis problem is to find a program that meets a correctness specification given as a logical formula. Recent work on program synthesis and program optimization illustrates many potential benefits of allowing the user to supplement the logical specification with a syntactic template that constrains the space of allowed implementations. The goal was to identify the core computational problem common to these proposals in a logical framework. He described three different instantiations of the counter-example-guided-inductive-synthesis (CEGIS) strategy for solving the synthesis problem, report on prototype implementations, and presented experimental results on an initial set of benchmarks. The lecture held by THOMAS BALL was on Automated Test Generation via SAT Solvers. Satisfiability (SAT) and Satisfiability Modulo Theory (SMT) solvers power a new generation of modelling, verification, and testing tools. In these lectures, he focussed on the use of SAT and SMT solvers in automated test generation tools, the applications of these tools in industry (specifically the SAGE and Pex tools from Microsoft), and how interpolating SMT solvers can be used to bridge the gap between testing and verification via the process of lazy annotation.
From Actions, Transactions, and Processes to Services was the lecture series by MANFRED BROY. In the problem domain of business process engineering and in the solution domain of service oriented architectures (SOA) he introduced, modelled, and formalized in notions of business processes such as action, actor, event, business process, business transaction, and service a systematic way key in terms of a rigorous mathematical system model. The goal was to provide a minimal set of formal modelling concepts, nevertheless expressive enough to formalize key notions and concepts in business process engineering and service-oriented architectures capturing also their mutual relationships. The purpose of the approach is manifold; a clear definition of terminology, concepts, terms, and models in business process modelling and SOA; a rigorous formal basis for the specification, design and development of business processes and, in particular, SOAs, and a strictly formal concept for the development steps going from business process models to services as part of a SOA-oriented development process.
MICHAEL JACKSON lectured on System Behaviours and Problem Frames. A cyber-physical system comprises the machine, executing the software, and the problem world in which the system behaviour is evoked by the machine and will be observed and experienced by the stakeholders. The configuration of these elements constitutes a problem with two aspects: design of a behaviour to satisfy the stakeholders requirements; and specification of a machine that will ensure that behaviour in the problem world. Realistic systems present two major challenges. First, complexity, due to the proliferation and interaction of features and the related need – in critical systems – for a wide operational envelope. Second, the physicality of the problem world. At the relevant granularities the physical world is non-formal, presenting severe difficulties for reliable operation and for the formal reasoning necessary for system development.
AXEL VAN LAMSWEERDE held a lecture Dependable Human-Intensive Systems. Human-intensive systems involve complex cooperation processes among software applications, devices, and people, where the human contributions require domain expertise and have a significant impact on the success or failure of the overall mission. Such systems are in general safety-critical as exemplified by flight control, disaster management, or medical process support systems. The course focussed on medical human-intensive systems where decisions regulate safety-critical processes within the system. It presented a variety of formal techniques for building and analyzing models of human-intensive systems. A formal language supporting both event-based and state-based specifications was introduced as a process modelling language enabling suitable analyses. The course discussed formal operators for composing process model views in a coherent way or, conversely, for decomposing process models into specific views that abstract from details irrelevant to a specific dimension and also showed how goal-oriented techniques can be applied to declarative models of decision-making tasks to support risk analysis and model restructuring.
The topic of JOHN McDERMID's lecture was Development of Safety and Dependability Cases. Many of the systems on which society now depends have the ability to affect safety – either positively or negatively. It is difficult to make such systems safe; it is also difficult to demonstrate that they are safe. The focus of these talks was on demonstrating safety, through the use of safety cases, and on broader issues such as dependability. Development activities produce designs; safety analysis provides information about the potential for unsafe behavior of a design; effective safety management uses the results of analysis to drive development to produce systems that are “safe enough”; safety cases provide the argument (rationale) why (it is believed) the system is “safe enough” to use. Evolving Software Ecosystems was the title of the lecture given by TOM MENS. Software Ecosystems are collections of interacting and communicating software projects developed by the same developer community. Managing the complex interactions and dynamics is an important problem that raises additional challenges compared to the maintenance of individual software systems. He explored the state of the art in the empirical analysis of evolving software ecosystems with a focus on the evolutionary aspect, drew the analogy between software ecosystems and natural ecosystems, and explored what can be learnt from ecology and biological evolution in order to increase understanding of software ecosystem evolution.
The lecture series given by BERTRAND MEYER was on Automatic Alias Analysis and Frame Inference. Take two expressions e and f, denoting references (pointers), and a program location p. When an execution reaches p, can e and f ever be attached (i.e., point) to a single object? This is the aliasing problem, undecidable like anything else of interest in computer science, but for which we may nevertheless expect good enough solutions. Applications abound, from the general verification of object-oriented programs to frame analysis and even deadlock analysis. The lectures described various forms of the alias calculus and how it has been applied to addressing these problems, as part of an integrated program development and verification environment.
DORON PELED lectured on the topic Automatic Verification and Synthesis of Finite State Systems. With the advancement of computing power, and the increased insight into formal methods, it is only natural to ask whether one could automate the transformation from specification into working code (or at least, a skeleton of code that can be further adjusted to fit one's architecture and some additional constraints). This series of lectures included a fast tutorial on classical model checking and the state of the art in this area, and introduced the basics of software synthesis, together with some theoretical constraints that are responsible for the fact that programming is still a human rather than a computerized activity.
The course given by ALEXANDER PRETSCHNER was about Fault Based Testing. Intuition and textbooks suggest that good tests are those that unveil faults. Hypothesizing that there are correct programs, this means that there are no good tests for correct programs. A modified definition of good test cases then demands that tests unveil potential faults, and furthermore do so with good cost effectiveness. In this set of lectures, we studied the theoretical and empirical fault detection ability of partition-based testing, specifically coverage-based testing, both when used for test case selection and test case assessment. To show the applicability of this model, he showed how this theory captures fault models including simple division-by-zero or stuck-at-1 faults, but also limit testing, testing strategies for finite state machines, testing strategies for object-oriented software, performance as well as security testing, common concurrency faults, and combinatorial testing. Finally, he discussed the usefulness of fault models for the derivation of check lists and presented a live cycle model for fault models.
Electronic UTP: Mechanised Unifying Theories of Programming was the title of JIM WOODCOCK's lecture. Hoare & He's Unifying Theories of Programming (UTP) is a long-term research agenda to understand the relationship between different programming theories and pragmatic programming paradigms. The origin of UTP is in the work on predicative programming that was started by Hehner, and it supports three different ways to study such relationships: (i) by computational paradigm; (ii) by level of abstraction; and (iii) by method of presentation. These lectures formed a tutorial introduction to three key theories in UTP and their implementation in the Isabelle/HOL interactive theorem prover. We described mechanised theories of (i) relations, (ii) designs (pre-postcondition pairs), and (iii) reactive processes; the first two in detail, the last in outline.
PAMELA ZAVE gave a lecture on Compositional Models of Network Architecture. In the future, the Internet will need to support an even wider variety of applications, stakeholders, resources, endpoint devices, communication functions, service guarantees, and resource policies than it does today. Extending the capabilities of a network architecture means adding configuration data, functional mechanisms, operating constraints, etc. The first goal of this work was to describe all real networks by specializing and composing modules of this type. Subsequently, she used the model to study enhanced capabilities, discovering the properties and principles that enable them to be synergistic and non-interfering. Ultimately, a rigorous understanding of compositionality can turn into automated tools for network design, analysis, and implementation.
High-level scientific presentations as well as working and learning together shows the characteristics of the Summer School series. Such activities are essential for future scientific results in computer science and consequently the development of large-scale reliable and secure software systems – Dependable Software Systems. The Summer School provided once more two weeks of learning, discussion and development of new ideas, and was be a productive and beneficial event, at both the professional and social level.
We thank all lecturers, the staff, and hosts in Marktoberdorf. Special thanks must be addressed to Dr. Katharina Spies and Silke Müller for their great support.
The Marktoberdorf Summer School was arranged as an Advanced Study Institute of the NATO Science for Peace and Security Programme.We thank all authorities involved.
THE EDITORS
The classical formulation of the program-synthesis problem is to find a program that meets a correctness specification given as a logical formula. Recent work on program synthesis and program optimization illustrates many potential benefits of allowing the user to supplement the logical specification with a syntactic template that constrains the space of allowed implementations. Our goal is to identify the core computational problem common to these proposals in a logical framework. The input to the syntax-guided synthesis problem (SyGuS) consists of a background theory, a semantic correctness specification for the desired program given by a logical formula, and a syntactic set of candidate implementations given by a grammar. The computational problem then is to find an implementation from the set of candidate expressions so that it satisfies the specification in the given theory. We describe alternative solution strategies that combine learning, counterexample analysis and constraint solving. We report on prototype implementations, and present experimental results on the set of benchmarks collected as part of the first SyGuS-Comp competition held in July 2014.
Dynamic symbolic execution (DSE) is a well-known technique for automatically generating tests to achieve higher levels of coverage in a program. Two keys ideas of DSE are to: (1) seed symbolic execution by executing a program on an initial input; (2) use concrete values from the program execution in place of symbolic expressions whenever symbolic reasoning is hard or not desired. We describe DSE for a simple core language and then present a minimalist implementation of DSE for Python (in Python) that follows this basic recipe. The code is available at https://www.github.com/thomasjball/PyExZ3/ (tagged “v1.0”) and has been designed to make it easy to experiment with and extend.
For the problem domain of business process engineering we introduce, model, and formalize notions of business processes such as action, actor, event, business process, business transaction. In addition, for the solution domain of service-oriented architectures (SOA) we introduce, model, and formalize notions of service, service composition, service-oriented architecture, and layered SOA in a systematic way. We do that by a rigorous mathematical system model. For that purpose, we first develop a basic mathematical system model for formalizing fundamental concepts of processes and services. The goal is to provide a minimal set of formal modeling concepts, nevertheless expressive enough to formalize key notions and concepts in business process engineering and service-oriented architectures capturing also their mutual relationships. This way, the relationship between central notions in business process modeling is captured formally, which provides a basis for a methodology for deriving the systematic specification and design of service-oriented architectures from business process modeling. The purpose of the approach is manifold; one goal is a clear definition of terminology, concepts, terms, and models in business process modeling and SOA; another goal is a rigorous formal basis for the specification, design, and development of business processes and, in particular, SOAs. We end up with a strictly formal concept for the development steps from business process models to services as part of a SOA-oriented development process.
Two salient issues in software engineering are the proper relationships of requirements to design and implementation and of formal to non-formal development techniques. Both of these issues are prominent in the development of software for cyber-physical systems—that is, for systems whose explicit purpose is to ensure desired effects and behaviours in the physical world. In such systems the software monitors and controls those parts of the world in which the desired effects and behaviours are located: the system behaviour is the result of this interaction between the machine and the problem world. In this paper these salient issues are addressed using the Problem Frames approach. Requirements are understood as desired properties of the system behaviour: design of the structure and content of that behaviour is the central development task. The relationship between formal and non-formal techniques is one of dependence: pre-formal development activities must precede the deployment of full formalisation and formal verification. These pre-formal activities embody the exploration, investigation and design that are an essential preliminary to formal work. They clarify the structure of system behaviour as an assemblage of constituent behaviours, separating the identification and study of those behaviours from the careful treatment of their interactions. In this way formal reasoning is accorded its proper role and an intellectual structure established within which it can play its due part in the development of dependable systems.
The role of high-quality models is increasingly recognized for driving and documenting processes in safety-critical domains such as medical processes. In general, the environment of a complex process has to deal with multiple views referring to different concerns, agents, resources, locations, and so forth. The variety of needs along those views calls for multiple, complementary and consistent facets of a composite process model, where each facet addresses a specific view. Building multi-view process models is in our experience hard and error-prone. To address this challenge, the paper describes formal operators for composing process model facets in a coherent way and, conversely, for decomposing process models into specific facets that abstract from details irrelevant to a specific view. These operators are grounded on the formal trace semantics provided by our process language and its supporting analysis toolset. The paper shows how these operators are automated and reports on their use in a real case study of model construction for a particle-therapy center.
Computers are key elements of many systems, including utilities and transportation. Such systems are reliant on software to ensure safety, i.e. to protect human life and health. It is important for designers both to achieve safety and to demonstrate that they have done so. This chapter addresses both issues, including ways of demonstrating safety using safety cases. Software also has a role in system availability, reliability, and cyber security. We use the term dependability for this broader set of system attributes; the chapter also considers how to address these attributes, and trade-offs between them, using dependability cases Finally, the chapter considers some of the emerging challenges of achieving and demonstrating the safety of systems of systems, and of autonomous systems.
Software ecosystems are collections of interacting and communicating software projects developed by the same developer community. Dealing with the complex interaction and dynamics of these ecosystems is an important problem that raises additional challenges compared to the maintenance of individual software systems. In particular, the social aspect becomes crucial, requiring to understand evolving ecosystems as dynamic socio-technical networks. Based on software repository mining research advances, in this lecture I explore the state of the art in the empirical analysis of software ecosystems, with a focus on the evolutionary aspect. I present some of the challenges that need to be overcome during the extraction and cleaning of data obtained from different data sources (such as version control repositories, bug trackers, and mailing lists). I draw an analogy between software ecosystems and natural ecosystems, and illustrate through previously reported empirical case studies how ecological diversity metrics and econometric inequality metrics can be used to increase understanding of evolving software ecosystems from dual viewpoints. It should be noted that this article has been conceived as a work of reference. It does not provide any novel research results by itself, but rather introduces the important concepts and challenges in the field. It also provides many pointers for further reading for those researchers that wish to start working in this important research domain.
Some parts of the software verification process require human annotation, but as much as possible of the rest should be automatic. An excellent candidate for full automation is change analysis, also known as the frame problem: how to determine which program properties remain unchanged under a given operation. The problem is particularly delcate in the case of programs using pointers or references, in particular object-oriented programs. The double frame inference strategy automates both frame specification and frame verification. On the specification side, it deduces the set of permissible changes of a routine (its “modifies clause”) from a simple examination of its postcondition. On the implementation side, it applies the change calculus, itself based on the alias calculus, to determine the set of expressions whose values the routine can actually change. Frame verification then consists of ascertaining that the actual change set is a subset of the permissible change set.
Model checking methods are used to verify the correctness of digital circuits and code against their formal specification. In case of design or programming errors, they provide counterexample evidence of erroneous behavior. Model checking techniques suffer from inherent high complexity. New model checking methods attempt to speed it up and reduce the memory requirement. Recently, the more ambitious task of converting the formal specification automatically into correct-by-design code has gained significant progress. In this paper, automata-based techniques for model checking and automatic synthesis are described.
What is a good test case? One that reveals potential defects with good cost-effectiveness. We provide a generic model of faults and failures, formalize it, and present its various methodological usages for test case generation.
This is an introduction to mechanised theory engineering in Isabelle, an LCF-style interactive theorem prover. We introduce an embedding of Hoare & He's Unifying Theories of Programming (UTP) in Isabelle (named Isabelle/UTP) and show how to mechanise two key theories: relations and designs. These theories are sufficient to give an account of the partial and total correctness of nondeterministic sequential programs and of networks of reactive processes. A tutorial introduction to each theory is interspersed with its formalisation and with mechanised proofs of relevant properties and theorems. The work described here underpins specification languages such as Circus, which combines state-rich imperative operations, communication and concurrency, object orientation, references and pointers, real time, and process mobility, all with denotational, axiomatic, algebraic, and operational semantics.
The geomorphic view of networking is a modular theory of contemporary networking, which is currently being developed. This paper presents the basic components and ideas of this theory in the form of a genesis story—what is needed to build the simplest possible network, and why other structures must be added as the network grows bigger and more complex. The story ends with the contemporary Internet, which is vastly more complex than what the classic Internet architecture describes. The paper also summarizes work on formalizing the theory and applying it to solve real problems.