Ebook: Advances in Mathematical Modeling for Reliability
Advances in Mathematical Modeling for Reliability discusses fundamental issues on mathematical modeling in reliability theory and its applications. Beginning with an extensive discussion of graphical modeling and Bayesian networks, the focus shifts towards repairable systems: a discussion about how sensitive availability calculations parameter choices, and emulators provide the potential to perform such calculations on complicated systems to a fair degree of accuracy and in a computationally efficient manner. Another issue that is addressed is how competing risks arise in reliability and maintenance analysis through the ways in which data is censored. Mixture failure rate modeling is also a point of discussion, as well as the signature of systems, where the properties of the system through the signature from the probability distributions on the lifetime of the components are distinguished. The last three topics of discussion are relations among aging and stochastic dependence, theoretical advances in modeling, inference and computation, and recent advances in recurrent event modeling and inference.
The Mathematical Methods in Reliability conferences serve as a forum for discussing fundamental issues on mathematical modeling in reliability theory and its applications. It is a forum that brings together mathematicians, probabilists, statisticians, and computer scientists with a central focus upon reliability.
The University of Strathclyde hosted the fifth in the series of conferences in Glasgow in 2007. Previous conferences were held in Bucharest, Romania, in Bordeaux, France, in Trondheim, Norway, and in Sante Fe, New Mexico, USA.
This book contains a selection of papers originally presented at the conference and now made available to a wider audience in revised form. The book has been organized into a number of sections that represent different themes from the meeting, and important current research areas within the overall area.
1. Graphical Modeling and Bayesian Networks
Graphical methods are becoming increasing popular for modeling and supporting the computation of the reliability of complex systems. The papers within this section address a number of challenges currently facing these methods. Langseth provides a brief review of the state of the art of Bayesian Networks in relation to reliability and then focuses on the current challenges of modeling continuous variables within this framework. Hanea and Kurowicka extend the theory for non-parametric continuous Bayesian Networks to include ordinal discrete random variables, where dependence is measured through rank correlations. Donat, Bouillaut and Leray develop a Bayesian Network approach to capture reliability that is changing dynamically. Jonczy and Haenni develop a method using propositional directed acyclic graphs to represent the structure function and hence facilitate the computation of the reliability of networks.
2. Repairable Systems Modeling
One of the fundamental problems in reliability is to find adequate models for failure and repair processes. Simple renewal models provide familiar examples to students of probability and reliability, and provide the basic building blocks for many commercial simulation packages. However, such models do not come near to describing the complex interactions between failure and repair. The paper of Kahle looks at the way (possibly) incomplete repair interacts with the failure process through a Kijima type process. Often the overall failure repair process in real systems follows a homogeneous Poisson process, and Kahle shows that maintenance schedules can be constructed to generate this type of output. Volf looks at models where degradation is modeled through a number of shocks or some other random process, and considers how one can choose optimal repair policies that stabilize the equipment hazard rate. Finally, Daneshkhah and Bedford show how Gaussian emulators can be used to perform computations of availability. A major problem in practice is to understand how sensitive availability calculations are to parameter choices, and emulators provide the potential to perform such calculations on complicated systems to a fair degree of accuracy and in a computationally efficient manner.
3. Competing Risk
Competing risks arise in reliability and maintenance analysis through the ways in which data is censored. Rather than getting “pure” failure data we usually have a messy mixture of data, for there may be many different reasons for taking equipment offline and bringing it back to “as new”, or at least in an improved state. A competing risk model is used to model the times at which such failure causes would be realized, taking into account possible interdependencies between them. There has been a growing interest in competing risk modeling over the last 10–15 years, and the papers presented here demonstrate this. Dewan looks at the interrelationship between various kinds of independence assumptions in competing risk modeling. Sankaran and Ansa consider the problem in which the failure cause is sometimes masked, and additional testing might be required to find the true failure cause. The final two papers of this section move from IID models of Competing Risk to take a point process perspective. Lindqvist surveys a number of recent papers on this topic and discussing the benefits of moving to this wider framework. Finally Dijoux, Doyen and Gaudoin generalize the “usual” independent competing model theory for IID and show that in the point process generalization one can properly formulate and solve the corresponding identifiability issues.
4. Mixture Failure Rate Modeling
Mixture models provide a means of analyzing reliability problems where there exist, for example, multiple failure modes or heterogeneous populations. It will not always be possible to observe all factors influencing the time to event occurrence, hence a random effect, called a frailty, can be included in the model. A frailty is an unobserved proportionality factor that modifies the hazard function of an item or group of items. Frailty models can be classed as univariate, when there is a single survival endpoint, or multivariate, when there are multiple survival endpoints such as under competing risks or recurrent event processes. There is much interest in modeling mixtures and frailty in survival analysis. We include two papers in this area. Finkelstein and Esaulova derive the asymptotic properties of a bivariate competing risks model, where the lifetime of each component is indexed by a frailty parameter and, under the assumption of conditional independence of the components, the correlated frailty model is considered. The other paper, due to Badá and Berrade, aims to give insights into the properties of the reversed hazard rate, defined as the ratio of the density to the distribution function, and the mean inactivity time in the context of mixtures of distributions.
5. Signature
The signature of a system refers to a vector where the i-th element is the probability that the system fails upon the realization of i components. The Samaniego representation of the failure time of a system distinguishes the properties of the system through the signature from the probability distributions on the lifetime of the components. Such a representation is effective for comparing the reliability of different systems. This section of papers is concerned with developments of Samaniego representation. Rychlik develops bounds for the distributions and moments of coherent system lifetimes. Triantafyllou and Koutras develop methods to facilitate the calculation of the signature of a system through generating functions. Hollander and Samaniego develop a new signature based metric for comparing the reliability of systems. An important generalization of the concept of independence is that of exchangeability. This assumption is key to Bayesian and subjectivist modeling approaches. The paper of Spizzichino considers symmetry properties arising as a result of exchangeability and discusses generalizations to non-exchangeable systems.
6. Relations among Aging and Stochastic Dependence
Aging properties have always played an important role in reliability theory, with a multiplicity of concepts available to describe subtle differences in aging behavior. A particularly interesting development is to place such aging concepts in a multivariate context, and consider how multiple components (or multiple failure modes) interact. The paper of Spizzichino and Suter looks at aging and dependence for generalizations of the Marshall-Olkin model. Their work develops closure results for survival copulas in certain classes with specified aging properties. Belzunce, Mulero and Ruiz develop new variants on multivariate increasing failure rate (IFR) and decreasing mean residual life (DMRL) notions. Some of the basic properties and relationships between these definitions are given.
7. Theoretical Advances in Modeling, Inference and Computation
This collection of papers is concerned with developments in modeling, inference and computation for reliability assessment. Ruggeri and Soyer develop hidden Markov modeling approaches and self exciting point process models to address the issue of imperfect reliability development of software. Huseby extends the use of matroid theory to directed network graphs and derives results to facilitate the calculation of the structure function. Coolen and Coolen-Schrijner extend nonparametric predictive inference techniques to address k-out-of-m systems.
8. Recent Advances in Recurrent Event Modeling and Inference
Recurrent event processes correspond to those processes where repeated events are generated over time. In reliability and maintenance, recurrent event processes may correspond to failure events of repaired systems, processes for detection and removal of software faults, filing of warranty claims for products and so forth. Common objectives for recurrent event analysis includes describing the individual event processes, characterizing variation across processes, determining the relationship of external factors on the pattern of event occurrence and modeling multi-state event data. Model classes include Poisson, renewal and intensity-based for which a variety of parametric, semi-parametric and non-parametric inference is being developed. There has been growing interest in recurrent event analysis and modeling in reliability, medical and related fields as the papers presented here demonstrate. Adekpedjou, Quiton and Peña consider the problem of detecting outlying inter-event times and examine the impact of an informative monitoring period in terms of loss of statistical efficiency. Mercier and Roussignol study and compute the first-order derivatives for some functional of a piece-wise deterministic Markov process, used to describe the time-evolution of a system, to support sensitivity analysis in dynamic reliability. Lisnianski considers a multi-state system with a range of performance levels which are observed together with the times at which the system makes a transition in performance state and provides a method for estimating the transition intensities under the assumption that the underlying model is Markovian. Finally, van der Weide, van Noortwijk and Suyono present new results in renewal theory with costs that can be discounted according to any discount function which is non-increasing and monotonic over time.
Acknowledgments
The organization of the conference was made possible by the hard work of a number of different people working at Strathclyde:
Anisah Abdullah, Babakalli Alkali, Samaneh Balali, Tim Bedford, Richard Burnham, Daosheng Cheng, Alireza Daneshkhah, Gavin Hardman, Kenneth Hutchison, Alison Kerr, Haiying Nan, John Quigley, Matthew Revie, Caroline Sisi, Lesley Walls, Bram Wisse.
The conference itself was sponsored by the University of Strathclyde, Glasgow City Council and Scottish Power, whom we thank for their contributions to the event.
Bayesian network (BN) models gain more and more popularity as a tool in reliability analysis. In this paper we consider some of the properties of BNs that have made them popular, consider some of the recent developments, and also point to the most important remaining challenges when using BNs in reliability.
This paper introduces mixed non-parametric continuous and discrete Bayesian Belief Nets (BBNs) using the copula-vine modeling approach. We extend the theory for non-parametric continuous BBNs to include ordinal discrete random variables. The dependence structure among the variables is given in terms of (conditional) rank correlations. We use an adjusted rank correlation coefficient for discrete variables, and we emphasize the relationship between the rank correlation of two discrete variables and the rank correlation of their underlying uniforms. The approach presented in this paper is illustrated by means of an example.
Reliability analysis has become an integral part of system design and operating. This is especially true for systems performing critical tasks. Moreover, recent works in reliability involving the use of probabilistic graphical models, also known as Bayesian networks, have been proved relevant. This paper describes a specific dynamic graphical model, named graphical duration model (GDM), to represent complex stochastic degradation processes with any kind of state sojourn time distributions. We give qualitative and quantitative descriptions of the proposed model and detail a simple algorithm to estimate the system reliability. Finally, we illustrate our approach with a three-states system subjected to one context variable and non-exponential sojourn time distributions.
This paper proposes a new and flexible approach for network reliability computation. The method is based on Propositional Directed Acyclic Graphs (PDAGs), a general graph-based language for the representation of Boolean functions. We introduce an algorithm which creates in polynomial time a generic structure function representation for reliability networks. In contrast to many existing methods, our method does not rely on the enumeration of all mincuts or minpaths, which may be infeasible in practice. From this representation, we can then derive the structure functions for different network reliability problems. Based on the compact PDAG representation, we can both compute the exact reliability or estimate the reliability by means of an approximation method.
We consider an incomplete repair model, that is, the impact of repair is not minimal as in the homogeneous Poisson process and not “as good as new” as in renewal processes but lies between these boundary cases. The repairs are assumed to impact the failure intensity following a virtual age process of the general form proposed by Kijima. In previous works field data from an industrial setting were used to fit several models. In most cases the estimated rate of occurrence of failures was that of an underlying exponential distribution of the time between failures. In this paper it is shown that there exist maintenance schedules under which the failure behavior of the failure-repair process becomes a homogeneous Poisson process. Further, examples of optimal maintenance under incomplete repair are given.
The models of imperfect repairs are mostly based on the reduction of the cumulated hazard rate, either directly or indirectly (by shifting the virtual age of the system). If the state of the system is characterized by a process of deterioration, the repair degree can be connected with the reduction of the deterioration level. Such a view actually transforms the time scale (or the scale given by the cumulated hazard rate) to the scale of the growing deterioration. From the problems connected with such models (consistency of statistical analysis, model fit assessment etc.) we shall discuss mainly the question of the repair schemes, their consequences, and possibilities of an ‘optimal’ repair policy leading to the hazard rate stabilization.
The availability of asystem under afailure/repair process, is a function of time which can be calculated numerically. The sensitivity analysis of this quantity with respect to change in parameters is the main objective of this paper. In the simplest case that the failure repair process is (continuous time/discrete state) Markovian, explicit formulas are well known. Unfortunately, in more general cases this quantity could be a complicated function of the parameters. Thus, the computation of the sensitivity measures would be infeasible or might be time-consuming.
In this paper, we present a Bayesian framework originally introduced by Oakley and O'Hagan [7] which unifies the various tools of probabilistic sensitivity analysis. These tools are well-known to Bayesian Analysis of Computer Code Outputs, BACCO.
In our case, we only need to quantify the availability measure at a few parameter values as the inputs and then using the BACCO to get the interpolation function/sensitivity to the parameters.
The paper gives a brief introduction to BACCO methods, and the availability problem. It illustrates the technique through the use of an example and makes a comparison to other methods available.
Consider a competing risk set up with two risks and the latent failures given by X and Y. Statistical analysis of this model has been done under three different independence assumptions – independence of X and Y, independence of T=min (X,Y), and δ=I(X<Y) and independence between X and δ. We discuss examples where these independence arise and also the relationship between the three.
Consider a system consists of k components and each component is subject to more than one cause of failure. Due to inadequacy in the diagnostic mechanism or reluctance to report any specific cause of failure, the exact cause of failure cannot be identified easily. In such situations, where the cause of failure is masked, test procedures restrict the cause of failure to a set of possible types containing the true failure cause. In this paper, we develop a non-parametric estimator for the bivariate survivor function of competing risk models under masked causes of failure based on the vector hazard rate. Asymptotic properties of the estimator are discussed. We also illustrate the method with a data set.
We consider repairable systems where the observed events may be of several types. It is suggested to model the observations from such systems as marked point processes, leading to a need for extending the theory of repairable systems to a competing risks setting. In this paper we consider in particular virtual age models and their extension to the case of several types of events.
A complex repairable system is subjected to corrective maintenance (CM) and condition-based preventive maintenance (PM) actions. In order to take into account both the dependency between PM and CM and the possibility of imperfect maintenances, a generalized competing risks model have been introduced in [5]. In this paper, we study the particular case for which the potential times to next PM and CM are independent conditionally to the past of the maintenance process. We address the identifiability issue and find a result similar to that of [2] for usual competing risks. We propose a realistic model with exponential risks and derive the maximum likelihood estimators of its parameters.
A bivariate competing risks model is considered for a general class of survival models. The lifetime distribution of each component is indexed by a frailty parameter. Under the assumption of conditional independence of components the correlated frailty model is considered. The explicit asymptotic formula for the mixture failure rate of a system is derived. It is proved that asymptotically, as t→∞, the remaining lifetimes of components tend to be independent in the defined sense.
The reversed hazard rate defined as the ratio of the density to the distribution function shows an increasing importance in reliability analysis. Its connection with the mean inactivity time also stands out. Owing to the growing use of both functions, we aim at giving some insight about its properties in mixtures of distributions.
We consider coherent systems based on dependent components with arbitrary exchangeable and continuous joint distributions. Applying an extension of the Samaniego representation for the system lifetime distributions with independent components to the exchangeable model, we provide some bounds for the distributions and moments of the coherent system lifetimes. In particular, we present sharp upper and lower bounds on the distribution functions and expectations of arbitrary system lifetimes, dependent on the Samaniego signature of the system and the marginal distribution of the components. We further determine more general expectation bounds dependent on the mean and variance of the component lifetime marginal distribution, and respective refinements for restricted classes of distributions. We also consider evaluations of lifetime variances in terms of the marginal distribution and variance of a single component.
In the present article we develop some tools that facilitate the calculation of the signature of a system by a generating function approach. As an application, we establish a recurrence relation for the computation of the signature of a linear consecutive 2-out-of-n-F:system and provide a formula that expresses the signature of a circular consecutive k-out-of-n:F system in terms of the linear one.
While various forms of stochastic domination (including stochastic, hazard rate or likelihood ratio ordering) of one random variable over another have proven useful in making comparisons between systems, they share a common limitation. These modes of comparing systems induce only a partial ordering on the class of systems of interest, leaving some pairs of systems non-comparable. Comparisons via stochastic precedence (as defined in [1]) do not suffer from this limitation. In this paper, we describe how stochastic precedence may be used as a metric in comparing arbitrary systems whose components are assumed to be independent and identically distributed with common distribution F. An explicit computational formula is displayed for the relevant probability P(T1≤T2), where T1 and T2 are system lifetimes. A necessary and sufficient condition depending solely on system signatures is given for stochastic precedence between system lifetimes. Examples are given that illustrate the fact that systems whose lifetimes are not comparable by stochastic, hazard rate or likelihood ratio ordering may be definitively compared via stochastic precedence. In the final section, we focus on comparisons between systems whose signatures are symmetric.
The signature of a coherent system S is a feature of its structure function that, in the case when the lifetimes of the components of S are exchangeable, has a key role in the computation of the reliability function. We detail several aspects of this property and discuss the role that the concept of signature can have also for the case when the components' lifetimes are not exchangeable.
We analyze several aspects of a class of bivariate survival models that arise as a direct generalization of the bivariate exponential Marshall-Olkin model and that describe situations of (possibly dependent) competing risks.
In the literature several authors have proposed multivariate extensions of univariate aging notions such as IFR (increasing failure rate) and DMRL (decreasing mean residual life) notions. Bassan and Spizzichino [2] and Bassan, Kochar and Spizzichino [1] proposed new multivariate notions when the lifetimes of the components have exchangeable joint probability distributions. These new notions are based on stochastic comparisons of the residual lifetimes of the components and are based on definitions and characterizations of IFR and DMRL notions in the univariate case. In these paper we consider new multivariate notions based on known characterizations (see Cao and Wang [7], Belzunce, Hu and Khaledi [4] and Belzunce, Gao, Hu and Pellerey [3]) of IFR and DMRL notions. Some properties and preservation results under mixtures are also given.
This paper reviews recent developments in Bayesian software reliability modeling. In so doing, emphasis is given to two models which can incorporate the case of reliability deterioration due to potential introduction of new bugs to the software during the development phase. Since the introduction of bugs is an unobservable process, latent variables are introduced to incorporate this characteristic into the models. The two models are based, respectively, on a hidden Markov model and a self-exciting point process with latent variables.
The domination function has played an important part in reliability theory. While most of the work in this field has been restricted to various types of network system models, many of the results can be generalized to much wider families of systems associated with matroids. Previous papers have explored the relation between undirected network systems and matroids. In this paper the main focus is on directed network systems and oriented matroids. Classical results for directed network systems include the fact that the signed domination is either +1 or −1 if the network is acyclic, and zero otherwise. It turns out that these results can be generalized to systems derived from oriented matroids. Several classes of such systems will be discussed.
We present lower and upper probabilities for reliability of k-out-of-m systems with exchangeable components. These interval probabilities are based on the nonparametric predictive inferential (NPI) approach for Bernoulli data [5]. It is assumed that test data are available on the components, and that the components to be used in the system are exchangeable with those tested. An attractive feature is the way in which data containing zero failures can be dealt with.
This article presents some results pertaining to recurrent event modeling and analysis. In particular, we consider the problem of detecting outliers and also examine the impact of an informative monitoring period in terms of loss of efficiency. Aside from the ideas and analytical results, we demonstrate these aspects through an application to the well-used air-conditioning reliability data set in [18].
The aim of this paper is to study and to compute first-order derivatives with respect to some parameter p, for some functionals of piecewise deterministic Markov processes (PDMP), in view of sensitivity analysis in dynamic reliability. Such functionals are mean values of some function of the process, cumulated on some finite interval [0,t], and their asymptotic value per unit time.