Zhou Thesis PDF
Zhou Thesis PDF
Zhou Thesis PDF
September 2016
Shuaiwei Zhou
I dedicate this thesis to my wonderful and supportive parents
Declaration
I declare that this thesis has not been submitted as an exercise for a degree at this
I agree to deposit this thesis in the University’s open access institutional repository
Shuaiwei Zhou
This thesis focuses on modelling and inference for maintenance systems for the pur-
pose of utility optimisation. Providing standardised notation throughout, we first
demonstrate the motivation for investigating the problem of modelling and inference
for maintenance systems and briefly state the problems which are to be explored. The
definitions and terminology, which are also used within the general domains of science
and engineering, have been presented in terms of statistical representation.
We propose a Bayesian method to optimise the utility of a two phase maintenance
system sequentially by dynamic programming method. In particular, the parameters
of the failure distribution for the system of interest are analysed within the Bayesian
framework. Utility-based maintenance is modelled in several modified models, in-
cluding imperfect preventive maintenance, time value of money effect in maintenance,
maintenance for systems with discrete failure time distributions, maintenance for par-
allel redundant systems, of which all follow numerical examples. A hybrid approach
combining myopic and dynamic programming method is proposed to solve multi-phase
maintenance systems.
The Bayesian dynamic programming is carried out through the gridding approach
to solve the issue arising from nested series of maximisations and integrations over
a highly non-linear space. The core of gridding method, the increment is studied
extensively. We also utilise and modify the approach proposed by Baker (2006) to
analyse the effect of risk aversion on the variability of system in cash flows.
The potential generalisation of the current models has been discussed and the future
work concerning complicated models and efficient computation methods have also been
indicated.
vii
viii
Acknowledgements
I would like to express my utmost and sincere gratitude to my two wonderful super-
visors, Professor Simon P. Wilson and Professor Brett Houlding, for their invaluable
guidance and constant devotion to me during my PhD research. No matter how stupid
the questions I asked or how naı̈ve I was, they were always extremely patient to explain
and explore potential ideas with me.
Within the discipline of statistics, I would like acknowledge the kind help and
interesting discussions with the professors in our department and beyond, in particular
Myra O’Regan, John Hasslet, Eamonn Mullins and Elizabeth Heron. My memorable
PhD journey has been thankfully shared with fellow researchers who are Jason Wyse,
Louis Aslett, Susanne Schimitz, Arnab Bhattacharya, Tiep Mai, Arthur White, Gernot
Roetzer and Angela McCourt. I have been fortunate to have wonderful friendships
with Sean O’Riordain, Cristina De Persis, Thinh Doan, Donnacha Bolger and Shane
O’Meachair. My appreciation also goes to Charles McLaughlin and Michael Kelly with
their generous help during my PhD life in Ireland.
I am also indebted to my parents, and my brother and his wife, for their incredible
encouragement and support. Xiaoyao and Xiaonan, who burst onto the scene during
my undergraduate and postgraduate time, makes me proud of being a happy uncle.
The generous support of a four-year scholarship by the China Scholarship Council
and a three-year studentship by the University of Dublin made it all feasible.
Shuaiwei Zhou
Trinity College, Dublin
September 2016
ix
x
Abbreviations
xi
xii
Contents
Abstract vii
Acknowledgements ix
Abbreviations xi
Chapter 1 Introduction 1
1.1 Background and Motivation . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Structure and Main Contributions . . . . . . . . . . . . . . . . . . . . . 3
xiii
Chapter 3 Statistical Methodology and Utility 25
3.1 Bayesian Modelling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.1 Likelihood function . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.2 Prior distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.3 Posterior analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.4 Hierarchical Bayesian models . . . . . . . . . . . . . . . . . . . 31
3.1.5 Bayesian method in maintenance . . . . . . . . . . . . . . . . . 33
3.2 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 An Elementary Example . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.2.3 Formalisation under Uncertainty . . . . . . . . . . . . . . . . . 42
3.3 Utility Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.3.1 Utility Functions and Probabilities . . . . . . . . . . . . . . . . 44
3.3.2 Expected Utility . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3.3 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3.4 Utility in Maintenance . . . . . . . . . . . . . . . . . . . . . . . 50
xiv
Chapter 5 Sequential Maintenance Extension 93
5.1 Imperfect Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.2 Time Value of Money . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
5.3 Maintenance in Discrete Time . . . . . . . . . . . . . . . . . . . . . . . 100
5.4 Maintenance for Parallel Systems . . . . . . . . . . . . . . . . . . . . . 103
5.5 PPM under failure time distribution assumptions . . . . . . . . . . . . 105
5.6 Hybrid Myopic-Dynamic Programming . . . . . . . . . . . . . . . . . . 108
5.7 Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.7.1 Gridding Increments . . . . . . . . . . . . . . . . . . . . . . . . 117
5.7.2 Cost Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.7.3 Prior Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
5.7.4 Utility Function Forms . . . . . . . . . . . . . . . . . . . . . . . 123
5.8 Parallel Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
xv
xvi
List of Tables
3.1 Number of failed students for a course with 4 sessions and 6 available
demonstrators. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Dynamic Programming Example: Stage 4. . . . . . . . . . . . . . . . . 37
3.3 Dynamic Programming Example: Stage 3. . . . . . . . . . . . . . . . . 38
3.4 Dynamic Programming Example: Stage 2. . . . . . . . . . . . . . . . . 38
3.5 Dynamic Programming Example: Stage 1. . . . . . . . . . . . . . . . . 39
3.6 Properties of Utility Functions . . . . . . . . . . . . . . . . . . . . . . . 49
xvii
5.1 Optimal Imperfect Preventive Maintenance (IPM) time and correspond-
ing expected cost for chance nodes CN1 and CN22 conditioning on vari-
ous PM power parameter β based on dynamic programming and myopic
methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2 Optimal Perfect Preventive Maintenance (PPM) time and corresponding
expected cost rate for chance node CN1 conditioning on various time
effect parameter r based on dynamic programming. . indicates the
cost rate comparison to its previous one. . . . . . . . . . . . . . . . . . 99
5.3 Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 by dynamic programming condition-
ing on various parameter p of a discrete Weibull failure distribution. . . 102
5.4 Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost rate for each chance node by dynamic programming for
one-unit systems and two-unit redundant parallel systems. Bracketed
figures are failure time Tf1 with respect to Tm1 and Tm2 , numbers in
brackets representing corresponding failure times. . . . . . . . . . . . . 104
5.5 Optimal Perfect Preventive Maintenance (PPM) time and corresponding
expected cost rate for each chance node by dynamic programming based
on Weibull and gamma failure time assumptions. Bracketed figures are
failure time Tf1 with respect to Tm1 and Tm2 , numbers in brackets rep-
resenting corresponding failure times. . . . . . . . . . . . . . . . . . . . 107
5.6 Optimal Perfect Preventive Maintenance (PPM) time and correspond-
ing expected cost rate of chance nodes CN21 and CN22 for three-phase
maintenance systems based on Hybrid Myopic-Dynamic Programming
and myopic methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.7 Comparison of optimal perfect preventive maintenance (PPM) time and
corresponding expected cost for each chance node based on different in-
crement parameter δ. Bracketed figures are failure time Tf1 with respect
to Tm1 and Tm2 , numbers in brackets representing corresponding failure
times. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
xviii
5.8 Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 conditioning on various increment
parameter δ based on dynamic programming. . . . . . . . . . . . . . . 119
5.9 Optimal perfect preventive maintenance (PPM) time and correspond-
ing expected cost for chance node CN1 conditioning on cost difference
between failure and maintenance based on dynamic programming. . . . 121
5.10 Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 conditioning on various prior mean
of parameter θ based on dynamic programming. . . . . . . . . . . . . . 122
5.11 Optimal perfect preventive maintenance (PPM) time and correspond-
ing expected cost for chance node CN1 conditioning on various prior
standard deviation of parameter θ based on dynamic programming. . . 123
5.12 Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for each chance node by dynamic programming based on
non-Log and Log utility functions. Bracketed figures are failure time
Tf1 with respect to Tm1 and Tm2 , numbers in brackets representing cor-
responding failure times. . . . . . . . . . . . . . . . . . . . . . . . . . . 124
xix
xx
List of Figures
xxi
4.12 3D scatter plot for maximal probability of Tf2 conditioning on tf1 , i.e.,
max pTf2 (tf2 | tf1 ), with vertical lines for each point. . . . . . . . . . . . 77
4.13 Maximal probability of Tf2 conditioning on tf1 , i.e., max pTf2 (tf2 | tf1 ). . 77
4.14 Tf2 that has maximal probability conditioning on tf1 , i.e., arg max pTf2 (tf2 | tf1 );
corresponding probabilities (rounded to 2 decimals) shown alongside
dots that represent Tf2 which maximise the probabilities given a spe-
cific tf1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.15 Maximal probability of Tf2 and the corresponding Tf2 ; corresponding tf1
shown alongside dots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.16 Probability of Tf2 conditioning on tf1 > Tm1 , i.e., pTf2 (tf2 | tf1 > Tm1 ),
where Tm1 = 0.1, . . . , 6, by the gridding method. . . . . . . . . . . . . . 80
4.17 Comparison of pTf1 (tf1 ) (red) and pTf2 (tf2 | tf1 > Tm1 ) (green), where
Tm1 = 0.1, 0.5, 1.0, 1.1, 1.5, 2.0, 2.1, 2.5, 3.0, 3.1, 3.5, 4.0, 4.1, 4.5, 5.0,
5.1, 5.5, 5.9. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.18 Expected utilities for two-phase systems at chance nodes CN21 , CN22
and CN1 under the dynamic programming method. . . . . . . . . . . . 85
4.19 Expected costs for two-phase systems at chance nodes CN21 , CN22 and
CN1 under the dynamic programming method. . . . . . . . . . . . . . 87
4.20 Risk aversion parameter η on utility. . . . . . . . . . . . . . . . . . . . 88
4.21 Optimal Maintenance Decision Tree for Two-Phase Systems. . . . . . . 89
4.22 Posterior probability density of θ conditioning on Tf1 (≤ 0.5) and Tf1 >
0.5 compared with prior probability density of θ. . . . . . . . . . . . . . 90
xxii
5.5 Decision tree for three-phase system with sequential problem with shad-
ing indicating a range of possible outcomes for the preceding chance
node; Box DP-1 and DP-2 show the break into two period problems. . 109
5.6 Comparison of probabilities of Tf2 conditioning on varying Tf1 and op-
timal Tm1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
5.7 Posterior probability density of θ conditioning on Tf1 ≤ Tm1 (0.5) and
Tf1 > Tm1 (0.5) compared with prior probability density of θ. . . . . . . 113
5.8 Expected cost rates at chance node CN21 conditioning on tf1 ≤ Tm1
(top-left: tf1 = 0.1, top-right: tf1 = 0.2, middle-left: tf1 = 0.3, middle-
right: tf1 = 0.4 and bottom-left: tf1 = 0.5; and expected cost rates at
chance node CN22 conditioning on tf1 > Tm1 (bottom-right) under the
H-DP-M method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
5.9 Optimal Maintenance Decision Tree for Three-Phase Systems . . . . . . 116
5.10 Sensitivity analysis concerning gridding intervals: Compiling time (top-
left); Optimal Maintenance Time (top-right); Expected Cost Rate (bottom-
left); Expected Cost Rate per Compiling Time Unit (bottom-right). . . 120
5.11 Parallel Computing for p(tf2 , tf1 ) . . . . . . . . . . . . . . . . . . . . . 125
xxiii
xxiv
Chapter 1
Introduction
1
analysis is applied to use accumulating evidence to make advantageous early decisions.
In the context of system engineering, this could help save cost and even improve sys-
tem performance. The Bayesian method of sequential analysis is to make decisions that
minimise the expected value of some loss function which can be viewed as a function
of corresponding inputs and outputs, see DeGroot (1970) and Brockwell and Kadane
(2003). In this thesis, we focus the study on Bayesian sequential analysis applied to
maintenance optimisation of repairable systems.
The study of system maintenance has attracted increasing attention in recent years
because of a need from industry for increasing the reliability and availability of systems
whilst decreasing the associated costs. Percy and Kobbacy (1996) pioneered work in
preventive maintenance modelling from a Bayesian perspective. Damien et al. (2007)
analysed a single item maintenance in a Bayesian semi-parametric setting, which solves
the drawbacks of other models failing to capture the true underlying relationships in
the data. However, their analysis is based on a pre-defined finite time horizon, for
example, see Baker (2010); in other words, the maintenance time phases are pre-defined
which is not practical in reality; in our work, on the contrary, the maintenance time
phases are also pre-defined depending on a particular system but random and flexible,
which meets the maintenance scheduling programme. Nonparametric methods have
also been investigated in system maintenance. Gilardoni et al. (2013) use a power-law-
process parametric method by incorporating the nonparametric maximum likelihood
estimate of an intensity function to estimate the optimal preventive maintenance policy.
However, all these approaches fail to consider sequential maintenance which requires
more complicated modelling and longer computation time.
Maintenance based on prognostics is a prior event analysis and action. By means
of incorporating prognostics into the maintenance decision making process, one could
carry out a maintenance forecast based on known characteristics as well as the evalua-
tion of the significant parameters of the item. With regard to maintenance objectives,
as in most of the literature cost-based optimisation framework is taken (Van Horenbeek
et al., 2010). However, focus should not only be on costs as risk preference is simply
ignored if only cost-oriented objective is taken into account. Utility functions used
to measure risk preferences are ubiquitous in economic research. The little published
work that is the exception occurs in warranty and inventory, see Padmanabhan and
2
Rao (1993); Keren and Pliskin (2006). In fact, the field of maintenance and reliability
is a suitable area to apply risk-averse policies because there are numerous cash flows
occurring stochastically. A drawback of taking cost per unit time as a criterion of
optimality is that two policies might then be equally attractive, even if for one of them
the annual maintenance spend were much more variable than that for the other. What
might be seen by some as over-maintenance, in the sense that mean cost per unit time is
not minimised, could be optimal as a risk-averse policy, in which the large unscheduled
losses from failure have such a dis-utility that very frequent maintenance is carried out.
Clearly, a policy that minimised cost per unit time would be unsatisfactory for a main-
tenance engineer who could not convince management that periods of high loss were
an unavoidable part of an optimal long-term policy or for an enterprise that could not
survive because of short-term cash flow problems. Thus, extra maintenance activity is
an insurance policy against large losses occurring over a period.
Models and methodologies proposed in this thesis are primarily suited for large in-
dustrial purposes, for example, an automatic manufacturing system, a robotic process,
or a computer server for the non-life essential services, in which cases failure is neither
rare or frequent, maintenance itself is not cheap or trivial, but failure is a considerable
expense, though not exorbitantly so. Hence, this approach is not suitable to apply to
maintenance of systems with very high risk aversion properties, e.g., a nuclear power
facility, an off-shore oil field, or a life support system. It is also not worthwhile applying
to trivial systems where the computational cost of performing this analysis outweighs
any savings.
• Chapter 1 introduces the research background and motivation, and briefly out-
lines the structure and main contributions of the thesis.
3
• Although research background and motivation is given in Chapter 1, Chapter
2 gives a detailed review on maintenance modelling and analyses as well as the
fundamental concepts in systems maintenance, from reliability measures to clas-
sical failure time distributions. We highlight the essential publications which are
highly related to the research questions in this thesis.
• For those beyond the statistical research community, Chapter 3 briefly presents
the Bayesian perspective on modelling, and continues to introduce the founda-
tional concepts of the dynamic programming method as well as utility theory,
which are the maintenance optimisation methodologies in this research.
• Chapter 4 solves the sequential maintenance problem under the policy of perfect
preventive maintenance by a dynamic programming method utilising the idea
proposed by Brockwell and Kadane (2003), whereby a grid is constructed in the
maintenance and failure time space, over which the utility functions of expected
cost per unit time are evaluated. This method has a computation time which is
linear in the number of phases in the sequential problem.
• Chapter 6 utilises and modifies the approach proposed by Baker (2006) to inves-
tigate and analyse the effect of risk aversion on the variability of system in cash
flows from a certainty-equivalent point of view.
• Chapter 7 states the major conclusions and contributions of this research and
suggests future work directions.
4
Chapter 2
In this chapter we introduce the difference between non-repairable systems and re-
pairable systems, and classify the maintenance policies and review related modelling
methods.
5
discarded when they fail. Consider for example a small desk-top fan which can be
purchased for less than 10 euro at a discount shop. When such a unit fails, we would
probably discard it and buy another, because the cost of fixing it is greater than that
of purchasing a new one. Many electrical systems are now non-repairable, or they are
more expensive to repair than to replace.
A few definitions used in repairable systems are given below.
Definition 2.2 Local time Failure times of a repairable system are measured in
local time if the failure times are recorded as time since the previous failure. Failures
in local time will be denoted by T1 , T2 , . . ..
Local time is mainly used in the following work, unless explicitly stated.
There are similarities and differences between repairable and non-repairable systems.
A few issues are clarified to understand repairable system behaviour as follows.
For a non-repairable system the lifetime of the system is a random variable. As
there is no repair, the system would be discarded after its one and only failure, and if it
does not have an impact on the performance of a similar system located elsewhere, then
the assumption that different systems have lifetimes that are independent is reasonable.
Also, if many copies of the system were produced by the same manufacturing process,
then it is also reasonable to assume that the system lifetimes have the same distribution.
These two assumptions can be combined into one statement that says the lifetimes are
independent and identically distributed (IID) from some distribution having cumulative
distribution function (CDF) F (x).
6
Definition 2.4 Cumulative Distribution Function The cumulative distribution
function (CDF) of a random variable X is defined to be the function
F (x) = P (X ≤ x).
Since the lifetime must be nonnegative the probability distribution must have pos-
itive probability on the positive axis only. In other words, F (x) = 0 for x < 0.
Definition 2.5 Survival Function The survival function S(x), also called the re-
liability function, is the probability that a system will carry out its mission through time
x.
The survival function evaluated at x is just the probability that the failure time is
beyond time x. Thus the survival function is related to the CDF in the following way:
Another important function related to, but distinct from the PDF, is the hazard
function.
P (x < X ≤ x + ∆x | X > x)
h(x) = lim . (2.3)
∆x→0 ∆x
This is the limit of the probability per unit time that a unit fails (for the first and only
time) in a small interval given that it has survived to the beginning of the interval.
Compare the definition of the hazard function h(x) in (2.3) with the result for the pdf
7
given in (2.2). These are nearly the same, except one is a conditional probability and
the other is not.
One property of a PDF is that it must integrate to 1; that is, since we are dealing
with random variables that have all the probability on the nonnegative axis,
Z ∞
f (x)dx = 1.
0
The hazard is defined as the limit of a conditional probability, but it is not a conditional
probability density function. The hazard function does not need to integrate to 1, and
in fact, for most distributions we study, the hazard will not integrate to 1 but infinity
(see Cumulative Hazard Function). For a system whose hazard function is increasing,
this means that (in the limit) the probability of failure in a small interval divided by
the length of the interval is increasing with time. Thus if we take a small fixed length
of time, such as one hour, an increasing hazard would mean that the probability of
failing in this one hour, given that the system survived past the start of that hour,
increases with the age of the system. In this case we say that the system is wearing
out. Compare this definition with that of deterioration for a repairable system. We
say that a repairable system deteriorates when the times between failures tend to get
smaller, and we say that a non-repairable system is wearing out if the hazard function
is increasing. A non-repairable system with a decreasing hazard function is said to
experience burn-in. The term “deteriorate” will be reserved for repairable systems
and the term “wear out” will be reserved for non-repairable systems. Similarly, the
terms “improvement” and “burn-in” will be reserved for repairable and non-repairable
systems, respectively. Also note, for a continuous random variable the hazard function
can be defined as
f (x)
h(x) = .
S(x)
Knowing any one of the pdf f (x), the cdf F (x), the survival function S(x), or the
hazard function h(x) is enough to find all of the others.
8
As t tends to infinity, i.e., S(x) tends to 0, the cumulative hazard function increases
without bound, which implies that h(x) must not decrease too quickly, otherwise, H(x)
will converge.
The next section covers some of the commonly used distributions for lifetime, including
the exponential, the Weibull, and the gamma.
Exponential Distribution
and cdf
Z x
F (x) = P (X ≤ x) = λ exp(−λt)dt = 1 − exp(−λx), x > 0. (2.4)
0
1 1
The mean and variance of the exponential distribution are λ
and λ2
, respectively. The
most distinctive feature of the exponential distribution is that it is the only continuous
distribution with the memoryless property.
In other words, if the distribution has the memoryless property, then for instance, the
probability that an old unit survives one more day will equal the probability that a
brand new unit will survive one day. The memoryless property imposes some strong
assumptions about the way units age.
Another unique feature of the exponential distribution is that it is the only contin-
uous distribution with a constant hazard function.
9
Weibull Distribution
We discuss here the Weibull distribution for several reasons. First, it is probably the
most widely used distribution for lifetimes. Second, if repairs bring a system back
to a good-as-new state and the times between failures X1 , X2 , . . . are independent,
then the assumption that the times between failures are iid Weibull random variables
may be reasonable because Weibull is a versatile distribution that can take on the
characteristics of other types of distributions.
If X is a random variable with this survival function, then we will write X ∼ W EI(η, α),
where η and α are the shape and scale parameters, respectively.
The cdf, pdf and hazard functions are therefore given as follows:
n x η o
F (x) = 1 − S(x) = 1 − exp − , x>0 (2.6)
α
0 η x η−1
n x ηo
f (x) = F (x) = exp − , x>0 (2.7)
α α α
η x η−1 η
exp − αx
f (x) α α η x η−1
h(x) = = η = , x > 0. (2.8)
exp − αx
S(x) α α
The hazard function h is increasing when η > 1 and decreasing when η < 1. When
η = 1, the hazard function is the constant function h(x) = 1/α. Thus, the exponential
distribution is a special case of the Weibull distribution that occurs when η = 1.
10
PDF CDF
1.0 1.0
η=1.8 η=1.8
0.8 η=1 0.8 η=1
η=0.6 η=0.6
0.6 0.6
0.4 0.4
0.2 0.2
x x
1 2 3 4 5 1 2 3 4 5
Survival Hazard
1.0 8 η=1.8
η=1.8
η=1
0.8 η=1
6 η=0.6
η=0.6
0.6
4
0.4
2
0.2
x x
1 2 3 4 5 1 2 3 4 5
When the scale parameter α = 1, Figure 2.1 shows a number of Weibull proba-
bility density functions, cumulative distribution functions and corresponding hazard
functions, respectively.
The mean and variance of the Weibull can be expressed in terms of the gamma
function which is defined below.
Definition 2.12 Gamma Function For a > 0 the gamma function is defined to
be Z ∞
Γ(a) = xa−1 e−x dx.
0
The next theorem gives the mean and variance of the Weibull distribution in terms of
the gamma function.
11
and " 2 #
2 1
V (X) = α2 Γ 1 + − Γ 1+ (2.10)
η η
Gamma Distribution
The gamma distribution is another useful model for the lifetime of systems.
Definition 2.13 The pdf for the gamma distribution can be written as
xη−1
f (x) = exp(−x/θ), x > 0.
θη Γ(η)
We will write X ∼ GAM (η, θ) if the random variable X has this pdf, where η and θ
are the shape and scale parameters, respectively. Another useful form for the gamma
pdf is obtained by substituting 1/λ for θ; this gives
λη xη−1
f (x) = exp(−λx), x > 0. (2.11)
Γ(η)
The cdf and the survival function, and hence also the hazard function, cannot be
written in closed form. We can write the cdf as
Z x η η−1
λ ω
F (x) = exp(−λω)dω.
0 Γ(η)
f (x)
h(x) =
1 − F (x)
λη xη−1
Γ(η)
exp(−λx)
= λx η−1 −y
.
1
R
1 − Γ(η) 0
y e dy
This hazard function is increasing when η > 1, decreasing when η < 1, and constant
when η = 1, when the corresponding pdf is that of the exponential distribution.
12
PDF CDF
1.0 η=1.8
0.8 η=1.8 η=1
η=1 0.8 η=0.6
0.6 η=0.6
0.6
0.4
0.4
0.2
0.2
x x
1 2 3 4 5 1 2 3 4 5
Survival Hazard
1.0 η=1.8
1.5
η=1.8 η=1
0.8
η=1 η=0.6
η=0.6
0.6 1.0
0.4
0.5
0.2
x x
1 2 3 4 5 1 2 3 4 5
When the scale parameter θ = 1, Figure 2.2 shows a number of gamma proba-
bility density functions, cumulative distribution functions and corresponding hazard
functions, respectively.
13
Definition 2.14 Maintenance Management depicts all activities of the manag-
ment that determine the maintenance objectives, strategies and responsibilities, and
implementation of them by such means as maintenance planning, maintenance control,
and the improvement of maintenance activities and economics.
According to the definition above, the major steps to maintenance modelling can
be summarised as:
The availability and usability play a crucial part in a system’s performance because
any breakdowns and holdups can seriously impede its performance. At the same time,
idle systems negatively affect the ratio between fixed cost to output. The reduced
output induced by system breakdowns would result in less production as well as less
profitability which can be regarded as an inefficiency for the system. Moreover, complex
systems usually require a significant startup time after an interruption occurs. Possibly
during this period of time, goods that do not meet acceptable levels, e.g., scrap or goods
of minor quality are produced, as a result, one cannot obtain her or his expected profit
since these products cannot be sold or have to be sold at reduced prices. Thus, efficient
operation of a system requires well-scheduled maintenance to avoid interruptions as
much as possible and to recover from breakdowns quickly.
For a manufacturing system, wear-out, ageing and deteriorating will have a negative
impact on the function of the system, which results in the consequence that the system
cannot fulfil its capability. Maintenance is introduced to counteract those negative ef-
fects from an economic point of view. Therefore, maintenance actions plays an essential
role in sustaining and possibly improving a system’s availability, which in return will
improve the productivity of the system considered. In general, maintenance policies
and strategies are commonly categorised into three domains: Corrective Maintenance
(CM), Preventive Maintenance (PM) and Condition based Maintenance (CBM).
14
Corrective maintenance is initiated when the system sees a breakdown which results
in a stop for a system working and induces considerable cost. Corrective maintenance
is usually named repair, restoration or replacement of failed components. This mainte-
nance policy is often applied to systems of which failure is not costly and do not result
in disastrous situations, for components with constant failure rate, e.g., if a failure time
of components is assumed to follow an exponential distribution.
Preventive maintenance is implemented for the purpose of minimising the nega-
tive impact of an unexpected breakdown. Generally speaking, preventive maintenance
usually involves less resource consumption compared to that of corrective maintenance
and it can be designed in the production plans of the system of interest. PM in-
cludes all partial or complete overhauls, such as filter cleaning, oil charging, etc. in
order to prevent a critical failure that is costly before it actually occurs. It can be
seen that preventive maintenance makes sense in the situation when the failure rate
of a unit or component is increasing in time. Unlike CM that is unexpected, preven-
tive maintenance can usually be properly planned and prepared. Although preventive
maintenance is incorporated to prevent critical failures in system designing, sometimes
failure may still be seen. As a result, it is usually suggested to combine both corrective
maintenance and preventive maintenance tasks.
However, when the operation schedules and environmental variables change in prac-
tice, exhaustive or unnecessary use of preventive maintenance can occur. To make
sure that preventive maintenance is taken only when it is required, condition based
maintenance was introduced by incorporating inspections of the system of interest in
pre-determined intervals to determine the system’s operation condition. Depending on
the outcome of an inspection, relevant maintenance tasks can be implemented. It is
worthwhile noting that CBM is sometimes analysed in the field of PM (Manzini et al.,
2010).
The preventive maintenance policies include time based PM (Roux et al., 2008) in
which PM is conducted every t units of time and age based PM (Chen et al., 2006)
where PM is carried out every t units of operating time. There are other alternatives
of preventive maintenance, e.g., for non-repairable systems, group block replacements
where units or components would be replaced if it failed whereas the other working
components would be replaced at pre-determined schedule (Roux et al., 2008).
15
Condition based maintenance has received less attention probably because it is rel-
atively new compared to CM and PM. However, thanks to the fact that the inspection
is less costly, one is encouraged to implement CBM (Xiang et al., 2012). If a system
is designed to serve for a long period, one inspection monitor can be installed if it is
relatively cheaper. Van Horenbeek and Pintelon (2013) proposed a prognostic mainte-
nance by combining CBM with the prediction about the states of components to see
if a threshold is expected to be reached before the following scheduled inspection. If it
does, the component is replaced immediately. Although it can be seen that there is an
increasing application of CBM in practice (Wang et al., 2008), it is less studied in the
literature.
In reviewing the literature, we find limited effort was taken to compare differ-
ent maintenance policies. Xiang et al. (2012) investigated a repairable system un-
der preventive maintenance and condition based maintenance policies and found that
condition-based maintenance is superior to scheduled maintenance paradigm via simu-
lation. Van Horenbeek and Pintelon (2013) studied five different maintenance policies
(i.e., CM, block PM, age based PM, CBM with inspection and CBM under continuous
monitoring) on one machine and their noted effects.
Preventive maintenance comprises all maintenance activities which are not triggered
by a system failure. Not only the mode of maintenance task (preventive maintenance
or corrective maintenance) and its associated maintenance interval impact the failure
rate, but also its level of quality (effectiveness of maintenance task). The state after a
maintenance action is performed on a component is assumed to be: perfect, imperfect,
minimal, worse or worst (Pham and Wang, 1996). See Table 2.1.
16
Maintenance Policy System State Failure Rate
Preventive maintenance The system state is re- Decreasing of the failure rate
stored to be “as good
as new”
Imperfect maintenance A maintenance action Decreasing of the failure rate
that restores the
system to a state
somewhere between
“as good as new” and
“as bad as old”
Minimal Maintenance The system state is No effect on the failure rate
“as bad as old”
Worse Maintenance System is in operat- Increasing of the failure rate
ing state worse than
just prior to the main-
tenance action
Worst maintenance System breaks down Increasing of the failure rate
right after mainte-
nance action
17
mechanisms will remain unaffected. Hence, Lin et al. (2001) introduced the concept
of two categories of failure mechanisms, maintainable failure mechanisms and non-
maintainable failure mechanisms. Preventive maintenance will affect maintainable fail-
ure mechanisms exclusively, whereas non-maintainable mechanisms remain unaltered.
Zequeira and Berenguer (2005) stated that the maintainable and non-maintainable
failure rates are dependent and restored the system of interest to a condition between
as good as new and as bad as old via their proposed preventive maintenance actions.
In the literature, three approaches for modelling the impact of preventive mainte-
nance on the failure rate have been studied extensively; a failure rate model by Lie
and Chun (1986) and Nakagawa (1986), Nakagawa (1988), an age reduction model by
Canfield (1986) and Malik (1979) and a hybrid model by Lin et al. (2001).
In maintenance modelling, most researchers model a system as a whole unit, with-
out considering the effect of deterioration and failure on the subsystems. On the other
hand, machines were modelled as subsystems by some researchers. Van Horenbeek
and Pintelon (2013) modelled a simplified version only considering one subsystem in
a few machines and analysed the structural and stochastic dependencies. Roux et al.
(2008) evaluated the impact of three maintenance policies under an assumption that a
system has only two independent components. One of the assumptions in maintenance
modelling is assuming all the units or components of a system are identical and inde-
pendent. There are other less rigorous assumptions and we will include some of them
in our research:
• Costs of all relevant maintenance tasks are assumed to be known as constant and
the cost of CM is always more than that of PM.
18
conduct maintenance.
19
hazard adjustment on systems.
There are also studies by Dieulle et al. (2003) calculating the long-time expected
cost per unit of time via considering if system’s state is above or below a threshold and
assuming deterioration as a gamma process; and Schutz et al. (2011) investigating the
periodic and sequential preventive maintenance policies over a finite planning horizon.
However, due to the highly mathematical formalisation of their modelling, it is not
staightforward to be applied in practice.
20
beginning and then cooled gradually under a fully controlled environment to obtain
desirable shapes or properties. This method has been used to solve a wide variety of
problems, such as the ones with continuous, discrete and mixed-integer variables (Rao,
2009).
Dynamic programming (DP), see Bellman (1954), is a method of solving a com-
plex problem by breaking it down into series of simpler subproblems (different parts
of the original problem) and then combining the solutions to the subproblems to ob-
tain an overall solution to the original problem. There are two advantages using this
formulation. First, dynamic programming enables computing the optimal solution in
some cases which usually only applies to smaller problems. Due to the curse of di-
mensionality, computing the optimal solution to larger problems cannot be done in
a reasonable and feasible amount of time (Powell, 2007). Second, dynamic program-
ming can produce optimal theoretical results which could indicate the behaviour of the
optimal policy in the proposed models, for example, see Ding et al. (2002).
In a general assumption about state, action and parameter spaces, Rieder (1975)
consider a non-stationary Bayesian dynamic decision model which can be reduced to
a non-Markovian decision model with known transition probabilities. As a pioneer-
ing work in Bayesian dynamic programming, his work provides criteria of optimality
and the existence of Bayes policies. Nicolato and Runggaldier (1999) combine burn-
in, which is used to cope with the problem of “infant mortality” in system running
periods, with identical multi-component systems, and propose a Bayesian dynamic
programming method to make decisions on the optimal maintenance interval and best
burn-in time.
In other words, dynamic programming is an optimisation approach that transforms
a complex problem into a sequence of simpler problems; its essential characteristic is the
multistage nature of the optimisation procedure, which provides a general framework
employed to solve particular aspects of a more general formulation. In the decision
tree problem, we often call it “Roll-Back”, see Ross (1995). As in the problem we
will discuss in the next chapter, we divide the system’s running procedure into a few
stages and based on the information we learn from running the system, we model the
functional form of the system utility and then attempt to maximise it.
The problem addressed in the next chapter is associated with a two-phase sequential
21
problem. We consider the system running from a global perspective, which means we
assume the whole running procedure of the system, and make a decision based on the
optimal utility of the system. On the contrary, myopic decision making means that
we can do local utility optimisation based on the information we previously obtained.
Although we would not obtain the optimal utility for the system, optimisation problems
can be simplified and are also easier to implement in practice, and we can simply regard
it as a “roll-forward” method.
22
average cost per unit time and maximising systems availability. There are mainly
three ways to realise this aim: firstly, put several objectives into one objective func-
tion, however, this method requires transformation in a universal unit among different
objectives; secondly, assign weights to different objectives based on decision maker’s
preference. Although transformation is not required in this method, the decision maker
has to trade-off among objectives; thirdly, simply solve several objectives simultane-
ously using multi-objective optimisation algorithms, e.g., Oyarbide-Zubillaga et al.
(2008) implement a Non-dominated Sorting Genetic Algorithm to minimise costs as
well as maximise system production profits.
The objective proposed in this thesis is utility-based maintenance. From the main-
tenance engineer’s point of view, by incorporating risk preferences into maintenance
modelling, management of systems could avoid short-term problems such as short-term
cash flow problems by considering optimal long-term policy such as systems availabil-
ity. For example, there are two policies that are equally preferred based on minimising
average cost per unit time, however, it is likely that one of them has more variable
cash flows than the other, which would result in instability.
23
24
Chapter 3
25
Bayesian modelling arises earlier than the frequentist methodology actually. It is
being dated back to a thought experiment conducted by Bayes (1763) who threw balls
onto a square table, and to the ‘inverse probability’ found by Laplace (1774) and later
replaced as ‘Bayesian’ in the 1950s by Fisher (Fienberg, 2006). Both Bayes and Fisher
took the uniform prior distribution: it was a outcome of the ball throwing experiment
in Bayes’ case, while it was considered as an intuitive axiom which is a ‘principle of
insufficient reason’ by Fisher. Yet, Bayesian modelling and inference has not drawn
much attention until it reappeared and was recognised in the modern form thanks to
Jeffreys (1939) and Savage (1954), among other researchers.
Briefly the Bayesian concept combines both objective probability that has a similar
interpretation compared to the probabilities in frequentism and subjective probability
that is taken to express a degree of belief from a personalistic point of view. The
treatment of regarding the parameters of probability models as realisations of a random
variable or not differentiate Bayesian modelling from frequentist methodology, and
enable one to put direct statements of probability for these parameters.
Let us consider a sequence of observations x = {x1 , . . . , xn } and a corresponding
probability model that is believed to be the generating model for the data, with a
probability density function fX (x | ψ) in which the parameter(s) ψ is made explicitly
to model the dependence. It is worthwhile noting that ψ may be a vector of parameter
in general. ψ is considered as the unknown realisation of a random variable in the
Bayesian modelling, as a result we can consider the joint density function of the random
variables X and Ψ:
fX (x; ψ) fΨ (ψ)
fΨ | X (ψ | x) = R (3.2)
f (x; ψ)fΨ dψ
Ω X
∝ fX (x; ψ) fΨ (ψ)
26
(3.2) is referred to as the posterior distribution of parameter(s) ψ and encapsulates all
knowledge and information concerning the unknown parameters.
However, in practice the integral in the denominator of (3.2) which is the normal-
ising constant, does not often have an analytically tractable form. Due to this fact,
it hindered the learning and application of Bayesian modelling for quite a long period
of time. With the rapid development of computing techniques such as Markov chain
Monte Carlo in modern time, Bayesian modelling has become easier to implement since
the 1990s.
27
called the ‘likelihood principle’.
The prior distribution is the term fΨ (ψ) in numerator from (3.2). Because the param-
eters are treated as random variables in Bayesian modelling, a Bayesian is required to
specify corresponding distributions for the random variables which are independent of
the data x in (3.1). The prior distribution is taken to represent the information that a
decision maker has known prior to observing any possible data and is usually regarded
as the most controversial part in Bayesian methodology.
The controversy arises owing to the fact that the choice of a prior distribution
is always a subjective decision. However, it is argued that the implementation of a
prior distribution adds flexibility of modelling. For example, we can choose a ‘sceptical
prior’ which is suggested to assign a lower probability to some favourable outcomes.
Thus, the probability weighs in favour of that outcome becomes even more compelling.
However, we can think that the chosen model that is believed to be where the data are
generated, fX (·; ψ) is also a somewhat subjective choice.
In Bayesian modelling and inference, the prior is often chosen from some parametric
family of distributions and the parameters of the prior distribution are referred to as
hyper-parameters. There are commonly a few ways to choose a prior distribution for
the model specification. In general, these method include the following: subjective
priors, objective priors, empirical priors, priors from experts and conjugate priors.
In subjective probability, a particular individual chooses or specifies a prior to
capture her or his belief under examination as good as possible. It is worthwhile
mentioning that even a very vague prior can be useful, because the results during the
inference are mathematically and rationally updated for the belief concerning ψ with
observations coming in increasingly.
On the contrary, objective priors are chosen as a convenience to capture ‘ignorance’
about ψ a priori. These priors with hyper-parameters are often set to have very high
variance. and have good frequentist properties. Alternatively, as it can be seen in
(3.2), the prior exists in both numerator and denominator, then any multiplication of
28
the priors do not make any difference, as a result, some practitioners choose some priors
which do not integrate to 1 in a controversial fashion. It has to be noticed that one can
no longer guarantee the posterior to be a legitimate probability distribution because
of that. For example, if one choose a flat prior fΨ (ψ) = c ∀ ψ (c > 0), then it will
lead to solutions corresponding to Fisher’s fiducial probabilities, which is improper in
our Bayesian setting. Problems of improper priors are that there is the danger of over-
interpreting them since they are not probability densities, and also do not necessarily
ensure a proper posterior. To address the issue that Bayesian inference is sensitive to
the parameterisation for modelling, Jeffreys (1946) developed Jeffreys prior, which is
specified as fP si (ψ) ∝ J(ψ)1/2 , where J(ψ) is the Fisher information with respect to
ψ. Jeffreys prior enables the posterior to be invariant to re-parameterisation for the
model, yet this can still have problems of improper priors, which are that there is the
danger of over-interpret them since they are not probability densities and also do not
necessarily ensure a proper posterior.
Empirical priors lie in the field of so called ‘Empirical Bayes’. Its basic idea is to
learn some of the parameters of the prior from the data. Let us consider a hierarchical
Bayesian model (to be introduced as follows) with parameter Ψ and hyper-parameter
λ: Z
fX (x | λ) = fX (x | ψ)fΨ (ψ | λ) dψ, (3.3)
This method has its advantage of being robust because it overcomes some limitations
of mis-specification of the prior. However, it double counts the data, which results in
a likelihood principle violation of the relationship between data and hypothesis.
Specifying priors from experts is actually a research domain itself that is called
‘prior elicitation’. For simplicity, this might be conducted by pooling the experts’
opinion about the parameters of the model onto a credible range and applying a normal
prior with fixed γ% points for upper and lower bounds. This method requires the
parameters of the model to be well interpreted to the experts, otherwise, it would add
more questions if the experts have no understanding of probability theory. Garthwaite
et al. (2005) and O’Hagan et al. (2006) are relatively recent reviews of techniques to
address this issue.
29
If a prior distribution is multiplied by the likelihood, resulting in an expression that
is algebraically from the same family as the prior distribution, up to a normalising
constant, we call this conjugacy and the prior is called a conjugate prior distribution.
Its advantage arises since the normalising constant can be written down via inspection
without conducting integral in the denominator in (3.2). Particularly, if the likelihood
is from the exponential family and written as:
where Φ is a vector of natural parameter, s(x) is a sufficient statistic, and f, g are posi-
tive functions of x and ψ, respectively. If the conjugate prior is taken from exponential
family as:
fΨ (ψ) = h(η, ν)g(ψ)η exp Φ(ψ)T ν
where η and ν are hyper-parameters and h is the normalising function, then the pos-
terior distribution for n independent exponentially distributed data points is also con-
P
jugate, with hyper-parameters η + n and ν + i s(xi ) and has the computationally
convenient form as follows.
! ( !)
X X
fΨ (ψ | x1 , . . . , xn ) = η + n, ν + s(xi ) g(ψ)η+n exp Φ(ψ)T ν + s(xi )
i i
In the Bayesian modelling and inference setting, one can fully conduct an analysis based
on the posterior distribution to deal with all the questions of interest. Usually, this
can include probability density function plots, summary statistics such as expectation
and variance, the modal value of the posterior, or the intervals of the highest posterior
density. When the interest lies in some functional form of the parameters of models,
it often can also be dealt with comparatively straightforwardly.
It is worthwhile noting that models can become increasingly complex in a frequentist
framework, such as when there exist nuisance parameters in some elements of ψ. This
issue can be beautifully dealt with via standard probability theory. For instance, if
ψ = (δ, λ) in which δ is the parameter of interest whilst λ is a nuisance parameter
required to construct the complete model. In a Bayesian framework, we can obtain the
30
posterior distribution of δ by simply integrating out the nuisance parameter:
Z
f∆ (δ | x) = fΨ | X (δ, λ | x) dλ
Ωλ
Once the posterior distribution for a parameter of a model is obtained, it can be used
further to explore other probabilities. For example, a posterior predictive distribution
can be derived to provide the predictive probability distribution by taking into account
the uncertainty in ψ for a future data point, x∗ , which is from the same data generating
process incorporating all the information:
Z
∗
fX ∗ | X (x | x) = fX (x∗ | ψ) fΨ | X (ψ | x) dψ (3.5)
Ω
When it comes to modeling more complex real problems in practice, the simple spec-
ification of the likelihood and prior functions indicates its limitation. Hierarchical
modelling arises to address this issue in a natural expression under a Bayesian frame-
work.
In the previous illustration, we have stated that the observations are believed to
be from the data generating process fX (· | ψ) with parameter ψ where we can treat
ψ as a realisation of a random variable Ψ. And the core of Bayesian modelling, the
posterior presents the probability distribution for a particular realisation ψ expressing
the characteristic of the process. Let us consider multiple realisations of the parameters,
denoted as ψi , which will result in the data in different groups, e.g., xi1 , · · · , xini are
the data in group i from the parameter ψi . We can place a hyper-prior for the prior
distribution fΨ (·) under the Bayesian idea. Therefore,, we take the estimation of ψi as
part of our Bayesian modelling process via hierarchically conditional probabilities as
follows in an example.
31
Assume we have the data x = {x11 , . . . , xmnm } where the first index indicates which
group the data are from and the second one tells us the number of observations in that
group. We define the following to specify the hierarchical model:
fX | Ψ (· | ψi ) (3.7)
fΨ | Λ (· | λ) (3.8)
fΛ (·) (3.9)
where (3.7) is the probability model of the data generating process for group i. Note
that for different hierarchies, the parametric probability functions are from the same
family, and only differ in the terms of different parameters for each group (as mentioned
previously, each ψi can be in a vector form). (3.8) is assigned as the prior distribution
for the parameters Ψ, within which the hyper-parameter λ is made explicitly. Note
that the hyper-parameter λ is not specified either, but given a hyper-prior probability
distribution (3.9). This complete model includes parameters ξ = (ψ1 , . . . , ψn , λ), so
that the posterior is expressed as follows:
fΞ | Y (ξ | x) ∝ fY | Ξ (x | ξ) fΞ (ξ)
| {z } | {z }
first term second term
= fY (x | ψ) fΨ (ψ | λ) fΛ (λ)
| {z } | {z }
first term second term
The simplification of the first term to the usual likelihood is thanks to the conditional
independence of x and λ given ψ, and the joint prior in the second term decomposes
naturally because of the model formulation.
Then the interest will usually be in the posterior predictive probability distribution
of ψ, since this incorporates all the information learned about that parameter from
observing the data points:
Z
∗
fΨ∗ | X (ψ | x) = fΨ (ψ ∗ | λ) fΛ | X (λ | x) dλ (3.10)
The above is simply one of the possible hierarchical models, but it can be much
more complicated when dealing with other sophisticated models in practice, then for
modelling the dependencies one can simply adopt directed acyclic graphs (DAG) which
makes identification of conditional dependencies and independencies more clear.
32
3.1.5 Bayesian method in maintenance
Under the non-Bayesian framework, when there is little learning from the system op-
eration or rare evidence to judge the characteristics of a system or its components, it
would be questionable to carry out maintenance policies, such as corrective mainte-
nance (CM), and preventive maintenance, because it is difficult to judge if the system
or its components are critical or not on a priori grounds. At the same time, applying
maintenance policies may cause a significant drift of the system reliability, and it can
fail to capture the dynamics of systems when a new maintenance policy is carried out
without considering Bayesian learning.
Due to the uncertainty characteristics of most systems, such as unknown lifetime
distribution or known distribution but with uncertain parameters, it is necessary to
model these uncertainties to conduct reasonable maintenance policies. There have been
increasing applications of Bayesian methods in maintenance modelling, which may be
categorised as follow:
• Bayesian inference
33
the Bayesian method to the unknown parameters for a Weibull failure time dis-
tribution of a sequential preventive maintenance model, which is defined in the
context of a cycle. Within a cycle, minimal repair is conducted after a failure
and the effective performing time of the system and its hazard rate are both
adjusted to modelling the deterioration of the system. At the end of a cycle, a
full replacement of the system is carried out. However, their methods did not
consider the statistical learning connection between cycles and thus ignored the
future possibility of system performance.
• Bayesian network
34
• Applications
Bayesian methods have also been applied in other domains. For example, Durango-
Cohen and Madanat (2008) used a quasi-Bayes approach to optimise the inspec-
tion and make maintenance decisions for infrastructure facilities under perfor-
mance model uncertainity by taking a mixture of known models, of which the
mixture proportions are assumed to be random variables with probability den-
sities updated over time. When there are limited data and information, Zhang
and Wang (2014) used Bayesian linear methods to combine the subject expert
knowledge with the available limited data to estimate the unknown parameters
of models and applied it in infrastructure assets. Their optimisation objective is
still to minise the cost per unit time rather than the utility that is the optimiation
objective of this thesis.
35
How many demonstrators should be allocated to each session of this course to have
the fewest student fail?
Table 3.1: Number of failed students for a course with 4 sessions and 6 available
demonstrators.
• stages: 1st solve section 4, 2nd solve section 3 and 4, 3rd solve section 2, 3 and
4, 4th solve section 1, 2, 3 and 4.
36
will fail if no demonstrator is allocated to section 4; 8 students will fail if 1 demonstrator
is allocated to section 4; 6 students will fail if 2 demonstrators are allocated to section
4; 3 students will fail if 3 demonstrators are allocated to section 4; 2 students will fail
if 4 demonstrators are allocated to section 4; 1 student will fail if 5 demonstrators are
allocated to section 4; no student will fail if 6 demonstrators are allocated to section 4.
d4 x4 = 0 x4 = 1 x 4 = 2 x4 = 3 x4 = 4 x4 = 5 x4 = 6 f4 (d4 )
0 15 - - - - - - 15
1 15 8 - - - - - 8
2 15 8 6 - - - - 6
3 15 8 6 3 - - - 3
4 15 8 6 3 2 - - 2
5 15 8 6 3 2 1 - 1
6 15 8 6 3 2 1 0 0
The recursive relationship for this stage is f4 (d4 ) = minx4 {F4 (x4 )}. Therefore, the
f4 (d4 ) is the smallest value in each row.
We will now move to Stage 3: section 3 and 4. Again, we can have 0 to 6 demon-
strators available to allocated and we allocate 0 to 6 demonstrators to this section.
The recursive relationship at this stage is f3 (d3 ) = minx3 {F3 (x3 ) + f4 (d3 − x3 )}. Let’s
look at the case where we have 2 available demonstrators to allocate (d3 = 2) and
we choose to allocate 1 demonstrator to section 3 (x3 = 1). This gives us a value of
f3 (2) = F3 (1) + f4 (2 − 1) = F3 (1) + f4 (1). From Table 3.1, we get F3 (1) = 16 since
16 students will fail if 1 demonstrator is allocated to section 3, so f3 (2) = 16 + f4 (1).
From Table 3.2, f4 (1) = 8, i.e., when d4 = 1, the fewest number of students that will
fail is 8, so f3 (2) = 16 + 8 = 24, and we can write this into Table 3.3 where d3 = 2
and x3 = 1. Let’s fill in the rest of Table 3.3. Now let’s find f3 (d3 ) for each state by
selecting the minimum value for each row.
37
d3 x3 = 0 x3 = 1 x3 = 2 x3 = 3 x3 = 4 x3 = 5 x3 = 6 f3 (d3 )
0 36 - - - - - - 36
1 29 31 - - - - - 29
2 27 24 28 - - - - 24
3 24 22 21 22 - - - 21
4 23 19 19 15 19 - - 15
5 22 18 16 13 12 17 - 12
6 21 17 15 10 10 10 16 10
We will now move on to stage 2, section 2, 3 and 4 of this course. Since stage
3 includes section 3 and 4, we will not need Table 3.2 until we retrieve the solution.
Let’s look at the stage 2 shown in Table 3.4. The recursive relationship at this stage
is f2 (d2 ) = minx2 {F2 (x2 ) + f3 (d2 − x2 )}. Let’s look at the case where we have 4
demonstrators available to allocate (d2 = 4) and we choose to allocate only 1 to section
2 (x2 = 1). The recursive relationship becomes f2 (4) = F2 (1)+f3 (4−1) = F2 (1)+f3 (3).
We get F2 (1) from Table 3.1, 20 students will fail if 1 demonstrator is allocated to
section 2. We get f3 (3) from the Table 3.3, i.e., if we have 3 demonstrators to allocate
to section 3 and 4, what is the smallest number of students that will fail? So f2 (4) =
20 + 21 = 41. We enter this value into Table 3.4 where d2 = 4 and x2 = 1. Now we
will add the rest of the values to Table 3.4.
d2 x2 = 0 x2 = 1 x2 = 2 x2 = 3 x2 = 4 x2 = 5 x2 = 6 f2 (d2 )
0 61 - - - - - - 61
1 54 56 - - - - - 54
2 49 49 51 - - - - 49
3 46 44 44 47 - - - 44
4 40 41 39 40 43 - - 39
5 37 35 36 35 36 40 - 35
6 35 32 30 32 31 33 38 30
Now we will look at stage 1, section 1, 2, 3 and 4. We will not need the Table 3.3
38
again, until we retrieve the solution. In stage 1 (Table 3.5), the number of demonstra-
tors available is 6, no demonstrators have been allocated before this stage, so we know
that there are exactly 6 demonstrators available. The recursive relationship for this
stage is f1 (d1 ) = minx1 {F1 (x1 ) + f2 (d1 − x1 )}. Let’s look at the case where we choose
to allocate 3 demonstrators to section 1, then f1 (6) = F1 (3) + f2 (6 − 3) = F1 (3) + f2 (3).
From Table 3.1, F1 (3) = 5, i.e., 5 students will fail if we allocated 3 demonstrators to
section 1. From Table 3.4, we get f2 (3) = 44, which means if we have 3 demonstrators
to allocate to section 2, 3 and 4, the smallest number of students that will fail is 44,
so f1 (6) = 5 + 44 = 49 and we put it in Table 3.5. Now let’s fill in the rest of the table
and f1 (d1 ) is the smallest value in this row.
d1 x1 = 0 x1 = 1 x 1 = 2 x1 = 3 x1 = 4 x1 = 5 x1 = 6 f1 (d1 )
6 47 46 48 49 51 55 61 46
Now let’s retrieve the solution. Looking at stage 1 (Table 3.5), we know the least
number of students that will fail, i.e., allocating 6 demonstrators to this course, there
will be 46 students who will fail.
We will trace back through the solution to obtain the allocation of demonstrators.
We start by looking at stage 1 (Table 3.5). How many demonstrators should we allocate
to section 1? The smallest number of students that will fail is 46, which occurs when
x1 = 1, i.e., we are supposed to allocate 1 demonstrator to section 1. Now we will
look at stage 2 (Table 3.4), d2 at stage 2 is 5 since we start with 6 demonstrators
and have allocated 1 to section 1. The smallest number of students that will fail if
we have 5 demonstrators to allocate to section 2, 3 and 4 is 35, which occurs when
either 1 or 3 demonstrators are allocated to section 2, in other words, there are two
possible allocations that will give us the best solution. Let’s look at stage 3 (Table
3.3): given the first partial solution, 2 demonstrators have been allocated so we have 4
demonstrators left for section 3 and 4, so the smallest number of students that will fail is
15 when x3 = 3; given the second partial solution, 4 demonstrators have been allocated
so 2 are available, thus the smallest number of students that will fail is 24, which occurs
when x3 = 1. Now let’s go forward to stage 4 (Table 3.2). Both partial solutions have
allocated 5 demonstrators to section 1, 2 and 3, leaving only 1 demonstrator for section
39
4, hence the smallest number of students that will fail is 8, which occurs when x4 = 1.
As a result, there are two schemes of allocation of demonstrators, A and B that see
the least number of failed students, at 46,
Sessions 1 2 3 4
Allocation A 1 1 3 1
Allocation B 1 3 1 1
Now we have solved this problem successfully by dynamic programming and the
characteristics of dynamic programming will be illustrated via this example.
3.2.2 Characteristics
There are three most important characteristics in dynamic programming which are
stages, states and recursion.
Stages
The essential part of dynamic programming method is to recognise and restructure the
optimisation problems into a multiple of stages and only solve one stage subproblem
at a time sequentially. The solution of each one-stage subproblem assist to define the
characteristics of the next one-stage problem in the sequence, though each one-stage
subproblem is solved via a normal optimisation problem.
The stages usually present the different time periods in a problem’s analysing pro-
cedure, which means the stage duration is constrained by the length of the problem to
be analysed. The problem of determining the optimum preventive maintenance time
in this thesis is to be stated as a dynamic programming problem. The decision vari-
able is the scheduled preventive maintenance time at the beginning of each phase. As
the system considered will be performing for a few planned phases; the objective is
to maximise the total expected utility of system performance; however, there are no
resources (cost budgets) constraints in our assumption. If we can only determine the
optimum preventive maintenance times for each phase of the system at the beginning
of system performing, we could restructure the problem into a few stages based on the
number of planned phases, of which each represents the decision regarding the optimum
maintenance time at the beginning of each phase.
40
However, the stages of a dynamic programming do not have time implications
necessarily. In the example illustrated in §3.2.1, the problem of allocating 6 available
demonstrators to 4 sessions of the course is restructured into 4 stages, which are 1st
solve section 4, 2nd solve section 3 and 4, 3rd solve section 2, 3 and 4, 4th solve section
1, 2, 3 and 4. The decision variable xi is the number of demonstrators to be allocated to
each session of the course to make the least failed number of student in this course. It is
worthwhile noting that problems without time implications are comparatively difficult
in practice to be restructured in stages via dynamic programming.
States
The states correspond to each stage of the optimisation problem in dynamic program-
ming. The states indicate the information needed to fully analyse the outcomes that the
current decision has upon the future situations. In the demonstrator allocation prob-
lem in §3.2.1, each stage has only one variable representing the state: di the number of
demonstrators available to be allocated at each stage. In our problem of determining
the optimum preventive maintenance time, the situation that failure time Tfi which
can be observed or not before scheduled maintenance time Tmi for each phase i that is
the state variable.
The elicitation of the state variable in a dynamic programming problem is very
critical. However, there do not exist standard rules to specify it in particular and it
usually requires one to study the problem through dynamic programming in a some-
what creative and subtle way. Based on practical implementation, it is suggested to
select the states of a dynamic programming problem based on the following criteria.
• The states should reflect sufficient information for one to make decisions in the
future regardless of how the problem has arrived at the current state.
In the demonstrator allocation problem in §3.2.1, the state variable, the number
of available demonstrators xi for each session i, does meet this criterion because
it does not consider how one has allocated the demonstrators prior to the current
section i.
41
to specify the number of states as small as possible. As a result, the limited prop-
erty of dynamic programming restricts the application of dynamic programming.
Recursive Optimisation
fn (dn , sn , Zn ), (3.11)
42
where fn (·) is the return of the process when there are n extra stages to go.
The next state of the process sn−1 has (n − 1) stages to go and we define the
transition function tn (·) as
sn−1 = tn (dn , sn , Zn ). (3.12)
43
refer to books (Davis et al., 1988; Fishburn, 1970; Varian, 1992) for a detailed theory
of decision maker’s preferences, utility functions and expected utility.
Utility theory is concerned with a decision maker’s preferences or values which can
be represented in numerically useful ways under assumptions about a decision maker’s
preferences (Fishburn, 1968). A utility theory is usually based on a decision maker’s
preference-indifference relation (read “is not preferred to”), and a set X of elements
x, y, z, · · · (interpreted as decision alternatives). If x, y and z are in X, then they are
assumed to have the following properties:
44
π2 be the probabilities that induced by decision 1 and decision 2 actually is made and
their corresponding costs occur, the utility function is written as
U (c1 , c2 , π1 , π2 ) = π1 c1 + π2 c2
If the two decisions are mutually exclusive, so that only one of them can happen,
then π2 = 1 − π1 . But we will still write out these probabilities in order to keep
symmetry.
Given this notation, we can write the utility function for decision as U (c1 , c2 , π1 , π2 ).
This is the function that represents the decision maker’s preference over each decision.
There are several classes of utility functions suitable for describing various types
of decision makers’ economic behaviour. We examine some examples of well known
classes: the quadratic, logarithmic, iso-elastic and negative exponential utility func-
tions.
U (x) = ax − bx2 .
Since its first derivative U 0 (x) = a − 2b > 0, when x < a/2b and second derivative
U 00 (x) = −2b < 0, this is a legitimate utility function.
A quadratic utility function is mainly used in the context of permanent income and
life cycle hypotheses (Bergman, 2005).
U (x) = log(x).
This is a legitimate utility function as its first derivative U 0 (x) = x−1 > 0 and second
derivative U 00 (x) = −x−2 < 0.
45
Iso-Elastic Utility Functions
Definition 3.3 A class called iso-elastic utility functions have the following form
x1−a −1 for a > 0, a 6= 1;
1−a
U (x) =
log(x) the limiting case for a = 1.
These functions have the property of iso-elasticity, which means that we get the same
utility function (up to a positive affine transformation) if the cost is scaled by some
constant k. Formally,
For all k > 0,
U (kx) = f (k)U (x) + g(k),
for some function f (k) > 0 which is independent of x and some function g(k) which is
independent of x as well, see Appendix B for proof.
This iso-elasticity property implies that if a given percentage cost budget is optimal
for the current level of budgets, then the same percentage cost budget allocation is
optimal for all the other levels of budgets as well.
Since the first derivative U 0 (x) = a exp {−ax} > 0 and the second derivative U 00 (x) =
−a2 exp {−ax} < 0, this one is also a legitimate utility function.
The class of negative exponential utility functions has an interesting property that
it is invariant under any additive cost transformation, i.e., for any constant k,
for some function f (k) > 0 which is independent of x and some function g(k) which is
independent of x as well, see Appendix B for proof.
It is natural to weight each cost induced by a decision with the corresponding proba-
bility that it will be made. This gives us a utility function of the following form
U (c1 , c2 , π1 , π2 ) = π1 c1 + π2 c2 .
46
This expression is actually known as the expected value, which is simply the average
level of cost that would happen.
One form that the utility function might take is the following:
This means that utility can be written as a weighted sum of some function of each
cost, V (c1 ) and V (c2 ), where the weights are in fact the probabilities π1 and π2 . Thus
equation 3.14 represents the expected utility, of the pattern of cost (c1 , c2 ) induced by
the relevant decisions d1 , d2 .
We refer to a utility with the form described above as an expected utility function or
a utility function that has an expected utility property. When we say that a decision
maker’s preferences can be represented by an expected utility function, or that the
decision maker’s preferences have the expected utility property, we mean that we are
able to choose a utility function that has the additive form described in equation 3.14.
And this form also turns out to be especially convenient. It has been proved that an
expected utility function has the property of uniqueness, i.e., it is unique up to an
affine transformation, which simply means that we can apply an affine transformation
to it and obtain another expected utility function that describes the same preferences
(Varian, 1992).
The expected utility function can also be subjected to some kinds of monotonic
transformation and still have the expected utility property. A function V (U ) is a
positive affine transformation if it can be written in the form: V (U ) = αU + β where
α > 0, which indicates that it not only represents the same preferences but it also still
has the expected utility property. It is straightforward to extend a utility function to
the case of a finite number of costs induced by decisions. If cost ci is associated with
probability pi , for i = 1, 2, . . . , n, then the expected utility is
n
X
EU (C) = pi U (ci )
i=1
And it also holds for continuous probability distribution. If f (c) is defined as a proba-
bility density function on cost c, then the expected utility can be written as
Z
EU (C) = U (c)f (c) dc
47
3.3.3 Risk Aversion
Based on the attitude to risk, we distinguish risk averse, risk neutral, and risk seeking
decision makers. Their utility functions are concave, affine, and convex, correspond-
ingly. Most decision makers are assumed to be risk averse and it is often convenient to
have a measure of risk aversion.
The coefficient of risk aversion is a special measure reflecting the character and
degree of a decision maker’s risk aversion. Intuitively, the more concave the expected
utility function is, the more risk averse the decision maker tends to be. We could
measure risk aversion by the second derivative of the utility function. However, this
definition is sensitive to changes in the utility function: if we consider any positive
multiple of the utility function, the second derivative changes but the decision maker’s
behaviour does not. If we normalise the second derivative by dividing by the first,
we get a reasonable measure known as Arrow-Pratt absolute risk aversion coefficient
(Arrow, 1965; Pratt, 1964). The most common measures are the coefficients of absolute
risk aversion (ARA) and relative risk aversion (RRA).
U 00 (x)
λA (x) = − . (3.15)
U 0 (x)
Utility functions with a constant absolute risk aversion coefficient are called constant
absolute risk aversion (CARA) utility functions.
U 00 (x)
λR (x) = −x = −xλA (x). (3.16)
U 0 (x)
Utility functions with a constant relative risk aversion coefficient are called constant
relative risk aversion (CRRA) utility functions.
48
Decreasing & Increasing Risk Aversion
Definition 3.7 If the absolute risk aversion λA (x) is decreasing, then we say the de-
creasing absolute risk aversion (DARA) is present, i.e., the following inequality holds,
Definition 3.8 If the relative risk aversion λA (x) is decreasing, then we say the de-
creasing relative risk aversion (DRRA) is present, i.e., the following inequality holds,
∂λR (x)
< 0. (3.18)
∂x
∂λR (x)
Also the increasing relative risk aversion (IRRA) is present if ∂x
> 0.
Thus, among the utility functions introduced in §3.3.1, negative exponential utility
exhibits constant absolute risk aversion (CARA) and increasing relative risk aversion
(IRRA); both the absolute and relative risk aversions of quadratic utility function are
increasing; for logarithmic and iso-elastic utility functions, they both exhibit decreasing
absolute risk aversion(DARA) and constant relative risk aversion (IRRA). Correspond-
ing results are present in Table 3.6.
It is worth noting that a utility function U exhibits constant absolute risk aversion
(CARA) if the absolute risk aversion coefficient does not depend on the resource or
λ0A (x) = 0, and decreasing absolute risk aversion (DARA) is present if decision makers
with more resource are less absolutely risk averse than those with less resource or
λ0A (x) < 0. We notice that there is a natural assumption that most decision makers
have decreasing absolute risk aversion, e.g., quadratic utility functions, which present
49
increasing absolute risk aversion, are avoided by economists because quadratic utility
functions imply unrealistic behaviour in practice in the sense of absolute risk aversion.
The crucial thing here is the right choice of the utility function and its parameters,
reflecting in particular decision makers’ attitude to risk. Usually, the parameters enter-
ing the utility functions are estimated using some statistical methods or psychological
experiments.
Kapliński (2013) briefly considers the economic and psychological aspects of deci-
sion making in maintenance and repair and discusses risk assessment criteria such as
expected value and maximisation of expected utility, of which research results suggest
different attitudes towards risk would influence the choice of decisions. For example,
a maintenance engineer and production manager would have very different risk pref-
erences: the former would prefer to maintain systems frequently whereas the latter
would prefer to keep systems performing consecutively.
Baker (2010) proposes a new concept of minimising the disutility of cost per unit
time instead of cost per unit time in maintenance modelling, which provides a main-
tenance policy that is optimal under risk aversion. But this paper only advocates use
of the exponential utility function, thus it would be interesting to explore the use of a
different utility function than the exponential.
Houlding and Coolen (2011) address some of the foundational issues of adaptive
utility when utility is uncertain, seen from the perspective of a Bayesian statistician,
which generalise the traditional utility concepts of value of information and risk aver-
sion. In (Houlding and Coolen, 2012) they extend their work by combining the deci-
sion making with uncertain utility and nonparametric predictive inference, by means of
which they present the Nonparametric Predictive Utility Inference (NPUI) suggestion
50
as a possible strategy for the problem of utility induction in cases of extremely vague
information. Meanwhile, Houlding and Coolen (2007) examine how the possibility to
learn preferences can be of interest for decisions in the area of reliability, which offers
a generalisation of the classical Bayesian approach by adaptive utility for sequential
decision making.
Flood et al. (2010) use a Bayesian Network to model the downtime of a system
and employ the posterior distribution within a decision analysis. They give an exam-
ple by computing the expected utilities for a warranty policy and an adaptive form of
the former through simulating samples of system downtime from the posterior distribu-
tion. Taking maximising the expected utility as the objective, the optimally acceptable
downtime range of a system is found simply by using the optim function in R under a
continuous decision space.
In software reliability, as software is more frequently used, the reliability of the soft-
ware increases, which is different from the systems commonly modelled with increasing
failure rates (IFR). For example, McDaid and Wilson (2001) propose a decision the-
oretic solution to the problem of deciding the optimal length of the software testing
period by using an error detection model and a sensible utility.
Overall, utility application in maintenance is a relatively new area and few works
have been done under maintenance optimisation for repairable systems based on utility
theory, thus it is very valuable to explore utility-based maintenance modelling.
51
52
Chapter 4
A system maintenance policy specifies how the maintenance activities should be sched-
uled and executed. Each maintenance action is taken to keep the repairable system
at the required operation level and it can be minimal repair, perfect maintenance or
replacement, etc.
Many models concerning maintenance describe a periodic maintenance policy, where
the maintenance frequency and times are pre-determined (Barlow and Proschan, 1965),
or fixed prior to modelling set up (Schutz et al., 2011). This policy has its advantages,
for example, it is less complicated to implement in practice if the system maintenance
is based on calendar time and as a result, it is popular for practitioners. However, this
policy also has its disadvantages due to its inflexibility. The main issue is that this
policy is not a globally optimal maintenance policy, which could result in exceptionally
expensive costs if the system fails to perform on a desired level due to inadequate and
not-in-time maintenance. A Bayesian situation with fixed maintenance times started
with the work by Percy and Kobbacy (1996). Damien et al. (2007) analysed a single
item maintenance in a Bayesian semi-parametric setting, which solves the drawbacks
of other models failing to capture the true underlying relationships in the data, but
still with a pre-defined time horizon. Another example is Baker (2010) who considers
failures of a system under some maintenance policy but where the system may poten-
tially reach a regeneration point T . However, again the maintenance time phases are
pre-defined, which is not practical in reality.
Myopic maintenance modelling methodology focuses on the next maintenance phase
based on the previous and current system status, which fails to consider the possible
53
maintenance series in the future. As a result, this methodology is not a globally optimal
method either. In the study of this chapter, maintenance time phases are initially pre-
definded, depending on the particular system, but are flexible and updated with data.
The objective of Chapter 4 is to determine the optimal maintenance schedule times
by proposing sequential maintenance models through adopting the Bayesian approach
on certain random or unknown parameters of failure distributions. The Bayesian ap-
proach could be quite flexible when the failure distributions of the system is either
unknown or contains uncertain parameters, which is common in most of the practical
situations.
This chapter starts with the problem setting for two-phase1 maintenance systems,
discusses the choice of utility functions, models two-phase systems maintenance by
stochastic dynamic programming, and utilises a gridding method to solve problems
arising from classical optimisation methods. A few numerical examples follow. Note
that the “time” in this thesis is regarded as “local time” (Definition 2.2).
54
on if failures are observed or not. In general, our model falls into the category of
condition-based maintenance, but with scheduled maintenance times being subject to
random failure that induces change in maintenance.
Costs of the system, such as failure cost, repair cost and maintenance cost, are also
assumed to depict the properties of the system evolution. These costs are assumed
to be constant here for the purpose of simplicity though this is also not a necessary
assumption. Our aim is to find the optimal maintenance times according to the running
of the system based on the criterion of maximising the expected utility per unit time
(i.e., here the negative expected cost per unit time because the payoff is negative cost
in this problem). The utility, of course can be altered and incorporated in a more
general or specified horizon according to various contexts (either theoretical, practical,
or both).
The system considered here has two processing time periods described through a
decision tree, see Figure 4.1. At the decision nodes (represented by squares) Tm1 ,
Tm2 , one has to decide the optimal maintenance time for the associated phase. A
phase is defined as the period between the successive occurrence of a failure or a
maintenance. Chance nodes (represented by circles) are used to describe the possibility
that systems go to various situations, in particular C1 is the chance node for phase 1
and C2i (i = 1, 2) are the chance nodes for phase 2. Tfj (j = 1, 2) are the potential
failure times for each phase, while Tmi (i = 1, 2) is the optimal maintenance times
we are attempting to find. Prior to the system running, we need to determine the
maintenance time Tm1 for the first time period, afterwards, the system starts running,
where it may face two circumstances: failures can be before the maintenance time Tm1 ;
or after the maintenance time Tm1 (right-censored). Then after the maintenance or
repair, the system goes to the second running period, again with the same potential
consequences as above. All the maintenance time decisions are made according to the
average cost of the system processing (average cost per unit time, i.e., cost rate, is
expressed at the end of each branch), of which are Cr , Cm and Cf representing costs
of repair, maintenance and failure of the system, respectively. The utility is a function
of cost rates.
55
𝑇𝑓2 ≤ 𝑇𝑚2
2(𝐶𝑟 + 𝐶𝑓 )
𝑇𝑓1 + 𝑇𝑓2
𝐶𝑟 + 𝐶𝑓 + 𝐶𝑚
𝑇𝑓1 + 𝑇𝑚2
𝑇𝑓2 > 𝑇𝑚2
𝑇𝑚1 𝐶1
𝑇𝑓2 ≤ 𝑇𝑚2 𝐶𝑚 + 𝐶𝑟 + 𝐶𝑓
𝑇𝑚1 + 𝑇𝑓2
2𝐶𝑚
𝑇𝑓2 > 𝑇𝑚2 𝑇𝑚1 + 𝑇𝑚2
Figure 4.1: Decision tree for two-phase system with sequential problem with shading
indicating a range of possible outcomes for the preceding chance node. At the very
right side, the formulae represent expected mean cost per unit time.
Interest centres on the value of the unknown maintenance times Tmi (i = 1, 2) for
each phase, given observing failure or not. In a non-sequential setting, one could simply
choose how many phases the system has and then find a single maintenance time so as
to optimise the expected cost or utility per unit time of the system. This method can
be repeated for all potential phases. The sequential setting, however, allows the choice
of maintenance times to depend on the data observed at successive points, and also on
what may be learned in the future.
The cost rate CR(·) is a function with respect to the operation time To of the
system and the corresponding cost C induced during To , defined as
C
CRTo = CRTo (To , C) = (4.1)
To
A utility function is defined as U (CRTm ) over cost rate CRTm induced by corre-
sponding maintenance time Tm . As a result, the expected utility of cost rate can be
written as follows,
Z
E(U (CRTm )) = U (CRTm )f (CRTm ) dCRTm (4.2)
56
where f (CRTm ) is a probability density function defined on cost rate CRTm .
In this problem, the expected utility of cost rate at chance node CN1 for a one-phase
system is
57
utility function captures both the core issues of cost and time. Consider two mainte-
nance policies 1 and 2 which have cost rates CR1 and CR2 , with current cost budget B
which is planned for maintenance in particular. Then one tends to prefer maintenance
policy 1 if EU (B − CR1 ) > EU (B − CR2 ) for every B. Pranzagl (1959) and Bell (1988)
have proved that the negative exponential and linear utility function families are the
only ones for which preference holds regardless of budget B.
Bell and Fishburn (2001) suggest that the following utility
should be used because its form of sum of linear and exponential has a few properties:
• It satisfies the so called “one-switch” rule, which means preference for one of
two maintenance policies is allowed to change only once as the cost budget B
increases.
∂λA (x) 3
αβ exp(βx)
• Its absolute risk aversion is decreasing as ∂x
= − (αβ+exp(βx)) 2 < 0.
58
U(x)
6 η=0.1
5
η=0.3
η=0.5
4
η=0.7
3
x
2 4 6 8 10
Myopic is the most elementary modelling method for sequential maintenance. My-
opic modelling optimises the cost per unit time, but does not explicitly use forecasted
59
information or any direct representation of decisions about maintenance times in the
future; in other words, it makes no explicit attempt to capture the impact of a current
maintenance time on the future. In its most basic form, a myopic policy can be given
by
Tmmyopic = arg max U (CRTm ) (4.7)
where Tmmyopic is the optimum maintenance time and U (·) is a utility function of cost
rate.
In the two-phase sequential maintenance problem, the optimum maintenance time
for phase one under myopic modelling is
Z Tm
myopic
1 Cf + Cr
Tm1 CN1 = arg max UTm1 × fTf1 (tf1 ) dtf1
0 Tf1
Cm
+UTm1 × fTf1 (tf1 > Tm1 ) (4.8)
Tm1
Note that the function here is the expected utility of cost rate.
Similarly for chance nodes CN21 and CN22 , the optimum maintenance times under
myopic modelling can be expressed as follows, respectively:
Z Tm
myopic
2 Cf + Cr
Tm2 CN21 = arg max UTm2 × fTf2 (tf2 | tf1 ) dtf2
0 Tf2
Cm
+UTm2 × fTf2 (tf2 > Tm2 | tf1 ) (4.9)
Tm2
Z Tm
myopic
2 C f + Cr
Tm2 CN22 = arg max UTm2 × fTf2 (tf2 | tf1 > Tm1 myopic
CN1 ) dtf2
0 Tf2
Cm myopic
+UTm2 × fTf2 (tf2 > tm2 | tf1 > Tm1 CN1 ) (4.10)
Tm2
Briefly, the procedure to find the optimum maintenance times for two-phase systems
under myopic modelling are as below:
1. Maximise the expected utility function of cost rate for phase one to find the
optimum maintenance time, Tm∗ 1 ;
2. Similarly, maximise the expected utility function of cost rate for phase two and
find the optimum maintenance times, TCN21 ,m2 and TCN22 ,m2 , given additional
data with regard to the outcome of the first phase.
From the equations above, we can see that the maintenance times obtained are
not globally optimal as the myopic method only utilises previous information to make
60
decisions based on the current state of knowledge instead of taking future states into
account. In order to find the globally optimum maintenance time, it is required to
propose other methods.
where γi , di and Zi , i = 1, 2, are best possible solutions from stage i to the end, decision
variables and state variables, respectively.
As a result, we suggest use “roll-back” method, where we solve the optimal main-
tenance times for phase two first, i.e., Tm2 for chance nodes CN21 and CN22 ; and
then input the maximum expected utility of cost rate for phase two back into the ex-
pected utility function of cost rate for phase one, and hence solve to obtain the globally
optimum maintenance time Tm1 for phase one.
Z Tm
DP
2 2(Cf + Cr )
Tm2 CN21 = arg max UTm2 × fTf2 (tf2 | tf1 ) dtf2
0 Tf1 + Tf2
Cf + Cr + Cm
+UTm2 × fTf2 (tf2 > tm2 | tf1 ) (4.13)
Tf1 + Tm2
Z Tm
DP
2 Cm + Cf + Cr
Tm2 CN22 = arg max UTm2 × fTf2 (tf2 | tf1 > tm1 ) dtf2
0 Tm1 + Tf2
2Cm
+UTm2 × fTf2 (tf2 > tm2 | tf1 > tm1 ) (4.14)
Tm1 + Tm2
61
Accordingly, the maximum expected utility of cost rate for chance nodes CN21 and
CN22 are as follows, respectively:
Z Tm2 DP
CN21 2(Cf + Cr )
E(CN21 ) = UTm2 × fTf2 (tf2 | tf1 ) dtf2
0 Tf1 + Tf2
!
Cf + Cr + Cm
+UTm2 DP
× fTf2 (tf2 > Tm2 DPCN21 | tf1 ) (4.15)
Tf1 + Tm2 CN21
Z Tm2 DP
CN22 Cm + C f + Cr
E(CN22 ) = UTm2 × fTf2 (tf2 | tf1 > Tm1 ) dtf2
0 Tm1 + Tf2
!
2Cm
+UTm2 DP
× fTf2 (tf2 > Tm2 DPCN22 | tf1 > Tm1 ) (4.16)
Tm1 + Tm2 CN22
For phase one, replace the expected utility of cost rateE(CN21 ) and E(CN22 ) back
into the expected utility function of cost rate for phase one, i.e.,
Z Tm
DP
1 E(CN21 )
Tm1 CN1 = arg max UTm1 × fTf1 (tf1 ) dtf1
0 Tf1
E(CN22 )
+UTm1 × fTf1 (tf1 > tm1 ) (4.17)
Tm1
Then the globally optimum maintenance time for phase one by dynamic program-
ming method is TmDP
1
.
Briefly, the procedure to find the optimum maintenance times for two-phase systems
under dynamic programming modelling are as below:
1. Maximise the expected utility function of cost rate for phase two to find the
optimum maintenance times, Tm∗ 2 and corresponding maximum expected utility
of cost rate E(Tm∗ 2 );
2. Then plug the obtained maximum expected utility of cost rate E(Tm∗ 2 ) in the
expected utility function of cost rate for phase one to find the globally optimum
maintenance time TmDP
1
.
For example, let failure time Tf follow a Weibull distribution with scale parameter κ
and shape parameter θ. The probability density function (pdf) of Tf is given by
θ−1 θ
θ tf tf
exp − tf > 0, κ > 0, θ > 0;
κ κ κ
fTf (tf | κ, θ) =
0 otherwise.
62
θ−1
θ tf
Its hazard function is hTf (tf | κ, θ) = κ κ
, so depending on the shape param-
eter θ, the hazard function can be decreasing, constant or increasing. The Weibull
distribution has been widely used in practice for modelling the failure time of systems,
see Houlding and Wilson (2011); Singpurwalla and Wilson (1999).
The hierarchical Bayesian method is applied to modelling systems considered in this
thesis. The modelling centres on the uncertainty of shape paremeter θ in the Weibull
probability distribution. θ is assumed to follow a truncated normal distribution in order
to make sure that θ can well express the property of hazard function, i.e., increasing,
decreasing or constant. The hyper-parameter µ in the truncated normal distribution is
assumed to be a uniform prior because one knows it is between a range from knowledge
by experts, but no other information is available about its location. In specifying the
model, define the following:
(4.18) is the Weibull probability model with shape parameter θ and scale parameter
κ.
(4.19) is the prior distribution of the shape parameter θ, which is a normal distri-
bution truncated at a and b, where −∞ < a < b < ∞, with mean µ and standard
deviation σ, with the hyper-parameters made explicitly as µ. Its probability density
function (pdf) f , for a ≤ θ ≤ b, is given by
1
φ x−µ
σ
f (θ | µ, σ, a, b) = b−µ
σ a−µ (4.21)
Φ σ
−Φ σ
√1 exp − 21 ζ 2 is the probability density function of the standard normal
where φ(ζ) = 2π
distribution N (0, 1) and Φ(·) is its cumulative distribution function. See Figure 4.3
for the probability density function of θ. The left graph shows normal distributions
truncated at 1 with different sets of parameters, with mean µ = 1, 2, 3, 4 and standard
deviation σ = 1; it can be seen that θ has higher probability near the mean and
lower probability near the tail, with smaller mean µ. The right graph shows normal
distributions truncated at various points at 0.5, 1, 1.5, 2 with same mean µ = 2 and
standard deviation σ = 1; we may see that θ has higher probability near the mean
when the truncation point is closer to the mean µ.
63
Comparison of normals truncated at 1 with various mean Comparison of normals with mean 2 and various truncation point
µ truncated at
0.8
0.8
1 0.5
2 1
3 1.5
4 2
0.6
0.6
normal normal
Density
Density
0.4
0.4
0.2
0.2
0.0
0.0
−2 0 2 4 6 −2 0 2 4 6
θ θ
Figure 4.3: Comparison of truncated normals with various mean (left) and truncating
points (right).
Crucially, µ is not specified directly, but has hyper-prior distribution (4.20) which
is a uniform distribution with parameters a0 and b0 . Its probability density function
(pdf) is given by
1
b0 −a0
θ ∈ [a0 , b0 ];
f (µ | a0 , b0 ) =
0 otherwise.
This full model involves parameters ξ = (θ, µ). The mean and variance of these
distributions can be expressed as
1
E (Tfi | θ, κ) = κΓ 1 + (4.22)
θ
" 2 #
2 1
Var (Tfi | θ, κ) = κ2 Γ 1 + − Γ 1+ (4.23)
θ θ
a−µ b−µ
φ σ
−φ σ
E (θ | a, b) = µ + b−µ
a−µ
σ (4.24)
Φ σ
−Φ σ
!2
a−µ a−µ b−µ
φ b−µ a−µ b−µ
φ − φ −φ
Var (θ | a, b) = σ 2 1 + σ σ
b−µ
σ
a−µ
σ − σ
b−µ
σ
a−µ
(4.25)
Φ σ
− Φ σ Φ σ
−Φ σ
64
1
E(µ) = (a0 + b0 ) (4.26)
2
1
Var(µ) = (b0 − a0 )2 (4.27)
12
Systems we discuss here are assumed to have increasing hazard when they are
running, which meets the characteristics of most industrial systems in practice though
this is not necessary for general applications.
To solve the sequential maintenance optimisation problem, a generalised form of
the stochastic dynamic programming algorithm for this specific problem is given below.
Let the failure times of the system follow a Weibull distribution with shape param-
eter θ and scale parameter 1 (for simplicity as our concern is about shape parameter
θ). We give a prior for θ as a truncated normal distribution N (µ, 1) truncated at 1 as
the hazard function would decrease sharply at the initial time if the shape parameter
θ is less than 1. The mean of truncated normal distribution of θ is given by E(µ) = 2
when the hyperparameter distribution of µ is a uniform distribution with a0 = 1 and
b0 = 3. Thus, related functional forms can be written as
√1 exp − 1 (θ − 2)2
φ(θ − 2) 2
f (θ | µ = 2, σ = 1, a = 1, b = ∞) = = 2π (4.28)
Φ(1) Φ(1)
fTfi (tfi | θ) = θ(tfi )θ−1 exp −(tfi )θ , i = 1, 2 (4.29)
where φ(ζ) = √1 exp(− 12 ζ 2 ) is the probability density function of the standard normal
2π
distribution and Φ(·) is its cumulative distribution function. The expectation and
variance of failure Tfi , i = 1, 2 with respect to θ are
1
E (Tfi | θ, κ = 1) = Γ 1 + (4.31)
θ
2
2 1
Var (Tfi | θ, κ = 1) = Γ 1 + − Γ 1+ (4.32)
θ θ
As we can see in Figure 4.4, with the increasing of θ, the expectation of Tf increases
and variance of Tf decreases, which means the system is less likely to fail with the
increase of θ.
65
0.20
0.94
0.15
Expectation
Variance
0.92
0.10
0.90
0.05
2 4 6 8 10 2 4 6 8 10
θ θ
φ a−µ
σ
E (θ | µ = 2, σ = 1, a = 1) = µ+σ
1 − Φ a−µ
σ
φ(1)
= 2+ (4.33)
Φ(1)
" !#
a−µ a−µ
φ φ a − µ
Var (θ | µ = 2, σ = 1, a = 1) = σ2 1 − σ σ
a−µ −
1 − Φ a−µ
σ
1 − Φ σ
σ
2
φ(1) φ(1)
= 1− − (4.34)
Φ(1) Φ(1)
0.4
0.3
Density
0.2
0.1
0.0
66
It can be seen from Figure 4.6 that the expectation and variance of parameter θ
increases and decreases respectively, when the truncating point a is moving from 1 to 2.
Then the prior distribution of θ approaches to a left skewed and leptokurtic truncated
normal distribution.
2.8
0.60
2.7
0.55
Expectation
2.6
Variance
0.50
2.5
0.45
2.4
0.40
2.3
1.00 1.25 1.50 1.75 2.00 1.00 1.25 1.50 1.75 2.00
a a
Failure time Tf1 follows a Weibull distribution with shape parameter θ and scale
parameter 1. By integrating over θ, we may obtain the marginal distribution of Tf1 .
Z
fTf1 (tf1 ) = fTf1 (tf1 | θ)f (θ) dθ (4.35)
θ
Z ∞
θ exp{− 21 (θ − 2)2 − (tf1 )θ }(tf1 )θ−1
= √ dθ
1 2π 1 − 21 Erfc − √12
67
0.75
Density
0.50
0.25
0.00
0 2 4 6
Tf1
R
Figure 4.7: Marginal density of Tf1 : fTf1 (tf1 ) = f (tf1 | θ)f (θ) dθ.
For a one-phase system, the expected utility of cost rate at decision node CN1 is
ECN1 (U (CRTm1 ))
Z Tm1
Cf + Cr
= UTm1 × fTf1 (tf1 ) dtf1
0 tf1
Cm
+UTm1 × fTf1 (tf1 > Tm1 ) (4.36)
Tm1
Z Tm1 Z ∞
θ exp{− 21 (θ − 2)2 − (tf1 )θ }(tf1 )θ−1
Cf + Cr
= UTm1 × √ dtf1 dθ
0 1 t f1 2π 1 − 12 Erfc − √12
Cm
× exp −(Tm1 )θ
+UTm1 (4.37)
Tm1
where U (·) is the exponential utility function defined in (4.6), which represents the
utility induced by mean cost per unit time here. This expected utility function has one
random variable Tf1 .
For a two-phase system, the conditional probabilities have the following forms:
68
f (tf1 , tf2 )
fTf2 (tf2 | tf1 ) = (4.38)
f (tf1 )
R
f (tf2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
= θ R
θ
f (tf1 | θ)f (θ) dθ
R
f (tf | θ)f (tf1 | θ)f (θ) dθ
= θ R 2
f (tf1 | θ)f (θ) dθ
√θ 2
R ∞ π exp{− 12 (θ−2)2 −(tf1 )θ −(tf2 )θ }
1
dθ
erfc − √1
2
= √2
R∞ π
exp{− 12 (θ−2)2 −(tf1 )θ }
1
dθ
erfc − √1
2
R∞
exp − 12 (θ − 2)2 − (tf1 )θ − (tf2 )θ dθ
1
= R∞ 1
2 − (t )θ dθ
1
exp − 2
(θ − 2) f 1
69
R
f (tf2 > Tm2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
θ
= R
f (tf1 > Tm1 | θ)f (θ) dθ
√2 θ 1
R ∞ π exp{− 2 (θ−2)2 −(Tm1 )θ −(Tm2 )θ }
1
dθ
erfc − √1
2
= √
R ∞ π2 exp{− 12 (θ−2)2 −(Tm1 )θ }
1
dθ
erfc − √1
2
R∞
exp − 21 (θ − 2)2 − (Tm1 )θ − (Tm2 )θ dθ
1
= R∞ 1
2 − (T
θ dθ
1
exp − 2
(θ − 2) m 1 )
We can see that these conditional probabilities are complicated and do not have
analytically mathematical form.
Correspondingly, the expected utility of cost per unit time for decision nodes CN21 ,
CN22 and CN1 can be written as
Tm1 + Tm2 1
exp − 2 (θ − 2) − (Tm1 ) dθ
ECN1 (U (CRTm1 )) (4.44)
Z Tm1
= ECN21 (U (CRTm2 ))fTf1 (tf1 )dtf1
0
+ ECN22 (U (CRTm2 ))fTf1 (tf1 > Tm1 )
Z Tm1
exp −(tf1 )θ dtf1
=
0
70
Tm2 R∞ 1 2 θ θ
− − − −
exp (θ 2) (t ) (t ) dθ
Z
2(Cr + Cf ) 1 R 2 f 1 f 2
× UTm2 ∞ 1 2 θ
dtf2
0 tf1 + tf2 1
exp − 2 (θ − 2) − (tf1 ) dθ
R∞ 1 2 θ θ
!
− − − −
Cr + Cf + Cm exp (θ 2) (t f ) (T m ) dθ
+UTm2 1 R∞ 2 1 1
2
dtf1
tf1 + Tm2 2
exp − 2 (θ − 2) − (tf1 ) dθ θ
1
+ exp −(Tm1 )θ
R∞
exp − 12 (θ − 2)2 − (tf2 )θ − (Tm1 )θ dθ
Z Tm2
Cm + Cr + Cf 1 R
× UTm2 ∞ 1
2 − (T
θ dθ
dtf2
0 tf2 + Tm1 1
exp − 2
(θ − 2) m 1 )
R∞ 1 2 θ θ
!
− − − −
2Cm exp (θ 2) (T m1 ) (T m ) dθ
+UTm2 1 R∞ 2 1 2
Tm1 + Tm2 1
exp − 2 (θ − 2)2 − (Tm1 )θ dθ
71
6
Tm2
2
6
0
6
4
4
Tf
1
2
Tm1 2
0
0
As we can see in Figure 4.1, the branches of the decision tree increase as 2n , where
n is the number of phases or time periods, which results in issues of computation
time; in addition, even for a simple two-phase maintenance optimisation, based on
dynamic programming, the optimum maintenance time for phase one is determined
by the subsequent optimum times of phase two requiring solutions of nested series of
maximisations and integrations over a highly non-linear space, which have no analytical
forms.
It is worthwhile to mention that Houlding et al. (2015) proposed a conjugate class
of utility functions for sequential decision problems. However, due to the fact that
different utility functions are integrated by different intervals instead of the whole real
line as it was applied, we cannot simply apply the method to our modelling. Hence,
a gridding method for sequential decision problems such as in Brockwell and Kadane
(2003) is considered. One can construct an approximation to the expected cost per unit
time by evaluating it at the points of a grid and storing the results for the current phase;
then one can go back to the previous phase and compute the expected cost per unit
time for the previous phase by also evaluating at grid points. With this step finished,
it is not necessary to keep the value of the current phase, and the related storage space
72
can be released. However, in our decision-making process, it is necessary to keep the
value of maintenance time Tm for each phase. This process can be repeated until one
has found the optimal decision (initial maintenance time here) for the beginning time
point.
To be more precise, we introduce the notation as follows: select the lower and upper
bounds bli and bui , with bui > bli , as well as a number of subdivisions ni , for i = 1, . . . , K,
which is the number of phases. Define grid points
For the gridding method, the formula in Equation 4.35 can be written in discrete
form as:
X
pTf1 (tf1 ) = pTf1 (tf1 | θ)p(θ)
θ
ECN21 (U (CRTm2 ))
Tm2
X 2(Cf + Cr )
= UTm2 × pTf2 (tf2 | tf1 )
Tf2 =0
tf1 + t f2
Cf + Cr + Cm
+UTm2 × pTf2 (tf2 > Tm2 | tf1 ) (4.45)
tf1 + Tm2
ECN22 (U (CRTm2 ))
Tm2
X Cm + Cf + Cr
= UTm2 × pTf2 (tf2 | tf1 > Tm1 )
Tf2 =0
Tm1 + tf2
2Cm
+UTm2 × pTf2 (tf2 > Tm2 | tf1 > Tm1 ) (4.46)
Tm1 + Tm2
where
p(tf1 , tf2 )
pTf2 (tf2 | tf1 ) =
pT (tf1 )
P f1
θ pTf2 (tf2 | θ)pTf1 (tf1 | θ)p(θ)
= P
θ pTf1 (tf1 | θ)p(θ)
p(tf1 > Tm1 , tf2 )
pTf2 (tf2 | tf1 > Tm1 ) =
pTf1 (tf1 > Tm1 )
73
P
θ pTf2 (tf2 | θ)pTf1 (tf1 > Tm1 | θ)p(θ)
= P
θ pTf1 (tf1 > Tm1 | θ)p(θ)
By this gridding method, and for our example, we set δθ and θ is from 1 to 10; δTf as
0.1 and tf1 , tf2 is from 1 to 6.0.
0.04
0.075
0.03
0.050
PMF
PMF
0.02
0.025
0.01
0.00 0.000
2.5 5.0 7.5 10.0 0 2 4 6
θ Tf1
Figure 4.9: Prior distribution of θ (left) and marginal distribution of Tf1 (right) by
the gridding method.
Figure 4.10 and Figure 4.11 show probabilities of Tf2 conditional on Tf1 : depending
on various tf1 from 0.1 to 6.0, it can be seen that the modes of pTf2 (tf2 | tf1 ) move from
left to right and reversely after reaching a certain threshold node, which demonstrates
the dynamics of conditional probabilities.
74
Tf1
0.1
0.2
0.100 0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.1
1.2
1.3
1.4
1.5
1.6
0.075
1.7
1.8
1.9
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
Density
3
0.050
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.1
4.2
4.3
0.025 4.4
4.5
4.6
4.7
4.8
4.9
5.1
5.2
5.3
5.4
5.5
5.6
5.7
0.000
5.8
5.9
6
0 2 4 6
Tf2
Figure 4.10: Probability of Tf2 conditioning on tf1 , i.e., pTf2 (tf2 | tf1 = i), where
i = 0.1, . . . , 6, by the gridding method.
75
p(tf1) p(tf1) 0.100 p(tf1)
p(tf2|tf1=0.1) p(tf2|tf1=0.5) p(tf2|tf1=1.0)
0.075 0.075
0.075
0.050 0.050
0.050
0.075 0.075
0.075
0.050 0.050
0.050
Figure 4.11: Comparison of pTf1 (tf1 ) (red) and pTf2 (tf2 | tf1 = i) (green), where i = 0.1,
0.5, 1.0, 1.1, 1.5, 2.0, 2.1, 2.5, 3.0, 3.1, 3.5, 4.0, 4.1, 4.5, 5.0, 5.1, 5.5, 5.9.
76
Figure 4.13 is the projection, in which numbers alongside the dots are Tf2 which produce
the maximum probability conditioning on corresponding Tf1 . From Figure 4.12 and
Figure 4.13, we can see that the node is at when pTf2 (tf2 = 0.9 | tf1 = 1).
0.105
0.100
0.095
max p(tf2|tf1)
0.090
1.0
0.080
0.8
0.6
0.075
0.4
0.2
0.070
0.0
0 1 2 3 4 5 6
Tf1
Figure 4.12: 3D scatter plot for maximal probability of Tf2 conditioning on tf1 , i.e.,
max pTf2 (tf2 | tf1 ), with vertical lines for each point.
0.9
0.10 0.9 0.9
0.9
0.9
0.8
0.8
0.8
0.8
0.09
0.8
0.1
0.1
0.1
0.1
max p(tf2|tf1)
0.1
0.8 0.1
0.1
0.1
0.1
0.1
0.8 0.1
0.1
0.1
0.1
0.8
0.1
0.1
0.1
0.1
0.1
0.7 0.7
0.1
0.08 0.1
0.2
0.7 0.2
0.2
0.2
0.2
0.7
0.6 0.2
0.2
0.6 0.2
0.3
0.3
0.6
0.3
0.5 0.3
0.6 0.5 0.4 0.4
0.5 0.4
0 2 4 6
Tf1
Figure 4.13: Maximal probability of Tf2 conditioning on tf1 , i.e., max pTf2 (tf2 | tf1 ).
77
Figure 4.14 presents Tf2 that has maximal probability conditioning on tf1 , i.e.,
arg max pTf2 (tf2 | tf1 ); corresponding probabilities (rounded to 2 decimals) are shown
alongside dots that represent Tf2 which maximise the probabilities given a specific tf1 .
It can be seen that a few tf2 have the same maximal probability conditional on various
tf1 , which is because our gridding methods have limited precision. While Figure 4.15
shows the tf2 that has maximal probability and its corresponding probability value,
for example, for varying Tf1 , the maximum conditional probability is about 0.08 when
Tf2 = 4.
0.75
0.080.08 0.080.08
0.07 0.080.080.07
arg max p(tf2|tf1)
0.070.070.07
0.50
0.070.070.07
0.070.070.070.07
0.25
0.080.080.080.080.080.080.080.08
0.080.080.080.080.080.080.080.080.080.090.090.090.090.090.090.090.090.090.090.090.09
0 2 4 6
Tf1
Figure 4.14: Tf2 that has maximal probability conditioning on tf1 , i.e.,
arg max pTf2 (tf2 | tf1 ); corresponding probabilities (rounded to 2 decimals) shown along-
side dots that represent Tf2 which maximise the probabilities given a specific tf1 .
78
1
0.10 0.9
1.1
0.8
1.2
0.7
1.3
0.6
1.4
0.09
0.5
6
5.9
5.8
5.7
max p(tf2|tf1)
5.6
5.5 1.5
5.4
5.3
5.2
5.1
5 0.4
4.9
4.8
4.7
1.6
4.6
4.5
4.4
4.3
4.2
0.3
1.7
4.1
0.08 4
3.9
3.8 1.8
3.7
3.6
3.5
0.2
3.4 1.9
3.3
3.2 2
3.1
3
2.1
2.9
2.8 2.2
2.7
2.6 2.3 0.1
2.5 2.4
Figure 4.15: Maximal probability of Tf2 and the corresponding Tf2 ; corresponding tf1
shown alongside dots.
Figure 4.16 and Figrue 4.17 illustrate the dynamics of pTf2 (tf2 | tf1 > Tm1 ), which
has similar property compared with that of pTf2 (tf2 | tf1 ).
79
Tm1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.1
1.2
1.3
0.075
1.4
1.5
1.6
1.7
1.8
1.9
2.1
2.2
2.3
2.4
2.5
2.6
2.7
0.050 2.8
2.9
Density
3.1
3.2
3.3
3.4
3.5
3.6
3.7
3.8
3.9
4.1
4.2
0.025
4.3
4.4
4.5
4.6
4.7
4.8
4.9
5.1
5.2
5.3
5.4
5.5
5.6
0.000 5.7
5.8
5.9
0 2 4 6
Tf2
Figure 4.16: Probability of Tf2 conditioning on tf1 > Tm1 , i.e., pTf2 (tf2 | tf1 > Tm1 ),
where Tm1 = 0.1, . . . , 6, by the gridding method.
80
p(tf1) p(tf1) p(tf1)
p(tf2|tm1>0.1) p(tf2|tm1>0.5) p(tf2|tm1>1.0)
Figure 4.17: Comparison of pTf1 (tf1 ) (red) and pTf2 (tf2 | tf1 > Tm1 ) (green), where
Tm1 = 0.1, 0.5, 1.0, 1.1, 1.5, 2.0, 2.1, 2.5, 3.0, 3.1, 3.5, 4.0, 4.1, 4.5, 5.0, 5.1, 5.5, 5.9.
81
In summary, these graphs demonstrate the dynamic property of conditional prob-
abilities used to calculate corresponding cost rate and utility in Figure 4.1. However,
due to the discrete property of the gridding method, only approximate presentation
can be shown.
4.3.5 Pseudocode
Here a general pseudocode for the gridding method is presented as follows, regardless
of failure time distribution assumptions.
1. Pre-defined variables
(d) Generate a sequence of possible failure time values of Tf1 , Tf2 and mainte-
nance time values Tm1 , Tm2 and define the lengths of them as lTf1 , lTf2 , lTm1 , lTm2 .
lTf ×lTf
(a) Define a matrix M 1 2 for the joint mass distribution of Tf1 and Tf2 ,
where rows are possible Tf1 and columns are possible Tf2 .
(b) Calculate the joint probability of Tf1 and Tf2 , i.e., p(tf1 , tf2 ), by generating
failure probability from an arbitrary distribution Df and integrating over
parameter θ.
(c) Sum the rows the matrix p(tf1 , tf2 ) which is the marginal probability of Tf1 ,
i.e., p(tf1 ).
p(tf1 ,tf2 )
(d) Obtain the conditional probability matrix p(tf2 | tf1 ) by p(tf1 )
.
(a) Define a matrix p(tf2 > Tm2 | tf1 ) with 0 entries having lTf1 rows (i)and lTm2
columns (j).
82
(b) For each row, update the entries by sum of the columns of p(tf2 | tf1 ) from
j + 1 to lTm2 and obtain the joint probability of tf2 > Tm2 and tf1 , i.e.,
p(tf2 > Tm2 , tf1 ).
p(tf2 >Tm2 ,tf1 )
(c) Obtain the conditional probability matrix p(tf2 > Tm2 | tf1 ) by p(tf1 )
.
(a) Define a vector representing p(tf1 > Tm1 ), of which the length is lTm1 .
(b) Define a matrix for the joint probability of tf1 T tm1 and tf2 , i.e., p(tf1 >
Tm1 , tf2 ), of which rows are possible Tm1 and columns are possible Tf2 .
(c) For a given Tm1 , sum p(tf1 ) of which tf1 > Tm1 , which gives p(tf1 > Tm1 ).
(d) For a given Tm1 , sum p(tf1 , tf2 ) of which tf1 > Tm1 and corresponding tf2 ,
which gives p(tf2 , tf1 > Tm1 ).
p(tf2 , tf1 >Tm1 )
(e) Obtain the conditional probability matrix p(tf2 | tf1 > Tm1 ) by p(tf1 >Tm1 )
.
(a) Define a matrix p(tf2 > Tm2 | tf1 ) with 0 entries having (lTm1 − 1) rows (i)
and lTm2 columns (j).
(b) For each row, update the entries by sum of the columns of p(tf2 | tf1 > Tm1 )
from j + 1 to lTm2 and obtain p(tf2 > Tm2 , tf1 > Tm1 ).
p(tf2 >Tm2 ,tf1 >Tm1 )
(c) Obtain the conditional matrix p(tf2 > Tm2 | tf1 > tm1 ) by p(tf1 >Tm1 )
.
6. Calculate the expected utility of cost rate (the expected cost per unit time) at
chance node CN21 and CN22 , i.e., ECN21 (U (CRTm2 )) and ECN22 (U (CRTm2 ))
Upper
(a) ECN21
is a utility matrix related to tf1 and tf2 ≤ Tm2 , with lTf1 rows and
lTm2 columns.
Lower
(b) ECN 21
is a utility matrix related to tf1 and tf2 > Tm2 , with lTf1 rows and
lTm2 columns.
Upper
(c) For each row of matrix ECN21
(i.e., given a possible Tf1 ), each entry is the
Cf +Cr
utility given a possible Tf2 , which gives tf1 +tf2
× p(tf2 | tf1 ).
Lower
(d) For each row of matrix ECN 21
(i.e., given a possible Tf1 ), each entry is the
Cf +Cr +Cm
utility given a possible Tm2 , which gives tf1 +Tm2
× p(tf2 > Tm2 | tf1 ).
83
Upper Lower
(e) Sum the two matrices above, i.e., ECN21
and ECN 21
, which gives the ex-
pected utility of expected cost per unit time at chance node CN21 , i.e.,
ECN21 (U (CRTm2 )).
Upper Lower
(f) Apply the same process to ECN22
and ECN 22
with different cost structures
and conditional probability matrices to obtain ECN22 (U (CRTm2 )).
7. Calculate the expected cost per unit time at chance node CN1 , i.e., ECN
Tf
1
1
(a) For each row of matrix ECN21 (U (CRTm2 )), find the maximum utility and
corresponding Tm2 , which gives the optimal Tm2 that produces maximum
utility, given a Tf1 .
(b) For each row of matrix ECN22 (U (CRTm2 )), find the maximum utility and
corresponding Tm2 , which gives the optimal Tm2 that produces maximum
utility, given a Tm1 .
(c) Sum the maximum utilities above pairwisely with their corresponding prob-
abilities, i.e., p(tf1 ) and p(tf1 > Tm1 ), which gives the ECN1 (U (CRTm1 )).
(d) Find the optimal maintenance time Tm1 that produces maximum utility
from ECN1 (U (CRTm1 )).
84
within the dynamic programming framework. However, its solution would not be
globally optimal since it ignores possible learning.
From Figure 4.18 we are able to see that the expected utility at chance node CN21
is decreasing due to the branch assumption of tf1 ≤ Tm1 indicating more cost induced,
while the expected utility at chance node CN22 is increasing thanks to the branch
assumption of tf1 > Tm1 . By looking for the maximum value of the expected utility at
chance node CN1 , we are able to find the corresponding maintenance time, which is
the optimal maintenance time for CN1 , Tm1 .
−1
ChanceNodes
Expected Utility
CN21
CN22
CN1
−2
−3
0 2 4 6
Tm1
Figure 4.18: Expected utilities for two-phase systems at chance nodes CN21 , CN22
and CN1 under the dynamic programming method.
From Table 4.1 we can see that the optimal maintenance time Tm1 at chance node
CN1 , by the dynamic programming method is 0.5, which is larger than than that by
the myopic method that is 0.3, whereas the expected utility is higher, which can be
explained as by considering all possible future decisions and previous information, and
as a result, one can reduce the cost spent on system maintenance, which in other words
improves its utility. When it comes to the second phase, if failure is not observed before
maintenance at Tm1 at local time 0.5 for the first phase, then maintenance time for
the second phase is supposed to be at local time 0.3, which can be explained as: early
maintenance decisions can be riskier so as to gain information whilst later decisions
(here the last decisions) have no or little use to gain extra information because the
85
system considered here has two phases only.
Table 4.1: Optimal corrective maintenance (CM) time and corresponding expected
cost for each chance node based on dynamic programming and myopic methods. Brack-
eted figures are failure time Tf1 with respect to Tm∗ 1 and Tm∗ 2 , numbers in brackets
representing corresponding failure times.
As we can see in Figure 4.19 and Table 4.2, for very low risk aversion, in other
words, when the risk aversion parameter η → 0, exponential utility function will give
the same maintenance policy, i.e., same maintenance time by maximising expected
utility of cost rate, which tends towards minimising the expected cost per unit time or
cost rate.
86
3
2 ChanceNodes
Expected Cost
CN21
CN22
CN1
0 2 4 6
Tm1
Figure 4.19: Expected costs for two-phase systems at chance nodes CN21 , CN22 and
CN1 under the dynamic programming method.
Table 4.2: Optimum Perfect Preventive Maintenance (PPM) time through different
optimisation objectives, i.e., expected cost and expected utility for each chance node
based on dynamic programming method. Bracketed figures are failure time Tf1 with
respect to Tm1 and Tm2 , numbers in brackets representing corresponding failure times.
To investigate the impact of risk aversion parameter η on the decision making about
87
the optimal maintenance time, η is increased gradually to compare the corresponding
optimal maintenance time and expected utility for chance node CN1 of the two-phase
repairable system. As it is shown in Table 4.3, with the decision maker becoming more
risk averse (i.e., larger η), one is supposed to preventively maintain the system earlier
and the expected utility decreases accordingly. Figure 4.20 shows the characteristic of
decreasing, in which the expected utility is transformed via an exponential function
exp(·) to be presented in the right graph.
Table 4.3: Optimal perfect preventive maintenance (PPM) time and corresponding
expected utility for chance node CN1 by dynamic programming conditioning on various
risk aversion parameter η of an exponential utility function.
0.08
0e+00
0.06
Exponential Utility
−5e+09
Utility
0.04
0.02
−1e+10
0.00
0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00
η η
According to the results obtained via the dynamic programming method, one can
88
determine the optimal maintenance times for each phase of the two phase system by
referring to Figure 4.21. For example, one determines the maintenance time for phase
1 as 0.5; if a failure occurs at time 0.2, the maintenance time for phase 2 is chosen as
0.8 at local time, i.e., 1.0(0.2 + 0.8) at global time; if no failure happens before 0.5, the
maintenance for phase 2 is decided as 0.3 at local time, i.e., 0.8(0.5 + 0.3) at global
time.
Phase 1 Tm1=0.5
Figure 4.22 shows the prior and posterior probability distributions of parameter
θ depending on the observed failure Tf1 and conducted preventive maintenance at
Tm1 = 0.5 and indicates that the posterior probability distribution of θ is more right-
skewed and has higher mode if failure occurs earlier in the first phase.
89
0.12
p(θ)
0.10
p(θ | Tf1=0.1)
p(θ | Tf1=0.2)
p(θ | Tf1=0.3)
0.08
p(θ | Tf1=0.4)
Density
p(θ | Tf1=0.5)
0.06
p(θ | Tf1>0.5)
0.04
0.02
0.00
1 2 3 4 5 6
θ
Figure 4.22: Posterior probability density of θ conditioning on Tf1 (≤ 0.5) and Tf1 >
0.5 compared with prior probability density of θ.
4.4.2 Simulation
90
ECRCN1 : µCR
N Simulated Expected Cost Rate SE Variance σ 2 = E [(SE − µCR )2 ]
1000 2.430923 1.53014e-05
5000 2.596228 3.262069e-07
20000 2.517805 6.777306e-08
50000 2.540612 3.925825e-09
100000 2.548292 4.006983e-10
As can be seen in Table 4.4, the variance of the simulated expected cost rate becomes
smaller with the increasing number of simulations, N . In other words, the expected
cost rate can be approximated via a large number of simulations.
Related R code is also attached as follows.
91
1 n <- 1000 # number of theta
2 sim . theta <- rtruncnorm (n , a =1 , b = Inf , mean =2 , sd =1)
3
4 # generate tf1 and tf2 which follow Weibull distribution given shape parameter theta
5 tf1 <- rep (0 , rep = n )
6 tf2 <- rep (0 , rep = n )
7
8 k <- 1
9 while (k <= n ) {
10 x <- rweibull (1 , shape = sim . theta [ k ] , scale =1)
11 y <- rweibull (1 , shape = sim . theta [ k ] , scale =1)
12 # x <- rgamma (1 , shape = sim . theta [ k ] , scale =1)
13 # y <- rgamma (1 , shape = sim . theta [ k ] , scale =1)
14 if (( round (x , 1) ! = 0) & & ( round (y , 1) ! = 0) ) {
15 tf1 [ k ] <- x
16 tf2 [ k ] <- y
17 k = k +1
18 }
19 }
20
21 # compute simulated cost for chance node CN1
22 cost . cn <- rep (0 , n )
23 for ( i in 1: n ) {
24 if ( round ( tf1 [ i ] , 1) <= cn1 . tm1 . opt ) {
25 if ( round ( tf2 [ i ] , 1) <= cn21 . tm2 . opt [ round ( tf1 [ i ] , 1) * 10]) {
26 cost . cn [ i ] <- (1+ alpha ) * ( Cf + Cr ) /
27 ( round ( tf1 [ i ] , 1) + round ( tf2 [ i ] , 1) )
28 }
29 else if ( round ( tf2 [ i ] , 1) > cn21 . tm2 . opt [ round ( tf1 [ i ] , 1) * 10]) {
30 cost . cn [ i ] <- ( Cf + Cr + alpha * Cm ) /
31 ( round ( tf1 [ i ] , 1) + cn21 . tm2 . opt [ round ( tf1 [ i ] , 1) * 10])
32 }
33 }
34 else if ( round ( tf1 [ i ] , 1) > cn1 . tm1 . opt ) {
35 if ( round ( tf2 [ i ] , 1) <= cn22 . tm2 . opt ) {
36 cost . cn [ i ] <- ( Cm + alpha * ( Cf + Cr ) ) /
37 ( cn1 . tm1 . opt + round ( tf2 [ i ] , 1) )
38 }
39 else if ( round ( tf2 [ i ] , 1) > cn22 . tm2 . opt ) {
40 cost . cn [ i ] <- (1+ alpha ) * Cm /
41 ( cn1 . tm1 . opt + cn22 . tm2 . opt )
42 }
43 }
44 }
45 # compute simulated expected cost for chance node CN1
46 mean ( cost . cn )
92
Chapter 5
93
conducted, i.e., pTf2 (tf2 | tf1 > Tm1 ) > pTf2 (tf2 | tf1 ).
According to a review by Nakagawa (2012), existing imperfect preventive main-
tenance models can be categorised as: age reduction models, hazard rate reduction
models, and hybrid models of both. Via age reduction models, the virtual age of a
system would reduce to t − δ from t after imperfect preventive maintenance, i.e., the
hazard function would change from h(t) to h(t − δ), whereas the hazard function would
change from h(t) to τ h(t) (τ > 1) after imperfect preventive maintenance by hazard
rate reduction modelling. And hybrid method combines them to model the hazard
rate from h(t) to τ h(t − δ). A novel method by manipulating the probability matrix
pTf2 (tf2 | tf1 > Tm1 ) is proposed in this section.
Based on the original conditional failure probability matrix pTf2 (tf2 | tf1 ) by the
gridding method, the conditional failure time matrix pTf2 (tf2 | tf1 > Tm1 ) for perfect
preventive maintenance M is obtained by (§4.3.5, page 82). Then imperfect preventive
maintenance probability manipulation is conducted based on this matrix: for each
row, take the last arbitrary β percent values and reassign them proportionally to each
element left, a new matrix M new is created, which can be regarded as a way of expressing
the assumption that the failure rate increases after maintenance. For simplicity, the
matrix M with entry Mij = pTf2 (Tf2 = tjf2 | tf1 > tim1 ) is expressed as an example as
follows
0.943 0.048 0.006 0.003
0.800 0.151 0.038 0.011
M =
0.653 0.232 0.084 0.031
0.644 0.237 0.087 0.032
where tjf2 is the j th possible value of Tf2 and tim1 is the ith possible value of Tm1 ; and
the sum of each row is 1.
Now, for each row, the last entries 0.003, 0.011, 0.031, 0.032 are replaced with 0 and
added proportionally to the other entries for each row. For example, for the first row,
0.943 0.048
0.943 + 0.003 × 0.943+0.048+0.006
, 0.048 + 0.003 × 0.943+0.048+0.006
and 0.173 + 0.003 ×
94
0.006
0.943+0.048+0.006
, which becomes the first row of matrix M new ,
0.9458 0.0481 0.0060 0.0000
0.8089 0.1527 0.0384 0.0000
M new =
0.6739 0.2394 0.0867 0.0000
0.6653 0.2448 0.0899 0.0000
0.100
p(tf1)
p(tf2|tf1>Tm1=0.5)(PPM)
p(tf2|tf1>Tm1=0.5)(IPM)
0.075
Density
0.050
0.025
0.000
0 2 4 6
Failure Time
95
As we can see in Figure 5.2, through the manipulation of the conditional proba-
bilities, the system is more likely to fail early at the second phase if it is imperfectly
preventively maintained at Tm1 as 0.5.
0.100
p(tf1)
p(tf2|tf1=0.1)
p(tf2|tf1=0.2)
p(tf2|tf1=0.3)
p(tf2|tf1=0.4)
0.075
p(tf2|tf1>Tm1=0.5)
Density
0.050
0.025
0.000
0 2 4 6
Failure Time
Table 5.1 presents the comparison of optimal maintenance time by dynamic pro-
gramming and myopic methods through imperfect preventive maintenance modelling
via various preventive maintenance power β. Imperfect preventive maintenance at
power β = 0 is equivalent to perfect preventive maintenance. From Table 5.1, we
can see that expected cost rates increase with respect to the increasing PM power β
because the system is more likely to fail under imperfect preventive maintenance com-
pared with perfect preventive maintenance. It may be generally concluded that under
the imperfect preventive maintenance modelling policy systems tend to be maintained
later (Tm1 = 0.6 when β = 55 vs Tm1 = 0.5 when β = 0) while expected cost per
unit time is increasing due to the system having the high possibility to fail early and
being less reliable. More interesting is that one is supposed to maintain systems even
96
later because once maintained the system is less reliable under the imperfect preventive
maintenance policy. For example, the expected cost rate increases 11.65% under the
imperfect preventive maintenance policy with PM power β = 55 compared with its
counterpart under the perfect preventive maintenance policy which is when β = 0.
Table 5.1: Optimal Imperfect Preventive Maintenance (IPM) time and corresponding
expected cost for chance nodes CN1 and CN22 conditioning on various PM power
parameter β based on dynamic programming and myopic methods.
97
5.2 Time Value of Money
In finance, the net present value (NPV) is defined as the sum of the present values of
incoming and outgoing cash flows over a period of time. According to this idea, we can
consider the payoff and cost as incoming and outgoing cash flows of the maintenance
system, respectively; in other words, a payoff or cost now is more valuable than an
identical payoff or cost in the future, which is because of a discount factor.
Discounting of future costs has been widely used in finance, where net present
value (NPV) is used as criterion for assessing alternative policies. In our maintenance
problem, if a cost induced by a maintenance policy can be delayed for a time δt, this
part of money from the cost budget can be invested for that period of time and earn
corresponding interest with rate r or one does need not to borrow that amount of
money for a period δt with rate r 2 . This time value of money effect can be modelled
through discounting costs induced by a particular maintenance at time δt ahead by a
discounting factor exp(−rδt) which is the continuous compounding of interest. So we
can re-define the cost rate function as
f (cost to time t)
CR(t) =
P t
i exp(−rt i )costi
= (5.1)
t
where ti is the time at which cost i occurs, r is the inflation rate or factor for costs
and t is the system performing time. Here it is assumed that it becomes less expensive
to be maintained or fail in the future and the utility function meets the requirement
of risk aversion in this case. In other words, one can replace the previous cost Ci (i is
the phase number) in Chapter 4 by exp(−rti )Ci , where ti is the local time that cost
Ci happens at.
In the maintenance decision making problem, by implementing a discount factor in
utility modelling, the maintenance would be able to depict the reality more accurately
in terms of taking time effect into the maintenance modelling procedure.
Given the time effect parameter r varying between 0 and 1, the time factor’s effect
on systems’ perfect preventive maintenance optimisation is explored.
2
r is equivalent to the rate of return in finance.
98
Time Effect Parameter r Maintenance Time Expected Cost Rate Cost .
r = 1.0 0.5 2.55 -
r = 0.9 0.5 2.43 4.96%
r = 0.8 0.4 2.29 5.55%
r = 0.7 0.4 2.15 6.36%
r = 0.6 0.4 2.00 6.89%
r = 0.5 0.4 1.85 7.50%
r = 0.4 0.3 1.69 8.53%
r = 0.3 0.3 1.51 10.71%
r = 0.2 0.3 1.31 13.01%
r = 0.1 0.2 1.07 18.25%
Table 5.2: Optimal Perfect Preventive Maintenance (PPM) time and corresponding
expected cost rate for chance node CN1 conditioning on various time effect parameter r
based on dynamic programming. . indicates the cost rate comparison to its previous
one.
From Table 5.2 we are able to conclude generally that with the time effect more
dominating (r → 0), systems tend to be maintained earlier because it would be much
more expensive to conduct maintenance in the future, which meets our expectation; we
may also notice that the expected cost rate decreases significantly with maintenance
actions being conducted earlier. For example, when r changes from 0.9 to 0.8, main-
tenance time would be scheduled 0.1 time units in advance and the expected cost rate
would be decreased by 5.55%; and if r changes from 0.2 to 0.1, although the mainte-
nance time would be also scheduled 0.1 time units earlier, the expected cost rate would
be decreased more significantly by 18.25%. These results imply the crucial role that
the time value of money effect plays in the maintenance modelling.
Models and results presented in this section suggest that the consideration of NPV
is crucial for preventive maintenance optimisation problem in practice, which is be-
cause money is usually borrowed from banks to carry out maintenance in practical
circumstances. Therefore, a NPV could be achieved by deferring the maintenance cost
to the future, which should be taken into account in the maintenance optimisation
process.
99
5.3 Maintenance in Discrete Time
In the survival analysis of repairable systems’ maintenance, the time to failure is not
always observed in a continuous time setting. For instance, in practice tyres of fighter
aircraft are preventively replaced after about 4 ∼ 14 flights (Nakagawa, 2012). In some
situations, the lifetimes of a system are recorded depending on the number of cycles
that it is working, so there is not a calendar or clock involved, e.g., the failure time
data of a toy manufacturing system may be collected each manufacturing cycle. In
other cases, its lifetimes are not defined at the exact clock time but are statistically ob-
served monthly, seasonly, or yearly, for example. Thus it is interesting and worthwhile
considering system maintenance in a discrete time setting.
Consider the time over an indefinitely long cycle n (n = 1, 2, . . .) that a single unit
should be operating. A unit is replaced at cycle N (N = 1, 2, . . .) after its installation
or at failure, whichever occurs first. Let {Pn }∞
n=1 denote the discrete failure distribution
that a unit fails at cycle n. Cost (Cf + Cr ) is incurred for the system that is replaced
and cost Cm (< Cf + Cr ) is incurred for the non-failed system that is preventively
maintained. Then, the expected cost rate for a one phase system is given by
P mN
(Cf + Cr ) Tj=1 Pj + C m ∞
P
N +1 Pj
j=Tm
N
C(Tm ) = PTmN P∞ (TmN = 1, 2, . . .) (5.2)
j=1 i=j Pi
Let hn ≡ Pn
P∞ (n = 1, 2, . . .) be the hazard rate of the discrete Weibull distri-
Pj
j=n
The probability mass function for three different real parameter settings (p =
0.01, 0.05, 0.1, 0.5, θ = 2) is illustrated below.
100
PMF PMF
0.08
0.15
0.06
0.10
0.04
0.05
0.02
tf tf
2 4 6 8 10 12 2 4 6 8 10 12
PMF PMF
0.25 0.14
0.12
0.20
0.10
0.15
0.08
0.10 0.06
0.04
0.05
0.02
tf tf
2 4 6 8 10 12 2 4 6 8 10 12
From Figure 5.3, we can see that discrete Weibull failures are more likely to happen
for the system if p is smaller.
The cumulative distribution on the support of Tf is
θ
P(Tf ) = P(Tf ≤ tf ) = 1 − (1 − p)(tf +1) , tf = 0, 1, 2, . . . (5.4)
θ
S(tf ) = P(Tf > tf ) = (1 − p)(tf ) , tf = 0, 1, 2, . . . . (5.5)
P(tf ) θ θ
h(tf ) = = 1 − (1 − p)(tf +1) −(tf ) , (5.6)
S(tf )
∂h(tf ) θ θ
= −(1 − p)−(tf ) +(1+tf ) −θ(tf )−1+θ + θ(1 + tf )−1+θ log(1 − p).
(5.7)
∂tf
It is obvious to see that (5.7) > 0 if and only if when θ > 1, which represents an
increasing hazard function.
101
We can apply the dynamic programming method to a two phase system following
a discrete Weibull failure time distribution with the same assumptions concerning the
other parameters in Chapter 4. Corresponding results are shown in Table 5.3.
Table 5.3: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 by dynamic programming conditioning on various
parameter p of a discrete Weibull failure distribution.
As we can see in Table 5.3, with the parameter p increasing, which means the
system is less likely to fail following a discrete Weibull distribution, the expected cost
rate is also increasing despite having the same maintenance time of 1 for all cases. This
means that under the same maintenance time schedule, one can reduce the expected
cost rate significantly if the system is more likely to have a discrete Weibull failure
time. Maintenance time for the second phase can be achieved via the same procedure
proposed in Chapter 4.
This section is an elementary exploration to discrete failure probability distribution.
Hence, one can consider other discrete distributions in practice.
102
5.4 Maintenance for Parallel Systems
Systems studied so far have only one unit. Now let us consider a parallel redundant
system that consists of N (N ≥ 2) identical units and the system fails when all its units
fail. Assume each unit has a failure distribution F (t) with finite mean µ.
Suppose that a one phase system is replaced at system failure or at planned time
T (0 < T < ∞), whichever occurs first. Then, we have the expected cost per unit time
for a one phase system as
Z Tm Z ∞
Cf + Cr Cm
C(Tm ; N ) = dF N (t) + dF N (t) (5.8)
0 t Tm t
Z Tm
Cf + Cr Cm
= × f N (tf ) dtf + × f N (tf > tm ) (5.9)
0 Tf Tm
where Cf + Cr is the cost of replacement (failure cost and repair cost) at system failure,
Cm is the cost of perfect preventive maintenance at scheduled maintenance time Tm
with Cm < Cf + Cr .
For example, one is supposed to determine the optimal preventive maintenance
time for a two unit parallel redundant system with two phases, of which two units
are identical and follow a Weibull probability distribution with the shape parameter θ
from a normal distribution truncated at 1 with mean 2 and standard deviation 1 and
scale parameter 1, which are same assumptions presented for two-phase maintenance
models in Chapter 4. The decision tree is the same as in Figure 4.1 (page 56).
Based on the gridding method, the expected cost rates for chance nodes CN21 and
CN22 are expressed as
ECN21 (CRTm2 ; N )
Tm2
X 2(Cf + Cr ) N
= × pTf2 (tf2 | tf1 )
Tf2 =0
tf1 + tf2
Cf + Cr + Cm N
+ × pTf2 (tf2 > Tm2 | tf1 ) , (5.10)
tf1 + Tm2
ECN22 (CRTm2 ; N )
Tm2
X Cm + Cf + Cr N
= × pTf2 (tf2 | tf1 > Tm1 )
Tf2 =0
Tm1 + tf2
2Cm N
+ × pTf2 (tf2 > Tm2 | tf1 > Tm1 ) , (5.11)
Tm1 + Tm2
where N is the number of parallel units of systems.
103
As it can be seen (5.10) and (5.11), the conditional probability matrices for the two
unit system are transformed via component wise matrix multiplication. Thus, one can
apply the dynamic programming method to solving the optimal preventive maintenance
problem for two unit (and even multi-unit simply via matrix multiplication) redundant
parallel systems.
Table 5.4: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost rate for each chance node by dynamic programming for one-unit systems
and two-unit redundant parallel systems. Bracketed figures are failure time Tf1 with
respect to Tm1 and Tm2 , numbers in brackets representing corresponding failure times.
Table 5.4 shows that the expected cost per unit time for the two unit redundant
parallel system is 0.57, which is 77.85% cheaper than that for the one unit system
(2.55). Hence, one could consider a two unit redundant system if its implementing
cost rate is less than 1.96. In other words, one can take the expected cost of 1.96 as a
cut-off to determine if a parallel redundant system should be implemented in this case.
104
5.5 PPM under failure time distribution assump-
tions
Here we simply compare the failure distribution assumption’s effect on system perfect
preventive maintenance optimisation.
For example, we assume the system failure time follows Weibull and gamma dis-
tribution, respectively. These two distributions have the same shape parameter θ
(denoted as θW and θG , respectively) that has a normal distribution prior with mean
2 and standard deviation 1 truncated at 1. For a legitimate comparison, it is assumed
that the failure times TfW and TfG from two distributions have the same expectation.
Formally,
E(TfW ) = E(TfG ),
1
κW Γ 1 + = κG θG ,
θW
1
E κW Γ 1 + = E (κG θG ) ,
θW
1
κW E Γ 1 + = κG E (θG ) ,
θW
E (θG )
κW = κG ,
1
E Γ 1 + θW
where κW and κG are the scale parameters for Weibull and gamma distributions, re-
spectively.
κW is obtained through the following steps:
φ(1)
1. E(θG ) = 2 + Φ(1)
= 2.2876 ((4.33), page 66);
s
2. Generate a large number of N random number for θW , denoted as (θW )i , i =
1, · · · , N , from a normal distribution truncated at 1 with mean 2 and standard
deviation 1 (κG = 1), which is the same distribution of θG ;
1
3. Apply gamma function Γ(·) to each 1 + s )
(θW i
, i = 1, · · · , N ;
1
4. Compute the mean of Γ 1 + s )
(θW i
, i = 1, · · · , N , which is 0.8993309 and re-
1
places E Γ 1 + θW ;
5. κW is obtained as 2.543669.
105
Then we are able to apply our gridding method to obtain marginal probability of
failure time Tf from Weibull distribution with shape parameter θW and scale parameter
κW = 2.543669 and gamma distribution with shape parameter θG and scale parameter
κG = 1. As they are shown in Figure 5.4, systems under gamma distribution assump-
tion are more likely to fail earlier than that under Weibull distribution assumption, on
the condition of same expected failure time for the two distributions.
Weibull
gamma
0.03
0.02
Density
0.01
0.00
0 2 4 6
Failure Time
Figure 5.4: Marginal Weibull and gamma probabilities of Tf1 , pW (tf1 ) and pG (tf1 )
with same expectation, i.e., E(TfW ) = E(TfG ).
We can see the difference in effect by looking at the probability density functions
of the two distributions:
θW −1 ( θW )
θW tf tf
fTWf (tf | κW , θW ) = exp −
κW κW κW
( θW )
tf
∝ (tf )θW −1 exp −
κW
(tf )θG −1
G tf
fTf (tf | κG , θG ) = exp −
(θG )κG Γ(θG ) θ
G
tf
∝ (tf )θG −1 exp −
θG
106
By ignoring all the normalising constants, we can see that the probability density
function of the Weibull distribution drops off much more quickly for shape parameter
θW > 1 than the gamma distribution. In the case where θW = θG = 1, they both
reduce to the exponential distribution. More importantly, the hazard increases for the
Weibull distribution but tends to be a constant for the gamma distribution.
A comparison of maintenance time and expected cost rate obtained according to
different failure distribution assumptions are presented in Table 5.5.
Table 5.5: Optimal Perfect Preventive Maintenance (PPM) time and corresponding
expected cost rate for each chance node by dynamic programming based on Weibull and
gamma failure time assumptions. Bracketed figures are failure time Tf1 with respect
to Tm1 and Tm2 , numbers in brackets representing corresponding failure times.
From Table 5.5 we can conclude that systems with gamma failure time distribution
107
tend to be preventively maintained earlier (the maintenance time Tm1 = 1 under gamma
failure time assumption and Tm1 = 1.2 under Weibull failure time assumption) and
about 66.24% more expensive (systems under gamma failure time assumption would
induce higher expected cost rate (1.66) than that under Weibull failure time assumption
(1.00)). In this sense, it is essential to have appropriate failure time assumptions about
the systems studied in practice because failure time assumptions could result in different
maintenance policies.
108
DP-2 3(𝐶𝑟 + 𝐶𝑓 )
𝑇𝑓1 + 𝑇𝑓2 + 𝑇𝑓3
DP-1 𝑇𝑚3 𝐶31
2 𝐶𝑟 + 𝐶𝑓 + 𝐶𝑚
𝑇𝑓1 + 𝑇𝑓2 + 𝑇𝑚3
𝑇𝑚2 𝐶21
2 𝐶𝑟 + 𝐶𝑓 + 𝐶𝑚
𝑇𝑓1 + 𝑇𝑚2 + 𝑇𝑓3
𝑇𝑚3 𝐶32
2 𝐶𝑟 + 𝐶𝑓 + 𝐶𝑚
𝑇𝑓1 + 𝑇𝑚2 + 𝑇𝑚3
𝑇𝑚1 𝐶1
2 𝐶𝑟 + 𝐶𝑓 + 𝐶𝑚
𝑇𝑚1 + 𝑇𝑓2 + 𝑇𝑓3
𝑇𝑚3 𝐶33
𝐶𝑟 + 𝐶𝑓 + 2𝐶𝑚
𝑇𝑚1 + 𝑇𝑓2 + 𝑇𝑚3
𝑇𝑚2 𝐶22
𝐶𝑟 + 𝐶𝑓 + 2𝐶𝑚
𝑇𝑚1 + 𝑇𝑚2 + 𝑇𝑓3
𝑇𝑚3 𝐶34
3𝐶𝑚
𝑇𝑚1 + 𝑇𝑚2 + 𝑇𝑚3
Figure 5.5: Decision tree for three-phase system with sequential problem with shading
indicating a range of possible outcomes for the preceding chance node; Box DP-1 and
DP-2 show the break into two period problems.
In order to apply the dynamic programming method to finding the optimal main-
tenance times for this three-phase system, one is required to have the conditional
probabilities rooting from chance nodes CN31 , CN32 , CN33 and CN34 , which are eight
(23 ) conditional probabilities as below.
109
R
f (tf3 > Tm3 | tf2 , tf1 , θ)f (tf2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
θ
= R
θ
f (tf2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
R
f (tf3 > Tm3 | θ)f (tf2 | θ)f (tf1 | θ)f (θ) dθ
= θ R
θ
f (tf2 | θ)f (tf1 | θ)f (θ) dθ
3. fTf3 (tf3 | tf2 > Tm2 , tf1 )
f (tf3 , tf2 > Tm2 , tf1 )
=
f (tf2 > Tm2 , tf1 )
R
f (tf3 | tf2 > Tm2 , tf1 , θ)f (tf2 > Tm2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
= θ R
θ
f (tf2 > Tm2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
R
f (tf | θ)f (tf2 > Tm2 | θ)f (tf1 | θ)f (θ) dθ
= θ R 3
θ
f (tf2 > Tm2 | θ)f (tf1 | θ)f (θ) dθ
4. fTf3 (tf3 > Tm3 | tf2 > Tm2 , tf1 )
f (tf3 > Tm3 , tf2 > Tm2 , tf1 )
=
f (tf2 > Tm2 , tf1 )
R
f (tf3 > Tm3 | tf2 > Tm2 , tf1 , θ)f (tf2 > Tm2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
= θ R
θ
f (tf2 > Tm2 | tf1 , θ)f (tf1 | θ)f (θ) dθ
R
f (tf3 > Tm3 | θ)f (tf2 > Tm2 | θ)f (tf1 | θ)f (θ) dθ
= θ R
θ
f (tf2 > Tm2 | θ)f (tf1 | θ)f (θ) dθ
5. fTf3 (tf3 | tf2 , tf1 > Tm1 )
f (tf3 , tf2 , tf1 > Tm1 )
=
f (tf2 , tf1 > Tm1 )
R
f (tf3 | tf2 , tf1 > Tm1 , θ)f (tf2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R
θ
f (tf2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
R
f (tf | θ)f (tf2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R 3
θ
f (tf2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
6. fTf3 (tf3 > Tm3 | tf2 , tf1 > Tm1 )
f (tf3 , tf2 , tf1 > Tm1 )
=
f (tf2 , tf1 > Tm1 )
R
f (tf3 > Tm3 | tf2 , tf1 > Tm1 , θ)f (tf2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R
θ
f (tf2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
R
f (tf3 > Tm3 | θ)f (tf2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R
θ
f (tf2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
7. fTf3 (tf3 | tf2 > Tm2 , tf1 > Tm1 )
f (tf3 , tf2 > Tm2 , tf1 > Tm1 )
=
f (tf2 > Tm2 , tf1 > Tm1 )
R
f (tf3 | tf2 > Tm2 , tf1 > Tm1 , θ)f (tf2 > Tm2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R
θ
f (tf2 > Tm2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
R
f (tf | θ)f (tf2 > Tm2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R 3
θ
f (tf2 > Tm2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
8. fTf3 (tf3 > Tm3 | tf2 > Tm2 , tf1 > Tm1 )
110
f (tf3 > Tm3 , tf2 > Tm2 , tf1 > Tm1 )
=
f (tf2 > Tm2 , tf1 > Tm1 )
R
f (tf3 > Tm3 | tf2 > Tm2 , tf1 > Tm1 , θ)f (tf2 > Tm2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R
θ
f (tf2 > Tm2 | tf1 > Tm1 , θ)f (tf1 > Tm1 | θ)f (θ) dθ
R
f (tf3 >m3 | θ)f (tf2 > Tm2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
= θ R
θ
f (tf2 > Tm2 | θ)f (tf1 > Tm1 | θ)f (θ) dθ
The conditional probabilities above are with regard to Tf3 , which are conditioning
on two joint probabilities, e.g., the calculation for pTf3 (tf3 | tf2 , tf1 ). This complex form
adds a challenge to computing. As a result, the gridding approach to solving the
dynamic programming problem requires the multi-dimensional matrix formalisation.
This combination of the myopic method and dynamic programming can be applied
to those expanded problems with multi-phase systems’ optimal maintenance determi-
nation.
In the proposed hybrid myopic-dynamic programming method, first we solve part
DP-1 by dynamic programming and get Tm1 as 0.5.
p(tf1)
p(tf2|tf1=0.1)
p(tf2|tf1=0.2)
p(tf2|tf1=0.3)
0.075 p(tf2|tf1=0.4)
p(tf2|tf1=0.5)
p(tf2|tf1>Tm1=0.5)
Density
0.050
0.025
0.000
0 2 4 6
Tf2
Figure 5.6: Comparison of probabilities of Tf2 conditioning on varying Tf1 and optimal
Tm1 .
111
Making use of this result, in Figure 5.6 we can see that p(tf2 | tf1 = i), i = 0.1, . . . , 0.5,
is mostly on the right and below p(tf1 ), which tells us that the system is more likely to
fail earlier if a system failure is observed.
Let us compare the prior and posterior distribution of parameter θ. By Bayes’
theorem, the posterior of θ can be shown as p(θ | tf1 ≤ Tm1 ) and p(θ | tf1 > Tm1 ):
φ(θ − 2)
p(θ) = (5.12)
Φ(1)
p(tf1 ≤ 0.5 | θ)p(θ)
p(θ | tf1 ≤ 0.5) = (5.13)
p(tf1 ≤ 0.5)
p(tf1 > 0.5 | θ)p(θ)
p(θ | tf1 > 0.5) = . (5.14)
p(tf1 > 0.5)
Figure 5.7 shows the prior and posterior distributions of parameter θ given the
failure time is left censored and right censored, respectively. If a failure occurs before
the scheduled maintenance though it is unknown by how much, the posterior of θ is
updated in the red line indicating a right-skewed property and a higher mode compared
to the prior.
112
0.06
0.05 p(θ)
p(θ | Tf1<Tm1=0.5)
p(θ | Tf1>Tm1=0.5)
0.04
Density
0.03
0.02
0.01
0.00
1 2 3 4 5 6
θ
Figure 5.7: Posterior probability density of θ conditioning on Tf1 ≤ Tm1 (0.5) and
Tf1 > Tm1 (0.5) compared with prior probability density of θ.
113
8
ChanceNodes ChanceNodes
Expected Cost
Expected Cost
4
CN31 CN31
4
CN32 CN32
CN21 CN21
2
2
0 0
0 2 4 6 0 2 4 6
Tm2 at CN21 Tm2 at CN21
4
4 ChanceNodes ChanceNodes
Expected Cost
Expected Cost
CN31 CN31
CN32 CN32
CN21 2 CN21
2
0 0
0 2 4 6 0 2 4 6
Tm2 at CN21 Tm2 at CN21
2
ChanceNodes ChanceNodes
Expected Cost
Expected Cost
3
CN31 CN33
CN32 CN34
2
CN21 CN22
1
0 0
0 2 4 6 0 2 4 6
Tm2 at CN21 Tm2 at CN22
Figure 5.8: Expected cost rates at chance node CN21 conditioning on tf1 ≤ Tm1 (top-
left: tf1 = 0.1, top-right: tf1 = 0.2, middle-left: tf1 = 0.3, middle-right: tf1 = 0.4 and
bottom-left: tf1 = 0.5; and expected cost rates at chance node CN22 conditioning on
tf1 > Tm1 (bottom-right) under the H-DP-M method.
Figure 5.8 shows expected cost rates at chance node CN21 conditioning on tf1 ≤ Tm1
(top-left: tf1 = 0.1, top-right: tf1 = 0.2, middle-left: tf1 = 0.3, middle-right: tf1 = 0.4
and bottom-left: tf1 = 0.5; and expected cost rates at chance node CN22 conditioning
on tf1 > Tm1 (bottom-right) under the H-DP-M method.
The optimal maintenance times of the three phase system for chance nodes CN21
and CN22 and corresponding expected cost rates depending what have been observed
in the first phase are shown in Table 5.6.
114
H-M-DP (Three-Phase) DP (Two-Phase)
Chance Maintenance Expected Maintenance Expected
History
Node Time Cost Rate Time Cost Rate
Table 5.6: Optimal Perfect Preventive Maintenance (PPM) time and corresponding
expected cost rate of chance nodes CN21 and CN22 for three-phase maintenance systems
based on Hybrid Myopic-Dynamic Programming and myopic methods.
From Table 5.6 we see that through hybrid myopic-dynamic programming the future
information about the system performance is taken into our maintenance modelling,
then the maintenance times at CN21 are earlier that in the same chance node for the
two phase system, which results in lower expected cost rate compared with dynamic
programming for the two-phase system.
According to the results obtained via the hybrid myopic-dynamic programming
method, one can determine the optimal maintenance times for each phase of the three
phase system by referring to Figure 5.9. For example, one determines the maintenance
time for phase 1 as 0.5; if a failure occurs at time 0.2, the maintenance time for phase 2
is chosen as 0.7; if no failure happens before 0.7, the maintenance for phase 3 is decided
as 0.8.
115
Phase
Phase
Phase
1
2
3
Tm3=1.0
Tf2=0.1
Tm3=1.0
Tf2=0.2
Tf2=0.3
Tm3=0.9
Tm2=0.7
Tf2=0.4
Tm3=0.9
Tf2=0.5
Tf2=0.6
Tm3=0.9
Tf2>=0.7 Tm3=0.8
Tm3=0.8
Tm3=1.0
Tf1=0.1
Tf2=0.1
Tm3=0.9
Tf2=0.2
Tf2=0.3
Tm3=0.9
Tf2>=0.7
Tm3=0.8
Tf1=0.2
Tm3=0.8
Tm3=0.8
Tm3=0.9
Tf2=0.1
Tf1>=0.5
Tf2=0.2
Tm3=0.9
Tf2=0.3
Tm3=0.8
Tm2=0.6
Tf2=0.4
Tf2=0.5
Tm3=0.8
Tf2>=0.6 Tm3=0.8
Tm3=0.7
Tm2=0.4
Tm3=0.3
Tf2>=0.4
116
5.7 Sensitivity Analysis
The impact of parameter variation on maintenance policies, i.e., sensitivity analysis,
is considered and presented in this section.
In Chapter 4, the gridding method was implemented to solve the optimisation problem
in sequential preventive maintenance models, and the increment parameter was set
up as 0.1 as default. This section is intended to analyse and understand the core of
the gridding method, i.e., the gridding increments’ impact on decisions concerning the
optimal preventive maintenance times and the relationships between them.
First, the optimal perfect preventive maintenance times based on two different
gridding increments, δ = 0.05, 0.1, are compared as an example.
As we can see in Table 5.7: under δ = 0.05, we preventively maintain systems
earlier (0.45 time units) compared with (0.5 time units) under δ = 0.1, thanks to
the increment intervals; the former also produces lower expected cost rate (2.52) than
that (2.55) of the latter, which is a 1.18% reduction. Depending on characteristics of
systems and avaialble budget, this could be a significant cost reduction for a particular
repairable system.
Second, the precision of the increment parameter δ is further investigated to un-
derstand the underlying impact of gridding increments. In this case, the increment
parameter δ decreases, i.e., the gridding interval becomes larger. As we can see in
Table 5.8, with the increment parameter increasing from 0.005 to 0.100, there is little
change with respect to the optimal maintenance times ranging from 0.4 to 0.5, i.e., it
does not change our optimal decisions significantly. However, we note that the expected
cost rate also increases with the increasing of increment parameter δ. It is worthwhile
noting that the expected cost rate is not strictly increasing with the increase of δ,
e.g., it decreases from 2.5479 to 2.5475 when δ increases from 0.085 to 0.090. That is
because it is sampled at different points with the change of the accuracy of discrete
approximation to a continuous distribution.
117
δ = 0.05 δ = 0.1
Maintenance Time Expected Cost Rate Maintenance Time Expected Cost Rate
Table 5.7: Comparison of optimal perfect preventive maintenance (PPM) time and
corresponding expected cost for each chance node based on different increment param-
eter δ. Bracketed figures are failure time Tf1 with respect to Tm1 and Tm2 , numbers in
brackets representing corresponding failure times.
118
Increment Parameter Maintenance Expected Computation Time
δ Time Cost Rate (seconds)
0.005 0.460 2.49 577.03
0.010 0.460 2.49 99.63
0.015 0.465 2.49 52.95
0.020 0.460 2.50 11.71
0.025 0.475 2.50 6.29
0.030 0.450 2.51 3.78
0.035 0.455 2.51 2.60
0.040 0.480 2.51 1.84
0.045 0.450 2.52 1.36
0.050 0.450 2.52 1.02
0.055 0.440 2.53 0.79
0.060 0.480 2.53 0.64
0.065 0.455 2.53 0.53
0.070 0.490 2.54 0.43
0.075 0.450 2.54 0.39
0.080 0.480 2.54 0.34
0.085 0.425 2.55 0.27
0.090 0.450 2.55 0.23
0.095 0.475 2.55 0.21
0.100 0.500 2.55 0.19
Table 5.8: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 conditioning on various increment parameter δ
based on dynamic programming.
119
respect to the increment parameter δ but not very significantly. And from the bottom-
left graph in Figure 5.10, the absolute expected cost rate increases with respect to the
increment parameter δ, while the relative expected cost, i.e., the expected cost rate per
compiling time unit, increases seemingly exponentially. As a result, decision makers in
practice choose an appropriate increment interval to trade off between accuracy and
the cost budget that they have.
600
0.50
0.48
400
0.46
200
0.44
0
0.025 0.050 0.075 0.100 0.025 0.050 0.075 0.100
Gridding Interval Gridding Interval
Expected Cost Rate per Compiling Time Unit
2.54
10
Expected Cost Rate
2.52
2.50
0
0.025 0.050 0.075 0.100 0.025 0.050 0.075 0.100
Gridding Interval Gridding Interval
Figure 5.10: Sensitivity analysis concerning gridding intervals: Compiling time (top-
left); Optimal Maintenance Time (top-right); Expected Cost Rate (bottom-left); Ex-
pected Cost Rate per Compiling Time Unit (bottom-right).
Gridding increments and choice of interval depend on the practical problems and
it is often a trade-off between accuracy and efficiency. For example, smaller intervals
would increase the accuracy to find the closest maintenance time to theoretical results,
however, it would add computing time. Therefore, it is a decision made via cooperation
among different stakeholders to seek the optimum design of system maintenance.
120
5.7.2 Cost Structure
The related costs induced by system performing is one of the core issues considered in
our maintenance modelling. So the impact of cost structure on corresponding systems
must also be considered.
In previous studies, the perfect preventive maintenance cost Cm is assumed to be
less than that of a failure and a repair, Cf + Cm . From Table 5.9, we increase the
perfect preventive maintenance cost Cm from 0.5 to 3 with fixed failure and repair
cost, in other words, the cost difference between failure and maintenance is shrinking.
As we can see, the maintenance time increases from 0.5 to 6 which is the maximum
range of the time assumption and the expected cost rate increases as well, which means
there is less advantage to maintain the system preventively if one has to spend a higher
cost to carry it out.
Table 5.9: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 conditioning on cost difference between failure and
maintenance based on dynamic programming.
121
5.7.3 Prior Sensitivity
We conduct our parameter analysis under the Bayesian framework, so in this section,
we analyse the impact of prior assumptions on the system maintenance time decision
making.
The expected value for the shape parameter of the Weibull distribution is set from
1 to 10. As we can see in Table 5.10, with the increasing of the prior expectation for
the shape parameter θ, the expected cost rate decreases significantly from 3.05 to 0.91,
but it has little impact on the decision about the optimal maintenance time, which
simply ranges from 0.5 to 0.6. That is because a larger expectation of θ indicates a
higher hazard (failure) rate: one can reduce the expected cost rate by applying a same
maintenance policy (here same maintenance time) to a system with a higher possibility
to fail.
Table 5.10: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 conditioning on various prior mean of parameter θ
based on dynamic programming.
In Table 5.11, the standard deviation for the shape parameter θ does not have a
significant effect on making the optimal decision about the maintenance time, which
ranges from 0.4 to 0.6, or the expected cost rate, which ranges from 2.60 to 1.84.
122
SD(θprior ) Maintenance Time Expected Cost Rate
σ=0.1 0.4 2.60
σ=0.3 0.4 2.64
σ=0.5 0.4 2.68
σ=0.7 0.5 2.65
σ=0.9 0.5 2.59
σ=1.0 0.5 2.55
σ=2.0 0.5 2.22
σ=3.0 0.5 1.98
σ=4.0 0.5 1.84
Table 5.11: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for chance node CN1 conditioning on various prior standard deviation
of parameter θ based on dynamic programming.
The findings above helps practitioners to implement our method in practice because
of the simpler interpretation to expectation than that to standard deviation.
The utility functional forms also need to be considered as it would produce different
risk aversion.
Let us examine the maintenance time decision under uncertainty following the
Cobb-Douglas utility function (Cobb and Douglas, 1928) written for one phase sys-
tem for simplicity:
Cf + Cr Cm
U , , fTf (tf ≤ Tm ), fTf (tf > Tm )
Tf Tm
f (t ≤T ) f (t >T )
Cf + Cr Tf f m Cm Tf f m
= . (5.15)
Tf Tm
Then we can use this monotonically transformed utility function to find the optimal
maintenance time and compare it with the expected cost rate objective function.
123
As we can see in Table 5.12, for chance node CN1 , one is suggested to maintain
the system at local time 0.5 under log utility function and expected cost rate objective
(non-log utility function).
Table 5.12: Optimal perfect preventive maintenance (PPM) time and corresponding
expected cost for each chance node by dynamic programming based on non-Log and
Log utility functions. Bracketed figures are failure time Tf1 with respect to Tm1 and
Tm2 , numbers in brackets representing corresponding failure times.
However, the expected cost rate (0.62) for log utility function is much less than
that (2.55) for non-log utility function. That is because the logarithm transformation
of the expected cost rate function conveys risk aversion to the decision maker.
That is because the logarithm transformation of the expected cost rate function
presents risk aversion on the decision making. Under the non-log utility function,
because the decision maker is risk neutral, she or he is supposed to maintain the system
later compared to her or his counterpart with risk aversion, however, the maintenance
time is also 0.5, which means the system is maintained earlier than it is supposed to
be, as a result, the expected cost rate is higher than that under log transformation.
124
5.8 Parallel Computing
We have seen that the increment parameter δ is an essential element in the gridding
method to optimising the maintenance time via dynamic programming in §5.7.1. When
δ is smaller, on the one hand, it will increase the accuracy of the gridding method, which
leads one to approach the theoretically optimal maintenance time. On the other hand,
it results in a problem of being computationally expensive.
Parallel computing is a method of carrying out a large number of calculations simul-
taneously based on the principle that a complicated large problem can be independently
transformed into smaller and simpler ones. A few applications of parallel computing in
maintenance have been investigated. For example, Yang et al. (2012) build up a parallel
computing platform for multiobjective simulation optimisation of bridge maintenance
planning that can span tens of years, and also investigate the proposed framework
through a practical case, which shows the superiority to GA method.
Here the possible solutions to implement parallel computing in further research are
discussed. For example, we need to compute joint probability mass of Tf1 and Tf2 , see
Figure 5.11.
P(*2=1,
*1=1)
P(*2=2,
*1=1)
If there are only two possible values of for Tf1 and Tf2 , we can calculate them
separately in four different processes independently and then combine them to form
125
the matrix for the joint probability mass.
With the development of cloud computing, decision makers could share computer
processing rescources and data to make better and coherent decisions in a comprehen-
sive and efficient manner. Although discussion of cloud computing is beyond the scope
of this thesis, it is interesting to mention that there have been some investigation on
parallel computing on maintenance. For instance, Vert et al. (2015) review models
and algorithms of a pipelineparallel computing process in intelligent scheduling. This
parallel computing process models the computing nodes’ change of resource consump-
tion, which can help decision makers to see which nodes have the highest useage in
computing. Through being implemented in a virtual cloud environment, it provides
better security and higher computing.
126
Chapter 6
The concept of risk aversion plays an important part in economics and finance. In
maintenance modelling, risk aversion would induce a more costly maintenance policy
but could potentially avoid huge expenditures. In this context, if one is more risk
averse, she or he would tend to maintain the system more frequently or earlier. In this
chapter, the approach proposed by Baker (2006) is used and modified to analyse the
effect of risk aversion on the variability of systems in cash flows.
1 − exp(−ηx)
U= , (6.1)
η
where η > 0 is the parameter measuring the risk aversion. Note that
1 − exp(−ηx)
lim =x
η→0 η
127
and its Taylor series expansion at x = 0 is
η 2 η2 3 η3 4 η4 5 η5 6
x− x + x − x + x − x + O(η 6 )
2 6 24 120 720
In other words, as η → 0, U → x, and to first order in η,
η
U ' x − x2 . (6.2)
2
There are a few advantages of using this utility function:
1. The exponential distribution has the memoryless property. The utility function
in (6.1) is defined with constant risk aversion, in other words, the amount of
resource that one prepares to risk is not a function of the initial resource x0 ; this
utility function does not depend on the resources spent on other fields either. As
a result, using of this utility function with this feature would avoid considering
one’s resources and other activities.
2. To first order in the risk aversion parameter η, all utility functions on the whole
real line will be in the form given in (6.2). Therefore, using exponential utility
function will produce more general results under lower risk aversion.
3. Most utility functions generally have more than one parameters as shown in
(§3.3.1, page 44), which would add much complexity on modelling.
As this thesis deals with costs induced by system performance, we define cost C =
−x having disutility
exp(ηC) − 1
−U = . (6.3)
η
The certainty equivalent is defined as a guaranteed return that one would accept,
instead of taking a chance on a higher but uncertain and risky return. In our modelling,
we interpret it as the sum of resources that one definitely gains or loses which would
have the same expected utility as the variable cash flows via the system performance.
Therefore, if the system is performing for time T , the exponential utility function can
be expressed as
exp(η · CE · T ) − 1 E exp(ηC) − 1
= ,
η η
where CE is the certainty equivalent per unit time, or
log {E exp(ηC)}
CE = . (6.4)
ηT
128
6.2 Risk-averse Maintenance Modelling
We consider the preventive maintenance policy for the system modelled in chapter 4,
which would be maintained perfectly at an optimum maintenance time Tm or at failure
Tf , whichever occurs first, i.e., the length of system’s first phase is Tm or Tf , whichever
is smaller, denoted as Ti . Let the cost over the ith phase be Fi , where Fi is a random
cash flow during the phase i; in other words, Fi induced by preventive maintenance or
failure and repair, of which the cost is Cm and Cf + Cr , respectively. The certainty
equivalent per unit time for phase i is
log E{exp(ηFi )}
CEi =
ηTi
For simplicity, the certainty equivalent per unit time for each phase is
log E{exp(ηF )}
CE = (6.5)
ηT
where kj is j th cumulant of the cost for each phase. As we can see, with the risk
aversion parameter η increasing, certainty equivalent CE also increasingly weights
towards higher cumulants, such as skewness and kurtosis.
As we model sequential preventive maintenance with unfixed maintenance time,
the length of each phase Ti is not equal. As a result, equation 6.5 becomes
N
X
exp(η · CE · T ) = {log E(exp(ηF ))}i Pi = G(log E(exp(ηF )))
i=1
129
for a system with N phases, where Pi is the probability that i phases have occurred by
time T , therefore,
log {G(log E exp(ηF ))}
CE = , (6.7)
ηT
where log G(·) is the cumulant generating function for the number of phases. Cox
(1962) gives the first few cumulants when expanding equation 6.7 as T → ∞,
Cf + (Cm − Cf − Cr )S(Tm )
C(Tm ) = (6.9)
T
where S(·) is the survival function and Tm is optimum maintenance time obtained
in chapter 4 by dynamic programming approach. We consider the first term in the
expansion 6.8 to approximate CE.
Define I as an indicator variable, where I = 1 denotes system failure in (0, Tm ] and
I = 0 denotes system has survived to time Tm with failure occurring. Therefore,
Then can use the equation 6.11 to analyse the relationship between risk aversion pa-
rameter η and the corresponding certainty equivalent.
130
6.3 Numerical Examples
We consider the sequential preventive maintenance model in chapter 4 where the system
is perfectly maintained preventively at Tm or at failure time Tf following a Weibull
distribution with survival function S(t) = θtθ−1 , whichever occurs first, whilst the
corresponding cost is Cm and Cf + Cr , respectively. From Table 4.1 (page 86), the
optimum maintenance time Tm for phase one is 0.5 and corresponding possible failure
times Tf are 0.1, 0.2, 0.3, 0.4, 0.5.
Replace Cf as 2, Cr as 1 and Cm as 0.5 in equation 6.11,
3 log {1 + S(0.5)(exp(−2.5η) − 1)}
CE(Tm = 0.5 | Tf , θ, η) = + + ···
Tf ηTf
3 log 1 + θ0.5θ−1 (exp(−2.5η) − 1)
= + + · · ·(6.12)
Tf ηTf
As we can see in equation 6.12, the certainty equivalent is associated with failure time
Tf , shape parameter θ for Weibull distribution with scale parameter 1 and risk aversion
parameter η.
The shape parameter θ of the Weibull distribution is set as 2 and analysed the
relationship between the risk aversion parameter η and the certainty equivalent CE. As
we can see in Figure 6.1: first, the certainty equivalent CE increases with increasing risk
aversion; second, the certainty equivalent CE decreases with failure time increasing,
i.e., when the system is more reliable. In other words, one intends to spend less resource
if the system is more reliable.
CE
Tf1=0.1
Tf1=0.2
100
Tf1=0.3
80 Tf1=0.4
Tf1=0.5
60
40
20
Η
0.05 0.10 0.15 0.20
131
6.4 Discussions
This chapter investigates maintenance optimisation via risk aversion from a certainty-
equivalent point of view. The aim in this section is to discuss relevant areas of interest
for further research.
• As can be seen in previous studies, utility function with very low risk aversion
provides the same maintenance policy, which tends towards minimising the av-
erage cost per unit time. Although one would be interested to know if there are
other utility functions can be applied in similar maintenance optimisation mod-
elling, it has been proved that only the linear and exponential utility functions
have properties that are reasonable to apply in practice. However, it could be
an interesting area to investigate depending on practical scenarios, e.g., nonpara-
metric utility function modelling.
• The findings that minimising the average cost per unit time is almost equivalent
to maximising the expected utility of cost rate and the certainty-equivalent cost
increases as risk aversion increases bring up the question of determing the risk
aversion parameter of the utility function. Baker (2010) suggests that one may
refer to a plot showing the relationship of certainty-equivalent cost per unit time
that is estimated from cost data and varying values of risk aversion parameter.
The certainty-equivalent cost per unit time that is not very sensitive to risk
aversion should be preferred.
132
Chapter 7
7.1 Conclusions
From the utility point of view, the maintenance problems for repairable systems, includ-
ing corrective and preventive maintenance were mainly discussed. Under the framework
of Bayesian methodology, the parameters of models are random rather than fixed as
for the non-Bayesian method, while the former often suits in most of the practical
situations, the failure distribution is either unknown or contains several unknown pa-
rameters. In such case, the Bayesian approach can be quite effective to estimate these
unknown parameters by assigning prior distributions to them.
Starting from a two-phase system maintenance model, we explored the system’s
perfect preventive maintenance by dynamic programming, compared it with a myopic
method, and found the dynamic programming superior in terms of optimising mainte-
nance time. When the risk aversion parameter in the utility function is very small, our
utility-based maintenance optimisation would reduce to minimising the expected cost
per unit time.
Under different failure time distributions, we explored the effect of failure time as-
sumptions on the optimal maintenance time, and that a gamma failure time assumption
would result in later maintenance time.
Considering time effect on maintenance optimisation, we found maintenance tends
to be done earlier with the future cost becoming more expensive. Several modified
maintenance optimisation models are proposed such as modelling systems with discrete
failure distribution, parallel redundant systems.
133
By manipulating a conditional probability matrix, we explore and compare the dif-
ference between perfect preventive maintenance and imperfect preventive maintenance
and we may generally conclude that under an imperfect preventive maintenance mod-
elling framework systems tend to be maintained later because the system would enter
into a state with higher probability to fail if imperfect preventive maintenance were
conducted.
To avoid the problem of dimension increasing and to speed up computation time,
we proposed a hybrid myopic-dynamic programming method. By using the posterior
distribution for the parameter of interest, we see that the hybrid myopic-dynamic
programming maintenance time at CN21 are earlier, which results in lower expected
cost rate compared with dynamic programming for a two-phase system.
7.2 Applications
The utility-based maintenance strategies and policies proposed in this thesis are in-
tensively mathematically and theoretically modelled with a number of assumptions.
Although they are not straightforward to apply in practice, they provide new perspec-
tives on modelling preventive maintenance with modification. The prognostic mainte-
nance policies derived from historical information and system evolution require decision
makers to understand thoroughly the characteristics of the system of interest and to
anticipate all possible consequences as much as possible.
As a result, models and methodologies proposed in this thesis are more suitable for
large industrial and business purposes. The corresponding systems are supposed to be
well documented for decision makers to extract critical information and parameters for
further maintenance modelling. For example, decision makers should be able to have
access to advice from an expert if required, because expert information is essential to
justify related prior parameters to implement Bayesian dynamic programming. At the
same time, the potential future states of a system should be finite and of a relatively
small number. For instance, a personal computer’s possible states could be ‘working’,
‘sleeping’, or ‘down’, which are accountable states.
Because the methodologies in this thesis are sensitive to the number of system
states, it would not be reasonable to apply them to highly dynamic systems, such as
134
machines dealing with signal processing. Other systems that could potentially be of
interest to apply modelling methods in this thesis are an automatic manufacturing sys-
tem, a robotic process, or a computer server for the non-life essential services, in which
cases failure is neither rare or frequent, maintenance itself is not cheap or trivial, but
failure is a considerable expense, though not exorbitantly so. Hence, this approach is
not suitable to apply to maintenance of systems with very high risk aversion properties,
e.g., a nuclear power facility, an off-shore oil field, or a life support system. It is also
not worthwhile applying to trivial systems where the computational cost of performing
this analysis outweighs any savings.
In practice, there are few systems that would meet the requirements of the the-
oretical modelling, though it is the decision makers’ prerogative to adjust relevant
parameters, define overall and critical objectives, and utilise available information to
seek the pratical maintenance policies appropriate to the properties of an individual
system.
7.3 Outlook
Maintenance optimisation for repairable systems is part of maintenance theory in en-
gineering, and it also requires more attention and effort to deal with increasing issues
concerning it.
4. Systems considered in this research are relatively simple, however, as the com-
plexity of systems in modern society increases, high demand to deal with the
135
maintenance on complex systems requires us to explore this field, despite the
huge mathematical and computational challenges.
136
Appendix A
Glossary
137
138
Appendix B
Mathematical Proofs
These functions have the property of iso-elesticity, which means that we get the same
utility function (a positive affine transformation) if the cost is scaled by some constant
k. Formally,
For all k > 0,
U (kx) = f (k)U (x) + g(k),
for some function f (k) > 0 which is independent of x and some function g(k) which is
independent of x as well.
First consider the case when a 6= 1,
(kx)1−a − 1
U (kx) =
1−a
x1−a − 1 k 1−a − 1
1−a
= k +
1−a 1−a
1−a
k −1
= k 1−a U (x) +
1−a
The log function can be written as
U (kx) = log(kx)
= log(k) + log(x)
= U (x) + log(k)
139
B.2 Negative Exponential Utility Functions
A negative exponential utility function is of the form
Since the first derivative U 0 (x) = a exp {−ax} > 0 and the second derivative
U 00 (x) = −a2 exp {−ax} < 0, this one is also a legitimate utility function.
The class of negative exponential utility functions has an interesting property that
it is invariant under any cost transformation, i.e., for any constant k,
for some function f (k) > 0 which is independent of x and some function g(k) which is
independent of x as well, it can be verified as below:
CE = U −1 (E(U (X)))
If a decision maker with utility function U has the current cost budget less than
CE, she will regard the system less reliable and tend to maintain the system earlier;
if her current cost budget is more than CE, she will think the system is more reliable
and tend to maintain the system later; and if her current cost budget is exactly CE,
she will be thinking there is no difference between maintain the system earlier or not.
Decision makers use utility functions to compare different decisions to each other.
In this sense, we can rescale a utility function via multiplying it by a positive constant
and/or adding any other positive or negative constant, which is called a positive affine
140
transformation. Two utility functions produce the same results if they are connected
via a positive affine transformation.
Suppose we have constants alpha > 0 and β and a utility function U ; another
utility function V is defined via U, α and β as:
V (x) = αU (x) + b
Since
CE = U −1 (E(U (X)))
V (CE) = αU (CE) + β
= αE(U (X)) + β
= E(V (X))
CE = V −1 (E(V (X)))
V 0 (X) = αU 0 (X)
U 00 (X) = αU 00 (X)
141
By division, we obtain
V 00 (X) U 00 (X)
=
V 0 (X) U 0 (X)
V 00 (X) U 00 (X)
Assume that there exists such a pair of functions U and V for which V 0 (X)
= U 0 (X)
g 0 = (V 0 (U 0 )−1 )0
V 0 (X) = αU 0 (X)
Now we show that if and only if their second derivatives and first derivatives are the
same, the two utility functions can be regarded as same.
142
Appendix C
Computational Notes
The statistical computation in this thesis are written in R code (R Core Team, 2015) ,
running on a MacBook Pro (mid 2012) with an Intel Core i7 2.9 GHz CPU.
1 # # ---- myrcode1
2 # two - time period system
3
4 # pre - defined variables
5 Cf <- 2 # failure cost
6 Cr <- 1 # repair cost
7 Cm <- 0.5 # maintenace cost
8 alpha <- 1 # time effect parameter , discount rate : <=1
9
10 library ( truncnorm )
11 delta <- 0.1 # increment
12 theta . poss <- seq ( from =1 , to =10 , by = delta ) # possible values of theta
13 theta . mass <- dtruncnorm ( theta . poss , a =1 , b = Inf , mean =2 , sd =1)
14 theta . prob <- theta . mass / sum ( theta . mass ) # probability of theta
15
16 # possible values and lengths of tf1 , tf2 , tm1 and tm2
17 tf1 . poss <- seq ( from = delta , to =6 , by = delta )
18 tf2 . poss <- seq ( from = delta , to =6 , by = delta )
19 tm1 . poss <- seq ( from = delta , to =6 , by = delta )
20 tm2 . poss <- seq ( from = delta , to =6 , by = delta )
21 tf1 . n <- length ( tf1 . poss )
22 tf2 . n <- length ( tf2 . poss )
23 tm1 . n <- length ( tm1 . poss )
24 tm2 . n <- length ( tm2 . poss )
25
26 # compute tf2gtf1 . prob : p ( tf2 | tf1 )
27 # define a matrix for the joint mass distribution of tf1 & tf2
28 # rows are possible tf1 and columns are possible tf2
29 tf2gtf1 . upper <- matrix ( rep (0 , tf1 . n * tf2 . n ) , nrow = tf1 .n , ncol = tf2 . n )
30 for ( i in 1: tf1 . n ) {
143
31 for ( j in 1: tf2 . n ) {
32 tf2gtf1 . upper [i , j ] <- sum ( dweibull ( tf1 . poss [ i ] , shape = theta . poss , scale =1)
33 * dweibull ( tf2 . poss [ j ] , shape = theta . poss , scale =1)
34 * theta . prob )
35 # tf2gtf1 . upper [i , j ] <- sum ( dgamma ( tf1 . poss [ i ] , shape = theta . poss , scale =1)
36 # * dgamma ( tf2 . poss [ j ] , shape = theta . poss , scale =1)
37 # * theta . prob )
38 }
39 }
40 # joint prob of tf1 and tf2 : p ( tf1 , tf2 )
41 tf2gtf1 . upper . prob <- tf2gtf1 . upper / sum ( tf2gtf1 . upper )
42 # marginal prob of tf1 : p ( tf1 )
43 tf2gtf1 . lower . prob <- rowSums ( tf2gtf1 . upper . prob )
44 # conditional prob of tf2 given tf1 : p ( tf2 | tf1 )
45 tf2gtf1 . prob <- exp ( log ( tf2gtf1 . upper . prob ) - log ( tf2gtf1 . lower . prob ) )
46
47 # plot tf2gtf1 . prob : p ( tf2 | tf1 )
48 # convert matrix tf2gtf1 . prob to a data frame with possible tf2
49 tf2gtf1 . prob . frame <- data . frame ( t ( tf2gtf1 . prob ) , Tf2 = tf2 . poss )
50 # convert the data frame from " wide " format to " long " format
51 library ( reshape2 )
52 tf2gtf1 . prob . long <- melt ( tf2gtf1 . prob . frame ,
53 id . vars = " Tf2 " ,
54 variable . name = " Tf1 " ,
55 value . name = " Density " )
56 levels ( tf2gtf1 . prob . long $ Tf1 ) <- tf1 . poss # set the variables as possible tf1
57 library ( grDevices )
58 library ( ggplot2 )
59 ggplot ( data = tf2gtf1 . prob . long , # tf1 from delta to 6
60 aes ( x = Tf2 , y = Density , colour = Tf1 ) ) +
61 geom _ line ()
62 # ggplot ( data = tf2gtf1 . prob . long [1:1800 , ] , # tf1 from delta to 3
63 # aes ( x = Tf2 , y = Density , colour = Tf1 ) ) +
64 # geom _ line ()
65 # ggplot ( data = tf2gtf1 . prob . long [1801:3600 , ] , # tf1 from 3+ delta to 6
66 # aes ( x = Tf2 , y = Density , colour = Tf1 ) ) +
67 # geom _ line ()
68
69
70 max . prob . tf2 <- apply ( tf2gtf1 . prob , 1 , max )
71 max . tf2 <- delta * apply ( tf2gtf1 . prob , 1 , which . max )
72 # create a data frame
73 # highest probability , max tf2 given tf1
74 hpmtf2gtf1 <- data . frame ( tf1 . poss , max . tf2 , max . prob . tf2 )
75
76 # for each tf1 . poss , the highest prob of tf2
77 ggplot ( hpmtf2gtf1 , aes ( x = tf1 . poss , y = max . prob . tf2 ) ) +
78 geom _ point () +
79 geom _ line () +
144
80 geom _ text ( aes ( x = tf1 . poss +0.08 , label = max . tf2 ) , size =2 , hjust =0) +
81 xlab ( " Tf1 " ) + ylab ( " Highest probability value w . r . t Tf2 " )
82 # for each tf1 . poss , the tf2 that has the highest prob
83 ggplot ( hpmtf2gtf1 , aes ( x = tf1 . poss , y = max . tf2 ) ) +
84 geom _ point () +
85 geom _ text ( aes ( y = max . tf2 +0.01 , label = round ( max . prob . tf2 , 2) ) ,
86 size =2 , vjust =0) +
87 xlab ( " Tf1 " ) + ylab ( " Tf that has the highest probability " )
88 # for each tf1 . poss , the tf2 that has the highest prob and its prob
89 ggplot ( hpmtf2gtf1 , aes ( x = max . tf2 , y = max . prob . tf2 ) ) +
90 geom _ point () +
91 geom _ text ( aes ( x = max . tf2 +0.015 , label = tf1 . poss ) ,
92 size =2 , hjust =0) +
93 xlab ( " Tf2 that has the highest probability given Tf1 " ) +
94 ylab ( " Highest probability value w . r . t Tf2 " )
95 # # scatter 3 d plot for tf1 . poss , max . tf2 and max . prob . tf2
96 # library ( scatterplot3d )
97 # scatterplot3d ( tf1 . poss , max . tf2 , max . prob . tf2 ,
98 # xlab =" Tf1 " , ylab =" Tf2 that has highest density given Tf1 " ,
99 # zlab =" Density ")
100
101 # compute centf2gtf1 . prob , i . e . , p ( tf2 > tm2 | tf1 )
102 centf2gtf1 . prob <- matrix ( rep (0 , tf1 . n * tm2 . n ) , nrow = tf1 .n , ncol = tm2 . n )
103 for ( i in 1: tf1 . n ) {
104 for ( j in 1:( tm2 .n -1) ) {
105 centf2gtf1 . prob [i , j ] <- sum ( tf2gtf1 . prob [i , ( j +1) : tm2 . n ])
106 }
107 }
108
109 # compute tf2gtm1 . prob , i . e . , p ( tf2 | tf1 > tm1 )
110 # define a vector representing p ( tf1 > tm1 ) , of which the length is tm1 . n
111 tf2gtm1 . lower . prob <- rep (0 , tm1 . n )
112 # define a matrix for the joint prob of tf1 > tm1 and tf2 , i . e . , p ( tf1 > tm1 , tf2 )
113 # rows are possible tm1 and columns are possible tf2
114 tf2gtm1 . upper . prob <- matrix ( rep (0 , tm1 . n * tf2 . n ) , nrow = tm1 .n , ncol = tf2 . n )
115 for ( i in 1: tm1 . n ) {
116 tf2gtm1 . lower . prob [ i ] <- sum ( tf2gtf1 . lower . prob [ tf1 . poss > tm1 . poss [ i ]])
117 # tf2gtm1 . lower . prob is p ( tf1 > tm1 )
118 # for a given tm1 , sum p ( tf1 ) of which tf1 > tm1
119 for ( j in 1: tf2 . n ) {
120 tf2gtm1 . upper . prob [i , j ] <- sum ( tf2gtf1 . upper . prob [ tf1 . poss > tm1 . poss [ i ] , j ])
121 # tf2gtm1 . upper . prob is the joint p ( tf1 > tm1 , tf2 )
122 # for a given tm1 , sum p ( tf1 , tf2 ) of which tf1 > tm1 and corresponding tf2
123 }
124 }
125 tf2gtm1 . prob <- exp ( log ( tf2gtm1 . upper . prob [ - tm1 .n , ]) -
126 log ( tf2gtm1 . lower . prob [ - tm1 . n ]) )
127 # tf2gtm1 . prob is P ( tf2 | tf1 > tm1 )
128 # use [ - tm1 . n ] to disgard the last one as it is NaN
145
129 # p ( tf2 | tf1 > tm1 . poss [ tm1 . n ]) =0 / 0 which is NaN
130
131 # compute centf2gtm1 , i . e . , p ( tf2 > tm2 | tf1 > tm1 )
132 centf2gtm1 . prob <- matrix ( rep (0 , ( tm1 .n -1) * tm2 . n ) , nrow = tm1 .n -1 , ncol = tm2 . n )
133 for ( i in 1:( tm1 .n -1) ) {
134 for ( j in 1:( tm2 .n -1) ) {
135 centf2gtm1 . prob [i , j ] <- sum ( tf2gtm1 . prob [i , ( j +1) : tm2 . n ])
136 }
137 }
138
139 # for preventive maintenance ( PM )
140 # we assume that the failure rates incrase after PM
141 # i . e . , p ( tf2 | tf1 > tm1 ) > p ( tf2 | tf1 )
142 # for each row , the number of entries to be re - allocated
143 # prop is an integer number from 1 to ncol -1 ,
144 # which is the number of prob ( s ) taken out from each row
145 prop <- 55
146 prop . tf2gtm1 . prob <- matrix ( rep (0 , ( tm1 .n -1) * tf2 . n ) , nrow = tm1 .n -1 , ncol = tf2 . n )
147 # for each row of tf2gtm1 . prob , i . e . , P ( tf2 | tf1 > tm1 ) ,
148 # take the sum of entries from prop +1 to ncol ,
149 # allocate it to other entries from 1 to prop
150 # depending their proportions in their own sum ;
151 # replace entries from prop +1 to ncol with 0.
152 for ( i in 1:( tm1 .n -1) ) {
153 prop . tf2gtm1 . prob [i , ] <- c ( tf2gtm1 . prob [1 , 1:( tf2 .n - prop ) ]+
154 sum ( tf2gtm1 . prob [1 , ( tf2 .n - prop +1) : tf2 . n ]) *
155 tf2gtm1 . prob [1 , 1:( tf2 .n - prop ) ] /
156 ( sum ( tf2gtm1 . prob [1 , 1:( tf2 .n - prop ) ]) ) ,
157 rep (0 , prop ) )
158 }
159 # compute new p ( tf2 > tm2 | tf1 > tm1 )
160 cenprop . tf2gtm1 . prob <- matrix ( rep (0 , ( tm1 .n -1) * tm2 . n ) ,
161 nrow = tm1 .n -1 , ncol = tm2 . n )
162 for ( i in 1:( tm1 .n -1) ) {
163 for ( j in 1:( tm2 .n -1) ) {
164 cenprop . tf2gtm1 . prob [i , j ] <- sum ( prop . tf2gtm1 . prob [i , ( j +1) : tm2 . n ])
165 }
166 }
167
168 par ( mfrow = c (3 ,2) )
169 plot ( tf1 . poss , tf2gtf1 . lower . prob , type = " l " , col = " black " , lty =1 ,
170 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 >1) " )
171 lines ( tf2 . poss , tf2gtm1 . prob [10 , ] , col = " red " , lty =2)
172 lines ( tf2 . poss , prop . tf2gtm1 . prob [10 , ] , col = " blue " , lty =3)
173 abline ( v = tf2 . poss [ which . max ( tf2gtf1 . lower . prob ) ] , col = " black " , lty =1)
174 abline ( v = tf2 . poss [ which . max ( tf2gtm1 . prob [10 , ]) ] , col = " red " , lty =2)
175 abline ( v = tf2 . poss [ which . max ( prop . tf2gtm1 . prob [10 , ]) ] , col = " blue " , lty =3)
176
177 plot ( tf1 . poss , tf2gtf1 . lower . prob , type = " l " , col = " black " , lty =1 ,
146
178 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 >2) " )
179 lines ( tf2 . poss , tf2gtm1 . prob [20 , ] , col = " red " , lty =2)
180 lines ( tf2 . poss , prop . tf2gtm1 . prob [20 , ] , col = " blue " , lty =3)
181 abline ( v = tf2 . poss [ which . max ( tf2gtf1 . lower . prob ) ] , col = " black " , lty =1)
182 abline ( v = tf2 . poss [ which . max ( tf2gtm1 . prob [20 , ]) ] , col = " red " , lty =2)
183 abline ( v = tf2 . poss [ which . max ( prop . tf2gtm1 . prob [20 , ]) ] , col = " blue " , lty =3)
184
185 plot ( tf1 . poss , tf2gtf1 . lower . prob , type = " l " , col = " black " , lty =1 ,
186 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 >3) " )
187 lines ( tf2 . poss , tf2gtm1 . prob [30 , ] , col = " red " , lty =2)
188 lines ( tf2 . poss , prop . tf2gtm1 . prob [30 , ] , col = " blue " , lty =3)
189 abline ( v = tf2 . poss [ which . max ( tf2gtf1 . lower . prob ) ] , col = " black " , lty =1)
190 abline ( v = tf2 . poss [ which . max ( tf2gtm1 . prob [30 , ]) ] , col = " red " , lty =2)
191 abline ( v = tf2 . poss [ which . max ( prop . tf2gtm1 . prob [30 , ]) ] , col = " blue " , lty =3)
192
193 plot ( tf1 . poss , tf2gtf1 . lower . prob , type = " l " , col = " black " , lty =1 ,
194 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 >4) " )
195 lines ( tf2 . poss , tf2gtm1 . prob [40 , ] , col = " red " , lty =2)
196 lines ( tf2 . poss , prop . tf2gtm1 . prob [40 , ] , col = " blue " , lty =3)
197 abline ( v = tf2 . poss [ which . max ( tf2gtf1 . lower . prob ) ] , col = " black " , lty =1)
198 abline ( v = tf2 . poss [ which . max ( tf2gtm1 . prob [40 , ]) ] , col = " red " , lty =2)
199 abline ( v = tf2 . poss [ which . max ( prop . tf2gtm1 . prob [40 , ]) ] , col = " blue " , lty =3)
200
201 plot ( tf1 . poss , tf2gtf1 . lower . prob , type = " l " , col = " black " , lty =1 ,
202 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 >5) " )
203 lines ( tf2 . poss , tf2gtm1 . prob [50 , ] , col = " red " , lty =2)
204 lines ( tf2 . poss , prop . tf2gtm1 . prob [50 , ] , col = " blue " , lty =3)
205 abline ( v = tf2 . poss [ which . max ( tf2gtf1 . lower . prob ) ] , col = " black " , lty =1)
206 abline ( v = tf2 . poss [ which . max ( tf2gtm1 . prob [50 , ]) ] , col = " red " , lty =2)
207 abline ( v = tf2 . poss [ which . max ( prop . tf2gtm1 . prob [50 , ]) ] , col = " blue " , lty =3)
208
209 plot ( tf1 . poss , tf2gtf1 . lower . prob , type = " l " , col = " black " , lty =1 ,
210 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 >5.9) " )
211 lines ( tf2 . poss , tf2gtm1 . prob [59 , ] , col = " red " , lty =2)
212 lines ( tf2 . poss , prop . tf2gtm1 . prob [59 , ] , col = " blue " , lty =3)
213 abline ( v = tf2 . poss [ which . max ( tf2gtf1 . lower . prob ) ] , col = " black " , lty =1)
214 abline ( v = tf2 . poss [ which . max ( tf2gtm1 . prob [59 , ]) ] , col = " red " , lty =2)
215 abline ( v = tf2 . poss [ which . max ( prop . tf2gtm1 . prob [59 , ]) ] , col = " blue " , lty =3)
216
217 # p ( tf1 ) vs p ( tf2 | tf1 > tm1 * =0.5) ( CM ) vs p ( tf2 | tf1 > tm1 * =0.5) ( PM )
218 par ( mfrow = c (1 ,1) )
219 plot ( tf2 . poss , prop . tf2gtm1 . prob [6 , ] , type = " l " , col = " blue " , lty =2 , lwd =2.5 ,
220 xlab = " Tf2 " , ylab = " p ( tf2 | tf1 > tm1 * ) " )
221 lines ( tf2 . poss , tf2gtm1 . prob [5 , ] , col = " red " , lty =2 , lwd =2.5)
222 lines ( tf1 . poss , tf2gtf1 . lower . prob , col = " black " , lty =1 , lwd =2.5)
223 lines ( tf1 . poss , tf2gtf1 . lower . prob )
224 legend (3.5 , 0.30 ,
225 c ( " p ( tf1 ) " , " p ( tf2 | tf1 > tm1 * =0.5) ( CM ) " , " p ( tf2 | tf1 > tm1 * =0.6) ( PM ) " ) ,
226 lty = c (1 , 2 , 2) ,
147
227 lwd = c (2.5 , 2.5 , 2.5) ,
228 col = c (1 , 2 , 4) )
229
230 # #####
231 # dynamic programming
232
233 # calculate expectated cost per unit time at chance node CN21 and CN22 ,
234 # i . e . , euc21 and euc22 .
235
236 # euc21 . upper is a cost matrix related to tf1 and tf2 <= tm2 ,
237 # for each row ( given a possible tf1 ) ,
238 # each entry is the cost given a possible tm2 ;
239 # e . g . , the euc21 . upper [1 , 2] is the average cost per unit time
240 # when tf1 = tf1 . poss [1] , and tf2 = tf2 . poss [1] , tf2 . poss [2] ,
241 # where tm2 = tm2 . poss [2] ,
242 # i , e . , euc21 . upper [1 , 2]= utility ( tf1 = tf1 . poss [1] , tf2 = tf2 . poss [1]) +
243 # utility ( tf1 = tf1 . poss [1] , tf2 = tf2 . poss [2]) .
244 euc21 . upper <- matrix ( rep (0 , tf1 . n * tm2 . n ) , nrow = tf1 .n , ncol = tm2 . n )
245 for ( i in 1: tf1 . n ) {
246 euc21 . upper [i , 1] <- (1+ alpha ) * ( Cf + Cr ) /
247 ( tf1 . poss [ i ]+ tf2 . poss [1]) *
248 tf2gtf1 . prob [i , 1]
249 for ( j in 2: tm2 . n ) {
250 euc21 . upper [i , j ] <- euc21 . upper [i , j -1]+
251 (1+ alpha ) * ( Cf + Cr ) /
252 ( tf1 . poss [ i ]+ tf2 . poss [ j ]) *
253 tf2gtf1 . prob [i , j ]
254 }
255 }
256
257 # euc21 . lower is a cost matrix related to tf1 and tf2 > tm2 ,
258 # for each row ( given a possible tf1 ) ,
259 # each entry is the cost given a pssoible tm2 ;
260 # e . g . , the euc21 . lower [1 , 2] is the average cost per unit time
261 # when tf1 = tf1 . poss [1] and tm2 = tm2 . poss [2] ,
262 # i . e . , euc21 . lower [1 , 2]= utility ( tf1 = tf1 . poss [1] , tm2 = tm2 . poss [2]) .
263 euc21 . lower <- matrix ( rep (0 , tf1 . n * tm2 . n ) , nrow = tf1 .n , ncol = tm2 . n )
264 for ( i in 1: tf1 . n ) {
265 for ( j in 1:( tm2 .n -1) )
266 euc21 . lower [i , j ] <- ( Cf + Cr + alpha * Cm ) /
267 ( tf1 . poss [ i ]+ tm2 . poss [ j ]) *
268 centf2gtf1 . prob [i , j ]
269 }
270
271 # add the two matrices above , i . e . , euc21 . upper + euc21 . lower ,
272 # give the matrix of expected cost per unit time at chance node CN21 ,
273 # i . e . , euc21 ,
274 # which is associated with given tf1 and tm2 ,
275 # e .g , euc21 [1 , 2] is the expected cost per unit time
148
276 # when tf1 = tf1 . poss [1] and tm2 = tm2 . poss [2].
277 euc21 <- euc21 . upper + euc21 . lower
278
279 # cn21 . tm2 is the tm2 that minimise the cost
280 # cn21 . min is the minimal cost
281 cn21 . tm2 <- rep (0 , tf1 . n )
282 cn21 . min <- rep (0 , tf1 . n )
283 for ( i in 1: tf1 . n ) {
284 # for a given tf1 ,
285 # the tm2 that minimise the cost and associated cost .
286 cn21 . tm2 [ i ] <- tm2 . poss [ which . min ( euc21 [i , ]) ]
287 cn21 . min [ i ] <- min ( euc21 [i , ])
288 }
289
290 # euc22 . upper is a cost matrix related to tm1 and tf2 <= tm2 ,
291 # for each row ( given a possible tm1 ) ,
292 # each entry is the cost given a possible tm2 ;
293 # e . g . , the euc22 . upper [1 , 2] is the average cost per unit time
294 # when tm1 = tfm . poss [1] , and tf2 = tf2 . poss [1] , tf2 . poss [2] ,
295 # where tm2 = tm2 . poss [2] ,
296 # i , e . , euc22 . upper [1 , 2]= utility ( tm1 = tm1 . poss [1] , tf2 = tf2 . poss [1]) +
297 # utility ( tm1 = tm1 . poss [1] , tf2 = tf2 . poss [2]) .
298 euc22 . upper <- matrix ( rep (0 , ( tm1 .n -1) * tm2 . n ) , nrow = tm1 .n -1 , ncol = tm2 . n )
299 for ( i in 1:( tm1 .n -1) ) {
300 euc22 . upper [i , 1] <- ( Cm + alpha * ( Cf + Cr ) ) /
301 ( tm1 . poss [ i ]+ tf2 . poss [1]) *
302 prop . tf2gtm1 . prob [i , 1]
303 for ( j in 2: tm2 . n ) {
304 euc22 . upper [i , j ] <- euc22 . upper [i , j -1]+
305 ( Cm + alpha * ( Cf + Cr ) ) / ( tm1 . poss [ i ]+ tf2 . poss [ j ]) *
306 prop . tf2gtm1 . prob [i , j ]
307 }
308 }
309
310 # euc22 . lower is a cost matrix related to tm1 and tf2 > tm2 ,
311 # for each row ( given a possible tm1 ) ,
312 # each entry is the cost given a pssoible tm2 ;
313 # e . g . , the euc22 . lower [1 , 2] is the average cost per unit time
314 # when tm1 = tm1 . poss [1] and tm2 = tm2 . poss [2] ,
315 # i . e . , euc22 . lower [1 , 2]= utility ( tm1 = tm1 . poss [1] , tm2 = tm2 . poss [2]) .
316 euc22 . lower <- matrix ( rep (0 , ( tm1 .n -1) * tm2 . n ) , nrow = tf1 .n -1 , ncol = tm2 . n )
317 for ( i in 1:( tm1 .n -1) ) {
318 for ( j in 1:( tm2 .n -1) )
319 euc22 . lower [i , j ] <- (1+ alpha ) * Cm /
320 ( tm1 . poss [ i ]+ tm2 . poss [ j ]) *
321 cenprop . tf2gtm1 . prob [i , j ]
322 }
323
324 # add the two matrices above , i . e . , euc22 . upper + euc22 . lower ,
149
325 # give the matrix of expected cost per unit time at chance node CN22 ,
326 # i . e . , euc22 ,
327 # which is associated with given tm1 and tm2 ,
328 # e .g , euc22 [1 , 2] is the expected cost per unit time
329 # when tm1 = tm1 . poss [1] and tm2 = tm2 . poss [2].
330 euc22 <- euc22 . upper + euc22 . lower
331
332 # cn22 . tm2 is the tm2 that minimise the cost
333 # cn22 . min is the minimal cost
334 cn22 . tm2 <- rep (0 , tm1 .n -1)
335 cn22 . min <- rep (0 , tm1 . n )
336 for ( i in 1:( tm1 .n -1) ) {
337 # for a given tf1 ,
338 # the tm2 that minimise the cost and associated cost .
339 cn22 . tm2 [ i ] <- tm2 . poss [ which . min ( euc22 [i , ]) ]
340 cn22 . min [ i ] <- min ( euc22 [i , ])
341 }
342
343 # #####
344 # calculate expectated cost per unit time at chance node CN1 , i . e . , cn1 .
345
346 # compute p ( tf1 > tm1 ) , i . e . , centf1 . prob
347 # p ( tf1 ) is tf2gtf1 . lower . prob
348 centf1 . prob <- rep (0 , tm1 . n )
349 for ( i in 1:( tm1 .n -1) ) {
350 centf1 . prob [ i ] <- sum ( tf2gtf1 . lower . prob [( i +1) : tm1 . n ])
351 }
352
353 # upper21 is a cost vector related to tf1 <= tm1 ,
354 # each element is the cost given a possible tm1 ;
355 # e . g . , the upper21 [2] is the average cost per unit time
356 # when tf1 = tf1 . poss [1] , tf1 . poss [2] , where tm1 = tm1 . poss [2] ,
357 # i . e . , upper21 [2]= utiltiy ( tf1 = tf1 . poss [1]) + utility ( tf1 = tf1 . poss [2]) .
358 upper21 <- rep (0 , tm1 . n )
359 upper21 [1] <- ( cn21 . min * tf2gtf1 . lower . prob ) [1]
360 for ( i in 2: tm1 . n ) {
361 upper21 [ i ] <- upper21 [i -1]+( cn21 . min * tf2gtf1 . lower . prob ) [ i ]
362 }
363
364 # lower22 is a cost vector related to tm1 ,
365 # each element is the cost given a possible tm1 ;
366 # e . g . , the lower22 [2] is the average cost per unit time
367 # when tm1 = tm1 . poss [2] ,
368 # i . e . , lower22 [2]= utiltiy ( tm1 . poss [2]) .
369 lower22 <- cn22 . min * centf1 . prob
370
371 # add two vectors , i . e . , upper21 + lower22
372 # give the vector of expected cost per unit time at chance node CN1 ,
373 # i . e . , en1 ,
150
374 # which is associated with given tm1 ,
375 # e . g . , cn1 [2] is the expected cost per unit time
376 # when tm1 = tm1 . poss [2].
377 cn1 <- upper21 + lower22
378
379 ucn1 . frame <- data . frame ( Tm1 = tm1 . poss , CN21 = upper21 , CN22 = lower22 , CN1 = cn1 )
380 # convert the data frame from " wide " format to " long " format
381 library ( reshape2 )
382 ucn1 . long <- melt ( ucn1 . frame ,
383 id . vars = " Tm1 " ,
384 variable . name = " ChanceNodes " ,
385 value . name = " ExpectedCost " )
386 levels ( ucn1 . long $ Tm1 ) <- tm1 . poss
387 ggplot ( data = ucn1 . long ,
388 aes ( x = Tm1 , y = ExpectedCost , colour = ChanceNodes ) ) +
389 geom _ line () +
390 xlab ( " Tm1 " ) + ylab ( " Expected Cost " ) +
391 labs ( fill = " Chance Nodes " ) # ?? no space shown in plot
392
393 # cn1 . tm1 is the tm1 that minimise the cost at chance node CN1
394 # cn1 . min is the minimal cost at chance node CN1
395 cn1 . tm1 <- tm1 . poss [ which . min ( cn1 ) ]
396 cn1 . min <- min ( cn1 )
397 cn1 . tm1 . opt <- cn1 . tm1 # the optimal maintenace time for phase 1
398 cn1 . opt <- cn1 . min # the minimal expected cost for the entire system
399
400 # go forward from phase 1 to phase 2
401 # given optimal maintenance time for phase 1 , i . e . , cn1 . tm1 . opt
402
403 # cn21 . tm2 . opt and cn21 . opt are
404 # the optimal maintenance time and minimal expected cost for phase 2
405 # when tf1 <= cn1 . tm1 . opt .
406 cn21 . tm2 . opt <- cn21 . tm2 [ tf1 . poss <= cn1 . tm1 . opt ]
407 cn21 . opt <- cn21 . min [ tf1 . poss <= cn1 . tm1 . opt ]
408 # cn22 . tm2 . opt and cn22 . opt are
409 # the optimal maintenance time and minimal expected cost for phase 2
410 # when tf1 > cn1 . tm1 . opt .
411 cn22 . tm2 . opt <- cn22 . tm2 [ tm1 . poss == cn1 . tm1 . opt ]
412 cn22 . opt <- cn22 . min [ tm1 . poss == cn1 . tm1 . opt ]
413
414 # print results
415 print ( cn1 . tm1 . opt )
416 print ( cn1 . opt )
417
418 print ( tf1 . poss [ tf1 . poss <= cn1 . tm1 . opt ]) # tf1 that is smaller than tm1
419 print ( cn21 . tm2 . opt )
420 print ( cn21 . opt )
421
422 print ( cn22 . tm2 . opt )
151
423 print ( cn22 . opt )
424
425 # #####
426 # myopic
427
428 # for chance node CN1
429
430 # mcn1 . upper is a cost vector related to tf1 <= tm1 ,
431 # each element is the cost given a possible tm1 ;
432 # e . g . , the mcn1 . upper [2] is the average cost per unit time
433 # when tf1 = tf1 . poss [1] , tf1 . poss [2] , where tm1 = tm1 . poss [2] ,
434 # i . e . , mcn1 . upper [2]= utiltiy ( tf1 = tf1 . poss [1]) + utility ( tf1 = tf1 . poss [2]) .
435 mcn1 . upper <- rep (0 , tm1 . n )
436 mcn1 . upper [1] <- ((( Cf + Cr ) / tf1 . poss ) * tf2gtf1 . lower . prob ) [1]
437 for ( i in 2: tm1 . n ) {
438 mcn1 . upper [ i ] <- mcn1 . upper [i -1]+((( Cf + Cr ) / tf1 . poss ) * tf2gtf1 . lower . prob ) [ i ]
439 }
440 # mcn1 . lower is a cost vector related to tm1 ,
441 # each element is the cost given a possible tm1 ;
442 # e . g . , the mcn1 . lower [2] is the average cost per unit time
443 # when tm1 = tm1 . poss [2] ,
444 # i . e . , mcn1 . lower [2]= utiltiy ( tm1 . poss [2]) .
445 mcn1 . lower <- ( Cm / tm1 . poss ) * centf1 . prob
446 mcn1 <- mcn1 . upper + mcn1 . lower
447 mcn1 . tm1 <- tm1 . poss [ which . min ( mcn1 ) ] # the optimal tm1
448 mcn1 . min <- min ( mcn1 ) # the minimal cost for CN1
449
450 # for chance node CN21
451 # tf1 <= optimal tm1
452 mtf1 . n <- length ( subset ( tf1 . poss , tf1 . poss <= mcn1 . tm1 ) )
453 mcn21 . upper <- matrix ( rep (0 , mtf1 . n * tm2 . n ) , nrow = mtf1 .n , ncol = tm2 . n )
454 for ( i in 1: mtf1 . n ) {
455 mcn21 . upper [i , 1] <- (1+ alpha ) * ( Cf + Cr ) / ( tf1 . poss [ i ]+ tf2 . poss [1]) *
456 tf2gtf1 . prob [i , 1]
457 for ( j in 2: tm2 . n ) {
458 mcn21 . upper [i , j ] <- mcn21 . upper [i , j -1]+
459 (1+ alpha ) * ( Cf + Cr ) / ( tf1 . poss [ i ]+ tf2 . poss [ j ]) *
460 tf2gtf1 . prob [i , j ]
461 }
462 }
463 mcn21 . lower <- matrix ( rep (0 , mtf1 . n * tm2 . n ) , nrow = mtf1 .n , ncol = tm2 . n )
464 for ( i in 1: mtf1 . n ) {
465 for ( j in 1:( tm2 .n -1) )
466 mcn21 . lower [i , j ] <- ( Cf + Cr + alpha * Cm ) / ( tf1 . poss [ i ]+ tm2 . poss [ j ]) *
467 centf2gtf1 . prob [i , j ]
468 }
469 mcn21 <- mcn21 . upper + mcn21 . lower
470 mcn21 . tm2 <- rep (0 , mtf1 . n )
471 mcn21 . min <- rep (0 , mtf1 . n )
152
472 for ( i in 1: mtf1 . n ) {
473 mcn21 . tm2 [ i ] <- tm2 . poss [ which . min ( mcn21 [i , ]) ]
474 mcn21 . min [ i ] <- min ( mcn21 [i , ])
475 }
476
477 # for chance node CN22
478 # tf1 > optimal tm1
479 mcn22 . upper <- rep (0 , tm2 . n )
480 mcn22 . upper [1] <- ( Cm + alpha * ( Cf + Cr ) ) / ( mcn1 . tm1 + tf2 . poss [1]) *
481 prop . tf2gtm1 . prob [ which . min ( mcn1 ) , 1]
482 for ( i in 2: tm2 . n ) {
483 mcn22 . upper [ i ] <- mcn22 . upper [i -1]+
484 ( Cm + alpha * ( Cf + Cr ) ) / ( mcn1 . tm1 + tf2 . poss [ i ]) *
485 prop . tf2gtm1 . prob [ which . min ( mcn1 ) , i ]
486 }
487 mcn22 . lower <- (1+ alpha ) * Cm / ( mcn1 . tm1 + tm2 . poss ) *
488 cenprop . tf2gtm1 . prob [ which . min ( mcn1 ) , ]
489 mcn22 <- mcn22 . upper + mcn22 . lower
490 mcn22 . tm2 <- tm2 . poss [ which . min ( mcn22 ) ]
491 mcn22 . min <- min ( mcn22 )
492
493 print ( mcn1 . tm1 )
494 print ( mcn1 . min )
495 print ( mcn21 . tm2 )
496 print ( mcn21 . min )
497 print ( mcn22 . tm2 )
498 print ( mcn22 . min )
499
500 # #####
501 # simulation
502 set . seed (2014)
503
504 n <- 1000 # number of theta
505 sim . theta <- rtruncnorm (n , a =1 , b = Inf , mean =2 , sd =1)
506
507 # generate tf1 and tf2 which follow Weibull distribution
508 # given shape parameter theta
509 tf1 <- rep (0 , rep = n )
510 tf2 <- rep (0 , rep = n )
511
512 k <- 1
513 while (k <= n ) {
514 x <- rweibull (1 , shape = sim . theta [ k ] , scale =1)
515 y <- rweibull (1 , shape = sim . theta [ k ] , scale =1)
516 # x <- rgamma (1 , shape = sim . theta [ k ] , scale =1)
517 # y <- rgamma (1 , shape = sim . theta [ k ] , scale =1)
518 if (( round (x , 1) ! = 0) & & ( round (y , 1) ! = 0) ) {
519 tf1 [ k ] <- x
520 tf2 [ k ] <- y
153
521 k = k +1
522 }
523 }
524
525 # compute simulated cost for chance node CN1
526 cost . cn <- rep (0 , n )
527 for ( i in 1: n ) {
528 if ( round ( tf1 [ i ] , 1) <= cn1 . tm1 . opt ) {
529 if ( round ( tf2 [ i ] , 1) <= cn21 . tm2 . opt [ round ( tf1 [ i ] , 1) * 10]) {
530 cost . cn [ i ] <- (1+ alpha ) * ( Cf + Cr ) /
531 ( round ( tf1 [ i ] , 1) + round ( tf2 [ i ] , 1) )
532 }
533 else if ( round ( tf2 [ i ] , 1) > cn21 . tm2 . opt [ round ( tf1 [ i ] , 1) * 10]) {
534 cost . cn [ i ] <- ( Cf + Cr + alpha * Cm ) /
535 ( round ( tf1 [ i ] , 1) + cn21 . tm2 . opt [ round ( tf1 [ i ] , 1) * 10])
536 }
537 }
538 else if ( round ( tf1 [ i ] , 1) > cn1 . tm1 . opt ) {
539 if ( round ( tf2 [ i ] , 1) <= cn22 . tm2 . opt ) {
540 cost . cn [ i ] <- ( Cm + alpha * ( Cf + Cr ) ) /
541 ( cn1 . tm1 . opt + round ( tf2 [ i ] , 1) )
542 }
543 else if ( round ( tf2 [ i ] , 1) > cn22 . tm2 . opt ) {
544 cost . cn [ i ] <- (1+ alpha ) * Cm /
545 ( cn1 . tm1 . opt + cn22 . tm2 . opt )
546 }
547 }
548 }
549 # compute simulated expected cost for chance node CN1
550 mean ( cost . cn )
pm 2tp.R
154
References
Arrow, K. J. (1965), Aspects of the theory of risk bearing, in ‘The Theory of Risk
Aversion’, Yrjö Jahnssonin Säätiö, Helsinki.
Asadzadeh, S. M. and Azadeh, A. (2014), ‘An integrated systemic model for optimiza-
tion of condition-based maintenance with human error’, Reliability Engineering and
System Safety 124, 117–131.
Bayes, T. (1763), ‘An essay towards solving a problem in the doctrine of chances. By
the late Rev. Mr. Bayes, F. R. S. Communicated by Mr. Price, in a letter to John
Canton, A. M. F. R. S.’, Philosophical Transactions 53, 370–418.
155
Bell, D. E. (1995), ‘A contextual uncertainty condition for behavior under risk’, Man-
agement Science 41, 1145–1150.
Celeux, C., Corset, F., Lannoy, A. and Richard, B. (2006), ‘Designing a bayesian
network for preventive maintenance from expert opinions in a rapid and reliable
delay’, Reliability Engineering and System Safety 91(7), 849–856.
Chen, J., Li, K. and Lam, Y. (2010), ‘Bayesian computation for geometric process in
maintenance problems’, Mathematics and Computers in Simulation 81(4), 771–781.
Chen, M. C., Hsu, C. M. and Chen, S. W. (2006), ‘Optimizing joint maintenance and
stock provisioning policy for a multi-echelon spare part logistic network’, Journal of
the Chinese Institute of Industrial Engineers 23(4), 289–302.
156
Damien, P., Galenko, A., Popova, E. and Hanson, T. (2007), ‘Bayesian semiparametric
analysis for a single item maintenance optimisation’, European Journal of Opera-
tional Research 182(2), 794–805.
Davis, J., Hands, W. and Maki, U. (1988), Handbook of Economic Methodology, Edward
Elgar.
Deb, K. (2005), Optimization for engineering design: Algorithms and examples, New
Delhi: Prentice-Hall.
Dieulle, L., Bérenguer, C., Grall, A. and Roussignol, M. (2003), ‘Sequential condition-
based maintenance scheduling for a deteriorating system’, European Journal of Op-
erational Research 150(2), 451–461.
Ding, X., Puterman, M. and Bisi, A. (2002), ‘The censored newsvendor and the optimal
acquisition of information’, Operations Research 50(3), 475–496.
Duffuaa, S., Ben-Daya, M., Al-Sultan, K. and Andijani, A. (2001), ‘A generic concep-
tual simulation model for maintenance systems’, Journal of Quality in Maintenance
Engineering 7(3), 207–219.
Fishburn, P. C. (1970), Utility Theory for Decision Making, John Wiley & Sons, Inc.
157
Fisher, R. A. (1922), ‘On the mathematical foundations of theoretical statistics’, Philo-
sophical Transactions of the Royal Society of London, Series A 222, 309–368.
Flood, B., Houlding, B., Wilson, S. P. and Vilkomir, S. (2010), ‘A probability model
of system downtime with implications for optimal warranty design’, Quality and
Reliability Engineering International 26(1), 83–96.
Holmberg, K., Adgar, A., Arnaiz, A., E., J., Mascolo, J. and Mekid, S. (2010), Springer,
London.
Houlding, B. and Coolen, F. P. A. (2011), ‘Adaptive utility and trial aversion’, Journal
of Statistical Planning and Inference 141(2), 734–747.
158
Houlding, B. and Wilson, S. P. (2011), ‘Consideration on the UK re-arrest hazard data
analysis’, Law, Probability and Risk 10(4), 303–327.
Jeffreys, H. (1946), ‘An invariant form for the prior probability in estimation problems’,
Proceedings of the Royal Society of London, Series A 186(1007), 453–461.
Kallen, M. and van Noortwijik, J. (2005), ‘Optimal maintenance decisions under im-
perfect inspection’, Reliability Engineering and System Safety 90(2-3), 177–185.
Kang, C. and Golay, M. (1999), ‘A bayesian belief network-based advisory system for
operational availability focused diagnosis of complex nuclear power systems’, Expert
Systems with Applications 17, 21–32.
Kapliński, O. (2013), ‘The utility theory in maintenance and repair strategy’, Procedia
Engineering 54, 604–614.
Keren, B. and Pliskin, J. S. (2006), ‘A benchmark solution for the risk-averse newsven-
dor problem’, European Journal of Operational Research 174, 1643–1650.
Kim, H., Kwon, Y. and Park, D. (2007), ‘Adaptive sequential preventive maintenance
policy and bayesian consideration’, Communications in Statistics: Theory & Methods
36(6), 1251–1269.
Laplace, P. S. (1774), ‘Mémoire sur la probabilité des causes par les évènemens’,
Mémoires de Mathématique et de Physique 6, 621–656.
Lei, Y., Liu, J., Ni, J. and Lee, J. (2010), ‘Production line simulation using stpn for
maintenance scheduling’, Journal of Intelligent Manufacturing 21(2), 213–221.
Li, W. and Pham, H. (2006), Statistical maintenance modelling for complex systems, in
H. Pham, ed., ‘Springer Handbook of Engineering Statistics’, Springer, pp. 807–833.
159
Lie, C. H. and Chun, Y. H. (1986), ‘An algorithm for preventive maintenance policy’,
IEEE Transactions in Reliability 35, 71–75.
Lin, D., Zuo, M. J. and Yam, R. C. M. (2000), ‘General sequential imperfect preven-
tive maintenance models’, International Journal of Reliability Quality and Safety
Engineering 7, 253–266.
Lin, D., Zuo, M. J. and Yam, R. C. M. (2001), ‘Sequential imperfect preventive mainte-
nance models with two categories of failure modes’, Naval Research Logistics 48, 172–
183.
Lindley, D. V. (1958), ‘Fiducial distributions and Bayes’ theorem’, Journal of the Royal
Statistical Society, Series B 20(1), 102–107.
Manzini, R., Regattieri, A., Pham, H. and Ferrari, E. (2010), Maintenance for indus-
trial systems, Springer, London.
McDaid, K. and Wilson, S. P. (2001), ‘Deciding how long to test software’, The Statis-
tician 50(2), 117–134.
Nakagawa, T. and Osaki, S. (1975), ‘The discrete W feibull distribution’, IEEE Trans-
actions on Reliability 24(5), 300–301.
160
Nguyen, D. G. and Murthy, D. N. G. (1981), ‘Optimal maintenance policy with im-
perfect preventive maintenance’, IEEE Transactions on Reliability 30(5), 496–497.
O’Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. R., Garthwaite, P. H., Jenkinson,
D. J., Oakley, J. E. and Rakow, T. (2006), Uncertain Judgements: Eliciting Experts’
Probabilities, Wiley.
Padmanabhan, V. and Rao, R. C. (1993), ‘Warranty policy and extended service con-
tracts: theory and an application to automobiles’, Marketing Science 12, 230–247.
Pham, H. and Wang, H. (2006), Reliability and Optimal Maintenance, Springer, Lon-
don.
Pratt, J. W. (1964), ‘Risk aversion in the small and in the large’, Econometrica
32(1/2), 122–136.
161
R Core Team (2015), R: A Language and Environment for Statistical Computing, R
Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
URL: http://www.R-project.org/
Rao, S. S. (2009), Engineering optimization theory and practice, Hoboken, N.J.: John
Wiley & Sons.
Rigdon, S. E. and Basu, A. P. (2000), Statistical Methods for the Reliability of Re-
pairable Systems, first edn, Wiley.
Roux, O., Jamali, M. A., Kadi, D. A. and Châtelet (2008), ‘Development of simulation
and optimization platform to analyse maintenance policies performances for man-
ufacturing systems’, International Journal of Computer Integrated Manufacturing
21(4), 407–414.
Savage, L. J. (1954), The Foundations of Statistical Inference, New York: John Wiley.
Schutz, J., Rezg, N. and Léger (2011), ‘Periodic and sequential preventive mainte-
nance policies over a finite planning horizon with a dynamic failure law’, Journal of
Intelligence Manufacturing 22, 523–532.
Sheu, S., Yeh, R., Lin, Y. and Juang, M. (2001), ‘A bayesian approach to an adaptive
preventive maintenance model’, Reliability Engineering and System Safety 71, 33–44.
162
Triki, C., Alalawin, A. and Ghiani, G. (2013), Optimizing the performance of complex
maintenance systems, in ‘2013 5th International conference on modeling, simula-
tion and applied optimization, ICMSAO 2013’, IEEE, Piscataway, NJ, Hammamet,
Tunisia, pp. 1–6.
Varian, H. R. (1992), Microeconomic Analysis, third edn, W.W. Norton & Company.
Vert, N., Volkova, A., Zegzhda, D. and Kalinin, M. (2015), ‘Maintenance of sustainable
operation of pipeline-parallel computing systems in the cloud environment’, Auto-
matic Control and Computer Sciences 49(8), 713–720.
Wang, H. and Pham, H. (2006), ‘Availability and maintenance of series systems subject
to imperfect repair and correlated failure and repair’, European Journal of Opera-
tional Research 174(3), 1706–1722.
Wang, L., Chu, J. and Mao, W. (2008), ‘An optimum condition-based replacement and
spare provisioning policy based on Markov chains’, Journal of Quality in Mainte-
nance Engineering 14(4), 387–401.
Xiang, Y., Cassady, C. R. and Pohl, E. A. (2012), ‘Optimal maintenance policies for
systems subject to a markovian operating environment’, Computers and Industrial
Engineering 62(1), 190–197.
Yang, I., Hsieh, Y. and Kung, L. (2012), ‘Parallel computing platform for multiobjective
simulation optimization of bridge maintenance planning’, Journal of Construction
Engineering and Management 138(2), 215–226.
163
Zequeira, R. I. and Berenguer, C. (2005), ‘Periodic imperfect preventive maintenance
with two categories of competing failure modes’, Reliability Engineering and System
Safety 91, 460–468.
164