Rel Code

Download as pdf or txt
Download as pdf or txt
You are on page 1of 148


for Windows, Version 9.02

Reliability and Replacement Analysis Software.

Nicholas A.J. Hastings

MIM Professor of Maintenance Engineering
Queensland University of Technology

Published by:
Albany Interactive Pty Ltd
22 Honeywood Court
Samford, Queensland
Australia 4520

Tel/Ans: 07 3289 1066

Fax: 07 3289 1077
RELCODE copyright  1974-2000 N.A.J.Hastings.
ABN 33 006 643 965

Title and Contents i


1. INTRODUCTION ..................................................... 1-1

1.1 Aim of RELCODE ..................................................... 1-1
1.2 Methods .................................................................. 1-1
1.3 Database.................................................................. 1-2
1.4 Introduction to Reliability Concepts ................................. 1-2
1.5 The Normal Distribution............................................... 1-3
1.6 The Weibull Distribution .............................................. 1-4
1.7 The Negative Exponential Distribution .............................. 1-4
1.8 The Three-Parameter Weibull Distribution ......................... 1-4
1.9 Bi-Weibull Distribution ................................................ 1-5

2. INSTALLING AND RUNNING RELCODE .................... 2-1

2.1 Installing RELCODE................................................... 2-1
2.2 Running RELCODE.................................................... 2-1
2.3 Initial Screens ........................................................... 2-1
2.4 Stopping RELCODE ................................................... 2-3

3. DATA AND DATA ENTRY......................................... 3-1

3.1 The Data ................................................................. 3-1
3.2 Failures and Suspensions .............................................. 3-1
3.3 Data for RELCODE.................................................... 3-2
3.4 Database.................................................................. 3-3
3.5 Creating Data for a New item - Header Data....................... 3-3
3.6 Event Data Entry (Failure and Suspension Data) .................. 3-4
3.6.1 Entering a Data Record for an Item ........................ 3-4
3.6.2 Amending Event Data ........................................ 3-5
3.6.3 Deleting an Event Record.................................... 3-5
3.6.4 Adding Event Records........................................ 3-5
3.6.5 Age Data Range ............................................... 3-6
3.7 Print the Data............................................................ 3-6
3.8 Data in Earlier Versions of RELCODE ............................. 3-6

4. RELIABILITY ANALYSIS ......................................... 4-1

4.1 Analysis Menu .......................................................... 4-1
4.2 Distribution Fitting ..................................................... 4-1
4.3 Go With Preferred Model ............................................. 4-2
4.4 Distribution Fitting - Results .......................................... 4-2
Data Summary ................................................. 4-2
Fitted Parameters .............................................. 4-2
Goodness of Fit Test.......................................... 4-2
Model Accuracy Test ......................................... 4-3
4.5 Save Parameters......................................................... 4-4
4.6 Conclusion Regarding Parameters ................................... 4-4
4.7 Details of Models Available........................................... 4-4
4.8 Relative Model Quality ................................................ 4-5

ii Title and Contents

5. RELIABILITY GRAPHS............................................ 5-1
5.1 Graphs Menu ............................................................ 5-1
5.2 Reliability Function..................................................... 5-1
5.3 Plot Data on Weibull Scale ............................................ 5-2
5.4 Cumulative Distribution Function .................................... 5-3
5.5 Probability Density Function.......................................... 5-4
5.6 Hazard Function ........................................................ 5-4


6.1 Background .............................................................. 6-1
6.2 Failure Replacement and Age-Based Preventive Replacement... 6-1
6.3 Component Replacement Policies .................................... 6-2
6.4 Is Age Based Preventive Replacement Worthwhile? .............. 6-2
6.5 Cost Considerations in Replacement policies ....................... 6-3
6.6 Starting Replacement Analysis........................................ 6-4
6.7 Replacement Costs Data Entry........................................ 6-5
6.8 Replacement Analysis Menu .......................................... 6-6
6.9 Graph of Costs Versus Replacement Age ........................... 6-6
6.10 Calculate Cheapest Replacement Age ............................... 6-7
6.10.1 Currency Symbol.............................................. 6-7
6.11 Specified Preventive Replacement Age.............................. 6-8
6.12 Spare Parts Requirements - Age Based Replacement ............ 6-10
6.13 Conclusion .............................................................. 6-11

7. BLOCK REPLACEMENT POLICIES ........................... 7-1

7.1 Block Replacement Policy Definition................................ 7-1
7.2 Graph of Costs versus Block Replacement Interval ............... 7-2
7.3 Cheapest Block Replacement Interval ............................... 7-2
7.4 Specified Block Replacement Interval ............................... 7-3
7.5 Spare Parts Requirements - Block Replacement ................... 7-4
7.6 Life Distribution Function Tabulations .............................. 7-6


8.1 Confidence Limits ...................................................... 8-1
8.2 Getting Confidence Limits with RELCODE........................ 8-1
8.3 Confidence Limits Graph .............................................. 8-2
8.4 Mean Time Between Failures (MTBF) .............................. 8-3
8.5 Getting the MTBF and Confidence Limits .......................... 8-3


9.1 Inspection Intervals for Hidden Failures…………………………… 9.1
9.2 Condition Monitoring Intervals ……………………………………9.3
9.3 Display Distribution Parameters...................................... 9-5
9.4 Previous Analysis Summary .......................................... 9-6

10. IMPORTING AND EXPORTING DATA....................... 10-1

10.1 Importing Data ......................................................... 10-1
10.2 Creating Importable ASCII Files .................................... 10-4
10.2.1 RELCODE Windows Data Standard Format ............ 10-4

Title and Contents iii

10.2.2 Creating Data…using a Spreadsheet....................... 10-4
10.3 Exporting Data ......................................................... 10-5
10.4 Exporting Results to File or Clipboard ............................. 10-6


11.1 General11-1
11.2 Definitions .............................................................. 11-1
11.2.1 Reliability as a Function of Operating Life............... 11-1
11.2.2 Reliability of one-Shot Devices ............................ 11-2
11.2.3 Failure.......................................................... 11-2
11.2.4 Reliability as Probability .................................... 11-2
11.2.5 Mean Time to Failure (MTTF) ............................ 11-2
11.2.6 Mean Time Between Failures (MTBF) ................... 11-3
11.3 Phases of Failure....................................................... 11-3
11.4 Bath Tub Curve ........................................................ 11-3
11.5 Other Failure Rate Patterns .......................................... 11-4
11.6 Importance of the Failure Rate Pattern ............................. 11-5
11.7 Failure Probability Density Function, F(T) ........................ 11-6
11.8 Reliability Function, R(T) ............................................ 11-7
11.9 Distribution Function, F(T) .......................................... 11-8
11.10 Relationship between Probability Density Function F(T) an8
Distribution Functions F(T)................................. 11-8
11.11 Hazard Function, H(T) ............................................... 11-8
11.12 Conclusion .............................................................. 11-9

12. LIFE DISTRIBUTIONS ............................................ 12-1

12.1 Introduction ............................................................. 12-1
12.2 Negative Exponential Distribution .................................. 12-1
12.3 Weibull Distribution ................................................... 12-3
12.4 Weibull Graphs - Hazard Function.................................. 12-5
12.5 Weibull Graphs - Probability Density Function ................... 12-6
12.6 Weibull Graphs - Reliability Function .............................. 12-7
12.7 The Three Parameter Weibull Distribution ........................ 12-8
12.8 Bi-Weibull Distribution ............................................... 12-9
12.9 Derivation of Bi-Weibull Hastings Distribution ................. 12-10
12.10 Discussion............................................................. 12-12


13.1 Good as New and Bad as Old ........................................ 13-1
13.2 Failures and Suspensions ............................................. 13-1
13.3 Basic Logic of the Analysis .......................................... 13-2
13.4 Weibull Probability Paper ............................................ 13-3
13.5 Weibull Plotting ........................................................ 13-3
13.5.1 Example........................................................ 13-2
13.5.2 Order Number ................................................ 13-3
13.5.3 Cumulative Probability Estimator and Median Rank ... 13-4
13.6 Weibull Plot – Results................................................. 13-5
13.6.1 Characteristic Life (ETA, η) ............................... 13-5
13.6.2 Shape Parameter (BETA, β)................................ 13-5

iv Title and Contents

13.7 Random Failures ....................................................... 13-6
13.7.1 Random Failures and the
Negative Exponential Distribution .................................. 13-6
13.8 Estimation of the MTBF .............................................. 13-7
13.8.1 Service Hours ................................................. 13-7
13.8.2 Point Estimate of the MTBF................................ 13-7
13.9 Confidence Limits – General Concepts............................. 13-7
13.9.1 One Sided and Two Sided Confidence Limits ........... 13-7
13.10 Confidence Limits for the MTBF ................................... 13-7
13.11 Confidence Limits – Procedure...................................... 13-8
13.12 Confidence Limits Example .......................................... 13-8
Confidence Limits Table…………………………………………….13-9
13.13 Exercise – Confidence Limits for the MTBF...................... 13-9


14.1 Suspended Items ....................................................... 14-1
14.2 Suspended Items – Data............................................... 14-1
14.3 Suspended Items – Formula .......................................... 14-2
14.4 Suspended Items – Example of Calculation........................ 14-2
14.5 Suspended Items – RELCODE Analysis ........................... 14-3
14.6 Bi-Weibull Distribution Example.................................... 14-5
14.7 Distribution Models and Fitting Methods Used by RELCODE14-10
14.7.1 Model 1. Weibull 2 Parameter fitted by
Linear Regression.................................................... 14-11
14.7.2 Model 2. Weibull 2 Parameter fitted by
Maximum Likelihood...................................... 14-11
14.7.3 Model Accuracy ............................................ 14-11
14.7.4 Goodness of Fit............................................. 14-12
14.7.5 Goodness of Fit Test – Examples ........................ 14-12
14.7.6 Model 3. Weibull 2 Parameter fitted by
Maximising the Model Accuracy ........................ 14-13
14.7.7 Model 4. Weibull 3 Parameter fitted by
Linear Regression .......................................... 14-14
14.7.8 Model 5. Weibull 3 Parameter fitted
Maximum Model Accuracy............................... 14-14
14.7.9 Model 6. Bi-Weibull Disibution fitted by
Maximum Model Accuracy............................... 14-14
14.8 Confidence Limits for Reliability – Theory...................... 14-14
14.9 Example of Calculation for Confidence Limits ................. 14-15
14.10 Conclusion ............................................................ 14-16


– ANALYTICAL METHODS ..................................... 15-1
15.1 Introduction ............................................................ 15-1
15.2 Replacement Only On Failure (ROOF)............................. 15-1
15.3 Age-Based Preventive Replacement Policy ........................ 15-1
15.4 Cost Minimization with Age-Based Preventive Replacement ... 15-3
15.4.1 Truncated Mean Life ........................................ 15-3
15.5 Spare Parts for Age Based Policy ................................... 15-5

Title and Contents v

15.6 Block Preventive Replacement Policy .............................. 15-5


16.1 No Failure Data……………………………………… ................. 16.1
16.2 No Failure Data Screen ............................................... 16.1
16.3 The Three Point Estimate Method……………………… ........... 16-2
16.4 Entering the Three Point Estimate into RELCODE………….... 16.3
16.5 Results………………………………………………… ................. 16.3
16.6 Theory of the Three Point Estimate Method………………… .... 16.5
16.7 Conclusion………………………………………………… ............ 16.5


vi Title and Contents


A Computer Software Package

Reliability and Replacement Analysis

1.1 Aim of RELCODE
The aim of RELCODE is to deal with several problems in the field of reliability and maintenance

• The statistical analysis of reliability data

• The determination of minimum cost replacement policies for components or assemblies
• The estimation of spare parts requirements.
• Estimation of inspection intervals.

1.2 Methods
RELCODE uses well documented statistical and analytical methods which are widely accepted by
reliability and maintenance engineers. These include:

• Weibull Analysis - this is the classic reliability analysis technique

• Many recent extensions of Weibull analysis which take advantage of the power of the modern

• Age and block replacement policy analysis

• Calculation of inspection intervals for hidden failures and for condition monitoring, as used for
example in Reliability Centered Maintenance

These techniques enable you to:

• Make a scientific appraisal of your reliability data

• Compare item reliability with specifications

• Compare the reliability of competing items,

• Evaluate appropriate maintenance, safety and environmental policies,

• Evaluate developments intended to improve reliability

• Generate tables and graphs that can be incorporated in wordprocessor systems for report

1. Introduction 1-1
RELCODE is thus a valuable tool for the reliability or maintenance professional.

1.3 Database
RELCODE incorporates a database (using Microsoft Access) which is designed to hold the
information required to carry out reliability and replacement analysis on your data. This includes
data on age at failure and on known successful performance of items which have run without
failure. It also has provision for entering cost and usage data and for recording the current
reliability parameters and replacement policy for your items. As you enter your data it is
automatically stored in this database and is then easily retrieved for analysis or amendment using
familiar Windows methods. There are also options for importing and exporting data in formats
compatible with spreadsheets, and for importing data from earlier versions of RELCODE.

1.4 Introduction to Reliability Concepts

Later chapters of this manual provide a tutorial course in reliability and replacement analysis. In this
chapter we provide a brief introduction to the main concepts as a starting point for using RELCODE.

The failure of equipment usually can be described by one or more of the following types of failure

Decreasing Failure Rate, associated with EARLY LIFE failures,

Constant Failure Rate, associated with RANDOM failures,
Increasing Failure Rate, associated with WEAROUT failures.

The WEIBULL distribution can identify any one of the above failure patterns, i.e. Decreasing,
Constant or Increasing failure rate, but if more than one pattern is present it produces an averaging of
the several patterns.

A BI-WEIBULL distribution is based on combining two Weibull distributions. It can represent a

combination of two failure patterns. This allows for the situation where Early Life, Random or
Wearout failures occur in combination. The use of the Bi-Weibull distribution is important in
allowing more accurate assessments of reliability and more cost-effective component replacement
policies to be established than when the standard Weibull distribution is used alone.

By the use of these models, RELCODE enables relevant failure patterns to be determined from the
users data. The mean life of the item can be estimated, as well as the reliability to any age, and
confidence limits for reliability.

Identifying the type of failure rate pattern helps you to determine the root cause as to why items are
failing. EARLY LIFE failure is normally associated with manufacture or installation defects. Items
which have survived the early life failure or burn-in period are more reliable than the average new
item. For improvements in reliability, the manufacture or installation problems must be eliminated,
or a period of accelerated aging introduced in order to eliminate early failures in service.

RANDOM failures can indicate a cause of failure which is external to the item itself, and which
causes a stress in excess of the design strength. An example is a nail in a tire or a brick through a
window. RANDOM failures also occur in complex and electronic equipment where there are a very
large number of possible causes of failure.

WEAROUT failures are due to wear or fatigue and occur in many mechanical components such as
gears or valves or in perishable items such as seals and filters. They may be sudden in nature or may

1-2 1.Introduction
take the form of a slow degradation which eventually reaches a level which is formally defined as a
failure. The ability to detect the onset of wearout when superimposed on Early Life or Random
failures is a feature of the Bi-Weibull distribution.

For the maintenance engineer, major concerns are the setting of preventive maintenance intervals
and the forecasting of spare parts requirements. RELCODE assists the user in evaluating the
following three replacement policy options:

Replace Only On Failure

Preventive Replacement Based On Individual Component Age
Preventive Replacement of Components in a Block

Using standard techniques widely accepted by maintenance engineers, RELCODE determines

whether preventive replacement is cost effective. If so, it determines the cheapest preventive
replacement policy, and the financial savings achieved by the use of this policy. The average number
of spare parts required by the various policies, for levels of total component utilization specified by
the user, are also calculated.

1.5 The Normal Distribution

Users of statistics will be familiar with the Normal distribution, illustrated in Figure 1.1. This
distribution is important in many areas of statistics, but because it always has the same shape it is not
widely used in reliability analysis.

Figure 1.1. The Normal Probability Density Function

1. Introduction 1-3
1.6 The Weibull Distribution
A more flexible distribution is the Weibull distribution, which has a constant which allows it to
assume a variety of shapes. This constant is called "Beta" (ß) and is termed the "Shape Parameter".
Figure 1.2. illustrates the form that the Weibull takes for a variety of Beta values.

Figure 1.2.

All probability distributions have their own equation. The Weibull's is:

f(t) = (ß/η) (t/η)(ß-1) exp[-(t/η)ß]

It is not really important that we fully understand this equation, since RELCODE does all the
necessary calculations. The parameter eta (η) is called the characteristic life and is defined as the time
before which there is a 63.2% risk of failure. The 63.2% figure arises from the mathematics of the
Weibull equation, as (1 - (1/e)) where e is the base of natural logarithms. This is the value of the
Weibull cumulative distribution function when t = η. The distribution shown in Figure 1.2 is Two-
Parameter Weibull, the two parameters being ß and η.

1.7 The Negative Exponential Distribution

The Negative Exponential Distribution is a special case of the Weibull distribution in which the
shape parameter ß = 1. In this case the failure rate is a constant 1/η.

1.8 The Three-Parameter WEIBULL Distribution

The Weibull Distribution can be made even more flexible by the introduction of a third parameter
which is called the "Minimum Life" and is usually denoted by the symbol gamma ( γ ). Graphically
the Three Parameter Weibull looks like Figure 1.3, if beta is approximately 2. The minimum life (which
is sometimes called a "Location Parameter" or a "Guarantee Life") is the time before which it is
assumed there is a 0 (zero) risk of failure.

1-4 1.Introduction
Figure 1.3. Three Parameter Weibull Distribution

1.9 Bi-Weibull Distributions

Reliability analysis sometimes requires a distribution with even more flexibility than the two or three
parameter Weibull. Authors such as Kao (1959), Clark (1991) and Hastings and Ang (1995) have
proposed distributions based on combining two (or possibly more) Weibull models in various ways.
Distributions of this type are referred to as bi-Weibull distributions. Bi-Weibull distributions can be
formed by adding Weibull probability density functions, multiplying reliability functions, adding hazard
functions, or in other ways, and can involve varying numbers of parameters.

In RELCODE we make use of a specific type of bi-Weibull distribution, describes in Hastings and Ang
(1995) as Hastings Distribution. This is formed by adding together a Weibull Two Parameter and a
Weibull Three Parameter Hazard Function, as detailed in Chapter 12. This provides a distribution with
five parameters, which can represent combinations of any two failure patterns, that is any two of burn-
in, random and wearout patterns. This can mean, for example, burn-in followed by wearout, or random
followed by another random with a higher failure rate. This provides an extremely flexible distribution
model which can fit the vast majority of observed failure patterns. If the data do not require the full five
parameters, it will revert to a Weibull distribution. Here we shall refer to this distribution as the bi-
Weibull distribution, unless it is necessary to distinguish between this particular form of bi-Weibull and
others, when we shall refer to it as Hastings distribution, or Hastings bi-Weibull distribution.

1. Introduction 1-5

2.1 Installing RELCODE

RELCODE runs under Windows 95 or later. To install RELCODE proceed as follows:

a. Start Windows.

b. Insert the CD provided in your CD drive:

c. RELCODE will normally automatically start to install, but if it does not then click Start, Run.
At the Command prompt type X:\SETUP where X is the letter reference of your CD drive and
press the Enter key or click the OK button.

d. Follow the prompts to install RELCODE

2.2 Running RELCODE

Click Start, Programs. Locate the Reliability Analysis program group and click it, or locate and
then double click the RELCODE icon

2.3 Initial Screens

When we run RELCODE the following screen appears:

2. Installing and Running 2-1

Figure 2.1. Start Screen

Click the Start button to go to the Item Header Screen shown in Figure 2.2.

Figure 2.2. Item Header Screen

2-2 2. Installing and Running

You are now ready to enter data and carry out analysis, as discussed in subsequent chapters.

2.4 Stopping RELCODE

To exit from RELCODE click the Close button on the bottom right of the current screen. This will
take you back to a preceeding screen. Continue until you reach the Start screen (Figure 2.1),
where you click the Exit button to exit RELCODE.

2. Installing and Running 2-3


3.1 The Data

RELCODE is concerned with the analysis of the reliability of items which, when they fail, are
replaced by new items, or are repaired to a good-as-new standard. We shall take as an example a
drive belt on a crane hoist motor. The first thing to emphasize is that the data relating to the
component you are analyzing should be taken from records pertaining to the same kind of belt
working on the same kind of motor - in other words it should be as homogeneous as possible.

Let us assume that you have data which shows the total hours run by the crane at the time when drive
belt replacements occurred. To analyze the reliability of the belts, we need to know the age of the
individual belts when they were replaced, and whether the replacements were due to the belt failing or
for some other reason. We also need to know the ages of surviving belts which at the time of the
analysis have not failed but are still running. In other words, we need data on both the failures and
the successful performance of the components in question.

You should extract the data from your records into a worksheet in the style of Figure 3.1:

Figure 3.1 Drive Belt Data for Crane 1



2634 2634 F
9002 6368 S
13791 4789 F
17331 3540 (Surviving) S

3.2 Failures and Suspensions

The "F" or "S" column must be explained. Firstly note that “F” stands for Failure and “S” stands for
Suspension, whilst failures and suspensions are both referred to as “Events”.

"F" means that the component was replaced as a result of failure - it was a "Failure Replacement".
The concept of “Failure” can include situations where the belt actually fails in service, and situations
where it is deemed to have deteriorated to a point where it is prudent to replace it. The latter situation is
a “Near Failure Condition” replacement in which the condition is such that the belt can be regarded as
having failed. The definition of a failure event is at the discretion of the user, in that you could choose to
analyze failures of only one particular type, or you could regard all sources of failure as “failures” for a
particular study. The key thing is that it is important to use a consistent definition for failure events in
any given data set.

"S" refers to "Suspensions". Suspensions or suspended items are items for which we have successful
performance data, but which have not failed nor reached a near failure condition. One source of
suspensions can be Preventive Replacements carried out not as a result of failure, or even near failure

3. Data and Data Entry 3-1

conditions but perhaps because the crane was in the shop and it was an accepted practice to replace the
belt while other work was being done.

But there are other conditions where you would record "Suspensions", for example, if the crane were
taken out of service, or sold, or the engine were replaced with a new (or reconditioned) one, the old
drive belt being removed with the old engine. Another source of suspended items is items currently
running which have not failed. Thus, the last entry in Figure 3.1 is the current odometer reading for the
crane, and gives one more suspension "reading" on a time when the drive belt was still working. These
"suspensions" are an integral part of the RELCODE formula, and in fact allow the calculation of much
more accurate reliability and replacement solutions than would be possible if only failure data were

Getting back to your work-sheet, you would continue organizing your data in the same way as in Figure
3.1 for other similar drive belts. As you can see, the odometer reading between events is simply the
subtraction of one entry from the one immediately following it.

3.3 Data For RELCODE

RELCODE analyses data relating to the failure or suspension of components or equipment. Up to 1008
entry records are allowed for a given type of item.. Failure data about a component is quite simply the
time it took for it to fail or the distance it traveled before failure etc. (depending on the units involved).
Suspension data relates to those items which were withdrawn from use for some reason other than
failure, or continue in use without failure. For each component the following information is entered:

• The age of the item when it failed or was suspended (i.e. after how many hours or kilometers etc.).

• Whether it was a failure or suspension.

• If several failures occurred at the same age then we can enter a frequency. Similarly, if several
suspensions occurred at the same age we can enter a frequency. The default frequency for any entry
is 1. If both failures and suspensions occurred at the same age, make two separate entries, one for
failures and one for suspensions.

Figure 3.2 shows an example of data suitable for input into RELCODE. This data represents more
extensive drive belt reliability data than that shown in Figure 3.1. RELCODE will accept the data in any
sequence by age. RELCODE sorts the data by age during the course of data entry.

Figure 3.2. Example of RELCODE Data

Item Reference: Drive Belt
Age Unit: Hours


2634 F 1
1584 F 1
3540 S 1
4136 F 2
5579 F 1
4789 F 1
6368 S 3

3.4 Database

3-2 3. Data and Data Entry

Data entered by the user into RELCODE is saved on a database nominated by the user and referred to as
a RELCODE database. A RELCODE database is a Microsoft Access database with filename extension
.MDB. RELCODE databases have a given structure and an appropriate database is supplied with the
RELCODE software package. Options also exist for importing data into RELCODE from files
generated by spreadsheets, or from files created by earlier versions of RELCODE. Details of this are
given in the chapter on Importing Data.

You can create a new RELCODE database by copying an existing one. For example, you can copy the
original database RELDAT97.MDB supplied with RELCODE. This contains a few demo items which
can optionally be deleted later. The new database can be opened by using the Change Database option
on the Item Header Screen.

3.5 Creating Data for a New Item - Header Data

The data for any item consists of Header Data and Event Data. To create data for a new item we
initially create the Header Data. To do this, click the Add Item button in the footer of the Item
Header Screen, shown in Figure 3.3. The fields in the bottom section of the screen will clear so
that new data can be entered.

Figure 3.3 Item Header Screen

Enter the Item Reference, up to 70 characters long, and the age unit, up to 10 characters long.
The age unit is the unit in which age at failure or suspension is measured. For example, the units might
be kilometers, hours, cycles, miles, or anything. The name of the unit is purely descriptive and is up to
10 characters long. The units appear in appropriate positions in the output. In our example the units are
hours. Other parameters such as an Item Description and various cost data can also be entered at this
time, or can be added later. Click the Ok button to save the Item in your database. To enter failure and
suspension data for this item, click the Event Data Entry button on the right hand side of the screen. The
Event Data Screen will then appear. This is illustrated in Figure 3.4.

3. Data and Data Entry 3-3

3.6 Event Data Entry (Failure and Suspension Data)

If you are at the Item Header Screen, click the Event Data Entry button on the right hand side of the
screen. The Event Data Screen will then appear. This is illustrated in Figure 3.4.

Each event record contains the fields:

Record Number (maintained by RELCODE)

Failure (F) or Suspension (S)

The Record Numbers are created and maintained automatically. Each event record can contain data
indicating the occurrence of one or more failures at a stated age, or one or more suspensions at a stated

3.6.1 Entering a Data Record for an Item

When a new item header entry is created, initially the table of event data, which appears in the grid on
the left of the Event Data screen will be empty.

New event records are created by entering data at the panel on the right of the Event Data screen. To
add a record, click the New Record Button and then enter data in the fields below it:

Enter F for Failure or S for Suspension

Enter the Age of failure or suspension
Enter the frequency, that is the number of failures or suspensions at that age.

Then click the Save button and the data will be written into the grid on the left of the screen, or if you
decide not to save the data you have just entered, click Cancel. Data can be entered in any sequence by
age and will be automatically sorted when you exit this screen. The default entries are F for failure and 1
for frequency.

Figure 3.4. - Event Data Screen

3-4 3. Data and Data Entry

Records can also be added at the keyboard, which is convenient when a number of records are to be
added in succession. Initially, click the New Record button. Press Enter to move forward to the
Failure/Suspension field. If the event is a Failure, just press Enter and the default entry F will appear. If
it is a suspension enter S and press Enter. Then key in the age and press Enter to move forward to the
Frequency field. If the frequency is 1, just press Enter and the default value will appear. If the
frequency is greater than 1, key it in and press Enter. At the Save Button, press Enter again and the
record will be added to the grid. The focus will move to the New Record button. Press Enter if you
wish to add another record.

3.6.2 Amending Event Data

To amend a data record we first select it by clicking it in the grid on the left of the screen. The entry
will appear in the fields in the Add or Amend Data panel on the right of the screen. We then amend the
entry in the panel. Clicking Save will put the amended record back in the grid.

3.6.3 Deleting an Event Record

To delete an event record select it in the grid by clicking on the dark section on the left of the record,
and then press the Delete key. Deleting records may leave gaps in the sequence of record numbers, but
RELCODE will renumber the records when you Close the screen.

3.6.4 Adding Event Records

Please refer to Section 3.6.1

3.6.5 Age Data Range

The age data can have up to six digits in front of the decimal point and are displayed by the computer
with two digits after the decimal point. In some applications it may be necessary to scale the data. For

3. Data and Data Entry 3-5

example, if observations went into millions of kilometers, we would need to change the age units to
hundreds or thousands of kilometers.

Age values of less than 0.01 are not accepted. If failures do occur at zero age, enter a small age value to
represent them.

3.7 Print the Data

To print the current data, click the Print Data button. Figure 3.5 shows the result for the current data.

Figure 3.5 Data Printout

Pump Reliability Analysis

Item Ref: Drive Belt
Age Unit: Hours
User: Hastings: RELCODE User Manual
Record Number Age Event Type Frequency

1 1584 F 1
2 2634 F 1
3 3540 S 1
4 4136 F 2
5 4789 F 1
6 5579 F 1
7 6368 S 3

3.8 Data in Earlier Versions of RELCODE

In earlier versions of RELCODE there were two types of data, Ungrouped and Grouped. Ungrouped
data consisted of ages at failure or suspension of individual items. Grouped data consisted of numbers
of failures and suspensions which occurred in specified age ranges. The Windows version of
RELCODE has only one type of data, which is in the style shown in Figure 3.2. This is similar to the
previous Ungrouped style except that frequencies can now be entered. The ability to enter frequencies
means that the old Grouped style is no longer required, since frequencies can be used optionally to
achieve the equivalent of any grouping that the user may wish to make. Conversion of data from the old
RELCODE formats is discussed in the “Import and Export Data” Chapter.

3-6 3. Data and Data Entry


4.1 Analysis Menu

Once we have entered our data we can analyze it. To start the analysis we click the Analysis Menu
button on either the Item Header Screen or the Event Data Screen. The Analysis Menu Screen shown in
Figure 4.1 then appears.

Figure 4.1 Analysis Menu Screen

4.2 Distribution Fitting

To fit a distribution to our data we click the button labeled: “Model Optimization - Fit Distribution”,
shown in Figure 4.1. RELCODE then fits a range of distributions using a variety of techniques, and
produces results which, for the data of Chapter 3, are as shown in Figure 4.2. The various distributions
and fitting methods are referred to as Models 1 to 6. RELCODE also recommends a preferred model at
the bottom of the screen (above the command buttons). In Figure 4.2 we see that the preferred model is
a Weibull 2 Parameter distribution, with parameters obtained by Maximum, Model Accuracy, Model 3.

Figure 4.2 Distribution Summary

4. Reliability Analysis 4-1

4.3 Go With Preferred Model
RELCODE analyzes the results of distribution fitting and recommends a preferred model. The user
can accept this model by clicking the button marked “Go With Preferred Model” or can select any
model by clicking the corresponding “Use Model” button. The selected model is then used in
subsequent analysis. If we are happy to accept the recommended model we can now move on to
consider the results obtained from that model. These are shown in Figure 4.3. The features of the
alternative models and how RELCODE selects the preferred model, are discussed in Sections 4.7 and
4.8 of this chapter, and in Chapter 14.

4.4 Distribution Fitting - Results

In the Drive Belts example the preferred model is a Two Parameter Weibull distribution fitted by
Maximum Model Accuracy. Clicking the “Go With Preferred Model” button at the Distribution
Summary screen brings up the results shown in Figure 4.3.

Data Summary. The first part of Figure 4.3 summarises our data, giving the numbers of failures and
suspensions, and the total of failures and suspensions.

Fitted Parameters. The second part shows the parameters of the fitted distribution. In this case
these are a Shape Parameter (BETA) of 2.17 and a Scale Parameter (ETA) of 5827.32. Also shown is
the value of the Mean Life (or MTBF) which is 5160.70. This is the average life which can be
expected from the drive belts, as calculated from the distribution model. Also, we get the
Characteristic Life which is the expected age to which 36.8% of items will survive and the standard
deviation which is a measure of the spread of the failure ages.

Goodness of Fit Test. The third part of Figure 4.3 addresses the question of whether the fitted
distribution provides a good fit, in a statistical sense, to the data. The test used is the Hastings-Ang
Model Accuracy test, described in Ang and Hastings (1994). This article also discusses the merits of

4-2 4. Reliability Analysis

various goodness of fit tests. The Hastings-Ang test is preferred because it is applicable to data
which can contain suspended items. Other tests, notable the Kolmogorov-Smirnov test, can give
faulty results when suspended items are present.

Figure 4.3 Fitted Weibull Parameters

Model Accuracy Test. The "Goodness of Fit" section shows the Model Accuracy which is defined by

Model Accuracy = 1 - Root Mean Square Probability Error, as a percentage.

A Model Accuracy of 100% means that all the points lie exactly on the fitted line, i.e. a perfect fit. In
this case the Model Accuracy is 99.46%. This means that the average (root mean square) distance of the
points from the line is 0.0054 or 0.54% on a linear probability scale.

Figure 4.3 shows the critical values of model accuracy given by the Hastings-Ang test, at various
confidence levels. The critical value at any given level of confidence is the value of model accuracy
which would cause us to reject the model with that level of confidence. The critical value at the 80%
Confidence Level is 90.96%. If the observed Model Accuracy is less than 90.96%, we can say with 80%
confidence that the model given in Figure 4.3 does not fit the data. In this example the observed Model
Accuracy is 99.46%, and we do not reject the hypothesis that the model fits the data. Corresponding
statements can be made at the 90%, 95% and 99% confidence levels using the values shown.

As a qualitative measure, Model Accuracies are graded as follows in relation to the critical values. Note
that the critical values decrease as the confidence levels increase.

Model Accuracy > critical value for 80% confidence = GOOD

Between critical values for 80% and 99% confidence = MODERATE
Less than critical value for 99% confidence = POOR

Thus, in the example, the Model Accuracy is Good. We conclude that the Weibull distribution, with the
parameters shown, provides a good fit to the data in this example. RELCODE incorporates extended

4. Reliability Analysis 4-3

versions of the Model Accuracy tables beyond those in Ang and Hastings (1994), covering the use of
the root mean square probability error and the two and three parameter Weibull and the Hastings bi-
Weibull distribution.

4.5 Save Parameters

RELCODE can save the values of the fitted parameters on its reliability database. To save the
current parameters, click the Save Parameters button at the bottom of the Weibull Parameters Screen,
Figure 4.3, (or bi-Weibull Parameters Screen if this distribution is the preferred model). If we do not
save the current distribution parameters they will be lost when we exit from the given item, though of
course they can be re-computed later. If we do save the parameters, these values will be read in from
the database and become the current parameter values the next time the item is accessed. To carry
out Replacement Analysis it is necessary to have current parameter values.

4.6 Conclusion Regarding Parameters

We have seen how RELCODE will:

• Fit a range of distribution models to our data,

• Indicate a preferred model and allow us to select this model or any of the other models if we

• Calculate and display a range of parameters for the selected model,

• Carry out a statistical goodness of fit test on the selected model.

• Save the current distribution parameters

Thus we determine an appropriate distribution model for our data, an indication of whether the model
is statistically valid, and also the parameters of our model which may already contain valuable
information. In the case where the result is a two parameter Weibull distribution we can infer the
shape of the distribution and the corresponding failure rate from the Shape Parameter, beta. If beta is
less than 1 this indicates burn-in failures, if it is equal to or close to 1 it indicates random failures and
if it is greater than 1, as in this case, it indicates wearout failures. The Mean Life or MTBF is also

Thus we conclude that for the given data we have wearout failures and a Mean Life of estimated at
5160 hours.

Further insight into the reliability of our items can be obtained from the graphs which are discussed
in the next chapter. To view these graphs, click the Graphs Menu button at the bottom of the
parameters screen, Figure 4.3.

4.7 Details of Models Available

Model 1 fits the two parameter Weibull by transforming the data to Weibull probability scales and then
using linear regression to fit a straight line. This is equivalent to the probability paper method, widely
used for manual calculations.

4-4 4. Reliability Analysis

Model 2 uses the Maximum Likelihood method for fitting the two parameter Weibull distribution. The
Maximum Likelihood method is regarded as statistically superior to the Linear Regression method, but
is susceptible to being influenced by outliers.

Model 4 fits the three parameter Weibull by subtracting a minimum life value, transforming the data to
Weibull probability scales and then using linear regression to fit a straight line. This is equivalent to the
probability paper method for the three parameter Weibull distribution. The best value of the minimum
life is found by a search process.

Model Accuracy. Models 3, 5 and 6 fit the data using the Model Accuracy method. This method
searches for distribution parameters such that the root mean square distance of the observed points
from the fitted line is as small as possible, on linear scales of age and reliability. By contrast, the
Linear Regression method (despite its name) uses the highly non-linear Weibull probability paper
scales. The Linear Regression method can be unduly influenced by points at very low or very high
probability levels where the scale of the graph is very spread out. The Model Accuracy method
applies uniform scaling in seeking a curve which minimises the average (root mean square) distance
of the points from the line. The Model Accuracy method is therefore recommended, but the Linear
Regression models are available because they emulate the widely used graph paper methods.

Model 3 fits the two parameter Weibull distribution using the Model Accuracy method.

Model 5 fits the three parameter Weibull distribution using the Model Accuracy method

Model 6 fits the bi-Weibull distribution using the Model Accuracy method.

4.8 Relative Model Quality

RELCODE selects the preferred model on the basis of Relative Model Quality. This uses the Model
Accuracy for the model, but does not simply choose the model with the highest accuracy. This is
because, a model with more parameters can be expected to give higher accuracy than a model with
fewer parameters even when no real improvement exists. To allow for this, a parameter factor is
introduced. The Parameter Factor is an allowance for the number of parameters estimated. This
factor has been determined by simulation, and is the average improvement in model accuracy
obtained when fitting the same data with 3 or 5 parameters as opposed to parameters.

RELCODE selects the preferred model by calculating the relative model quality as follows:

Relative Model Accuracy Model Accuracy for

Model = for - 2 parameter Weibull - Parameter Factor
Quality current model fitted by linear regression

The 2 parameter Weibull model obtained by linear regression (Model 1) is taken as a reference point
and will always have a relative model quality of zero. The model with the highest relative model
quality is the preferred model. In the event of a tie, the lowest numbered model is chosen.

4. Reliability Analysis 4-5


5.1 Graphs Menu

RELCODE will present your reliability data in a number of graphical forms. Click the Graphs
Menu button on the bottom of the Parameters Screen (Figure 4.3) and the Graphs Menu shown in
Figure 5.1 will appear. Five different graphs can then be obtained by clicking the option buttons
on the Graphs Menu.

Figure 5.1 Graphs Menu

5.2 Reliability Function

The first option on the Graphs Menu is the Reliability Function. This shows a plot of reliability against
age for the distribution fitted to our data. The data itself is shown and also the distribution parameters.
Figure 5.2 shows an example. The data in this example are as in Chapter 4.

5. Reliability Graphs 5-1

Figure 5.2 Reliability Function

The Reliability Function is a plot of Reliability, shown on the vertical scale as a percentage ranging
from 100% to 0%, against Age. From the graph we can read off the reliability to any selected age, or the
age for which we have any selected reliability value. For example, we can find the age to which the
reliability level is 90% (known as the B10 Life). To find this, enter the graph from the left at the 90%
level. The corresponding age is approximately 2100 hours, and this is the B10 Life for the Drive belts.
To return to the Graphs Menu, click Other Graphs. To return to the distribution parameters screen click

5.3 Plot data on Weibull Scale

This option will generate a plot of our data on Weibull probability scales. These are scales which are
mathematically devised so that a Weibull cumulative distribution function will appear as a straight line.
Prior to the advent of computer based analysis, a manual Weibull Plot was the best practical method of
fitting a Weibull distribution.

The Weibull plot has the component age, t, on a logarithmic horizontal scale. The vertical scale is the
cumulative probability of failure, F(t), transformed to ln(ln(1/(1-F(t)))). The data are plotted on these
scales and a straight line is drawn through them to represent a Weibull distribution. The equation of the
line corresponds to the parameters of the currently selected distribution model.

5-2 5. Reliability Graphs

Figure 5.3 Weibull Plot

5.4 Cumulative Distribution Function

This option gives a graph of the Cumulative Distribution Function on linear scales. This is the
same function as in the Weibull plot, but is on linear scales of Cumulative Probability of Failure
against Age. Figure 5.4 shows the plot.

Figure 5.4 Cumulative Probability of Failure

5. Reliability Graphs 5-3

5.5 Probability Density Function

The next graph menu option is the Probability Density Function shown in Figure 5.5. The
probability density funtion has the property that the area under the curve between any two ages
gives the probability that a new item will fail in that age range. Note that the vertical scale is
failures per 10000 hours. The horizontal and vertical scaling are carried out automatically by

Figure 5.5. Probability Density Function

5.6 Hazard Function

The next option on the Graphs Menu is the Hazard Function. This shows the failure rate among
items which have survived to any given age. Figure 5.6 illustrates this.

5-4 5. Reliability Graphs

Figure 5.6 Hazard Function

This concludes the reliability distribution graphs. To return to the Graphs Menu, click Other Graphs.
To return to the distribution parameters screen click Close.

5. Reliability Graphs 5-5


6.1 Background

A common problem for maintenance managers is to determine a policy to adopt in regard to the
replacement of components which do, or may, fail. The appropriate policy will depend on such factors

• The reliability of the component as a function of operating life, and in particular whether wearout

• Whether a practical method of condition monitoring exists for the component, how effective it is in
predicting failure and what options it may provide for component replacement

• The costs arising if we need to replace a component at an inconvenient time as the result of actual
failure, or the detection of a near-failure condition

• The costs associated with replacement of the component before failure on an age basis, at a
convenient time, e.g. at a routine maintenance time.

• Safety considerations

• Availability of replacement parts

• Other possible maintenance actions such as overhaul or cannibalisation and the extent to which
these restore the component to "as good as new" condition.

6.2 Failure Replacement and Age-Based Preventive Replacement

In our analysis of replacement policies in this chapter we shall consider two types of situations in which
component replacements occur. We shall refer to these as Failure Replacement and Age Based
Preventive Replacement. The definitions of these are as follows:

(a) Failure Replacement A failure replacement is a replacement which occurs following the failure
of a component in service, or following the identification of an unfavorable condition which
leads us to promptly replace the component, within a short time of the condition being detected.
We could refer to this as a “failure or near-failure condition” replacement, however, the term
“failure” replacement will be used.

(b) Age Based Preventive Replacement An age based preventive replacement is replacement of a
component which has not failed, but which has reached an age deemed appropriate for
preventive replacement. An example where this occurs is with “lifed” components in aircraft.

6.3 Component Replacement Policies

6. Age-Based Replacement Analysis 6-1

RELCODE analyses three types of component replacement policy.

(a) Replace Only On Failure

Under this policy, only failure replacements are carried out and there is no preventive
replacement. This includes near-failure-condition replacements.

(b) Age-Based Preventive Replacement

Under this policy, replacements occur for items which reach a certain specified age, and failure
replacements occur if a component fails or reach an identified near-failure-condition before the
specified age. Determination of the "specified age" is the key part of the policy.

(c) Block Replacement

Under this policy, replacements occur for all the components under consideration at regular
intervals of elapsed calendar time. Failure replacements occur for items which fail (or reach an
identified near-failure-condition) between block replacements. The determination of the time
between block replacements is the key part of the policy. This policy is considered in detail in
Chapter 7.

Note that under both the Age-Based and Block Replacement Policies, some failure
replacements will occur.

6.4 Is Age Based Preventive Replacement Worthwhile?

Age Based preventive replacement can only be worthwhile if two conditions hold:

(a) The failure rate of the components is increasing, or will increase before another age based
preventive replacement opportunity occurs

(b) The cost of failure replacement is greater than the cost of age based preventive replacement.

Thus age based preventive replacement is not appropriate, if the failure rate (hazard function) is
decreasing (Burn-In Failures) or constant (Random Failures), because the new replacement item will not
be any more reliable than the one it replaces. It is important to analyse data to determine whether
wearout is present before jumping to the conclusion that age based preventive replacement is

Even if wearout occurs, the choice of policy will depend also on the cost of age based preventive
replacement being less than the cost of failure replacement. Age based preventive replacement policies
result in loss of useful life of the components which are removed before failure. For preventive
replacement to be worthwhile, this loss must be more than compensated by cost savings resulting from
fewer failure replacements. This can only occur if failure replacements are expensive when compared
to preventive replacements. The determination of an optimal (i.e. minimum cost) policy will depend on
a trade off between these factors.

The cost of making an age based preventive replacement is usually less than the cost of failure
replacement. This is because we can arrange for age based preventive replacements to be made at a
prearranged time so as to avoid loss of production. Also, if age based preventive replacement of a given
type of component is carried out as part of a routine service or overhaul, the repair cost tends to be
reduced as the replacement can be done as part of the other work.

6-2 6. Age-Based Replacement Analysis

Condition Monitoring

Where a good condition monitoring technique is applicable, this tends to work against age based
preventive replacement. A good condition monitoring technique will have the following characteristics:

• Cheap and technically easy to install and use

• Highly effective at predicting imminent failure conditions

• Not prone to giving false alarms, that is, indicating imminent failure when in fact the component
will last for a considerable time

• Gives a consistent indication of the Delay Time, that is the time between potential failure being
indicated and actual failure occurring.

If a good condition monitoring system is available, the cost of Failure Replacement (which includes on-
near-failure-condition-replacements) will be reduced. For example, if a bearing can be monitored
regularly and a condition (such as a vibration level) can be identified which accurately predicts when a
failure will occur within a few days, then we can make a replacement the day following detecting this
condition. This may not be as cheap as a an age based preventive replacement, but may be cheaper than
an actual in service failure.

RELCODE will analyse your data to help you to determine:

• whether preventive replacement (age or block) is worthwhile;

• the optimal (minimum cost) age-based preventive replacement policy;

• the optimal (minimum cost) block replacement policy;

• the long run average cost for these optimal policies;

• the long run average cost for any age-based replacement policy specified by the user;

• the long run average cost for any block replacement policy specified by the user;

• the long run average cost for a policy of replacement only on failure.

• spare parts requirement under the various policies

6.5 Cost Considerations in Replacement Policies

In the previous section we saw that the question of the costs of failure replacement and preventive
replacement must be addressed if we are to determine the best replacement policy. Before considering
how to determine an optimal policy, we shall look in more detail at the cost factors involved.

Factors in the cost of the replacement component include:

a) The cost of the component itself, as purchased from the supplier, including taxes where relevant;

6. Age-Based Replacement Analysis 6-3

b) The cost of condition monitoring, where used

c) Charges such as freight, packing, handling;

d) Inventory carrying costs, e.g. cost of capital tied up, warehouse costs, insurance;

e) Exceptionally, costs may be lowered if available spares are excessive in quantity for some reason,
e.g. by cannibalisation

f) The cost of lost production, lost business or substitute service

g) Cost of secondary damage which may be caused when the component under consideration fails

Of the cost factors just outlined, item (f), the cost of lost production, etc., may be the most difficult to
estimate. The extent of lost production could vary considerably depending on just when a replacement
occurs. In practice we may need to make a management judgement on the figure we use here. An
advantage of RELCODE is that it is easy to carry out a "what if" analysis with different cost figures and
see what effect this would have on the replacement policy.

For the age based policy it will be necessary to keep records of the ages of particular components. For
the block policy, and the replace only on failure policy, the ages of particular components are not

6.6 Starting Replacement Analysis

Before starting replacement analysis we must first have entered our event data and determined the
life distribution parameters.

Also, replacement analysis requires some additional data (e.g. costs) which can either be entered at
the Item Header Screen or in response to prompts in the course of the Replacement Analysis. In
this introduction we shall enter this data in response to prompts in the course of the analysis, and
conclude by illustrating this data at the Item Header Screen.

To start replacement analysis, click the Replacement Analysis button on either the Analysis Menu
Screen (Figure 4.1) or the Weibull or Bi-Weibull Parameter Screen (Figure 4.3). When we select
the Replacement Analysis Menu, if we have not previously entered our Cost of Failure and Cost of
Preventive Action data, we will be prompted for them, as shown in Figure 6.1. The theoretical
background to the data requirements and calculation of minimum cost replacement policies is given in
Chapter 15.

6.7 Replacement Costs Data Entry

The first step in using the Replacement Analysis is to enter the cost of a failure replacement and the
cost of a preventive replacement. When we select the Replacement Analysis Menu, if we have not
previously entered these costs, we will be prompted for them, as shown in Figure 6.1.

The values entered in Figure 6.1 are a Failure Replacement cost of $1000 and Preventive
Replacement cost of $110.

6-4 6. Age-Based Replacement Analysis

Having entered these costs we click OK and go to the Replacement Analysis Menu and can select any
of the analyses shown in Figure 6.2.

Figure 6.1. Replacement Costs Data Entry

6.8 Replacement Analysis Menu

The Replacement Analysis Menu is shown in Figure 6.2.

Figure 6.2. Replacement Analysis Menu

6. Age-Based Replacement Analysis 6-5

6.9 Graph of Costs versus Replacement Age
The first option on the Replacement Analysis Menu (Figure 6.2), is a Graph of Costs versus
Replacement Age. This is shown in Figure 6.3. The graph shows how the cost per age unit varies with
the preventive replacement age. The vertical axis is the average cost per unit time and the horizontal
axis is the preventive replacement age. We see from Figure 6.3 that the cost per unit time falls initially
quite steeply as the preventive replacement age increases. There is then a flat region around the optimal
preventive replacement age, and finally the cost per unit time rises again.
From the graph we can find the range of ages for which an Age-Based Preventive Replacement Policy
would result in lower costs than the policy of Replacement Only On Failure. The lowest point on this
graph is at the optimal preventive replacement age. There is an interval around this value where the
costs do not vary much. In a case where there is no turning point on the graph, then the results indicate
that there should not be a "preventive replacement" policy but a "replacement only on failure" policy.

The dotted horizontal line is drawn at the cost of a policy of replacement only on failure. This gives a
visual indication of the relative savings (if any) from a preventive replacement policy. The vertical
scale will be automatically adjusted to show reasonable numeric values. We can see from the graph
that the minimum costs occur with preventive replacement at about 2000 hours, when the cost is
about $10 per 100 hours. The panel shows the exact values.

Figure 6.3. Graph of Cost versus Preventive Replacement Age

6-6 6. Age-Based Replacement Analysis

6.10 Calculate Cheapest Replacement Age

This option calculates the cheapest replacement age. It compares the cost per age unit for replacement
only on failure with the cost of a policy of replacing individual components when they reach a certain
age, known as the preventive replacement age. The preventive replacement age is varied using a
search procedure to try to find the value of the preventive replacement age which minimizes costs,
and whether the resulting costs are cheaper than those associated with a policy of replacement only on
failure. The results are in the form shown in Figure 6.4. If a policy of replacement only on failure is
the cheapest then the output will show this.

In the present example, the cheapest age-based replacement policy is preventive replacement at 2080
hours. The cost is then $0.0995 per hour, or $9.95 per 100 hours.

The results given by RELCODE for this and other replacement ages are rounded to multiples of a
scale factor which is automatically calculated. The scale factor is the single column increment used in
Figure 6.3 and similar graphs.

If we want to know the Cheapest Age-Based Replacement Policy for a range of replacement costs, we
can go back to the Item Header Screen and alter the costs as often as we wish.

6.10.1 Currency Symbol

In this example the currency symbol is the $. The currency symbol can be changed (one character
only) for any given data base on the Item Header Screen.

6. Age-Based Replacement Analysis 6-7

Figure 6.4. Cheapest Age-Based Replacement Policy

6.11 Specified Preventive Replacement Age

The third option at the Planned Replacement Menu (Figure 6.2) calculates the average cost per unit time
for any user specified value of the preventive replacement age. Initially we are asked to enter the
specified preventive replacement age, as shown in Figure 6.5. In the example the optimal preventive
replacement age, as shown in Figure 6.4 is 2080 hours. This is an odd amount and it is more likely that
we would want to specify the preventive replacement age as round figure, reasonably close to the
optimal value. For example we may chose 2000 hours as the specified replacement age. We therefore
enter 2000 as the specified age in Figure 6.5.

6-8 6. Age-Based Replacement Analysis

Figure 6.5. Entering the User Specified Preventive Replacement Age

When we click OK, RELCODE calculates the costs for this preventive replacement age and also the
proportion of preventive replacements and the proportion of failure replacements. The results as
shown in Figure 6.6.

By returning to this option, we can compare the costs for two or more preventive replacement ages
and see how significant the difference in cost is. The advantage of being able to do this is that, for
example, we may carry out a major service at 2500 hours and it would be more convenient to replace
the drive belts at 2500 hours rather than at 2080 or 2000 hours. We can compare the costs of the
various policies in terms of both cost and the proportion of failure replacements.

6. Age-Based Replacement Analysis 6-9

Figure 6.6. Results for Preventive Replacement at a Specified Age

6.12 Spare Parts Requirements - Age Based Replacement

The aim of this analysis is to estimate the number of replacement parts that will be required to cover
both failure replacements and preventive replacements for a given annual component utilization.

The calculation is based on an assumption of steady state average conditions and in practice it may be
prudent to carry more spare parts as a safety stock and as an allowance for transient effects.

For example, suppose that each crane has two similar drive belts, and that we operate a fleet of 15
such cranes. This means that the number of components “at risk” will be 2 x 15 = 30. Suppose also
that the cranes have an average utilization of 2500 hours per year. The analysis will calculate the
average number of replacement components needed when preventive replacement occurs at a specified

We are asked to enter the specified preventive replacement age along with the number of components
at risk and the annual component utilization. The total annual requirement for replacement
components under the current replacement policy is then calculated. Figure 6.7 shows the data entry

Figure 6.8 shows the results of the calculations. We see that, for our current data, and for a
preventive replacement age of 2000 hours, the steady state average annual requirement for spare parts
will total 38.85, or 39 in round figures, of which 35.26 (35 if rounded) will be preventive
replacements and 3.60 (4 if rounded) will be failure replacements. This therefore gives us also a
figure for the expected number of in-service failures per annum under this policy.

6-10 6. Age-Based Replacement Analysis

The results also show the number of replacements per annum under a policy of replacement only on
failure, the result being 14.53 (15 if rounded). For this policy, all the failures will be in-service
failures. Thus we can see how our preventive replacement policy has reduced the annual average
number of in-service failures from approximately 15 to 4. Note, however, that some in-service
failures must still be expected to occur.

Figure 6.7 Spare Parts Utilization Data Entry

6.13 Conclusion

In this chapter we have seen how RELCODE helps us to analyse our options in relation to
preventive replacement of components, and in particular, how we can:

• Graphically display the relationship between cost and preventive replacement age (Figure 6.3)

• Calculate the minimum cost age based replacement policy (Figure 6.4)
• Calculate costs for any preventive replacement age which we specify (Figure 6.6)

• Calculate annual steady state average replacement parts requirements for any selected policy,
and for the policy of replace only on failure (Figure 6.8).

We can save the cost and replacement age parameters which we have entered by clicking the Save
Parameters button on the results screens, such as Figure 6.8. The parameters are then saved in the
database and also appear on the Item Header Screen as shown in Figure 6.9.

Figure 6.8 Spare Parts Annual Requirement for Age Replacement Policy

6. Age-Based Replacement Analysis 6-11

Figure 6.9 Item Header Screen Showing Replacement Policy Parameters

6-12 6. Age-Based Replacement Analysis


7.1 Block Replacement Policy Definition

Under a block replacement policy, all components are replaced simultaneously - in a block, at certain
intervals of time. Items which fail in between the block replacement times are replaced when they fail
(or reach an identified near-failure-condition), these being failure replacements. At the time of block
replacement, all items are replaced including those which have been subject to failure replacement. We
refer to the time between block replacements as the block replacement interval.

The block replacements are preventive replacements. The cost of a block preventive replacement may
differ from that of an age based preventive replacement. Usually a block replacement will be cheaper
(per component replaced) because there are economies of scale in doing many replacements at the same

In our example, let the cost of block preventive replacement be $60 per item. We set this value at the
Item Header Screen, by selecting the relevant item, amending the Preventive Action Cost field (bottom
left part of screen) and clicking the OK bottom. Figure 7.1 shows the result. The other data are the
same as in Chapter 6, at this stage, in particular, the cost of failure replacement is still $1000.

Block replacement policies can be analyzed using the “Block Based Replacement Policies” options on
the Replacement Analysis Menu shown in Figure 6.2.

Figure 7.1 Item Header Screen with Amended Preventive Action Cost

7.2 Graph of Costs versus Block Replacement Interval

7. Block Replacement Policies 7-1

When we click the “Graph of Costs versus Replacement Interval” button on the Replacement Analysis
Menu (Figure 6.2, Block Based Replacement Policies section), RELCODE calculates and displays a
graph of Cost versus Block Replacement Interval. The result is shown in Figure 7.2.

Figure 7.2 Graph of Costs versus Block Replacement Interval

7.3 Cheapest Block Replacement Interval

This option will calculate the cost for a policy of replacement only on failure and the cost of block
replacement. If block replacement is cheaper, then the cheapest block replacement interval is found.
The results are in the form shown in Figure 7.3.

Figure 7.3 Cheapest Block Replacement Interval

7-2 7. Block Replacement Policies

7.4 Specified Block Replacement Interval

This option calculates the average cost per unit time for any value of the block replacement interval
entered by the user. In this case RELCODE first prompts for the specified block replacement interval.
In the example the optimal block replacement interval was 1522.32 hours, as shown in Figures 7.2 and
7.3. In practice we may wish to pick a round number close to this value, such as 1500 hours. We enter
this in response to the prompt as shown in Figure 7.4.

Figure 7.4 Entering a Specified Block Replacement Interval

7. Block Replacement Policies 7-3

When we click the OK button on Figure 7.4, RELCODE calculates the cost per unit time for the
specified block replacement policy interval and displays the result as shown in Figure 7.5.

Figure 7.5 Results for Block Replacement at a Specified Interval.

7.5 Spare Parts Requirements - Block Replacement

The aim of this analysis is to estimate the number of replacement parts that will be required to cover
both failure replacements and preventive replacements for a given annual component utilization. The

7-4 7. Block Replacement Policies

calculation is based on an assumption of steady state average conditions and in practice it may be
prudent to carry more spare parts as a safety stock and as an allowance for transient effects.

The analysis is similar to that for the age-based policy described in Section 6.12. When we click the
“Spare Parts Requirements - Block Policy” button at Figure 6.2, we are asked to enter the specified
block replacement interval along with the number of components at risk and the annual utilization per
The total number of components needed per annum is then calculated. Figure 7.6 shows the data
entry prompts. Figure 7.7 shows the result.

Figure 7.6 Block Policy Spare Parts - Data Entry

7. Block Replacement Policies 7-5

Figure 7.7 Spare Parts Annual Requirements for Block Policy

7.6 Life Distribution Function Tabulations

In addition to providing solutions for the Age and Block replacement models, which are standard
replacement policy models well documented in maintenance literature, RELCODE provides a table
of values of the functions used in calculating these solutions. These tables can be useful if the user
has a problem which is a variation on the standard models, but is based on similar functions. To
see the tabulations, click the “Life Distribution Function Tabulations” button on the Replacement
Analysis Menu, Figure 6.2.
The table of results can be sent to a file and used as input by users wishing to build their own
analysis routines based on the relevant distribution.
The functions tabulated are:
• Failure Probability Density Function
• Reliability Function
• Truncated Mean Life
• Cumulative Renewals

7-6 7. Block Replacement Policies

Figure 7.9 Life Distribution Function Tabulations

7. Block Replacement Policies 7-7


8.1 Confidence Limits

When we analyze reliability, as we have done in Chapters 4 and 5, the points placed on the graphs in
Figures 5.2, 5.3 and 5.4 are placed at “median rank” positions on the reliability, or cumulative
probability of failure scale. The median rank position is generally accepted as the best position for a
single representative point which can be estimated from the available data.

The median rank points are such that we are 50% confident that the reliability is greater or less than
the median rank value. Statistical theory allows us to place other points on the reliability graph, which
correspond to percentages other than 50%. For example, at a given failure age, we can calculate a
probability such that we are 95% confident that the reliability is greater than that value. This is a 95%
lower confidence limit for the reliability at the corresponding age.
Confidence limits are values such that we are confident at some stated level (e.g. 95% confidence) that the value
taken by a variable, if a trial is repeated, will lie in a certain range. The “certain range” depends on the numbers
involved, and whether we are talking about “upper”, “lower” or “two-sided” confidence limits.

The larger the sample and the more failures we have observed, the tighter the confidence limits will be.
Conversely, if we have a small sample or very few failures, the confidence limits will be wide. We cannot make
any absolute statement about probabilistic reliability, only that we have a certain level of confidence of a certain
level of reliability, e.g. 90% confidence of 90% reliability.

8.2 Getting Confidence Limits with RELCODE

We shall continue with the example for which the data is given in Figure 3.2. For this data, the
points plotted in Figure 5.2 are the median rank or best point estimates of the reliability against
age. At the Analysis Menu, Figure 4.1, click the “Confidence Limits for Reliability” button. The
screen shown in Figure 8.1 will appear.

In Figure 8.1, look at the table in the centre, and in particular at the columns headed “Age
(Hours)” and “Median Rank”. These are the values of Age and Reliability at which the points are
plotted in Figure 5.1. For example, the second row of the table in Figure 8.1 has the values:

Age (Hours) = 2634

Median Rank = 0.84

and this corresponds to the second point from the left in Figure 5.2. Note that Figure 5.2 works in
percentages, whereas Figure 8.1 works in probabilities, that is 0 to 1, so that 0.84 in Figure 8.1
corresponds to 84% in Figure 5.2.

Figure 8.1 gives confidence limits for the reliability at all the failure ages in our data. For
example, in row 2 of the table, under the heading “Lower One Sided” in the 95% column is the
value 0.61. This means that we are 95% confident that the reliability to 2634 hours is greater than
0.61. Under the heading “Upper One Sided” we see the value 0.97. This means we are 95%
confident that the reliability to 2634 hours is less than 0.97. Values in the other columns are
confidence limits at 99% and 90% levels.

The values in the table in Figure 8.1 are also known as “Ranks”. The Median Ranks are the
central values which are exceeded with probability 50%. Ranks can also be tabulated in terms of

8. Confidence Limits 8-1

the cumulative probability of failure, so that the value 0.61 which appears in Figure 8.1 as the
lower one sided 95% confidence limit for two failures out of ten items at risk, corresponds to a
95% rank value of 0.39, or 39%, for the cumulative probability of failure.

We refer to these confidence limits as distribution free, because no assumptions about the form of
the life distribution are required to calculate them.

Figure 8.1 Distribution Free Confidence Limits for Reliability

8.3 Confidence Limits Graph

Click the “Graph” button on the Confidence Limits screen, Figure 8.1, and a graph of the results
will appear. This is illustrated in Figure 8.2. The Median Rank Points appear as circles, as in
Figure 5.2, and the upper and lower one sided 95% confidence limits are shown as small dashes.
This gives us a graphical indication of the spread of values that could occur. Taking a two sided
viewpoint, we are 90% confident that the reliability versus age will lie between the small dashed
points plotted.

8-2 8. Confidence Limits

Figure 8.2 Confidence Limits Graph

8.4 Mean Time Between Failures (MTBF)

The MTBF, or Mean Time Between Failures, is a concept which is widely used in reliability

In the case of an item which is has a constant failure rate (random failures), the MTBF is the mean
life. The MTBF also arises from a situation where components are replaced on failure, and in time
a steady state is reached regardless of the failure distribution function. The MTBF is then literally
the Mean Time Between Failures. Again the MTBF will be the mean life of the components.

8.5 Getting the MTBF and Confidence Limits

RELCODE will calculate a point (or “best”) estimate for the MTBF for our current data and will
also calculate confidence limits. To do this from the Analysis Menu, Figure 4.1, click the
“Confidence Limits for Reliability” button. The screen shown in Figure 8.1 will appear. Now
click the “Confidence Limits for the MTBF” button at the bottom left of the screen, Figure 8.1.

It is important to note that this analysis is only valid for items subject to random failures, so that we
should only use this analysis if our reliability analysis has yielded a two parameter Weibull
distribution with a BETA value close to 1. By “close to 1”, we might regard values in the range
0.8 to 1.4 as being satisfactory, although accuracy decreases as we move away from the value
BETA = 1.

8. Confidence Limits 8-3

Figure 8.3 Switch Failure Data

Switch Operations to Failure

A 1980
B 760
C 120
D 210
E 2170
F 3800
G 700
H 1350
I 1100
J 380

Since the data entered in Chapter 3 has not yielded a Random Failure pattern, we shall use other
data in this example. This is the Switch Failure Data as shown in Figure 8.3.

To analyses the data of Figure 8.3 we first enter it into RELCODE, then select “Analyse Data” and
“Model Optimization - Fit Distribution”. The preferred model is a two parameter Weibull with a
Beta value of 1.01. This is well within the range which we can regard as “Random”. The
parameters are as shown in Figure 8.4.

Figure 8.4 Preferred Distribution Parameters for the Switch Data

Returning to the Analysis Menu, we select “Confidence Limits for Reliability” and then
“Confidence Limits for the MTBF”. This yields the result shown in Figure 8.5.

8-4 8. Confidence Limits

Figure 8.5 Confidence Limits for the MTBF

From Figure 8.5 we see that the point estimate of the MTBF under the assumption of random failures is
1257 operations. Figure 8.5 gives a two sided 90% confidence interval for the MTBF, that is values
such that we are 90% confident that the MTBF lies between the lower and upper values given. This also
means that we are 95% confident that the MTBF lies above the lower value, and 95% confident that it
lies below the upper value.

Also, value of the lower confidence limit depends on whether or not the trial ends on a failure. In the
Switch example, all the switches run to failure, so the trial does end on a failure. We see therefore that
the results are:

Lower Confidence Limit for the MTBF = 800.38 operations

Upper Confidence Limit for the MTBF = 2316.84 operations

Hence we can state that we are 90% confident that the value of the MTBF lies between these values. In
reliability studies we are usually concerned mainly with the lower confidence limit. For example, there
may be a contract requirement to demonstrate that the MTBF exceeds a certain value with 95%
confidence. In this case we can say that the MTBF of the switches exceeds 800 operations with 95%

8. Confidence Limits 8-5

9.1 Inspection Intervals for Hidden Failures
The failure or deterioration of an item is not always immediately apparent. Failures which can go
undetected are called hidden failures. Hidden failures can occur in protective devices such as safety
valves or limit switches, where no actual system failure will occur until the protective action of the
faulty device is required. To reduce the probability of a system failure the protective device can be
inspected at intervals. The aim of such inspections is to detect hidden failures, and the frequency of
inspection is set so that the proportion of time that the protective device is in a failed state is kept to a
low level. Typically, this level can correspond to an availability level of 99%, that is, the device is
non-operational, on average, for 1% of the time. We wish to determine an inspection interval which
will achieve this. A formula relating the inspection interval, the MTBF and the availability can be
derived as follows:

Let I = Inspection Interval

M = MTBF of device
A = Availability

The inspection interval is assumed to be short relative to the MTBF. On average the item fails at its
MTBF, half way through an inspection interval. Hence:
A = M / (M + I/2)
I = 2 * M ((1 - A)/A)

This formula is used by RELCODE to give a suggested inspection interval. Initially we obtain a
distribution model using RELCODE, either by data analysis or by direct entry of parameters using the
“No Data” option. This gives an estimated value for the Mean Time Between Failures (MTBF) of
the device. At the Analysis Menu, click the Inspection Interval button to obtain a suggested value for
the inspection interval.
Figure 9.1 shows an example. The data here is from the Hydraulic Seal example of Chapter 16. This
has an MTBF of 120 months. For a 99% availability, the formula in this section gives:
I = 2 * M * ((1 - A)/A) = 2 * 120 * ((1 - .99) / .99) = 2.42 months.
This is the value shown. The value is not rounded by RELCODE and we will usually use judgement
in deciding a rounded value, for example 2 or 3 months in this case. A shorter interval will
correspond to a higher availability. As we see from Figure 9.1, RELCODE also shows the intervals
for 98% and 99.5% availability, and a graph of Availability against Inspection Interval, an example of
which is shown in Figure 9.2.

Figure 9.1 Inspection Intervals – Hidden Failures

9. Inspection Intervals and Other Features 9-1

Note that this analysis gives a single average value of the inspection interval over the life of the item.
A more flexible approach is to vary the inspection with the failure rate, however, we are unlikely to
want to vary the inspection interval continuously, even if the failure rate varies continuously. The
approach suggested is to set the life distribution parameters in RELCODE to correspond to average
values relevant to the age range for which we require an inspection interval. For example, if during
the main operating life of the Hydraulic Seal its failure rate was approximately constant with MTBF
of 500 months, we would set BETA=1 and MTBF=500 as our model parameters at the “No Data”
screen and then obtain suggested inspection intervals for this model. Later, if our seals are entering
the wearout phase, we could vary the model parameters and obtain a new inspection interval.

9-2 9. Inspection Intervals and Other Features

Figure 9.2 Inspection Intervals for Hidden Failures

9.2 Condition Monitoring Intervals

RELCODE can be used to estimate suitable intervals for condition monitoring. In condition
monitoring, the condition of an item deteriorates gradually in such a way that we can eventually
detect, using a suitable condition monitoring method, that a failure is imminent. The time from when
we can first detect that a failure is imminent to the time when an actual failure occurs is referred to as
the PF Interval. PF stands for Potential to actual Failure. The alternative term Delay Time is also
used. The PF Interval is a random variable.

Theory. Let R(t) be the survival function of the PF Interval and I be the condition monitoring
If a potential failure condition emerges at time x measured from the last inspection, then the
probability that item survives to the next inspection is
R(I-x). We can reasonably assume that the rate of emergence of potential failure conditions does not
vary significantly over the inspection interval. Also, the potential failure condition will emerge in
some interval.

The probability, p, of the condition being detected is therefore given by the average value of the
survival function R(t) over the interval:
p = ∫ R(t )dt / I 9.1

Note that the integral of the reliability function is the Truncated Mean Life function which appears in
a number of other analyses.

To use RELCODE to estimate the condition monitoring interval, we need to make RELCODE model
the PF Interval. Having set this distribution either by fitting a distribution to data or by using the “No
Data” option to set the parameters, we can then use the Condition Monitoring Interval option on the
Analysis Menu. The result is illustrated in Figure 9.3. This example is based on the item “PF

9. Inspection Intervals and Other Features 9-3

Interval Example” which has a mean life of 90 days and a gradual wearout pattern. Figure 9.3 shows
the for a 80% probability of detection we require an inspection interval of 48 days, or about 7 weeks.
Other intervals are shown for 90% and 95% detection probabilities. A graph of the result is also
available as shown in Figure 9.4. The graph shows the relationship between inspection interval and
probability of detection for the given PF interval distribution.

Figure 9.3 Condition Monitoring Intervals.

Figure 9.4 Graph of Probability of Detection against Condition Monitoring Interval.

9-4 9. Inspection Intervals and Other Features

9.3 Display Distribution Parameters

In addition to fitting distribution parameters to our data, RELCODE allows us to enter any values of
the parameters that we wish. It will then carry out a goodness of fit test for those parameters in
relation to the current data. We can also proceed to the Replacement Analysis options with these
The Display Distribution Parameters option is on the Analysis Menu, see Figure 4.1. When we click
the Display Distribution Parameters button, the screen shown in Figure 9.5 appears. The screen is
labeled, Display or Amend Distribution Parameters. This is because, in addition to displaying the
existing values of the distribution parameters, we can actually change the values of the parameters,
should we wish to do so. Note that the example in this chapter relates to the Drive Belt data, so if
you are continuing on from Chapter 8 you will need to return to the Item Header Screen and select the
Drive Belt data in order to get the numeric results shown in Figure 9.5.

Figure 9.5 Display or Amend Distribution Parameters Screen, prior to changes

If we have previously fitted a distribution, the screen will show the current parameter values. We can
amend these values if we wish. If we then press the OK button, a goodness of fit test will be carried
out, relating the amended parameter values to the current data.
For example, we can use this facility to test whether a negative exponential distribution would fit the
existing data. To do this, we change the shape parameter BETA from its value of 2.17, to the value
1.0 and press the OK button. The result is shown in Figure 9.6.

9. Inspection Intervals and Other Features 9-5

Figure 9.6. Display or Amend Distribution Parameters Screen - After Amending the Shape
Parameter to 1 and Clicking OK.

Figure 9.6 shows that the negative exponential distribution is rejected with 99% confidence as a fit to
the current data. If we wish to retain the amended values of the distribution parameters, we click the
Save Parameters button. Otherwise the amended parameters will not be saved.

9.4 Previous Analysis Summary

The Previous Analysis Summary is a screen which shows a summary of the results of the analysis
which we have carried out on the current item. To display this screen, go to the Analysis Menu. This
can be reached by clicking the Analysis Menu button on either the Item Header Screen or the Event
Data Screen. At the Analysis Menu Screen, click the Previous Analysis Summary Button. The
Previous Analysis Summary Screen, shown in Figure 9.7, will then appear.

9-6 9. Inspection Intervals and Other Features

Figure 9.7 Previous Analysis Summary Screen

The Previous Analysis Summary summarizes the results of the analysis for the current item. It can be
printed, saved to a file, or sent to the clipboard from where it can be incorporated into other

9. Inspection Intervals and Other Features 9-7

10.1 Importing Data
Relcode for Windows data is held on a Microsoft Access database. Relcode will import data into
this database from suitably formatted ASCII files. The formats used for importing data are ones
which are generated when exporting data from either the Windows or DOS versions of RELCODE.
This enables:
• Transfer of data into RELCODE from any source via ASCII files with these same formats.
These files can be created using a spreadsheet.
• Windows RELCODE users to transfer data between databases by exporting it from one
database and importing it into another or the same database.
• DOS RELCODE users to transfer data to the Windows version of RELCODE without rekeying
their data.

Data exported from Windows RELCODE is in a single file format described in Section 10.3 of this
chapter. Windows RELCODE will also import data from files in this format.

Users of DOS RELCODE can have data files in several formats. DOS users wishing to transfer
data to Windows RELCODE must send their data to a file in one of the DOS RELCODE formats
which appears in Table 10.1 below. Details of these formats are given in the DOS RELCODE
Users Manual.

Table 10.1 File Types for Importing Data

1. RELCODE for Windows Data files
2. DOS RELCODE Standard Format Ungrouped data files, (usually named *.rud)
3. DOS RELCODE Standard Format Grouped data files (*.rgd)
4. DOS RELCODE TXT Format Ungrouped data files (*.txt)
5. DOS RELCODE CSV Format Ungrouped data files(*.csv)

The following procedure can then be used to read a suitably formatted file into RELCODE for

1. Click the Import Data button on the Item Header Screen (Figure 10.1)
2. Select the file type and filename (Figure 10.2) and click the OK button. In the example the
selected file is called manual.rud. The data format in the file must correspond to the selected
file type.
3. The data will be imported to the RELCODE Access database and will appear on screen.
Figures 10.3 and 10.4 show the data which has been added from file manual.rud.

10. Importing and Exporting Data 10-1

Figure 10.1 Item Header Screen - Import Data Button

Figure 10.2 Selecting the Import File Type and Filename

10-2 10. Importing and Exporting Data

Figure 10.3 Item Header Screen with Imported Data

Figure 10.4 Event Data Screen with Imported Data

10. Importing and Exporting Data 10-3

10.2 Creating Importable ASCII Files

RELCODE will automatically create files in the formats which Windows RELCODE can read.
However, we may have data from other sources which we wish to import into Windows RELCODE. To
import our data we put it into the format described in this section. First we describe this format, then we
discuss creating the file from a spreadsheet.

10.2.1 RELCODE Windows Data Standard Format

The layout of the data file in RELCODE Windows Standard Format is as follows:

Row 1: Title
Row 2: Ageunit, Failure Replacement Cost, Preventive Action Cost
Row 3: Age, Event type (F=failure or S=suspension), Frequency

Subsequent rows are similar to row 3. In rows 2 and 3, commas are used to separate the variables.
The Title is up to 70 characters long and the Ageunit is up to 11 characters long. An example is
given in Figure 10.5

Figure 10.5 RELCODE Windows Standard Data Format - a Comma

Delimited ASCII file.
Drive Belt

10.2.2 Creating Data in RELCODE Windows Data Format using a Spreadsheet

Files can be created in RELCODE Windows Data format using a spreadsheet, such as Excel. Figure
10.6 shows an example. The data is output from Excel using Comma Delimited (also known as Comma
Separated Variable .CSV) format. The resulting file will be similar to that shown in Figure 10.5.

Figure 10.6 An Excel Spreadsheet showing RELCODE Data. Output the data in .CSV Format
to obtain a file similar in format to Figure 10.5

10-4 10. Importing and Exporting Data

1 Drive Belt
2 Hours 1000 60
3 1584 F 1
4 2634 F 1
5 3540 S 1
6 4136 F 2
7 4789 F 1
8 5579 F 1
9 6368 S 3

In Figure 10.6, the left hand column represents the row numbers and the top row represents the
column letters and these are not part of the data.

10.3 Exporting Data

Data can be exported from RELCODE to an ASCII file. To do this proceed as follows:

1. Select the relevant Item at the Item Header Screen (Figure 10.1)
2. Click the Export Data button. A pop up Export Data screen will appear as shown in Figure
3. Select or enter the name of the file to which you wish to send the data and click OK.

Figure 10.7 Export Data Screen

10. Importing and Exporting Data 10-5

The resulting data will be sent to the nominated file in the format described in Section 10.2.1 and
illustrated in Figure 10.5. Unlike DOS RELCODE, there is only one export format for Windows

10.4 Exporting Results to File or Clipboard

10.4.1 Sending Results to a File

Reports can be saved to files from several points in RELCODE. This is done by clicking the Save
Report to File button on screens where it appears. The most comprehensive results are obtained
from the Previous Analysis Summary Screen, shown in Figure 9.3. To retain the various
parameters it is necessary to click the Save Parameters button at various screens as the analysis
proceeds. The most recently saved parameters will appear on the Previous Analysis Summary

When we click the Save Results to File button we will be prompted for a file name. The results
will then be written to the file in a text format. This file can then be printed, or read by another
program as devised by the user.

10.4.2 Report to Clipboard

Reports and graphs can also be sent to the clipboard by clicking the Report to Clipboard or Copy
Graph to Clipboard button at relevant screens. The contents of the clipboard can then be pasted
into documents of the users choosing.

10-6 10. Importing and Exporting Data

10.5 Automatically Reading Data from a File

Some non-standard versions of RELCODE have an automatic file reading feature which allows the
user to specify a file from which data will be automatically read when the Item Header screen is
first loaded. This section applies only to those installations where the automatic read feature has
been specifically provided.

The file to be read can be in either RELCODE for Windows Data Format or DOS RELCODE
Standard Ungrouped Data Format. The RELCODE for Windows Data Format is defined in
Section 10.2. of this chapter. The DOS RELCODE Standard Ungrouped Data Format is defined
in the DOS RELCODE User Manual.

To activate automatic reading the user runs RELCODE and at the Start Screen (Figure 2.1) clicks
the (special feature) Set Automatic Read button. The Set Automatic Read Screen shown in Figure
10.8 then appears.

10.5.1 Auto-Read File

The file from which data is to be automatically read is referred to as the Auto-Read file. To use
the Auto-Read feature, at the Set Automatic Read Screen the user enters following:

• The name and full path of Auto-Read file

• A Yes/No option indicating whether Auto-Read is to be active

• An option button indicating the format of the Auto-Read file

When the user clicks the OK button, this information is stored in the local file DBNAME.TXT
(which also includes the name of the current or default Access database used by RELCODE).

When Auto-Read is active and the Item Header Screen (Figure 2.2) is first loaded, RELCODE will
attempt to read the Auto-Read file. If successful, the data in this file will be imported into the
database and will be selected as the current item. To view the event data, click the Event Data
Entry button. The data can then be analysed as normal.

10. Importing and Exporting Data 10-7

Figure 10.8 Set Automatic Read Screen

10-8 10. Importing and Exporting Data

11. Introduction to Reliability Statistics

11.1 General

The ultimate goal in the development of systems is to achieve such a high level of reliability that
maintenance is no longer needed. Whilst this goal may be some way off, steps toward it benefit from
the measurement of reliability. In this and following chapters we shall introduce the concepts of
statistical reliability analysis as a basis for measuring reliability, understanding different reliability
patterns and selecting appropriate maintenance policies and reliability improvement approaches.

The statistical analysis of reliability data enables us to:

• Measure reliability as a basis for system acceptance, quality assurance and continuous
• Identify appropriate preventive maintenance or replacement policies
• Make comparisons between competing designs, versions, products
• Establish mean life and other parameters for spare parts planning
• Establish failure rate patterns as an aid to identifying the root cause of failure

The purposes of this chapter

• to introduce basic reliability terms

• to introduce the ideas of burn-in, random and wearout failure patterns

• to introduce the following statistical distribution functions used in reliability analysis and to
provide an understanding of their meaning:
hazard function (failure rate)
failure probability density function
reliability function
distribution function

• to show graphs illustrating the various functions, as a basis for understanding reliability data

11.2 Definitions

11.2.1 Reliability as a Function of Operating Life

In broad terms, an item is said to be reliable if it is likely to carry out its designed function without
failure, and to continue doing this over a substantial period of time.

The following definition of reliability is given in BS4778.

Reliability: The ability of an item to perform a required function under stated conditions for a
stated period of time.

11. Introduction to Reliability Statistics 11-1

The ‘stated period of time’, or the operating life, may be measured in terms of calendar time,
operating hours, kilometres or miles run, cycles of operation or any other appropriate unit.

11.2.2 Reliability of One-Shot Devices

Some items - for example a rocket - are used only once and in that case the reliability is the
probability of successful performance on the one occasion.

11.2.3 Failure
The definition of reliability just given implies that we are able to distinguish between a failure on the
one hand and successful performance on the other. In some cases, the onset of failure is a clear cut
event, for example, a metal filament light globe either works or it does not. However, in many cases,
it will be necessary to carefully define what constitutes a failure.

For example, a catalyst used in a chemical reaction can deteriorate gradually, and if a given level of
catalytic action were important, in a certain application, it would be necessary to define this in
specifying when the catalyst had “failed”. Comparisons of the reliability of systems can only be fairly
made if “failure” is defined in the same way for each type of item. The following definition of failure
is given in BS4779.

Failure: The termination of the ability of an item to perform a required function.

From the foregoing discussion we see that the "required function" must be defined clearly in relation
to any specific application if consistent measurement of reliability is to be achieved.

11.2.4 Reliability as Probability

The definition of reliability given in Section 11.2.1 is conceptual, in that it does not indicate how
reliability might be measured. The measurement of reliability depends on the use of statistics, and on
the definition of reliability as a probability. We, therefore, introduce the following definition for the
reliability of an item which functions over an operating life.

Reliability: The probability that an item will perform a required function under stated
conditions for a stated duration of operating life.

Using standard statistical concepts, we regard the time to failure as a random variable, which we shall
denote by the symbol T. Let the variable t be the measure of the operating life in appropriate units;
for example, hours, kilometres, cycles. Then the reliability to age t is the probability that an item
survives to age t without failure.

Reliability = Probability (T>t) 11.1

11.2.5 Mean Time to Failure (MTTF)

If we have a number of similar items - and we observe how long each one operates before it fails,
then we can take an average of the observed ages at failure. This figure will give an estimate of the
Mean Time to Failure (MTTF) of the items. The Mean Time to Failure is the average value of
operating life at which failure occurs.

11-2 11. Introduction to Reliability Statistics

11.2.6 Mean Time Between Failures (MTBF)
Another term which is often used in connection with reliability is the Mean Time Between Failures or
MTBF. If we have a number of similar components and when a component fails we replace it with a
similar new one, and if over a period of time we note the number of component-hours of service
achieved and the corresponding total number of failures, then the ratio of the component-hours of
service to the number of failures will be the Mean Time Between Failures or MTBF. In the long run,
the MTBF and the MTTF will have the same value, and the two terms are often used interchangeably.
The MTTF or MTBF is the most important single measure of reliability, but by itself it does not give
any indication of the type of failure pattern; that is, whether failures occur randomly, as a result of
wearout, due to burn-in, or as some combination of these modes.

11.3 Phases of Failure

Many items exhibit one or more of three phases of failure, known as Burn In, Random and Wearout
Failures. These are summarised below.


Burn In Infant Mortality
Resulting from Approximately the same
manufacture defects, probability of death in age
faulty installation range 0 - 1 as in range 1 - 40.
or setup.

Random Accidents, Random Illness

Load in excess for example, a traffic accident
of design strength. or incurable illness
Nail in tyre;
applies to complex and
especially electronic equipment

Wearout Old Age

Mechanical wear, corrosion, Three score years and ten
fatigue; performance drift
below specification.

11.4 Bath Tub Curve

The Bath Tub Curve is a schematic plot of the failure rate or hazard function for an item which
exhibits all three phases of failure. Figure 11-1 illustrates this.

11. Introduction to Reliability Statistics 11-3

Burn In Random Wearout



Figure 11-1 - Bath Tub Curve (Hazard Function or Failure Rate)

The failure rate which appears on the vertical axis of the Bath Tub Curve is the probability that an
item fails in the next small time interval, given that it has survived so far. This quantity is also
referred to as the failure rate, instantaneous failure rate, hazard function or, in the case of human life,
as the force of mortality.

From the Bath Tub Curve we see that the failure rate is high during the Burn In phase, is relatively
low and constant during the Random phase, and then increases again in the Wearout phase.

In practice, items may not exhibit all these phases of failure. Manufacturers may artificially age
items to eliminate Burn In failures, (referred to as Stress Screening) and the onset of Wearout may lie
outside the normal range of operating life. Thus the Random failure phase is often regarded as the
most important, and the failure rate is regarded as being roughly constant over the operating life of
equipment. However, it would be unwise to dismiss Burn In and Wearout failures too lightly.

11.5 Other Failure Rate Patterns

Studies by the US Federal Aviation Authority found that six different failure rate patterns occurred
in the items studied. These are shown in Figure 11-2.

Pattern A is the bath tub curve discussed in the previous section. Pattern B is constant failure rate
followed by wearout. This may occur if early life failures are eliminated by stress screening.

Pattern C is gradually increasing failure rate. This is typical of items which are subject to
corrosion or chemical wear.

Pattern D is an initially increasing failure rate followed by a constant failure rate. Here new items
are resistant to excess stress, but after a while this resistance is lost and random failures occur.

Pattern E is constant failure rate, that is random failures only. This pattern is common. It usually
indicates failure causes which are external to the item itself, such as a metal object breaking a pump
vane or a nail in a tyre.

11-4 11. Introduction to Reliability Statistics

Pattern F, decreasing and then constant failure rate is also common. It combines burn-in failures
with later random failures.

Figure 11-2 - Six Failure Rate Patterns

11.6 Importance of the Failure Rate Pattern

Identification of the failure pattern is an important factor in maintenance decision making.

It helps identify the root cause of failure. There is a tendency to assume that wearout causes most
failures, but in fact, burn-in and random failures are more common. The identification of the
failure pattern or patterns will give a useful indicator of how to find, and hence eliminate the
physical cause of failure.

Burn-in failures are a sign of defective manufacture, installation, set up or maintenance. When
they are present, attention should focus checking new or recently refurbished items for correct
assembly, etc.

Random failures, as already indicated, occur typically due to sudden external stresses in excess of
installed strength. This will include misuse or misadventure failures.

Wearout in fact has several patterns. One is characterised by gradually increasing failure rate,
typical of corrosion, dirt build up, chemical or erosive wear. The second is a sudden, sharp
increase in failure rate, typical of fatigue failure or the conventional wear of rubbing or other
mechanical action.

Whilst the above mechanisms often occur in association with the failure patterns indicated, they are
only a broad guide and exceptions may occur in practice.

Also, the patterns shown in Figure 11-2 do not constitute an exhaustive list. For example, we
sometimes have a constant failure rate followed by another (usually higher) constant failure rate.

11. Introduction to Reliability Statistics 11-5

11.7 Failure Probability Density Function, f(t)

So far we have introduced some basic concepts of statistical reliability, particularly the various failure
rate patterns. Now we shall look more formally at a range of function and equations used in
reliability analysis.

The failure probability density function (p.d.f.) is a plot which is such that the area under the curve
between any two ages is equal to the probability that a new item fails in the given age interval. This
differs from the failure rate curve in which the probability of failure is conditional on the item having
survived to the current age. Figure 11-3 shows a schematic failure p.d.f. exhibiting all three failure
phases. Note that in the failure p.d.f. the curve falls to zero at the right hand end, whereas the Bath
Tub Curve, Figure 11-1, continues to rise.

To illustrate the difference between the two curves we can use the human analogy. For an 85 year old
person the Bath Tub Curve will show the probability of death before age 86, which is relatively high,
then for an 86 year old person the probability of death before age 87, which is higher still, and so on.
By contrast the failure p.d.f. will show the probability that a newly born baby will die at age 85,
which is low, followed by the probability that a newly born baby will die at age 86, which is lower
still, and so on.

Figure 11-3 - Failure Probability Density Function

For the probability density function the area under the curve between any two ages t1, t2, gives the
probability that a new item will fail in that age interval. The total area under the curve adds up to 1,
because every item is certain to fail at some time.

Probability of failure in interval t 1 to t 2 = f (t )dt 11.2

∫ f (t )dt = 1

11-6 11. Introduction to Reliability Statistics

11.8 Reliability Function, R(t)

The reliability function corresponds to the probability that an item survives to any given age.

Let T denote the time to failure, a random variable

t denote age

For an item which starts to operate at age t = 0, the reliability function is the probability that failure
does not occur in the interval 0 to t. We denote this by R(t)

R(t) = Probability(T>t) 11.4

Figure 11-4 schematically illustrates a reliability function R(t) and also the cumulative probability of
failure or distribution function F(t). In Figure 11-4 the vertical scale represents the reliability, or
probability of survival, expressed as a percentage. The horizontal scale represents the age of the

The reliability function is related to the failure probability density function by the fact that the
reliability to age t is 1 minus the area under the failure probability density function up to age t

R(t) = 1- ∫ f(u) du
o 11.5

Figure 11-4 - Reliability Function and Cumulative Probability of Failure or Distribution Function F(t)

11.9 Distribution Function, F(t)

The distribution function (d.f.) (or cumulative distribution function c.d.f., or cumulative probability of
failure) is the probability of failure at or before age t.

11. Introduction to Reliability Statistics 11-7

Let F(t) denote the distribution function

Then F(t) = Probability (T ≤ t) 11.6

F(t) = 1 - R(t) 11.7

F(t) + R(t) = 1 11.8

Figure 11-4 schematically illustrates this function, F(t). It also shows the complementary nature of the
reliability and distribution functions, coresponding to equation 11.8.

11.10 Relationship between Probability Density Function f(t) and Distribution

Function F(t).
The probability density function (p.d.f.) of the time to failure is a function of age, such that the
area under the curve between any two age values gives the probability that a new item will fail in
that age interval. This has already been illustrated schematically in Figure 11-3. The probability
density function, denoted f(t), is the differential coefficient of the distribution function F(t).

Thus we have the following equations:

f(t) = dF(t) / dt 11.9

F(t) = ∫ o f(u) du

Probability of failure in t to t + δt ≈ f(t).δt 11.11

11.11 Hazard Function, h(t)

The hazard function h(t) is a function such that the probability that an item which has survived to
age t fails in the small interval t to t + δt is h(t)δt. This is the function, known loosely as the
"failure rate", which is represented in the Bath Tub Curve, in Figure 11-1, and in Figure 11-2.

The hazard function can be related to the reliability function R(t) and the probability density
function f(t) as follows. The probability of failure in the interval t to t + δt is f(t).δt and is also
R(t).h(t).δt .

Thus f(t).δt = R(t).h(t).δt 11.12

f(t) = R(t) h(t) 11.13

h(t)= f(t)/R(t) = f(t)/(1 - F(t)) 11.14

We can derive a general relationship between the reliability and the hazard function from the
preceding equations.

From equations 11.7 and 11.9 we have:

11-8 11. Introduction to Reliability Statistics

f(t) = d F(t)/dt = -d R(t)/dt 11.15

From 11.3 and 11.15 we get:

-d R(t)/dt = R(t) h(t) 11.16

Separating variables and integrating we get:

-d R(t)/R(t) = h(t) dt 11.17

R(t ) = e ∫0
− h ( u ) du

11.12 Conclusion
In this chapter we have introduced the basic terms used in the statistical analysis of reliability data
and the statistical functions, hazard function, failure p.d.f., reliability function and cumulative
distribution function (cdf or df). In the next chapter we shall introduce particular forms of these
functions which are widely used in reliability analysis.

11. Introduction to Reliability Statistics 11-9

12. Life Distributions

12.1 Introduction
Decisions about aspects of reliability and maintenance depend to a significant extent on an assessment
of when items will fail. We shall rarely know exactly when failure will occur, and the best scientific
assessment will normally involve statistically fitting a distribution to failure data.

There are several reasons for using standard distribution models and standard procedures for
reliability analysis. These are:

(a) The desire for objectivity. Using a standard technique allows us to treat data from varied
sources in similar style and in this way assists with engineering judgement across a broad
spectrum of applications.

(b) The need for automating data analysis. The existence of a standard procedure means that this
procedure can be followed by technical staff and that it can also be computerised, leading to
efficient treatment of data and the extraction of useful information in a cost effective way.

(c) The merits of the techniques have been established in many studies, and lead to directly useful
information such as the values of the distribution parameters.

In this chapter we shall introduce three distribution models:

• Negative exponential distribution

• Weibull distribution
• Bi-Weibull distribution

12.2 Negative Exponential Distribution

The negative exponential distribution corresponds to the case of constant failure rate. This is
Pattern E in Figure 11-2 and is also the middle part of the Bath Tub in Figure 11-1. The failure
rate is usually denoted by the Greek letter λ (lambda).

The equations for the various reliability functions then are as follows, expressed in terms of the
parameter, λ.
F(t) = 1 - exp [-λt] 12.1
R(t) = exp [-λt] 12.2
f(t) = λexp [-λt] 12.3
h(t) = λ 12.4
Mean Life = 1/λ = MTBF 12.6
MTBF = Mean Time Between Failures

The negative exponential distribution is a special case of the Weibull distribution (discussed in the
next section), with Weibull Parameters β = 1, and η = 1/λ.

Figure 12-1 shows the negative exponential failure rate (a constant). The graph was produced by a
program which generates Weibull plots and is the special case where BETA = 1.

12. Life Distributions 12-1

Figure 12-1 - Negative Exponential Failure Rate is Constant (Random Failures). Also a special
case of Weibull Distribution with BETA = 1.0.

Figure 12-2 - Negative Exponential Probability Density Function (pdf)

Figure 12-2 shows the negative exponential probability density function, corresponding to equation

Figure 12-3 shows the negative exponential distribution cumulative distribution function,
corresponding to equation 12.1.

12-2 12. Life Distributions

Figure 12-3 - Negative exponential cumulative distribution function

12.3 Weibull Distribution


The Weibull Distribution (pioneered by Swedish researcher Waloddi Weibull in the 1950s) can
represent any one phase of failure, that is, Burn In, Random or Wearout, depending on the Shape
Parameter of the distribution. It cannot represent the existence of all three failure phases (or even two
phases) for the same item. Nevertheless it is found that the Weibull distribution provides a good
statistical model for many practical purposes, and is usually superior to other models with the same
number of parameters.

The equations of the Weibull distribution contain a parameter called BETA, denoted by the Greek
letter β, which is known as the shape parameter. The shape of the Weibull probability density
function and other functions is different for different values of BETA.

When the value of BETA is less than 1, the Weibull distribution represents a pattern of Burn-In
failures. For BETA equal to 1 the Weibull distribution reduces to the negative exponential
distribution. For BETA greater than 1, the Weibull distribution represents wearout failures. The
larger the value of BETA, the more pronounced is the wearout effect. BETA values in the range 1.5
to 2.5 may indicate some blend of random and wearout failures. This relationship between the BETA
value and the phase of failure is summarised in Table 12.1.

Table 12.1. Weibull Shape Parameter and

12. Life Distributions 12-3

Failure Phase Represented

Shape Parameter Failure Phase

Beta, β

<1 Burn In
1 Random
>1 Wearout

The Weibull distribution also has a scale parameter ETA (η), known as the Characteristic Life. This
parameter is related to the mean of the distribution and corresponds to the age by which 63.2% of
items have failed.

There is also a three parameter version of the Weibull distribution which we shall consider in a later

The equations of the Weibull distribution, based on shape parameter β (BETA) and Characteristic
Life η (ETA) are as shown in Table 12.2 below.


Range 0≤t≤+∞
Shape Parameter BETA (ß)
Scale Parameter ETA (η) also known as Characteristic Life
Cumulative distribution function F(t) = 1 - exp [-(t/η)ß]
Probability density function f(t) = (ßtß-1/ηβ) exp [-(t/η)ß]
Inverse distribution function G(α) = η{log [1/(1 - α)]}1/ß
(of probability α)
Survival function S(t) = exp [-(t/η)ß]
Inverse survival function Z(α) = η[log (1/α)]1/ß
(of probability α)
Hazard function (failure rate) h(t) = ßtß-1/ηß
Cumulative hazard function H(t) = (t/η)ß
Mean (Γ is Gamma Function) µ = η Γ[(ß + 1)/ß]
Variance σ2 = η2(Γ[(ß + 2)/ß] - {Γ[(ß + 1)/ß]}2)
Mode η(1 - 1/ß)1/ß , ß ≥ 1
0 ,ß≤1

1/ 2

 [ ] − 1
 Γ (β + 2) / β
Coefficient of Variation (σ / µ )  
 Γ[(β + 1) / β ] }

Characteristic Life
The Characteristic Life ETA (η) has the property that when t = η, then the cumulative distribution
function takes the value: 1 - exp (-1) = 0.632 for every β.

Random Number Generation

Random numbers of the Weibull random variable

12-4 12. Life Distributions

W : η, β ~ η (-log R)1/β

where R is a uniform random variable on the range 0 to 1.

12.4 Weibull Graphs - Hazard Function

Graphs illustrating various functions of the Weibull distribution for different values of the shape
parameter BETA (β) are given in this section and subsequent sections. These graphs were created by
the RELCODE software package.

First, in Figure 12-4, we illustrate the Hazard Function or failure rate h(t), as this shows clearly the
relationship between the Weibull shape parameter and the phases of failure represented in the Bath
Tub curve (Figure 11-1).

The vertical scale of the hazard function graph is measured in failures per unit of operating life and
the graph represents the instantaneous failure rate at any age.

The horizontal scale for this and other graphs in this section is the Operating Life or age in
appropriate units.

Figure 12.4 shows the Weibull Hazard Function for BETA = 0.75. As BETA is less than 1, the
hazard function or failure rate decreases with age, corresponding to Burn-In failures.

We have noted previously that for BETA = 1.0, the Weibull distribution reduces to the negative
exponential distribution, so the hazard function for BETA = 1 is a constant and in fact, η (ETA) is
the conventional Mean Time Between Failures or MTBF in this case.

For BETA = 2.0 the hazard function increases at a constant rate and corresponds to Pattern C in
Figure 11-2. This represents what may be termed "gradual wearout" or possibly a blend of random
and wearout failures, in contrast to the case where BETA = 3.3. For values of BETA above 2.0, the
gradient of the hazard function increases with age, representing a stronger or more marked wearout
effect than for lower values of BETA.

12. Life Distributions 12-5

Figure 12-4 - Weibull Hazard Function for Various Values of Beta.

12.5 Weibull Graphs - Probability Density Function

Graphs of the Weibull probability density function f(t) for BETA values of 0.5, 2.0 and 3.3 are shown
in Figure 12-5.

The vertical scale of the probability density function graph is measured in Failures per Unit of
Operating Life. The simplest way to interpret this graph is by recalling that the area under the curve
between any two values of Operating Life, is the probability that a new item will fail in that age range.

For BETA < 1, represented by Figure 12-7, the Weibull p.d.f. is skewed to the left and goes to
infinity at age zero. This represents the Burn-In failure phase with a high initial failure probability
density which then decreases.

For BETA = 1, recall that the Weibull distribution reduces to the Negative Exponential distribution
represented by Figure 12-2 the value of the failure probability density at age zero is 1/η, and the pdf
then decreases with age.

For BETA greater than 1 we illustrate two cases in Figure 12-5. Firstly, for BETA = 2.0, the
gradient at the origin is initially positive and decreases gradually. This corresponds to the gradual
wearout or combined random/wearout situation. Secondly, for BETA = 3.3, the Weibull distribution
takes on a bell shape which is very similar to the Normal distribution. These figures illustrate the
versatility of the Weibull distribution in representing a family of probability density functions which
includes the Negative Exponential, a shape comparable to the Normal, and a range of other shapes
representative of different failure patterns. It is this flexibility which has made the Weibull
distribution popular with reliability engineers.

12-6 12. Life Distributions

Figure 12-5 - Weibull Probability Density Function (pdf) f(t) for
Various Values of Beta

12.6 Weibull Graphs - Reliability Function

Graphs of the Weibull reliability function R(t) for BETA values of 0.75, 1.0, 2.0 and 3.3 are shown
in Figure 12-6 respectively.

The vertical scale of reliability function graphs is Probability expressed as a percentage and the
horizontal scale is Operating Life. The graph shows the probability that a new item will survive to the
corresponding age.

For BETA < 1, the Weibull reliability function falls steeply at first and then flattens out. The
reliability will approach zero as the age tends to infinity, but with BETA < 1, this approach is very
gradual. The reliability will have the value 36.81% when the age is equal to ETA, whatever the value
of BETA.

12. Life Distributions 12-7

Figure 12-6 - Weibull Reliability Functions for Various Values of Beta.

For BETA = 1, the Weibull distribution reduces to the Negative Exponential distribution. The curve
falls quite steeply at first, though not as steeply as for BETA <1. After the age value ETA, the curve
asymptotically approaches the zero level at a faster rate than for BETA <1.

For BETA = 2.0, the curve remains high initially and then climbs at a fairly steady rate, finally
approaching the zero value asymptotically. This is the gradual wearout case.

For BETA = 3.3, the Weibull reliability function remains close to the 100% level initially and then
falls sharply, indicating strong wearout. The zero level is approached more rapidly than in the
previous cases and is reached within the range of the graph (within the accuracy of the plot).

12.7 The Three Parameter Weibull Distribution

Further flexibility can be introduced into the Weibull distribution by adding a third parameter which is
a location parameter and is usually denoted by the symbol gamma (γ). The probability of failure is
zero for t<γ and then follows a Weibull distribution with origin at age γ. Gamma is often referred to
as the Minimum Life parameter and can be used in conjunction with any values of BETA and ETA.
Figure 12-7 illustrates a Weibull p.d.f. with GAMMA = 12 and BETA = 2. We see that the failure
probability density is zero between 0 and 12 and then follows the usually Weibull pattern for BETA =
2. Figure 12-8 shows the corresponding c.d.f.

The three parameter Weibull distribution may give a better fit to given failure data than the two
parameter distribution. From a mathematical viewpoint, giving the distribution an extra parameter
allows it more flexibility leading to a closer (or at least as close) fit to any given data. Although
gamma is usually called the "Minimum Life" this does not guarantee that no failures will occur below
this value in the future.

12-8 12. Life Distributions

Figure 12-7 - Three Parameter Weibull Probability Density Function with Minimum Life =
12 units

Figure 12-8 - Three Parameter Weibull Cumulative Distribution Function with Minimum
Life = 12 Units

12.8 Bi-Weibull Distributions

The Weibull distribution is considerably more flexible than the negative exponential and normal
distributions. Nevertheless, when considered in relation to the failure patterns of Figure 11-2, it
only covers two, patterns C and E. One particular limitation of the Weibull distribution is that it
does not cover patterns B and F which are quite common in practice. To further extend the
flexibility of our model we make use of a bi-Weibull distribution. A bi-Weibull distribution is

12. Life Distributions 12-9

formed by combining two Weibull distributions. Several versions of bi-Weibull distribution have
been proposed by different authors, see references to Chapter 1. These versions differ in the way
in which the two Weibull distributions are combined, in the number of parameters used, the range
of the parameters specified and so on. The version used in RELCODE was described by Hastings
and Ang in Reference 3 of Chapter 1, and may be referred to simply as the bi-Weibull distribution,
or a Hastings distribution, or Hastings bi-Weibull distribution if we wish to distinguish it from
other bi-Weibull variants.

12.9 Derivation of Hastings bi-Weibull Distribution

Hastings bi-Weibull distribution is derived by adding two Weibull hazard functions.

The first of these hazard functions is a two parameter Weibull hazard function with the equation:

h(t ) = λθ ( λt ) (θ −1) (12.7)

In equation 12.7, t is the component age, h(t) is the hazard function at age t, λ is the reciprocal of
a scale parameter and θ is a shape parameter. The case where θ = 1 corresponds to a constant
failure rate λ.

The second hazard function is a three parameter Weibull hazard function, which becomes operative
for t > γ. The equation is:

( β −1)
 β   (t − γ ) 
h( t ) =     12.8
 η  η 

In equation 12.8, β, η and γ are respectively shape, scale and location parameters, as in the three
parameter Weibull distribution.

Adding the two hazard functions gives Hastings bi-Weibull distribution, for which the hazard and
reliability equations are:


h(t ) = λθ ( λt ) (θ −1) 0<t<γ (12.9)

( β −1)
(θ −1)  β  (t − γ ) 
h(t ) = λθ (λt ) +    t≥γ (12.10)
  η 

R ( t ) = e − ( λt ) 0<t<γ (12.11)

+ (( t −γ ) / η ) β ]
R(t ) = e −[( λt ) t≥γ (12.12)

In equations 12.7 to 12.12, θ is not confined to values less than or equal to 1, and β is not confined
to values greater than 1, although the values do often conform to these ranges in practice.

12-10 12. Life Distributions

The following rules apply:

γ ≥ 0, η > 0, β > 0, λ ≥ 0, θ ≥ 0.

If λ = 0 then θ = 0. Hastings bi-Weibull distribution reduces to a Weibull distribution if λ = θ =

0, in which case the conventional Weibull parameters γ, η, β are used.

Figures 12-9, 12-10 and 12-11 show respectively the Hastings bi-Weibull hazard function,
probability density function and reliability function for the following parameter values:
λ = 0.01, θ = 0.6, γ = 40, η = 40, β = 3.0.

This example corresponds to a combination of burn-in and wearout failures. The range of shapes
which can be taken by the Hastings bi-Weibull distribution is large. Any combination of two
Weibull failure rate patterns can be accommodated, for example, burn-in plus wearout, random
plus wearout, burn-in plus random, random plus another random starting later. β is not required to
be greater than 1, nor λ less than 1. In practice, the ability of the Hastings distribution to detect the
onset of wearout is one of its main advantages.

Figure 12-9 - Hastings Bi-Weibull Hazard Function

12. Life Distributions 12-11

Figure 12-10 - Hastings Bi-Weibull Probability Density Function

Figure 12-11 - Hastings Bi-Weibull Reliability Function

12.10 Discussion
The negative exponential, Weibull and bi-Weibull distributions form a family of distributions of
gradually increasing complexity.

12-12 12. Life Distributions

The negative exponential is the simplest and deals only with the constant failure rate, pattern E of
Figure 11-2.

The Weibull includes the negative exponential as a special case and extends the range of models to
include Pattern C and Pattern D as an approximate three parameter Weibull. It will also provide a
solution for the strictly decreasing failure rate pattern and for the strong wearout case which do not
occur in Figure 11-2 but do exist in practice. However, a negative aspect is that where a “double”
pattern exists, Weibull fitting can tend to obscure this, since it is bound to average out the two

The bi-Weibull distribution includes the Weibull as a special case, but allows two failure phases so
that patterns B and F are now covered. Also, as Figure 12-9 shows, the Hastings bi-Weibull
distribution can provide a fair approximation for the Bath Tub, Pattern A. Thus the whole range of
patterns is effectively covered.

12. Life Distributions 12-13

13. Statistical Methods and Applications I

13.1 Good As New and Bad As Old

Reliability analysis of the type which we are considering here, is applicable to items which reach a
point where they are deemed to have failed. At failure, the item is removed from service, and may
possibly be replaced by a new item or by a “good-as-new” item (which may in fact be the original
item restored to good-as-new condition). Each new or good-as-new item becomes a separate item
in the analysis.

In a simple case like a metal filament light globe, failure is a clear cut event. In other cases, an
item might deteriorate to a point where it is no longer usable, and then be taken out of service
without experiencing actual failure to perform. An example of this would be a tyre which wears to
a point where its use is no longer legal.

In some cases an item may be repaired in the course of its life. For example, a tyre may have a
puncture which is repaired and the item then continues in service. Some punctures, however, may
be so serious that the tyre is not repaired but is discarded. Assuming that we are interested in how
long tyres last before they are replaced, then any event which requires replacement of the tyre is a
failure. This includes both severe punctures which require the tyre to be discarded and normal
wear which reaches the legal limit. A “normal” puncture, which is repaired so that the tyre
continues in use, is not a failure in this case. These repairs are known as a “bad as old” repairs,
since the tyre will continue to work, but its condition of wear will still reflect its age.

The definition of “failure” is essentially up to the analyst and any logically consistent definition
may be applied, provided that the results take due account of the definition used.

Pursuing the tyre example, in an application where the kilometres between any wheel changes was
of interest (perhaps because the vehicle operated in a remote location where puncture repair
facilities were not available) then normal punctures may be regarded as failures for purposes of
analysis. We shall assume that the analyst has defined failure in a way suited to his purposes.

When replacement occurs and data on the new item forms part of our analysis, we assume that the
new item is similar to the original item when new. This may be because the new item actually is
new, or because it is “as good as new”.

There may be situations where repairs leave an item in a condition which is neither “as bad as old”
nor “as good as new”, but we shall not consider these.

13.2 Failures and Suspensions

In practice, reliability data usually consists of ages at failure for some items, together with data
indicating that some other items have run successfully to known ages without failure. These latter
items are known as suspensions or suspended items (also as censored items).

Where suspensions occur, it is essential to take them into account if valid reliability analysis is to be
carried out. However, for the present we shall consider situations where all items run to failure,
leaving the analysis of the suspended item case until later.

13. Statistical Methods and Applications I 13-1

13.3 Basic Logic of the Analysis
Given our reliability data, we are interested in finding a statistically valid life distribution model
which will enable us to make soundly based decisions regarding reliability and maintenance.

The basic logic of the analytical process is:

1. Assume a model type, e.g., one of:

negative exponential

2. Estimate the parameters for the assumed model type from the data, yielding a fitted model.

3. Statistically test the hypothesis that the data could have arisen at random from the fitted
model. This gives a measure of the goodness-of-fit of the model.

4. Repeat with other model types and use a statistical test of model quality to decide which
model is most appropriate.

5. Use the preferred model as an aid to determining appropriate replacement policies or for
other management decisions.

The RELCODE computer software package automates this process.

13.4 Weibull Probability Paper

Prior to the widespread availability of computers, manual methods of data analysis were needed.
For reliability analysis, the technique of Weibull plotting using Weibull probability paper was

The plot is of the cumulative probability of failure. A special probability paper is used, the scales
of which are modified so that any Weibull distribution function appears as a straight line on the
graph paper. The vertical axis represents the cumulative percentage failures and the horizontal axis
represents the age at failure. The horizontal scale is logarithmic, whilst the vertical scale
[ [ ]]
represents log log 1 / (1− α ) for probability α. This transformation converts the Weibull
distribution function into a straight line.

13.5 Weibull Plotting

13.5.1 Example
As an example of Weibull plotting, consider the data in Table 13-1 which relates to a sample of 10
switches. The switches themselves are labeled A,B,C, and so on, and the number of operating cycles
to failure were observed as shown in Table 13.1.

13-2 13. Statistical Methods and Applications I

Table 13-1 Switch Failures - Basic Data
Switch Operations to Failure

A 1980

B 760

C 120

D 210

E 2170

F 3800

G 700

H 1350

I 1100

J 380

The Weibull probability paper analysis method can be summarized as follows:

1 Sort the data by increasing age at failure, determining the order number of each failure.
2 Estimate the cumulative probability of failure at each failure age using the median rank
formula, equation 13-2.
3 Plot the cumulative probability of failure against age on Weibull paper.
4 Determine the Weibull parameters by fitting a straight line to the data on the probability

13.5.2 Order Number

The first step in the analysis is to sort the failure data into the sequence of age at failure. Table 13.2
shows this. The Order Number, i, denotes the i-th failure in sequence. The number of items of data
is denoted by N, in this case N = 10.

13. Statistical Methods and Applications I 13-3

Table 13-2 Switch Failures Showing Failure Order Number
Order Number i Switch Operations to Failure
1 C 120
2 D 210
3 J 380
4 G 700
5 B 760
6 I 1100
7 H 1350
8 A 1980
9 E 2170
10 F 3800

13.5.3 Cumulative Probability Estimator and Median Rank

The Weibull plot is a plot of the cumulative probability of failure against age. Our data can be
considered as sample from an underlying population. The most widely used cumulative probability
estimator is:

Cumulative probability estimator = (i - 0.3)/(N + 0.4) 13.2

Equation 13.2 is Benard's formula. It estimates what is known as the Median Rank of the cumulative
probability of failure. This formula is used by RELCODE.

To illustrate the application of equation 13.2, consider the somewhat extreme case where we had only
one failure (a sample of 1). We would not expect the age of this failure to represent the age by which
100% of items in the underlying population would fail. It would be more realistic to regard the single
age at failure as representing the age by which 50% of the underlying population would fail. Benard's
formula for i = 1 and N = 1 gives a probability level of

i − 0.3 1 − 0.3 0.7

= = = 0.5
N + 0.4 1 + 0.4 14

that is 50%, which is intuitively reasonable.

The next step in the analysis is to extend Table 13.2 to show the Cumulative Probability of Failure
Estimator as given by equation 13.2. This is shown in Table 13.3. As an example of the
calculation, consider the tenth failure, for which i = 10. We have:

i − 0.3 10 − 0.3 9.7

p= = = =.933
N + 0.4 10 + 0.4 10.4

13-4 13. Statistical Methods and Applications I

Table 13-3 Switch Failures Showing Median Rank
Order Operations Median Rank
Number to Failure
i t i - 0.3 (N = 10)
N + 0.4
1 120 0.067
2 210 0.163
3 380 0.260
4 700 0.356
5 760 0.452
6 1100 0.548
7 1350 0.644
8 1980 0.740
9 2170 0.837
10 3800 0.933

13.6 Weibull Plot - Results

RELCODE will perform a Weibull plot, plotting the Median Rank (expressed as a percentage) against
the corresponding number of operations to failure, t. RELCODE then estimates the Weibull
parameters ETA (η) and ΒΕΤΑ (β). Figure 13-2 shows this.

13.6.1 Characteristic Life (ETA, η)

The straight line fitted to the data in Figure 13-2 represents a Weibull distribution function. The ETA
value (Characteristic Life) is found by noting where the line cuts the horizontal dotted line at the
probability value of 63.2%. A vertical is then dropped to the Age at Failure scale and this gives the
value of ETA. In Figure 13-2 we have ETA = 1344.

13.6.2 Shape Parameter (BETA, β)

The Shape Parameter is found from the gradient of the fitted line. In this case we get:

β = 1.01
η = 1344 Operations

Since β is close to 1, we deduce that the switches are subject to random failures and that the MTBF is
1334 operations. This concludes the Weibull analysis for these switches.

13. Statistical Methods and Applications I 13-5

Figure 13-12 - Weibull Plot for Switches

13.7 Random Failures

This section focuses specifically on the situation where the failure rate is constant. This is the random
failure phase, where neither burn-in nor wearout failures are evident.

Constant (or approximately constant) failure rates arise frequently - and are often assumed to occur
without being really verified. Random failures arise particularly with:

• failures arising from some erratic external cause, e.g. metal obect damaging a pump vane;
• electronic equipment;
• very complex equipment, or aggregations of equipment with some components replaced;
• as an approximation for any other hazard function over a short interval.

The aim of this analysis is to show how to estimate reliability in the case of random failures, in

• how to estimate the Mean Time Between Failures (MTBF)

• to introduce the idea of confidence limits
• to show how to calculate confidence limits for the MTBF

13.7.1 Random failures and the Negative Exponential Distribution

The random failure rate case corresponds to a Weibull distribution with β = 1. The hazard function
then reduces to a constant h(t) = 1/η = λ, a constant referred to as the failure rate. This distribution
is most commonly referred to as the Negative Exponential Distribution, details of which are given in
Chapter 12. The MTBF is the reciprocal of the failure rate.

13-6 13. Statistical Methods and Applications I

13.8 Estimation of the MTBF

13.8.1 Service Hours

When one machine operates for one hour it provides one service-hour.

13.8.2 Point Estimate of the MTBF

An unbiased estimate of the MTBF (assuming random failures) is given by the mean number of
service-hours per failure. If n failures occur in T service hours we have

Estimated MTBF = T/n 13.3

If m items are observed and ti, is the observed operating time of the i-th item, the total service-hours,
T, is found by adding up the operating time of the items. In equation form this is expressed by:

T = Σ ti 13.4

It is immaterial whether any specific item fails, is suspended or replaced, since the failure rate for all
items is assumed to be constant and independent of age.

13.9 Confidence Limits - General Concepts

The estimate of the MTBF given by equation 13.3 is known as a "point estimate". This means that if
we had to pick our single best value for the MTBF we should pick the value given by equation 13.3.
The true value of the MTBF may be greater or less than that given by equation 13.3.

Confidence limits are values such that we are confident at some stated level (e.g. 90% confidence)
that the true value will not be greater than or less than the specified limit.

13.9.1 One Sided and Two Sided Confidence Limits

Confidence limits can be one or two sided.

A one sided 90% lower confidence limit for the MTBF would be a value such that we are 90%
confident that the true MTBF exceeds the value given.
A one sided 90% upper confidence limit for the MTBF would be a value such that we are 90%
confident that the true MTBF is less than the value given.

A two sided 90% confidence interval is such that we are 90% confident that the true value lies
between the upper and lower boundaries of the interval. In this case, there is a 5% chance of the
value being below the lower limit and a 5% chance of it being above the upper limit.

In reliability analysis we are usually interested in lower one sided confidence limits.

13.10 Confidence Limits for the MTBF

Confidence limits for the MTBF can be calculated manually with the aid of Table 13.4. This is based
on equations developed by Epstein (1960). Table 13.4 shows the ratio of the lower and upper
confidence limits to the MTBF at various confidence levels and for various numbers of failures.

13. Statistical Methods and Applications I 13-7

The value of the lower limits depends on whether testing terminated at a failure, or whether testing
was terminated after some elapsed time, but not specifically at a failure.

There is a special case where we have zero failures. In this case, no upper limit can be specified,
but the lower limit is the elapsed service-hours, T, multiplied by the value in the Table 13.4 for n
= 0.

13.11 Confidence Limits - Procedure

If a number of items are observed and it is found that at the time of the n-th failure, the total
service hours provided is T then, as seen earlier in equation 13.3, we make a point estimate for the

MTBF = T/n 13.5

Confidence limits for the MTBF are then found using Table 13.3. If testing terminated at the n-th
failure, we enter the table at the corresponding number of failures n, and use the “End on Fail”
columns for the lower limit. We select the column according to the confidence limit we are
seeking, that is, 99%, 95% or 90%, lower or upper. If testing terminated after a certain time, but
not specifically at a failure, then we use the “End on Time” columns.

The value in the table is then multiplied by the point estimate of the MTBF (equation 13.5) to give
the required confidence limit.

13.12 Confidence Limits Example

Ten items are tested to failure and the total service-hours obtained is 100. Estimate the MTBF and
give a lower one sided 95% confidence limit for the MTBF.
The solution is as follows:

T = 100, n = 10

Point Estimate of MTBF = T/n = 100/10 = 10 hours

For the lower 95% confidence limit, enter Table 13.4 at

Failures, n = 10
Lower Conf Limit = 95%
Table entry = 0.637 (end on failure case)

Hence Lower 95% Confidence Limit = 0.637 x 10

= 6.37 hours

Lower - End on Failure Lower - End on Time Upper

Failur 99% 95% 90% 99% 95% 90% 90% 95% 99%

13-8 13. Statistical Methods and Applications I

es =
0* - - - .217 .334 .434 - - -
1 .217 .334 .434 .151 .211 .257 9.479 19.41 99.50
2 .301 .422 .514 .238 .318 .376 3.759 5.626 13.46
3 .357 .476 .564 .299 .387 .449 2.722 3.670 6.661
4 .398 .531 .599 .345 .437 .500 2.292 2.927 4.860
5 .431 .546 .626 .381 .476 .539 2.055 2.538 3.909

6 .458 .571 .647 .412 .507 .570 1.904 2.296 3.360

7 .480 .591 .665 .437 .532 .595 1.797 2.131 3.004
8 .500 .608 .680 .460 .554 .616 1.718 2.010 2.753
9 .517 .624 .693 .479 .573 .634 1.657 1.917 2.566
10 .532 .637 .704 .496 .590 .649 1.607 1.843 2.421

11 .546 .649 .714 .512 .604 .663 1.567 1.783 2.306

12 .558 .659 .723 .526 .617 .675 1.533 1.733 2.211
13 .570 .669 .731 .539 .629 .686 1.504 1.691 2.131
14 .580 .677 .738 .550 .640 .696 1.478 1.654 2.064
15 .589 .685 .745 1.456 1.622 2.006

20 .628 .717 .772 1.377 1.509 1.805

25 .657 .741 .792 1.327 1.438 1.683
30 .679 .759 .806 1.291 1.389 1.601
35 .697 .773 .818 1.265 1.353 1.540
40 .712 .785 .828 1.245 1.325 1.494

45 .725 .795 .837 1.228 1.302 1.457

50 .736 .804 .844 1.214 1.283 1.427
Table 13-4 One Sided Confidence Limits as a Multiple of the MTBF. *For zero failures, lower limit as a
multiple of the service time.

13.13 Exercise - Confidence Limits for the MTBF

Data relating to manpack radios shows the following. Estimate the MTBF and give a two sided 90%
confidence interval.
Utilisation Hours Number of Sets Number of Failures

0 - 50 100 4

50 - 100 200 4

100 - 150 200 3

150 - 200 100 3


Mid-point Sets Service Hrs. Failures

25 100 = 2500 4
75 200 = 15000 4
125 200 = 25,000 3

13. Statistical Methods and Applications I 13-9

175 100 = 17,500 3
________ _______
T = 60,000 n = 14

MTBF = T/n = 60000/14 = 4286 hours

The question asks for a two sided 90% confidence interval. This corresponds to finding the lower and upper
one sided 95% confidence limits.

Lower Limit. Table 13.4 entry = 0.640 (end on time case)

Lower Confidence Limit = 0.640 x 4286 hours = 2743 hours

Upper Limit. Table 13.4 entry = 1.654

Upper Confidence Limit = 1.654 x 5286 hours = 7089 hours

Result. MTBF Point Estimate = 4286 hours, 90% two sided interval is 2743 to 7089 hours.

13-10 13. Statistical Methods and Applications I

14. Statistical Methods and Applications II

14.1 Suspended Items

In practice, we often wish to analyse data where some items have failed and some have not. To
obtain accurate reliability information it is essential to take proper account of the successful
performance of the unfailed items. Unfailed items are called "Suspensions" or "Suspended Items."
The term censored items is also used.

Suspended items may arise for a number of reasons. One case is where these items are continuing in
service and simply have not yet failed.

Another possible source of suspended items is preventive replacement. If a preventive replacement

policy is place, some items will be replaced without having failed and these will be suspended items.
Other conditions where you would record suspensions would be if the parent item were removed
without the component under study having failed. This may occur due to a relocation or replacement
of the equipment, or because of a traffic accident for example.

A procedure for dealing with suspended items is illustrated in the following example:

14.2 Suspended Items - Data

An example of reliability data which includes both failures and suspended items is shown in Table
14.1. This example relates to diesel engines in earthmoving plant. Failure is defined as a situation
where an engine in replaced by a new or overhauled (good as new) engine.

Table 14-1 Reliability Data with Suspended Items - Diesel Engines

Event Hours Status Failure
Number, e Run Number, i
1 3895 Failure 1
2 4733 Suspension
3 7886 Suspension
4 9063 Failure 2
5 10030 Failure 3
6 12123 Suspension

In fitting a distribution to this data, it is necessary to calculate an estimate of the cumulative

probability of failure. In doing this we must take account of the suspended items. If they were
ignored, the reliability would be seriously underestimated.

The method used to allow for suspended items is a refinement of the Age Sensitive Method described
in Hastings and Bartlett (forthcoming 1997). The Age Sensitive Method is an improvement on the
method described by Herd (1960) and Johnson (1964). The use of the Hastings-Bartlett Age
Sensitive Method, improves the accuracy of RELCODE in estimating reliability from your data,
relative to the Herd-Johnson method.

14. Statistical Methods and Applications II 14-1

14.3 Suspended Items - Formula

In the case where there are no suspended items we made use of the failure order-number, i, and the
total number of items, N, in estimating the cumulative probability of failure, using equation 13.2.
With suspended items we make use of a modified order-number, mi, which allows for the
suspended items. The modified order number is calculated using formulae given in this section.
The following symbols will be used:

i = failure order-number
j = suspension order-number
e = event order-number
N = total number of events
ei = event-number of failure i
ej = event number of suspension j
mi = modified order-number of failure i
S(i) = set of suspensions occurring at or after failure i-1 and before failure i.
This set may be empty.
fi = age at which failure i occurs
sj = age at which suspension j occurs
m*i = N + 1 - mi
e*i = N + 1 - ei
αj = the proportion of the current inter-failure interval which has elapsed
when suspension j occurs.
f0 = m0 = e0 = 0

For suspensions j in the set S(i), αj is defined by:

αj = (sj - fi-1)/(fi - fi-1) 14.1

The formula for the modified order-number is:

 e *i e *j + 1− α j 
m i − m i −1 = m  1− * ∏ *
i −1
 14.2
 e i −1 S (i ) e j − α j 

In equation 14.2, the product is taken over suspensions in the set S(i). If this set is empty the
product term has value 1, and the equation reduces to:

mi - mi-1 = m*i-1 / e*i-1 14.3

14.4 Suspended Items - Example of Calculation

We start by calculating the modified order numbers using equations 14.1, 14.2 and 14.3. Once all the
modified order numbers have been calculated we calculate the median rank of the cumulative
probability of each failure using the formula.

Median Rank = (mi - 0.3)/(N + 0.4) 14.4

The calculation is illustrated by the following example, based on the data in Table 14.1. Consider
Event 1 in Table 14.1. As this first event is a failure, we apply equation 14.3 and get

14-2 14. Statistical Methods and Applications II

m1 = 1 14.5

Events 2 and 3 are suspensions. Using equation 14.1 we get:

α2 = (4733 - 3895)/(9063 - 3895) = 0.162

α3 = (7886 - 3895)/(9063 - 3895) = 0.772

Then from equation 14.2 we get

m2 - m1 = 6(1-(3/6) x ((6 - 0.162)/(5 - 0.162)) x ((5 - 0.772)/(4 - .772)))

= 1.259

m2 = 2.259

Event 5 is a failure and there are no suspensions in between events 4 and 5 so we use equation 14.3,
m3 - m2 = (6 + 1 - 2.259)/(6 + 1 - 4) = 1.58

m3 = 2.259 + 1.58 = 3.839

We have now calculated all the modified order-numbers. We then use equation 14.4 to calculate the
median ranks. Table 14.2 summarizes the results. Once the median ranks have been calculated, the
usual probability plotting technique can be applied.
Table 14-2 Modified Order Numbers and Median Ranks
Event Hours Status Failure Modified Median
Order Run Number Order Rank
Number e i Number (mi-.3)/
mi (N + .4)
1 3895 Failure 1 1 11%
2 4733 Suspension
3 7886 Suspension
4 9063 Failure 2 2.259 31%
5 10030 Failure 3 3.839 55%
6 12123 Suspension

14.5 Suspended Items - RELCODE Analysis

RELCODE uses the analytical method just described to get the modified order numbers and median
ranks which are then used in fitting distribution models. The Confidence Limits Table in RELCODE,
shown at Figure 14.1, shows the values of the modified order numbers, which correspond to those in
Table 14.2. Figure 14.1 also shows the median ranks for the reliability and these values are one
minus the values in Table 14.2. Fitting a distribution model using RELCODE yields the following

Shape Parameter, 2.41

14. Statistical Methods and Applications II 14-3

Characteristic Life, 12,010 hours
Mean Life 10,648 hours

The corresponding Weibull plot is shown in Figure 14.2.

Figure 14.1 Confidence Interval Screen - Showing the Modified Order-Numbers and
Median Ranks.

14-4 14. Statistical Methods and Applications II

Figure 14-2 Weibull Plot for Diesel Engines Example

14.6 Bi-Weibull Distribution Example

This example relates to a component called an Oscillating Axle Bush and illustrates an application
of the bi-Weibull distribution. Table 14.3 shows the data in the format printed by RELCODE. In
Table 14.3 the data have been sorted by age. In column 3, F = Failure and S = Suspension. In
all there are 11 failures and 14 suspensions.

14. Statistical Methods and Applications II 14-5

Table 14-3 Oscillating Axle Bushes Data


Age Unit: OP HRS
User: RELCODE User Manual
Record Number Age Event Type Frequency

1 290 S 1
2 334 S 1
3 452 F 1
4 695 F 1
5 769 F 1
6 1668 F 1
7 2150 S 1
8 2210 S 1
9 2252 S 1
10 2467 S 1
11 2607 S 1
12 2662 F 1
13 3212 S 1
14 3260 F 1
15 3576 F 1
16 3820 S 1
17 3852 S 1
18 3984 S 1
19 4011 S 1
20 4203 S 1
21 4454 S 1
22 4636 F 1
23 4818 F 1
24 5041 F 1
25 5134 F 1

We enter the data into RELCODE in the usual way and at the Analysis Menu select Model
Optimization - Fit Distribution. RELCODE fits a range of models and recommends a preferred
model. The results of fitting the various models to the data of Table 14.3 are shown in Figure
14.3. This procedure was introduced in Chapter 4. In this case the preferred model is Model 6,
the bi-Weibull Distribution.

14-6 14. Statistical Methods and Applications II

Figure 14-3 Distribution Fitting Summary

RELCODE selects a preferred model on the basis of relative model quality which is defined in
Section 4.8. In Figure 14.3 we see that the relative model quality for the bi-Weibull distribution is
noticeably higher than for the other distribution models. The fitted parameters of the bi-Weibull
are shown in Figure 14.4.

Figure 14.4 Bi-Weibull Parameters and Goodness of Fit Test.

14. Statistical Methods and Applications II 14-7

From Figure 14.4 we see that the hypothesis that the bi-Weibull distribution fits this data is not
rejected. Hence the bi-Weibull is a good model in this case. From the parameter values we see
that THETA has a value of 1.13, which is close to one. Thus there are approximately random
failures initially. The second part of the bi-Weibull cuts in at a GAMMA value of 4446 hours, and
has a BETA value of 2.76, indicating wearout. Thus we see that we have a combination of random
failures and wearout failures, with the onset of wearout at about 4500 hours.

Not all examples give such a clear cut interpretation as this one, but the bi-Weibull is generally
very valuable in identifying multiple failure rate patterns.

A reliability plot showing the data and the fitted distribution in given in Figure 14.5. We can see
that the reliability falls relatively slowly at first and then falls sharply at about 4500 hours. This
corresponds to the situation which we have identified from the parameters shown in Figure 14.4.

Figure 14.5 Bi-Weibull Reliability Plot

Figure 14.6 shows the bi-Weibull hazard function for this example, with the sharp rise in the
failure rate starting at about 4500 hours.

14-8 14. Statistical Methods and Applications II

Figure 14.6 Bi-Weibull Hazard Function

Figure 14.7 shows the bi-Weibull plot on Weibull Probability Scales. The change in slope between
the random and wearout phases is clearly apparent.

Figure 14.8 shows the plot generated when we fit the 2 parameter Weibull to the Oscillating Axle
Bush data. From a simple manual plot we might regard this model as satisfactory. This would
mean that we would miss the sharp increase in failure rate that occurs at about 4500 hours. We
could therefore be over optimistic in assessing the reliability of this component and fail to recognise
the need for preventive replacement and/or design review.

This concludes the bi-Weibull example. The example will, however, be referred to further in our
discussion of the various models and fitting techniques later in this chapter.

14. Statistical Methods and Applications II 14-9

Figure 14.7 Bi-Weibull Plot on Weibull Probability Scales.

Figure 14.8 Weibull 2 parameter distribution fitted to Axle Bush data.

14.7 Distribution Models and Fitting Methods Used by RELCODE

The models and fitting methods used by RELCODE were introduced in Chapter 4. Here we give
some additional details regarding these. For all models except model 2, we calculate the median

14-10 14. Statistical Methods and Applications II

ranks of the failures, allowing for suspensions as described in Section 14.3. We then fit the model
to these data points in the ways detailed here.

14.7.1 Model 1. Weibull 2 Parameter fitted by Linear Regression

A two parameter Weibull distribution is fitted by transforming the data to Weibull probability scales
and fitting a straight line by linear regression. This is the equivalent of the manual probability
paper method.

This method was regarded as adequate before the advent of more advanced computer based
techniques. It suffers from the problem that the Weibull probability paper transformation is highly
non-linear. This means that the distance of a point from the fitted line represents different
probabilities, depending on where you are on the paper. A point one centimetre from a line near
the bottom left of the paper has a probability error of about 0.1%, whereas a point one centimetre
from a line near the centre of the paper has a probability error of about 10%. This distortion can
cause the fitted distribution to be statistically rejected, even though parameter values can be found
(by other methods) which give a good fit.

14.7.2 Model 2. Weibull 2 Parameter fitted by Maximum Likelihood

The maximum likelihood method is based on the concept that if a failure occurs at age t, then the
likelihood of this event, for a given underlying distribution model, is given by the value of the
probability density function for that distribution at age t, f(t). Formulas for fitting the two
parameter Weibull distribution by maximum likelihood are given by Nelson (1981), pages 340-341.

Maximum likelihood is a technique which is regarded as statistically superior to the linear

regression method of Model 1. However, it can have difficulties. For example, consider fitting a
3 parameter Weibull distribution to given data. A 3 parameter Weibull with β < 1 has an infinite
value of the pdf at age γ, so if γ coincides with the first failure age we have an arbitrarily large
value of the likelihood. More generally, the method is susceptible to being influenced by outliers.
This is a disadvantage with engineering data which is often not well behaved.

14.7.3 Model Accuracy

Recognising the distorting effect of probability paper, Ang and Hastings (1994) used the
probability error as a basis for fitting distributions, and introduced the term Model Accuracy to
represent the value of a coefficient of conformance based on probability error. The probability
error is the difference between the reliability level of an observed data point ri and corresponding
model value, vi of the reliability at the same age. The root mean square probability error (RMSPE)
is given by:

 n 
=  ∑ (ri − v i ) / n
RMSPE 14.9
 i =1 

where n is the number of failures.

The Model Accuracy (MA%), expressed as a percentage, is then given by

14. Statistical Methods and Applications II 14-11

MA% = 100 x (1-RMSPE) 14.10

If all points lie on the line, model accuracy is 100%.

Ang and Hastings (1994) used the mean absolute probability error rather than the root mean square
probability error. The change to the root mean square probability error has been made to reflect
the fact that errors are likely to be normally distributed.

14.7.4 Goodness of Fit

To decide whether a distribution model is statistically valid we carry out a test of goodness of fit.
For the case where all items fail, the Kolmogorov-Smirnov (KS) type of test is applicable. A
version specifically suited to the Weibull distribution is described by D’Agostino and Stephens

However, the presence of suspended items invalidates KS type tests. If there are quite a few
suspended items occurring between any pair of failures, the KS test may reject a model which is, in
fact, quite accurate. For this reason, Ang and Hastings (1994) developed the Model Accuracy
Test, which is based on model accuracy statistics, and which is applicable with or without
suspended items.

The “Goodness of Fit Test” is a typical statistical test. We have:

1. A hypothesis (Ho) that the data fits the model.

2. An observed value of a test statistic, for example, the Model Accuracy statistic given by
equations 14.9, 14.10.

3. A confidence level at which we wish to test the hypothesis.

4. A critical value for the test statistic at the given confidence level.

5. We reject the hypothesis (Ho) at a given confidence level if the observed value of the test
statistic is less than the critical value.

14.7.5 Goodness of Fit Test - Examples

Acceptable Model. Figure 14.4 shows the goodness-of-fit test results for the Oscillating Axle
Bush data for the bi-Weibull model. Figure 14.4 gives critical values for the Model Accuracy test
at several confidence levels. The observed Model Accuracy value is 97.12%. This is greater than
the critical values for the Model Accuracy test at all confidence levels. Hence we conclude that the
hypothesis that the bi- Weibull distribution obtained by RELCODE fits this data is not rejected.

Rejected Model. Figure 14-9 shows the goodness of fit test for the 2 parameter Weibull model
fitted by linear regression, for the Axle Bush data. The corresponds to the Weibull Plot shown in
Figure 14.8. In this case the hypothesis that the distribution fits the data is rejected. RELCODE
gives us two indications that this is not a good model for this data, firstly by recommending the bi-
Weibull model in this case, secondly by rejecting the model in the goodness of fit test. If we
simply used the Weibull plot and did not apply a suitable goodness of fit test we might have
accepted the 2 parameter Weibull result and concluded that we had a beta value of 1.5 indicative of

14-12 14. Statistical Methods and Applications II

very gradual wearout. In fact, the goodness of fit test rejects that model but accepts the bi-Weibull
model, as we have already seen.

Figure 14.9 Parameters and Goodness of Fit test for the 2 Parameter Weibull Model for the
Oscillating Axle Bush

14.7.6 Model 3. Weibull 2 Parameter fitted by Maximising the Model Accuracy

The distortion introduced by the Weibull paper, discussed in Section 14.7.1 - Model 1, leads to the
need for a fitting method which avoids these distortions. This is achieved by using a computer
search technique to find parameter values which minimise the root mean square probability error.
Equivalently, we are maximising the model accuracy as defined by equations 14.9, 14.10.

In Table 14.5 we see that Model 3 gives the results:

ETA = 4880 op hrs )

BETA = 2.76 ) 14.12
Accuracy = 90.83% )

This suggests a much stronger wearout pattern than that found by Model 1. In the linear regression
method, greater weight was given to the early failures because of the scale distortion of the Weibull
paper. The accuracy of Model 3 is greater than Model 1, but still is not high enough to be
statistically acceptable.

14. Statistical Methods and Applications II 14-13

14.7.7 Model 4. Weibull 3 Parameter fitted by Linear Regression
Model 4 seeks to fit a 3 parameter Weibull distribution. This is done by first fitting a 2 parameter
Weibull (using linear regression). Then a search technique is used to try to find a GAMMA value
which will give a better fit. The criterion used for deciding whether a fit is “better” is the model
accuracy, but linear regression is used in fitting the distribution for any given GAMMA value.
This is analogous to using probability paper to carry out a 3 parameter Weibull fit. The GAMMA
value cannot exceed the first failure age, otherwise the linear regression method fails.

In this example, this model does not find a 3 parameter Weibull which improves on the 2 parameter
result of Model 1.

14.7.8 Model 5. Weibull 3 Parameter fitted by Maximum Model Accuracy

Model 5 is similar to Model 4, except that for any given GAMMA a search technique is used to
find the parameter values which maximise model accuracy. GAMMA is not restricted to values
below the smallest failure age. In this example no improvement is found on Model 3.

14.7.9 Model 6. Bi-Weibull Distribution fitted by Maximum Model Accuracy

RELCODE fits the bi-Weibull distribution by means of a computer search routine.

14.8 Confidence Limits for Reliability - Theory

The derivation of confidence limits for reliability is based on regarding the reliability data to any given
age as a trial in which some number of successes, x, occurs out of n items at risk. We then apply the
theory of Bernoulli trials and the Binomial distribution (or an equivalent Beta Distribution) to
determine confidence limits for the reliability.

Consider a situation where an item undergoes a trial and is either a success or a failure. Suppose that
success occurs with probability p and failure with probability 1-p. The trial is then known as a
Bernoulli trial. If n independent Bernoulli trials are carried out, each with the same probability of
success, p, then the probability of exactly x successes occurring is

Cx px (1-p)n-x
n 14.17

This is the general term of the expansion of the binomial expression

(p + (1-p))n 14.18

The number of successes in n independent Bernoulli trials each with the same probability of success,
p, is a binomial random variable which we denote B:n,p.

A binomial random variable B:n,p has probability function f(x) given by

f(x) = Prob (B = x) = nCxpx (1-p)n-x 14.19

where x is an integer 0 ≤ x ≤ n.

14-14 14. Statistical Methods and Applications II

The mean of the random variable is np and the variance is n(1-p)p

Estimation of Reliability

The reliability of an item in a situation corresponding to a Bernoulli trial is its probability of success.
Suppose we test n items and observe x successes and wish to make a statement about the reliability of
the items. We assume that the trials are Bernoulli with the same chance of success each time. The
sampling distribution of the number of successes is then a binomial distribution. The sample mean
x/n is an unbiased estimator of the probability of success, p,

pA = x/n 14.20

Confidence Limits for the reliability

The lowest value of p, say, pL, for which there is probability (1 - α) of getting more than x successes
in n trials is the lower one sided confidence limit for p at the 100 α % level.

Prob (B:n, pL > x) = 1 - α 14.21

The highest value of p, say pU, for which there is probability α of getting more than x successes in n
trials is the upper 100 α % confidence limit for p, given by

Prob (B:n, pU > x) = α 14.22

For given x, n, α we can obtain pL and pU from equations 14.21 and 14.22. RELCODE gives these
confidence limits.

14.9 Example of Calculation of Confidence Limits

As example of the calculation of the confidence limits consider the case where N = 2.

The binomial terms for this case are, if p is the probability success:

Number of Successes Probability

0 (1 - p)2
1 2p(1 - p)
2 p2

At the time of the first failure we have 1 success and so, for the 5% level, the lower confidence limit
is such that the probability of more than one success is 0.05.

Lower Limit given by p2 = .05

p = .22361

For the upper limit, the probability of more than one success is 0.95

Upper Limit given by p2 = .95

p = .97468

14. Statistical Methods and Applications II 14-15

These correspond to the values given by RELCODE. RELCODE rounds to two decimal places. The
RELCODE results are shown in Figure 14.10

At the time of the second failure we have zero successes so the lower limit is given by

2p(1 -p) + p2 = 0.05 , p = .02532

and the upper limit is given by
2p (1 - p) + p2 = 0.95 , p = .77639

Figure 14.10 Confidence Limits Example

14.10 Conclusion
RELCODE provides a range of model fitting and testing options which goes well beyond the basic
Weibull plotting technique.

In particular, methods are available which:

• Fit the 2 and 3 parameter Weibull distribution by traditional and advanced methods
• Fit the bi-Weibull distribution, which extends the range of models into the group found in
Reliability Centered Maintenance studies, allowing combinations of burn-in, random and
wearout failures.
• Test the statistical goodness of fit of the fitted models and recommend a preferred model on the
basis of model quality.
• Make improved allowance for suspended items
• Calculate confidence limits for the MTBF
• Calculate confidence limits for the reliability by age
The above analytical techniques are of value as a basis for measuring reliability and in indicating
the failure rate pattern of items. The failure rate pattern is useful as a diagnostic tool and as an
indication of the appropriate maintenance policy.

14-16 14. Statistical Methods and Applications II

14. Statistical Methods and Applications II 14-17

15.1 Introduction

The background to planned replacement analysis for components is given in Chapter 6 for age
based replacement and in Chapter 7 for block replacement. In this chapter the underlying
mathematics for these models is described. This includes the “replace only on failure” strategy
which is considered first.

15.2 Replacement Only On Failure (ROOF)

The simplest form of replacement policy is to replace only on failure. That is, we carry out failure
replacements only, and do not do any preventive replacements.

If failure replacements only are carried out the average cost per unit time in the long run will be given

Average cost = Cost of Failure Replacement

per unit time Mean Life of Components 15.1

We introduce the following symbols

GROOF = Average cost per unit time for replacement only on failure
µ = Mean Life of components (with no preventive replacement)
Cf = Cost of Failure Replacement

Then equation 15.1 can be written as

GROOF = Cf/µ....................... 15.2

15.3 Aged-Based Preventive Replacement Policy

In an age-based preventive replacement policy, items are replaced under the following rules:

A "preventive replacement age" denoted tp, is set as part of the policy. If a item fails before age tp a
failure replacement is made.

If an item survives to age tp, a preventive replacement is made at age tp.

To implement this policy in practice we need to record when each replacement occurs so that the age
of every component is known. The saving from preventive replacement may depend on the
preventive replacement occurring at a convenient time, e.g. at the next routine service. At the time of
such a service, any component which had reached (or exceeded) its preventive replacement age would
be replaced.

Items which fail before the preventive replacement age still require failure replacement.

15. Planned Replacement Analytical Methods 15-1

The situation is illustrated in Figure 15-1 which shows a cumulative probability of failure function and
the Preventive Replacement Age, tp. The preventive replacement age is represented as 3000
kilometers. The probability of failure replacement is F(tp), which in Figure 15.1 is about 22%.

Figure 15-1 - Age-Based Preventive Replacement Policy. Cumulative Distribution Function



Age, tp

Figure 15-2 shows in a schematic form a typical sequence of events under an Age-Based Preventive
Replacement Policy. Starting with a new component, in Figure 15-2, this first component survives to
age tp, when it is replaced on a preventive basis. This is indicated by the symbol P in the figure.
The second component fails before age tp, so a Failure Replacement occurs, indicated by F. The
sequence of replacements continues, with preventive replacement occurring whenever a component
survives to age tp, and failure replacement occurring otherwise.

Figure 15-2 - Schematic Sequence of Events for an Age-Based Preventive Replacement Policy

P = Preventive Replacement
F = Failure Replacement
tp = Preventive Replacement Age

Sequence of Events
tp <tp tp tp <tp

15-2 15. Planned Replacement Analytical Methods

15.4 Cost Minimization with Age-Based Preventive Replacement

The cheapest Age-Based Preventive Replacement Policy is the one which has the lowest long run cost
per unit time. This cost is derived by determining the average replacement cost per component and
the average life per component, and then dividing the average cost by the average life. The average
cost per component is given by:
Average Cost of Probability Cost Probability of
cost per = Failure x of Failure + Preventive x Preventive
component Replacement Replacement Replacement Replacement

Using the following symbols we can write this as an equation:

Cf = Cost of Failure Replacement

F(tp) = Probability of Failure Replacement
Cp = Cost of Preventive Replacement
1 - F(tp) = Probability of Preventive Replacement

cost per = Cf F(tp) + Cp [1 - F(tp)] 15.3

15.4.1 Truncated Mean Life

The average life per component under an age based replacement policy is called the truncated mean
life (TML). It is calculated by taking into account the facts that some components (a proportion 1 -
F(tp)) will remain in use until age tp, but some will fail at various ages between 0 and tp. This latter
group are items which are replaced on failure.

The truncated mean life is given by:

Truncated Preventive Probability Mean age at failure

Mean = Replacement x of Preventive + replacement for
Life Age Replacement items which fail

In symbols, and denoting the failure probability density function by f(t), the truncated mean life is given

Truncated Mean Life = t p [1 - F( t p )] + ∫ t f(t) dt 15.4


Using the integration by parts formula, equation 15.4 can also be expressed as shown in equation
15.5, and it is this version of the equation which is used in RELCODE.

Truncated Mean Life = ∫ [1- F( t )] dt


The average cost per unit time is given by

Average cost per component

Average cost per unit time = 15.6
Truncated mean life of component

15. Planned Replacement Analytical Methods 15-3

The average cost per unit time for an age-based preventive replacement policy will be denoted G. For
any given preventive replacement age tp the formula for G is derived using equations 15.3 and 15.5 in
equation 15.6 to give equation 15.7

Cf F( t p ) + Cp [1 - F( t p)]
G = tp

[1 - F( t )] dt

The minimum cost policy is found by evaluating G for a range values of tp, and choosing the value
which gives a minimum.

The cost G will vary with the preventive replacement age in the way shown in Figure 15.3. The cost
per unit time, G, will have a minimum value G*A which will occur at the optimal preventive
replacement age t*p. RELCODE will find t*p, searching in increments of width equivalent to the
horizontal space occupied by one character on a screen 80 characters wide.

The asymptotic value of G as tp increases is GROOF, given by equation 15.2. If the cost of failure
replacement is only slightly greater than the cost of preventive replacement, or the wearout effect is
only slight (BETA only slightly greater than one) then the minimum in Figure 15.3 will be very
shallow. In such a case, for all practical purposes the optimal policy is to replace only on failure, and
RELCODE will indicate this.

Figure 15-3 Variation of Cost with Preventive Replacement Age for Age Based Preventive Replacement Policy

15.5 Spare Parts for Age Based Policy

15-4 15. Planned Replacement Analytical Methods

The average number of replacements which will occur over a given total component utilization is
given by the total component utilization divided by the average life per component:

Total Component Utilization

Number of Replacements = 15.8
Truncated Mean Life of Component

The truncated mean life of a component for a given age based preventive replacement policy is given
by equation 15.5. RELCODE uses equation 15.5 in conjunction with equation 15.8 to determine the
number of spare or replacement parts which will (on average) be required.

The average proportion of failure replacements will be F(tp) and this is used to estimate the number of
failure replacements and preventive replacements for a given replacement policy.

15.6 Block Preventive Replacement Policy

In a block preventive replacement policy, all components are replaced simultaneously, in a block, at
fixed intervals of time (or operating life). Items which fail in between the block replacement times are
replaced when they fail, these being failure replacements. At the time of block replacement, all items
are replaced including those which have been subject to failure replacement. We refer to the time
between block replacement as the block replacement interval.
The block replacements are preventive replacements. The cost of a block preventive replacement may
differ from that of an age based preventive replacement. Usually a block replacement will be cheaper
(per component replaced) because there are economies of scale in doing many replacements at the
same time.

The interval between block replacements may be expressed in terms of calendar time, or in terms of
operating hours where this is more appropriate. In the latter case all components would be assumed to
operate concurrently. An example of this type of policy is light bulbs in a street, where a block policy
would involve replacing all the light bulbs in a single pass at certain intervals of time. In addition,
individual light bulbs which failed in between the block replacement intervals would be replaced on

Figure 15-5 shows a typical time interval in a block replacement policy in schematic form. There are
10 lamps in a street and initially all the light bulbs are new. As time goes by individual light bulbs fail
and are replaced. These are failure replacements. In Figure 15-6 the first bulb to fail is in Lamp
Number 4. Later failures occur in other lamps, and the bulb in Lamp 4 in fact fails again before the
block replacement time interval, xp, is reached. At time xp, all the light bulbs currently in use are
replaced, regardless of age.

We then have a situation identical to the starting position in that all the light bulbs are new. Thus,
subsequent time intervals will see, in a statistical sense, a repeat of the original pattern, although, of
course, the timing and number of individual failures will depend on chance.

In the block replacement policy, the aim is to choose the value of the interval xp which minimises

Figure 15-13 - Block Preventive Replacement Policy in Schematic Form : Light Bulbs in Street Lamps.

15. Planned Replacement Analytical Methods 15-5

F = Failure Replacements
Xp= Block Replacement Interval

Lamp Number
1 _____________________________________

2 __________________________F__________

3 _____________________________________

4 _____F__________________________F____

5 _____________________________________

6 _____________________________________

7 _____________________________________

8 ___________________________F_________

9 ________________F____________________

10 _____________________________________
0 Time Axis xp

To determine the average replacement cost per unit time we need an expression R(x) defined as the
mean number of failure replacements per component in time x. The average cost per unit time is then
g(xp), given by

Cf R( xp ) + Cp
g( xp) = 15.12

For the case where no preventive replacements are made the cost is given by

g(∞) = cf/µ 15.13

where µ is the mean life of a component.

For a discrete life distribution model in which fi is the probability that a currently new component will
fail in age interval i the renewal function Ri can be derived as follows. Let ri be the mean number of
renewals per component in the ith time interval. The ri is given by

r1 = f1
r2 = f2 + r1f1
n −1
rn = f n + ∑ ri f n −1 , n = 2,3,4,... 15.14
i =1

Rn is given by
R1 = r1
Rn = Rn-1 + rn, n = 2, 3, 4, ... 15.15

15-6 15. Planned Replacement Analytical Methods

16.1 No Failure Data
Sometimes we wish to use the analysis provided by RELCODE but do not have actual failure data available.
We may, however, be able to make a judgement regarding the reliability of our components. In this chapter
we shall present two methods which enable us to make use of RELCODE under these circumstances. In the
first method we use a RELCODE screen which is specifically designed to assist with the "No Failure Data"
case. In the second method we create three artificial data points which provide an estimated distribution

16.2 No Failure Data Screen

We consider first the use of the "No Failure Data" screen. To analyse an item for which we have no failure
data we first go to the Item Header screen and add the item to the data base as described in Chapter 3, Section
3.5. As an example we have added an item called Hydraulic Seal for which the age unit is Months. On the
Item Header screen there is a command button on the right hand side labelled "No Failure Data". Click this
button to go to the No Failure Data screen shown in Figure 16.1.

To use the No Failure Data screen, you enter estimated information about the item. The entries are as

1. You select a failure pattern from a choice of Random Failures, Gradual Wearout or Steep Wearout.

2. You enter an estimated value for the mean life of the item.

3. Optionally you can enter an age of onset of failures, which is applicable in cases where you consider that
there would be an initial period in which there would be a negligible probability of failures occurring.
The default value of the age of onset of failures is zero.

Figure 16.1 No Failure Data Screen

16. Analysis Without Failure Data 16-1

Once you have made these entries you can close the screen and proceed with the RELCODE analyses for
preventive replacement and inspection interval for the item. Analyses which require failure data (e.g.
Distribution Fitting) will not be available.
In the example we have selected a Steep Wearout failure pattern, a Mean Life of 120 months and an age of
onset of failures of 72 months.

The user does not need to be concerned with Weibull analysis in this case, but as a matter of information,
RELCODE will use the entries at this screen to establish an equivalent Weibull distribution model. The
Random Failure pattern converts to a Beta value (shape parameter) of 1, Gradual Wearout to a Beta value of 2
and Steep Wearout to a beta value of 3.5. The age of onset of wearout provides a Gamma value (location
parameter). The Eta value (characteristic life) is calculated by RELCODE from the Mean Life and the other
parameter values.

16.3 The Three Point Estimate Method

A second approach to the situation where we have no failures involves making a “three point estimate”.
Specifically we estimate the ages, t80, t50, t20 to which the component has 80%, 50% and 20% reliability.

We then enter these ages as though they were failures, and this enables RELCODE to fit a distribution. If we also
have data regarding the cost of failure replacement and the cost of preventive replacement we can then proceed to
solve the replacement problem.

As an example, consider the following. A diaphragm valve in a slurry pump is subject to failure. No data
records are available, but the three point estimate shown in Figure 16.2 has been made. These estimates are as
follows. Firstly we estimate the age at which we consider that 80% of the diaphragms will still be surviving. In
this case we estimate this as 60 months (5 years). Thus, we expect that 20% of the diaphragms will last less than
5 years, and 80% will last longer than 5 years.

Secondly, we estimate the age by which we expect that 50% will have failed. In this case we estimate this at 72
months (6 years). Thus we expect that half the diaphragms in these pumps will last for less than 6 years and that
half of them will last for longer than 6 years. Thirdly, we estimate the age by which we expect that 80% will
have failed. In this case we estimate this age as 90 months (7.5 years). Thus we expect that 80% will have failed
before they are 7.5 years old, whilst 20% will last for longer than 7.5 years. We also estimate the cost of failure
replacement as $1000 and the cost of preventive replacement as $200. These estimates are shown in Figure 6.1,
along with the number of diaphragms in the population (16) and the average annual utilization per component,
which in this case is 12 months, since the pumps are all used all the time.

Figure 16-2 - Slurry Pump Diaphragm

Three Point Estimate and Cost Data
Reliability Estimated
Age (Months)
80% 60
50% 72
20% 90

Cost of Failure Replacement $1000

Cost of Preventive Replacement $ 200
Number of Diaphragms installed 16
Annual utilization (Months) 12

16-2 16. Analysis Without Failure Data

16.4 Entering the Three Point Estimate into RELCODE
To use the Three Point Estimate method we add the item to the database using the Item Header Screen and then
enter the three data points as failures at the Event Data Screen. Figure 16.3 shows the Event Data Screen for the
example. We then analyse the data in the usual way.

Figure 16-3 - Entering Data for the Three Point Estimate

16.5 Results
We shall not show all the screens for the example, but only the main results. Figure 16.4 shows the reliability
plot obtained when RELCODE uses the three point estimate data from Figure 16.3. The three point estimate
corresponds to the three points shown on the graph, which we see are at (or very close to) the 80%, 50% and
20% reliability levels. Thus RELCODE has determined a Weibull distribution which corresponds closely to our
reliability estimates. In this case the result is a three parameter Weibull.

We can then proceed to carry out a replacement policy analysis in the usual way. The age based replacement
policy cost graph is shown in Figure 16.5. The optimal solution is to replace the diaphragm at approximately
50 months. We can continue with other analyses in the usual way.

The three point estimate method has allowed us to estimate the distribution function of the components, and we
can then proceed with any of the RELCODE analyses using that distribution. If we subsequently get real data
for this component we can replace the three point estimate data by the real data and repeat the analysis. If we
get more information, but not actual data, we can change our three point estimate if we wish. It is advisable to
note, say in the Item Memo field on the Item Header Screen, when we are using an estimate and not real data.

Figure 16-4 - Reliability Plot for the Slurry Pump Diaphragm.

16. Analysis Without Failure Data 16-3

Figure 16.5- Graph of Cost versus Preventive Replacement Age

16.6 Theory of the Three Point Estimate Method.

The three point estimate method relies on the fact that if we have three failures, then on a reliability plot,
these will be plotted at approximately the 80%, 50% and 20% reliability levels.

16-4 16. Analysis Without Failure Data

The cumulative probability estimator used in RELCODE is Benard’s formula, given in Chapter 13, equation
13.2. For failure i out of a total of N failures the estimator is:

Cumulative probability estimator, P = (i - 0.3)/(N + 0.4)

The corresponding reliability values R, expressed as a percentage, are:

R% = 100 * (1 - P)

Thus if N = 3 we get the following values

i R%

1 79.4
2 50.0
3 20.6

Within plotting accuracy, we see that the points plotted will therefore be at approximately the 80%, 50% and
20% reliability levels. Thus, by estimating the ages corresponding to these reliability levels we provide data
enabling RELCODE to fit a distribution which suits our estimates. It is possible that the data may give a poor
fit to the best Weibull model found, in which case the method is inappropriate and should not be relied on.
RELCODE will give an indication of this in the Goodness of Fit Test.

16.7 Conclusion

The methods given in this chapter let us analyse situations even though no data is available. We can do this either
by using the "No Failure Data" screen, or by using the Three Point Estimate method. In either case, we estimate
a suitable life distribution, and then use RELCODE in the usual way to derive preventive replacement ages and
other results.

16. Analysis Without Failure Data 16-5

17. Exercises

Some Reliability and Replacement Problems to be solved using RELCODE

Exercise 1 - Bearing

Heavy duty bearings in a steel forging plant have failed after the following numbers of weeks of

Age at Failure (Weeks)

Also, the bearing which is currently in the forge has run for 24 weeks without failing.

1. Use RELCODE to select a Weibull life distribution model, estimate the parameters and the
mean life.
2. The cost of Preventive Replacement is $100 and the cost of Failure Replacement is $1000.
Determine the optimal replacement policy and the corresponding cost per week.
3. In practice, the forge has a major service every four weeks. Preventive replacement of the
bearing can be carried out as part of this maintenance activity.
a. At what age (a multiple of four weeks) should the bearing be replaced, to minimise costs.
b. If there is a safety argument for keeping the number of in service failures as low as
possible within reasonable costs, what should the replacement policy be?
Support your conclusions by giving the costs for some alternative policies.
4. There are four similar forging plants and each works for 50 weeks per year. Estimate the
number of replacement parts required per year if the policy is preventive replacement at age
8 weeks. How many failure replacements will occur per year (steady state average) under
this policy?

Exercise 2 - Hydraulic Systems

Records from two heavy duty dump trucks show that seal failures in the hydraulic systems occurred at
the following odometer readings (kilometers, from new)

Truck 1 Truck 2
51220 45380
68060 103510

At present, the odometer readings are

Truck 1 Truck 2
105680 132720

1. Prepare reliability data in a form suitable for analysis by RELCODE.

17. Exercises 17-1

2. For the RELCODE preferred life distribution model, determine the model parameters and
the mean life

3. For the RELCODE preferred model and for the two parameter Weibull model, examine the
Reliability Function and the Weibull Probability Paper plots, and the Goodness of Fit Test
results. Which model do you consider to be the most appropriate and why?

4. What type of failure pattern(s) is/are indicated (EARLY LIFE, RANDOM, WEAROUT?)

5. The Preventive Replacement Cost is $100 and the Failure Replacement Cost is $1000.
Determine the optimal preventive replacement age, and the cost under this policy, and the
saving of this policy when compared with a policy of replacement only on failure.

6. Preventive replacement can only be carried out at odometer readings which are multiples of
5,000 kms. Select an appropriate preventive replacement age. What is the cost
($/kilometer) for this policy? How does this compare with the cost for the optimal policy?

7. If the company has a fleet of 6 similar dump trucks, each of which averages 50,000
kilometers per year, estimate the number of seal replacements which will be needed per
year, under an appropriate replacement policy.

8. If 6 dump trucks average 50,000 kilometers per year, estimate the average number of in-
service seal failures which will occur per year, given that the policy is to replace seals on a
preventive basis at 30,000 kilometers.

Exercise 3 - Sugar Centrifuge Cloth

The cloth filter on a sugar centrifuge is currently replaced on a preventive basis if a suitable
opportunity occurs and the cloth has been in use for at least 20 hours. The cloth is also replaced on
failure. The following data are available for the most recent six cloth replacements.
Cloth Age in Hours Comment

1 20 Did not fail

2 7 Failed
3 3 Failed
4 12 Failed
5 20 Did not fail
6 9 Failed

1. Use RELCODE to analyse the failures and estimate the following parameters:

Shape Parameter BETA

Scale Parameter ETA
Mean Life

2. Is the current replacement policy correct? What policy do you recommend?

3. The company has three centrifuges which each run an average of 400 hours per month. Estimate
the number of replacement cloths required per month under the existing and recommended
replacement policies.

17-2 17. Exercises

Exercise 4 - Bus Engines
A metropolitan transport company operates a large fleet of similar buses. The following table shows
the operating life in kilometers at which engine failures necessitating replacement have occurred, and
also shows the number of engines currently running which have completed the corresponding numbers
of kilometers without failure. (Note. This data has been grouped from actual failure and suspension
ages. The kilometer values shown are the mid points of kilometer ranges 0 - 50,000, 50,001 -
100,000, 100,001 - 150,000. In practice we could work with the individual failure and suspension
ages, but grouping provides a way of reducing the amount of data entry. The professional version of
RELCODE will handle up to 1008 data records, so that grouping is rarely necessary. However, this
example is based on grouped data for purposes of illustration)

Kilometers Failures Survivors

25,000 2 35
75,000 8 27
125,000 33 122

1. Use RELCODE to determine a suitable life distribution model and estimate the parameters of the
distribution and the resulting model accuracy. Use this distribution to answer the remaining parts
of this question.

2. Estimate the 90% reliable life.

3. What proportion engines would you expect to fail by 200,000 kms.

Exercise 5 - Insulator

A new type of insulator for high voltage electric power lines is being trialled.
An analysis of six insulators operating in an accelerated test environment, designed to simulate
coastline conditions, shows that the insulators fail to operate to specification, or survive, after the
following numbers of months.

Insulator Months Condition

1 28 Failed
2 15 Failed
3 30 Surviving
4 12 Surviving
5 26 Failed
6 21 Failed

The design called for 95% confidence of 90% reliability over an 18 month period in this trial. Do the
data indicate that this criterion has been met?

Exercise 6 - Maintainability

An aircraft maintenance check takes the following times in minutes to complete on six successive

17. Exercises 17-3

49, 59, 43, 65, 55, 58

Find a suitable distribution to fit this data and estimate the maintainability, given a maintenance time
constraint of 60 minutes. (Note. The maintainability is the probability that the maintenance is
completed within the maintenance time constraint.)

17-4 17. Exercises


Exercise 1 - Bearing

1. A three parameter Weibull model is selected, with GAMMA = 6.08 weeks, BETA = 1.17,
ETA = 13.28 weeks, Mean Life = 18.66 weeks.

2. Preventive replacement at 6.08 weeks, cost $16.45 per week..

3. Age Weeks Cost ($/week) % Failure Replacements

4 25.00 0
8 23.93 10

a. Replace at 8 weeks
b. Replace at 4 weeks. The preferred model indicates zero in-service failures in this case,
but it is possible in practice that some failures could occur, as the model is statistical.

4. 23.01, 2.44

Exercise 2 - Hydraulic Systems

1. Ages (kms) at failure are 51220, 16840, 45380, 58130. Suspensions occur at 37620, 29210.
These are obtained by subtracting odometer readings from next higher reading.
2. The RELCODE preferred model is a Weibull 2 parameter distribution obtained by
maximising model accuracy. The parameters are:
BETA = 5.92, ETA = 53,507, Mean Life = 49,603 kilometers.
3. The Weibull 2 parameter model fitted by linear regression fails the goodness of fit test at the
80% confidence level. The graphs show that the max. model accuracy method gets a good
fit to the later points. Possibly the data would be better modelled by a bi-Weibull
distribution, but there is insufficient data for a bi-Weibull analysis.
4. Wearout. The failure at 16,840 kilometers was possibly an early life failure.
5. 28,243 kms. 0.0043 $/km, 0.0159 $/km
6. For 30,000 km preventive replacement age. $0.0043 $/km
7. Utilization = 6 x 50,000 = 300,000 km/yr.
For preventive replacement at 30,000 km, spares = 10
8. 0.32

Exercise 3 - Sugar Centrifuge Cloth

1. BETA = 1.03, ETA = 16.04 hrs., Mean Life = 15.85 hrs.
2. No. Replace Only on Failure. More cautiously, extend preventive replacement age to, say,
30 hours, and check if any wearout.
3. 104, 76

Exercise 4 - Bus Engines

1. GAMMA = 8,123 km, ETA = 176,314 km,
BETA = 3.05, Model Accuracy = 99.76%
Three parameter Weibull, fitted by maximum model accuracy.

17. Exercises 17-5

2. 92,400 km. This is obtained from the Reliability Function graph, or from the Reliability
column in the Life Distribution Function Tabulations, under the Replacement Analysis Menu
(by interpolation).

3. 72.58%. From Reliability column in the Life Distribution Function Tabulations, under the
Replacement Analysis Menu (interpolating).

Exercise 5 - Alternator Warranty Data

No. Use: Analysis Menu, Confidence Limits for Reliability, Graph. Interpolating between the lower
blue bars would indicate a lower 95% confidence limit of about 50% at 18 months. This is below the
required level of 90%, so the criterion has not been met.

Exercise 6 - Maintainability

Enter the maintenance times as “failures”. Fitted distribution is Weibull 2 parameter with BETA =
7.54, ETA = 58.24.

The maintainability with a maintenance time constraint of 60 minutes is given by the probability that
maintenance is completed in less than or equal to 60 minutes. This is obtained from the Cumulative
Distribution Function graph, or by subtracting from 1 the value in the Reliability column in the Life
Distribution Function Tabulations, under the Replacement Analysis Menu. To reach this menu in the
full system you will need to enter values for the replacement costs. If replacement analysis is not
being used, enter any arbitrary costs, e.g. 1000, 100. The Reliability value in the “Life Distribution
Function Tabulations” Table at 60 minutes is 0.2859. The maintainability is therefore:

M = 1 - 0.2859 = 0.7141 = 71.41%

17-6 17. Exercises

18. References

Ang J Y T (1994) “Model Accuracy and Goodness of Fit for the Weibull Distribution with Suspended
Items”, PhD thesis, Monash University.

Ang J Y T and Hastings N A J (1994) “Model Accuracy and Goodness of Fit for the Weibull
Distribution with Suspended Items”, Microelectronics and Reliability, 34, 7, 1177-1184.

Clark, W.B.,(1991) “Analysis of Reliability Data for Mechanical Systems”, Proc. Annual Reliability and
Maintainability Symp. IEEE, 1991, pp 438-9.

D’Agostino R B and Stephens M A (1986) “Goodness of Fit Techniques”, Marcel Dekker, Inc, New

Dixey M (1993) “Putting Reliability at the Centre of Maintenance”, Professional Engineering,

Institution of Mechanical Engineers, UK, June 1993, 23-25.

Epstein B(1960) "Estimation from Life Test Data" in IRE Transactions in Reliability and Quality Control,
April 1960, pp.104-107.

Hastings N A J and Ang J Y T (1995) “Developments in Computer Based Reliability Analysis”,

Journal of Quality in Maintenance Engineering, 1, 1, 69-78, 1995.

Hastings N A J and Bartlett H J G (1997), “Estimating the Failure Order-Number from Reliability
Data with Suspended Items”, IEEE Transactions on Reliability.

Hallinan, A J (1993) “A Review of the Weibull Distribution”, Journal of Quality Technology, 25, 85-

Herd R.G. (1960) “Estimation of Reliability from Incomplete Data”, Proc. 6th National Symposium on
Reliability and Quality Control, 202-217.

Johnson L.G. (1964) “Theory and Technique of Variation Research”, Elsevier, Amsterdam.

Kao J H K (1959) “A Graphical Estimation of Mixed Weibull Parameters in Life Testing Electronic
Tubes”, Technometrics, 1, 4, 389-407.

Kao J H K (1960) “A Summary of Some New Techniques on Failure Analysis”, Proc 6th National
Symposium on Reliability and Quality, Washington DC, Jan 11-13, 1960, 190-201.

Lawless J F (1982) “Statistical Model and Methods for Life Data”, Wiley, New York.

Natesan and Jardine A (1986) “Graphical Estimation of Mixed Weibull Parameters for Ungrouped
Multicensored Data”, Maintenance Management International, 115-127.

References 18-1
Nelson W (1981) “Applied Life Data Analysis”, J Wiley and Sons, New York.

O’Connor P D T (1991), “Practical Reliability Engineering”, J Wiley and Sons, Chichester, UK.

18-2 18. References

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy