Upkeep of ALARP Demonstration and Management of Change: Key Messages
Upkeep of ALARP Demonstration and Management of Change: Key Messages
Key messages
A demonstration of ALARP is not a one off exercise but continues throughout the lifecycle.
The management system should specify accountabilities, roles and responsibilities and
competence for those undertaking this activity.
It is recommended that contingencies are set out for Reasonably Foreseeable outages
and changes of status, to avoid shutting down systems or product recalls.
9.2 Creeping change monitoring and The CCHAZID (14) is a variant of the standard HAZID
HAZID (CCHAZID) technique (Section 4), with a similar structure and
process but different guidewords, to help identify new
Creeping Change is the accumulation of small or increased risks occurring over time.
changes which often go unnoticed, but which can
ultimately add up to a significant change. Because, The primary guidewords in this case could relate to:
by their nature, they are gradual, unseen, and not • A whole site, location, or organisation.
planned, creeping changes can be difficult to • A defined product.
monitor. The status of any product or installation
• Activities, modules, or systems/functions.
may change with time, whether due to wear
and tear, corrosion, UV deterioration, changes • Items of equipment.
to management, people, training schemes and • RRMs.
numerous other causes. These changes should • Safety Critical equipment.
be identified and managed, whether by audit,
review, inspection and/or the monitoring of leading The secondary guidewords could relate to:
indicators. These will need to be defined at the
• Ageing (including degradation and obsolescence).
engineering stage and implemented as part of the
Lifecycle Criteria (Section 7). • New knowledge, technologies, standards, or
legislation.
A significant number of accidents happen due to • Data acquisition, e.g., trends in leading indicators.
these types of changes, as can be seen from these • Change of use, additional uses, process changes.
examples:
• Hazardous materials and environmental changes.
Example 9.1 • Equipment or infrastructure changes.
Some major accidents due to creeping change (e.g., electrical, mechanical, instrumentation,
structural and process).
• Proximity changes (equipment, activities).
UK
• Management/ownership changes.
Herald of Free • Workforce change, loss of skills, changes to
Aberfan Nimrod
Enterprise training, revised procedures.
• Operational Risk Assessments (ORAs) and
Marchioness Windscale Kings Cross Management of Changes (MoCs).
• Workforce, organisational and culture changes.
• Results from audits and reviews.
Worldwide
This may be included in the demonstration, or safety or the risk increase would be negligible. Amber
report, guidance for owners, operators, or users would indicate that temporary measures would
on how to respond. This could take the form of a be necessary, as indicated by the rule reference
contingency matrix, as shown Matrix 4. Green would in the cell. A red cell would prohibit the use of the
show that an activity, operation, or use of a piece equipment item, activity, or operation under the
of equipment would not be affected by the failure, specified set of conditions.
SCE #2 R R G G G R
SCE #4 R R Rule C R G G
The common pitfalls in risk management often potential is almost unlimited, as illustrated in many
relate to assumptions about predicting risk and major accidents where unquantifiable or unexpected
UK law, as follows: variables have turned out to be the most critical
factors (Appendices A & B). Risk cannot be sense
Assumption 1 checked, unlike many scientific or engineering
Risks need to be tolerable problems where errors may be obvious within a
There is no legal requirement in the UK for risks to be single order of magnitude.
tolerable, nor is there any relaxation of obligations if
the risks are Broadly Acceptable. The requirement Assumption 6
to reduce risks to ALARP is not linked to the level Numerical risk comparisons are more valid than
of risk. A lot of regulator and industry guidance absolute risk estimates
implies that tolerability is a key criterion, if not a Errors in risk calculations equally affect the
legal requirement, thereby shifting the emphasis comparisons between two risks if the error is
from qualitative to quantitative analysis. This has common to both. However, if the errors in each option
led to the development of unsound predictive are different it is possible that the comparison error is
methodologies, which might create the perception larger than either of the absolute ones.
that it is a legal requirement.
Assumption 7
Assumption 2 The best RRM can be selected using
Risks need to be quantified quantitative methods
Attributing a numerical value to a risk cannot As for Assumption 6, unless Robust Statistical data
demonstrate that it is ALARP. Quantification may is available for each RRM, assumptions will need
be appropriate for the purposes of justifying Risk to be made to determine their risks. This almost
Transfer or demonstrating Gross Disproportion, inevitably creates an invalid comparison, and the
provided it is based on Robust Statistics, avoids the exercise becomes one of comparing assumptions,
errors in Appendices A & B, and complies with the which may well be hidden. It is therefore better to
RSS Guidance (3). make that logic transparent in a rational argument
for each case.
Assumption 3
Risks need to be ranked or profiled Assumption 8
As for assumption 2, the ranking of risks does Sensitivity checks can validate QRA models
not satisfy any UK legislation, as all risks must be A sensitivity check changes one or more variables
reduced to ALARP. However, it is necessary for the in the model or its input data to observe changes in
analysis to be Proportionate (Sections 3.1 & 3.2), the model’s output. If this is to be used to validate
which can only be based on tangible, and preferably the model, the assumption is that the user will have
quantifiable, criteria. reference points to compare it with, but the models
are inherently hypothetical so this would not be
Assumption 4 possible. It cannot reveal errors in the input data and
Gross Disproportion requires CBA is only useful as a QA method to identify obvious
Gross Disproportion may be demonstrated coding errors in the software, such as output
qualitatively, and this may be the only effective decreasing when the input increases.
means of doing so, unless there are Robust
Statistics on which to base the CBA (Appendix A7). Assumption 9
RRMs/barriers are independent
Assumption 5 Many accidents occur because RRMs that were
That numerical risk estimates can be meaningful assumed to be independent had common mode
Expert witnesses, consultants and professionals failures that were not foreseen (Sections 3.1, 6.10,
have made errors of many orders of magnitude 6.11 and Appendices A & B5).
when judging risk (Appendices A & B). The error
1. Baker, J. The Report of the BP U.S. Refineries Independent Safety Review Panel. 2007.
2. Health and Safety Executive. Reducing Risks, Protecting People (R2P2). 2001. ISBN 0 7176 2151 0.
3. Royal Statistical Society. Practitioner Guides No’s. 1 to 4, Guidance for Judges, Lawyers, Forensic Scientists
and Expert Witnesses, Royal Statistical Society. s.l. : Royal Society of Statistics, 2009 to 2014.
4. Perrow, C. Normal Accidents: Living with High Risk Technologies. s.l. : ISBN: 9780691004129, 1984.
5. Health and Safety Executive. HSG65, Managing for Health and Safety, 3rd Edition. [Online] 2013.
https://www.hse.gov.uk/pubns/books/HSG65.htm.
6. The Chartered Institute of Ergonomics & Human Factors (CIEHF). Human Factors in Highly Automated
Systems. s.l. : CIEHF. White Paper.
7. Leveson, N. Engineering a Safer World: Systems Thinking Applied to Safety. 2011. ISBN: 9780471846802.
8. Leveson, N., Thomas J. https://psas.scripts.mit.edu/home/get_file.php?name=STPA_handbook.pdf. STPA
Handbook. [Online] 2018.
9. Kurt Lauridsen, Igor Kozine, Frank Markert, Aniello Amendola, Michalis Cristou, Monica Fiori. Assessment
of Uncertianties in Risk Analysis of Chemical Establishments. Roskilde : Riso National Laboratory, 2002. Riso-
R-1344(EN).
10. Reason, J.,. Managing the Risks of Organizational Accidents. Aldershot, UK : Ashgate, 1997.
11. Health and Safety Executive. A Review of Layers of Protection Analysis (LOPA) analyses of overfill of fuel
storage tanks, RR716. [Online] 2009. https://www.hse.gov.uk/research/rrhtm/rr716.htm.
12. Tinsley C. H., Dillon R. L., Madsen P. M. How to Avoid Catastrophe. [Online] 2011. https://hbr.org/2011/04/
how-to-avoid-catastrophe.
13. The Assurance Working Group. The GSN Community Standard . [Online] 2018. https://scsc.uk/r141B:1?t=1.
14. Richard J. Goff and Justin Holroyd, UK Health and Safety Laboratory. Development of a Creeping Change
HAZID Methodology. IChemE. [Online] 2017. https://www.icheme.org/media/11897/paper-61.pdf.
15. Health and Safety Executive. Good Practice and pitfalls in risk assessment, RR151. 2003.
16. —. HSG238, Out of control: Why control systems go wrong and how to prevent failure,. 2003.
17. Confidential Enquiry into Sudden Death in Infancy” (or “CESDI”), entitled “Sudden Unexpected Deaths in
Infancy. s.l. : BMJ.
18. Hill, Pr. R. Cot Death or Murder? - Weighing the Probabilities. s.l. : Salford University, 2002.
19. The fabrication of facts: The lure of the incredible coincidence. Derksen, Ton. s.l. : Neuroreport, 2009.
20. Kahneman, Daniel. Thinking Fast and Slow. 2011.
21. Tetlock, P. E. and Gardner, D. Superforecasting: The Art and Science of Prediction. 2015.
22. Robson, D. The Intelligence Trap: Why Smart People Do Stupid Things and How to Make Wiser Decisions.
s.l. : Hodder & Stoughton, 2019.
23. Cox Jr., L.A. What’s Wrong with Risk Matrices? 2008.
24. Thomas, P., Bratvold, RB, Eric Bickel JR. The Risk of Using Risk Matrices. 2013.
25. Miller, K. Quantifying Risk and How It All Goes Wrong,. s.l. : IChemE, 2018.
26. Health and Safety Executive. RR672 Offshore hydrocarbon releases 2001 to 2008. s.l. : HSE Books, 2008.
27. Ashwanden, C. You Can’t Trust What You Read About Nutrition. FiveThirtyEight. [Online] 2016.
FiveThirtyEight.com.
28. Reason, Pr. James. The Human Contribution.
29. J., Thomas. 2020. 2020 MIT STAMP Workshop.
Key messages
Measured risk is a statistic, an average, which may not be indicative of the risks in a
unique system and may not provide any indication of why the risk exists. Robust
statistics are only possible for mass produced items tested under identical conditions
for sufficient time.
Calculated risks must be based on logic or Robust Statistics that are modified in
accordance with Bayesian theory. This is rarely possible for engineered or sociotechnical
systems (Appendix B).
Predicted risk is an epistemic belief, a knowledge related opinion, but accident history
shows that the unknowns can be as important as the knowns, if not more so. It may
be undermined by multiple cognitive biases, conflicts of interest, errors due to poor
understanding of probability theory and counter-intuitive mathematical relationships.
Predicted risk is not real, it is an opinion, generally unique to an individual, which may
be subject to almost unlimited errors that can be extraordinarily difficult to identify,
even by experts.
The risks associated with engineered and sociotechnical systems are nuanced with
potentially chaotic aspects. Predictions, based on experience, appearance, or comparison
with similar systems, may therefore be highly deceptive. ‘Expert judgement’ or ‘sound
engineering judgement’ can only relate to Foreseeability, not to risk quantification.
Accidents only happen because someone thinks the risk is low, so dismissing low
risks is illogical.
This appendix deals with the legal definition of The Management of Health and Safety at Work
risk and the feasibility of measuring, calculating, Regulations 1999 does not specifically define risk
or predicting it for the purposes of demonstrating assessment, but it does say that a ‘every employer
whether it has been reduced to ALARP. shall make a suitable and sufficient assessment of
the risks’ to all persons affected. A court case is
normally conducted based on rational arguments,
A1 Legal definition of risk and which must therefore be the preferred assessment
risk assessment approach, where feasible.
If a dataset can be divided into separate groups, Nevertheless, accident investigations rarely, if ever,
it is not ergodic. An item of mass-produced state the causes to have been random failures of
equipment may fail due to reasons inherent in equipment items, because there are always deeper
its design or the way it is operated. Failures due reasons or the cause was due to poor training,
to design could be ergodic if they are random communications, ergonomics, design, management
and the user has no control over them. However, systems and so on. HSE238 (16) shows that most
failures due to application or the way the accidents involving some form of control error or loss
equipment was operated may vary greatly, so it were caused by poor design or specification, and
is not. If 99 users operate it in one application none of these factors can be put down to probability.
and one uses it in a situation where it fails then Measured risk therefore has little application for
the data could give a 1% failure rate, which engineered systems.
would be far too high for the first application
and seriously underestimate the second. Non-
ergodic data produces averages that greatly A3 Risk calculation (Bayesian methods)
underestimate the effect of the most serious
causes, because they are diluted across a large The Bayesian method converts background data
population for which they do not apply. Without (known as prior probabilities or base rates), into
detailed causal information it may be impossible posterior probabilities reflective of a more specific
to tell whether the data is ergodic and therefore condition, e.g., converting the cancer rates in the
a realistic representation of the relevant risk. general population (the prior) to those for smokers only
(the posterior). In this way it can convert a large dataset
This limits Robust Statistics to subjects with a large into sub-sets, but only if the conditional probabilities
amount of data, such as mass-produced items, and populations of the subsets are known. Taking
operating under identical conditions. Although major the car accident example above, the prior probability
accidents of all types are frequent enough to be (the total accident numbers) can be granularized into
measured statistically on a global basis this will be contextual probabilities, such as car or road type,
of little relevance to measuring the risks of a unique driver gender or age, as shown in this example:
engineered system. It may be possible to measure
failure rates for some components or RRMs, such as Example A3.1
fire protection systems, but these could vary greatly Bayesian calculation for the probability of
for various reasons, such as design, application, an accident given a male driver
operating conditions, natural or Hazardous
environments, or maintenance. In practice, reliability Bayes Formula:
Pr(Accident)
data taken from public sources or manufacturers Pr(Accident │ Male) = Pr(Male | Accident) x
Pr(Male Driver)
may not be collected with the same reasons as data
from a controlled study and may therefore have
Where Pr(Accident │ Male) = the probability of an
limited value (Appendix B1). accident given that the driver is male.
A prediction is therefore an opinion, based on the In Example A4.2, the doctors were all given the same
knowledge held by the person making the judgement, question, but their answers where based on statistics
so it should always include a caveat stating what that not prediction, but nevertheless varied by four orders
knowledge is. There can be no right or wrong answer. of magnitude. Predictions regarding engineered and
Phrases such as ‘sound engineering judgement’ or sociotechnical systems may have similar variations,
‘expert judgement’ are often used in this context, but as illustrated in Example A4.3.
their meaning implies:
• that the expert has acquired all reasonably In terms of probability, these systems are nuanced,
practicable knowledge about the subject before with potentially chaotic aspects, where a small
making such a judgement, and change to a minor detail can often dramatically
change the risk picture. This is evident from most
• that such knowledge would be sufficient to make a
major accidents, which conform to the principle of
meaningful and worthwhile risk prediction.
multiple Causes and are typically dictated by several
of the following characteristics:
Neither of these may be true.
• Complex interfaces and interdependencies
between sub-systems, e.g., equipment and
Example A4.2 activities, such as mechanical, electronic, software,
Risk as an epistemic concept management systems, procedures, permits,
competence, maintenance, operations etc.
What is the probability that a patient has • External influences, such as political, financial,
disease X? conflicts of interest etc.
Doctor A does not know the patient, so she quotes • Quality of leadership, cultures, policies, and
the population statistics, which are 1/10,000. communications.
Doctor B knows that the patient is male and in his • Design error.
70’s, so she quotes the statistics for that group,
which are 1/1,000. • Human factors.
Doctor C has seen the patients test results, which • Environmental factors (natural or man-made)
are positive, but they are only 90% reliable, so she and deterioration.
states 9/10.
Any one of these factors can become dominant, or
Who is correct? combinations of them could change the probability
Answers: from negligible to highly likely, or even certain. This
1) They are all correct, according to the knowledge is too complex to model probabilistically or to make
they hold. mental estimates of their likelihood. The Chernobyl
2) They are all incorrect, because none of them disaster illustrates these principles quite well, as
checked with the haematologist who has shown in Example A4.4.
analysed the blood sample and confirmed that
the patient has disease X. Chernobyl is one example of these sensitivities, but
similar conclusions could be drawn for virtually all
major accidents. The level of detail required would
be impracticable, making any meaningful predictions
NB. Doctor C made another mistake that is easily totally unrealistic.
missed. Her intuition said 9/10, based on the
test reliability of 90%, but the true probability This illustrates how the unknowns can become more
is 1/1,111 because we would expect 10% of all important than the knowns and, even if they are
people tested to be false positives (1,000) and known, it may be impossible to evaluate or quantify
0.9 to be true positives. Risk evaluations are them in any meaningful way.
often counter-intuitive, so judgements can result
in enormous errors and should never be trusted Engineers can sense check most engineering
(see also Appendix B6 to show how large these calculations within an order of magnitude (e.g., a
errors can be). pump or a bridge that is either ten times too large or
small would be apparent to a competent engineer).