Introduction To Risk and Failures
Introduction To Risk and Failures
Introduction To Risk and Failures
D.H. Stamatis
Free Engineering Books
https://boilersinfo.com/
Introduction to
Risk and Failures
Tools and Methodologies
D.H. Stamatis
This book contains information obtained from authentic and highly regarded sources. Reasonable
efforts have been made to publish reliable data and information, but the author and publisher cannot
assume responsibility for the validity of all materials or the consequences of their use. The authors and
publishers have attempted to trace the copyright holders of all material reproduced in this publication
and apologize to copyright holders if permission to publish in this form has not been obtained. If any
copyright material has not been acknowledged please write and let us know so we may rectify in any
future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information stor-
age or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, please access www.copy-
right.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222
Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that pro-
vides licenses and registration for a variety of users. For organizations that have been granted a pho-
tocopy license by the CCC, a separate system of payment has been arranged.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Jeanna
1. Risk..................................................................................................................... 1
General Definition............................................................................................ 1
Other Definitions..............................................................................................2
Economic Risks............................................................................................3
Health Risks..................................................................................................3
Health, Safety, and Environment (HSE) Risks........................................3
Information Technology (IT) and Information Security Risks.............4
Insurance Risks............................................................................................ 4
Business and Management Risks..............................................................4
Human Services Risks.................................................................................5
High Reliability Organizations (HROs)................................................... 5
Security Risks............................................................................................... 6
Societal Risks................................................................................................6
Human Factors Risks...................................................................................7
Risk Assessment and Analysis....................................................................... 8
Quantitative Analysis....................................................................................... 8
Fear as Intuitive Risk Assessment.................................................................. 9
Audit Risk........................................................................................................ 10
Other Considerations..................................................................................... 10
Risk versus Uncertainty................................................................................. 10
Risk Attitude, Appetite, and Tolerance........................................................ 11
Risk as Vector Quantity................................................................................. 12
Disaster Prevention and Mitigation............................................................. 12
Scenario Analysis............................................................................................ 13
Notes................................................................................................................. 15
References........................................................................................................ 20
Selected Bibliography.....................................................................................22
2. Approaches to Risk....................................................................................... 25
Zero Mind-Set................................................................................................. 25
ALARP............................................................................................................. 27
U.K. Statistics: Work Accidents Involving Young People
between 1996 and 2001..............................................................................30
vii
Free Engineering Books
https://boilersinfo.com/
viii Contents
5. HAZOP Analysis...........................................................................................83
Overview..........................................................................................................83
Definitions.......................................................................................................84
Process.............................................................................................................. 86
Minimum Requirements.......................................................................... 86
Defining Risk.............................................................................................. 86
Trigger Events............................................................................................. 87
Use of Analysis........................................................................................... 88
HAZOP Process..............................................................................................90
Definition....................................................................................................90
Preparation.................................................................................................. 91
Examination................................................................................................ 92
Documentation and Follow-Up............................................................... 93
Detailed Analysis............................................................................................ 95
Sequence of Examination.......................................................................... 96
Deviations from Design Intent................................................................. 97
Details of Study Procedure....................................................................... 98
Effectiveness Factors...................................................................................... 99
Team.................................................................................................................. 99
Team Leader (Chairperson).................................................................... 100
Engineers................................................................................................... 100
Description of Process.................................................................................. 101
Relevant Guidewords.............................................................................. 102
Point of Reference Concept..................................................................... 102
Screening for Causes of Deviations....................................................... 104
Consequences and Safeguards.............................................................. 105
Deriving Recommendations (Closure)................................................. 106
Conditions Conducive to Brainstorming.............................................. 106
Meeting Records...................................................................................... 106
Meeting Questions................................................................................... 107
Follow-Up.................................................................................................. 108
Computer HAZOP (CHAZOP).............................................................. 108
Advantages and Disadvantages........................................................ 110
Human Factors HAZOP.......................................................................... 110
Report............................................................................................................. 110
Study Title Page........................................................................................ 110
Table of Contents...................................................................................... 111
xiii
Free Engineering Books
https://boilersinfo.com/
xiv List of Figures
xv
xvi List of Tables
Acronym Meaning
ACOP Approved Code of Practice
ACTS Advisory Committee on Toxic Substances
ALARA As Low as Reasonably Achievable
ALARP As Low as Reasonably Practicable
CBA Cost Benefit Analysis
CD Consultative Document
CEN Comité Européen de Normalisation
CENELEC Comité Européen de Normalisation Electrotechnique
CLAW Control of Lead at Work Regulations
COSHH Control of Substances Hazardous to Health Regulations
CPF Cost of Preventing a Fatality
CSF Critical Safety Function
EC European Communities
E/E/PE Electrical, Electronic or Programmable Electronic
EU European Union
FMRI Final Mishap Risk Index
HSC Health and Safety Commission
HSE Health and Safety Executive
the HSW The Health and Safety at Work, etc. Act
ICRP International Commission on Radiological Protection
IEC International Electrotechnical Commission
IMRI Initial Mishap Risk Index
ISO International Organization for Standardization
MEL Maximum Exposure Limit
MHSWR Management of Health and Safety at Work Regulations
NOAEL No Observed Adverse Effect Level
OEL Occupational Exposure Limit
OES Occupational Exposure Standard
PHA Process Hazard Analysis
PHL Preliminary Hazard List
P&ID Process and Instrumentation Diagrams
Note: In the chemical industry sometimes this acronym means Pipe and
Instrumentation Diagrams)
PPE Personal Protective Equipment
QRA Quantitative Risk Assessment
RBMK Reactor Bolshoi Mozjnoct Kanali
SFAIRP So Far As Is Reasonably Practicable
SSR System Safety Requirement
TLM Top Level Mishap
TOR Tolerability of Risk
VPF Value for Preventing a Fatality
WATCH Working Group on the Assessment of Toxic Chemicals
xvii
Preface
It has been said many times by many individuals that risk is everywhere. We
can never avoid it. It is present in whatever we do. Obviously, we must try
to understand the risks we face and minimize them if possible. This book
is in fact an extension of my first edition published in 1995 and a second in
2003 on failure mode and effects analysis (FMEA) in which I discussed the
benefits of prevention based on an up-front analysis of failures.
As time passed, I noticed that, whereas FMEA is a powerful tool to fore-
cast failures of designs and processes, a missing link involving safety issues,
catastrophic events, and their consequences had to be covered. The second
edition briefly mentioned HAZOP analysis but did not expand on the meth-
odology. In this book, I focus on risk and HAZOP as they relate to major
catastrophic events and safety issues. Specifically, I address processes and
implementation and explain the fundamentals of using risk methodology in
any organization to evaluate major safety and/or catastrophic problems. A
classical and typical view of risk is shown in Figure P.1.
The significance of Figure P.1 is that the risk is emphasized and indeed
becomes more serious as both individual and societal risks become evident.
In fact, the hidden and untold significance is that implicitly the figure also
represents a level of uncertainty as shown in Figure P.2. Both risk and uncer-
tainty in the final analysis may be viewed and analyzed from the following
five perspectives (Callaghan and Walker 2001). In some cases, one factor may
be predominant, but combinations of factors often must be identified and
evaluated. The five perspectives are as follows:
Individual concerns—how individuals see the risk from a particular haz-
ard affecting them, their families, and the things they value. While they may
be prepared to engage voluntarily in activities that often involve high risks,
as a rule they are far less tolerant of risks imposed on them and over which
they have little control unless they see the risks as negligible. Moreover,
while they may be willing to live with a risk that they do not regard as neg-
ligible that secures them or society certain benefits, they would want the risk
levels low and clearly controlled.
Societal concerns—the risks or threats from hazards that impact soci-
ety and, if realized, may produce adverse repercussions for the institu-
tions responsible for putting in place the provisions and arrangements for
protecting people through legislation. These concerns are often associated
with hazards that give rise to risks that, if materialized, could provoke a
socio-political response, for example, events causing widespread or large-
scale consequences or multiple fatalities. Typical examples relate to nuclear
power generation, transportation accidents, or the genetic modification of
organisms. Societal concerns arising from multiple fatalities in a single
xix
xx Preface
Tolerable
region
Broadly acceptable
region
FIGURE P.1
Typical view of risk. (Source: www.HSE.gov.uk and public sector information published by the
U.K. Health and Safety Executive and licensed under Open Government Licence v1. 0)
Consider putative
Conventional consequences and
Likelihood increasingly uncertain
To
w
Emphasis on consequences ign ards
ora
eg if serious/irreversible or need nce
to address societal concerns
FIGURE P.2
Relationship between risk and uncertainty. (Source: www.HSE.gov.uk and public sector infor-
mation published by the U.K. Health and Safety Executive and licensed under the Open
Government Licence v1. 0)
event are known as societal risks. Societal risk is therefore a subset of soci-
etal concern.
Complexity in government regulations—regulations that affect and
effect intra- and inter-
state commerce and international commerce as
well. Throughout the long history of legislation introduced to eliminate or
Preface xxi
minimize risks, the first areas to be regulated have always been the most
obvious, often requiring little scientific insight for identifying problems and
possible solutions. For example, it was not difficult to realize that controlling
airborne dust would reduce the risk of silicosis in miners and that making
it mandatory to guard moving parts of machinery would prevent workers
from being killed or maimed. In short, dramatic progress toward tackling
such problems could be (and was) made without unduly taxing existing sci-
entific knowledge or the state of available technology. However, as the most
obvious risks have been tackled, new and less visible hazards have emerged
and gained prominence. Typical examples include hazards arising from bio-
technology and processes that emit gases that contribute to global warming.
Patterns of employment defined by changing demographics present some
challenges. The regulatory environment must cope with the increasing trend
of industries to outsource work (and the attendant risks), resulting in changes
in patterns of employment and in the fragmentation of large companies into
autonomous organizations working closely together. Dramatic increases in
self-employment and home working have been noted; small and medium
size firms are now major forces in creating jobs. Moreover, many monolithic
organizations have split into separate companies, for example, railways now
operate as separate companies responsible for operating the tracks, rolling
stock, and networks.
Polarization of approaches between large and small firms as a result
of the patterns of employment. Some of these changes have blurred legal
responsibilities for occupational health and safety, traditionally placed on
those who created the risks and were best situated to control them. In certain
industries, it has become difficult to determine who may be in that position.
While case law clarified some situations, the fact remains that in many sec-
tors it is very difficult to coordinate the adoption of measures to control risks.
Many more players are involved, and some have little access to expertise.
Chapter 1 of this book serves as an introduction to risk and provides sev-
eral definitions relevant to a number of industries. A distinction is also made
between risk and uncertainty. Chapter 2 discusses approaches to risk and
the zero mind-set philosophy. In conjunction with the concept of zero mind-
set, the ALARP principle for determining what risk is and what its effects
is also discussed. This chapter also addresses the major, serious, and minor
categories of risks. Chapter 3 covers 18 risk methodologies dealing with anal-
ysis, failures, safety, and hazards.
Chapter 4 is about preliminary hazard analysis and explains how to
evaluate a hazard in the early stages of a design. Chapter 5 covers hazard
and operability (HAZOP) studies. It begins with an overview of HAZOP
and provides key definitions. It also provides a detailed discussion of the
study process, its effectiveness, and the team required to perform the study.
It concludes with a full description of the process and report preparation.
Chapter 6 focuses on fault tree analysis (FTA) and discusses the general
rules of construction and the need for a top-to-bottom approach for defining
xxii Preface
failures and how they relate to HAZOP. Chapter 7 provides 14 additional risk
analysis methodologies for handling HAZOP.
Chapter 8 is titled “Teams and Team Mechanics” and provides a rationale
for utilizing teams in performing HAZOP analyses. It also defines what is
necessary for a team to be effective, qualifications of team members, con-
sensus, team process checks, problem solving, and logistical issues concern-
ing meetings.
Chapter 9 discusses job hazard analysis and OSHA regulations and how
they effect and affect risks in work environments. Chapter 10 is titled “Hazard
Communication Based on CFR 910.1200” and covers a typical automotive
hazard communication program. Specifically, it addresses the individuals
involved, their responsibilities, appropriate training, and the importance of
safe use instructions, chemical materials lists, and material safety data sheets.
Appendix A provides sample checklists for devising a safety plan and a
facility location plan and guidelines of the Australian Health Administration.
Appendix B details a HAZOP project.
Bibliography
Callaghan, B. and T. Walker. (2001). Reducing Risks: Protecting People: Decision-Making
Process. Norwich, U.K.: Crown Publications.
http://www.hse.gov.uk/risk/theory/r2p2.pdf
Acknowledgments
xxiii
xxiv Acknowledgments
I thank the State of New South Wales through the Department of Planning
for giving me permission to use Figure 1, 2, 3, and 4 January 2008 and 2011
Pg. vi, 7, 25-31 and 33. Hazardous Industry Planning Advisory Paper No 8.
(In this book they are Figures I.1,5.3 and Appendix B). HAZOP Guidelines
are from www.planning.nsw.gov.au
I thank Elsevier Publishing for granting permission through the Copyright
Clearance Center for using material from Chapter 3 of Sutton’s 2010 book
titled Process Risk and Reliability Management.
Thanks also to the editors for a superb job on the layout and improvements
to the original manuscript. Your efforts made this book more readable and
certainly more functional to follow.
I thank all my clients and friends who provided me with insights many
times in the application of risk analysis in the area of hazards, including
safety and environmental issues.
Finally, the biggest thank you goes to my chief editor and critic—my wife,
Carla. She has been very supportive during the entire project, pulling me out
of lethargic moods and encouraging me to continue writing. Without her,
this book would never have been finished.
Author
Dean H. Stamatis, PhD, ASQC Fellow, CQE, CMfgE, MSSBB, ISO 9000
Lead Assessor (graduate), is the president of Contemporary Consultants
Co. in Southgate, Michigan. He is a specialist in management consulting,
organizational development, and quality science. He has taught project
management, operations management, logistics, mathematical modeling,
economics, management, and statistics at both graduate and undergradu-
ate levels at Central Michigan University, University of Michigan, ANHUI
University (Bengbu, China), University of Phoenix, and Florida Institute
of Technology.
With over 30 years of experience in management, quality training, and
consulting, Dr. Stamatis has served numerous private sector industries,
including steel, automotive, general manufacturing, tooling, electronics,
plastics, food, maritime, defense, pharmaceutical, chemical, printing, health-
care, and medical device industries.
He has consulted for such companies as Ford Motor Co., Federal Mogul,
GKN, Siemens, Bosch, SunMicrosystems, Hewlett Packard, GM Hydromatic,
Motorola, IBM, Dell, Texas Instruments, Sandoz, Dawn Foods, Dow Corning
Wright, British Petroleum, Bronx North Central Hospital, Mill Print, St. Claire
Hospital, Tokheim, Jabill, Koyoto, SONY, ICM/Krebsoge, Progressive Insurance,
B. F. Goodrich, and ORMET, to name just a few.
Dr. Stamatis has created, presented, and implemented quality programs
with a focus on total quality management, statistical process control (both
normal and short run), design of experiments (both classical and Taguchi),
Six Sigma (DMAIC and DFSS), quality function deployment, failure mode
and effects analysis (FMEA), value engineering, supplier certification,
audits, reliability and maintainability, cost of quality, quality planning, ISO
9000, QS-9000, ISO/TS 16949, and TE 9000 series. He has created, presented,
and implemented programs on project management, strategic planning,
teams, self-directed teams, facilitation, leadership, benchmarking, and cus-
tomer service.
Dr. Stamatis is a certified quality engineer through the American Society
of Quality Control, a certified manufacturing engineer through the Society of
Manufacturing Engineers, a certified master black belt through IABLS, and
is a graduate of BSI’s ISO 9000 lead assessor training program.
He has written over 70 articles, presented many speeches, and participated in
national and international conferences on quality. He is a contributing author to
several books and the sole author of 42 books. His consulting extends across the
United States, South East Asia, Japan, China, India, Australia, Africa, and Europe.
In addition, he has performed more than 100 automotive-related audits, 25 pre-
assessment ISO 9000 audits, and helped several companies attain certification,
xxv
xxvi Author
xxvii
xxviii Introduction
Pre-approval
Development
Preliminary Hazard Analysis Application
Stage
This Guideline
Final Hazard Emergency Plan
Hazard and Fire Safety Study
Operability Study Analysis
Design
Stage
Post-approval
Construction/
Construction Safety Study Commissioning
Stage
FIGURE I.1
Preliminary HAZOP. (Source: HIPAP 8, New South Wales, Australia. With permission.)
Four points related to these six stages are worth emphasizing. First, one
may get the impression that each stage is distinct and independent. In fact,
in reality, the boundaries between them are not clear-cut and often over-
lap. Generally, we collect information or perspectives while progressing
from one stage to another. This process forces us to look and evaluate ear-
lier stages; thereby, we continually improve the process in an iterative and
dynamic mode.
Second, in all stages consensus is of primary concern and therefore des-
ignated team members must actively participate in discussions and the
decision-making process. It is possible that consensus may not be reached,
xxx Introduction
No
No
Yes
Yes
Yes
FIGURE I.2
Selection of HAZOP process.
Introduction xxxi
Relief Valve
PSV-101 Vent
V-101
Liquid
Product
LI-100 LRC-101
FRC-101
P-101A
(steam)
RM-12
T-100
Liquid In P-101B
(electric)
FIGURE I.3
Node selection. (Source: Ian Sutton, http://www.stb07.com/process-safety-management/hazop
.html. With permission.)
in which case multiple meetings must be held to allow the team to reach an
amicable resolution.
Third, it is imperative to recognize a team has no organized or standard-
ized format to follow at any stage because the process under evaluation is not
fixed at this point. It is dynamic and subject to change based on information
gained formally and informally.
Finally, at every stage, the team must be careful to incorporate lessons
learned from previous applications with a major caveat: be careful of past
actions because regulations and/ or circumstances may have imposed
changes that make earlier actions inappropriate or inapplicable to the new
process. Furthermore, some circumstances may require that actions are
taken quickly due to emergencies or other unforeseen events. Obviously,
this occurs because risk always involves uncertainty and any process under
consideration may be more complex than anticipated because sufficient and
applicable data are not available.
Uncertainty is a state of knowledge in which, although the factors influ-
encing the issue are identified, the likelihood of any adverse effects or the
effects cannot be described precisely. Uncertainty has many manifestations,
and they affect the approach to its handling. Key items of concern are as fol-
lows (Callaghan and Walker 2001):
xxxii Introduction
After we have gathered all the necessary information, we are ready for the
appropriate review intended to determine the most appropriate option for
managing the risks. The key to success depends to a large extent on ensur-
ing as far as possible that interested parties are content with the process for
reaching decisions and, hopefully also with the decisions. For example, they
should be satisfied with (1) the way uncertainty has been addressed and the
plausibility of the assumptions made; and (2) how other relevant factors such
as economic, technological, and political considerations have been integrated
in the decision-making process.
Meeting these conditions is not always easy, particularly when parties
have opposing opinions based on differences in fundamental values or con-
centrate on a single issue. Nevertheless, we tackle the first condition by
• Take reasonable care of their own health and safety and the health
and safety of other persons who may be affected by acts or omis-
sions at work.
• Cooperate with their employers as necessary to enable employers to
comply with their statutory health and safety responsibilities.
xxxvi Introduction
Finally, our process for ensuring that risks are properly managed would
not be complete without procedures to review our decisions after a suitable
interval to establish:
• Whether the actions taken to ensure that the risks are adequately
controlled produced the intended result.
• Whether decisions previously reached need to be modified and, if
so, how; for example, because levels of protection that were consid-
ered to be good practices may no longer be regarded as such as a
result of new knowledge, advances in technology, or changes in the
level of societal concerns.
• The appropriateness of the information gathered in the first two
stages of the decision-making process to assist decisions for action,
i.e., the methodologies used for the risk assessment and the cost–
benefit analysis (if prepared) or assumptions made.
• Whether improved knowledge and data would have led to better
decisions.
• What lessons could be learned to guide future regulatory deci-
sions, improve the decision-making process, and create greater trust
among regulators, operators, and those affected by or having an
interest in the risk problem.
References
Callaghan, B. and T. Walker. (2001). Reducing Risks: Protecting People: Decision-Making
Process. Norwich, U.K.: Crown Publications.
HIPAP 8. (January 2011). HAZOP Guidelines. Sydney: State of New South Wales
Department of Planning.
http://www.hse.gov.uk/risk/theory/r2p2.pdf
Selected Bibliography
Cowan, N. (2005). Risk Analysis and Evaluation, 2nd ed., Kent, U.K.: Institute of
Financial Services.
1
Risk
General Definition
Risk is everywhere, in everything we do. In all areas of life, risk is some-
thing that we must manage, whether we direct a major organization or sim-
ply cross a road. When describing risk, however, it is convenient to consider
that risk practitioners operate in specific practice areas.
In general terms, risk is the potential that a chosen action or activity, includ-
ing the choice of inaction, will lead to a loss or undesirable outcome. This, of
course, implies that a choice that will influence the outcome exists or existed.
Potential losses may also be called risks. Almost any human endeavor car-
ries some risk, but some activities are much riskier than others.
The ISO 31000 (2009)/ISO Guide 73:2002 definition of risk (see Note 1 at the
end of this chapter) is the “effect of uncertainty on objectives.” In this defini-
tion, uncertainties include events (that may or may not happen) and uncer-
tainties caused by ambiguity or a lack of information. It also includes both
negative and positive impacts on objectives. On the other hand, although
many definitions of risk exist in common usage, the predominant one is the
ISO definition developed by an international committee representing over
30 countries and based on the inputs of several thousand subject matter
experts (SMEs).
Although many of us use the word risk with different connotations, the
Oxford English Dictionary cites the earliest use of the word in English (spelled
as risque) in 1621 and the risk spelling from 1655. The dictionary defines risk
as (exposure to) the possibility of loss, injury, or other adverse or unwelcome
circumstance; a chance or situation involving such a possibility (Dodge 2003).
We have come a long way in our understanding of the modern meaning
of risk. In fact, for the sociologist Luhmann (1996), risk is a neologism that
appeared during the transition from traditional to modern society. Franklin
(2001 p. 274) informs us that, “in the Middle Ages, the term risicum was used
in highly specific contexts, above all sea trade and its ensuing legal prob-
lems of loss and damage.” In the common languages of the 16th century,
rischio and riezgo were used. The term was introduced to continental Europe
through interaction with Middle Eastern and North African Arab traders. In
1
2 Introduction to Risk and Failures: Tools and Methodologies
the English language, risk appeared only in the 17th century, and “seemed
to be imported from continental Europe.” When the terminology of risk took
ground, it replaced the older notion of “good and bad fortune.” Luhmann
(1996), in trying to explain this transition, wrote, “Perhaps, this was simply
a loss of plausibility of the old rhetorics of Fortuna as an allegorical figure of
religious content and of prudentia as a (noble) virtue in the emerging com-
mercial society.”
As the studies of risk proliferated in many sectors of the economy, scenario
analysis became a primary methodology for dealing with risk. Scenario anal-
ysis matured during Cold War confrontations between the United States
and the Soviet Union. It became widespread in insurance circles in the 1970s
when major oil tanker disasters such as the Exxon Valdez incident off the
coast of Alaska forced a more comprehensive examination of marine risk
issues. The scientific approach to risk entered finance in the 1960s with the
advent of the capital asset pricing model and became increasingly impor-
tant in the 1980s when financial derivatives proliferated. It reached general
business in the 1990s when the power of personal computing allowed for
widespread data collection and number crunching. Governments use sce-
nario analysis, for example, to set standards for environmental regulation.
The U.S. Environmental Protection Agency utilizes pathway analysis.
Other Definitions
The many inconsistent and ambiguous meanings attached to risk led to wide-
spread confusion and the development of various approaches to risk man-
agement in different fields (Hubbard 2009). For example, Jones (2005) sees
risk as relating to the probability of uncertain future events and describes
the probable frequency and probable magnitude of future loss. In computer
science, this definition is used by The Open Group (see Note 2).
OHSAS (Occupational Health & Safety Advisory Services 18001:2007)
defines risk as the product of the probability of a hazard resulting in an
adverse event times the severity of the event. In information security, risk
is defined as “the potential that a given threat will exploit vulnerabilities
of an asset or group of assets and thereby cause harm” to an organization
(ISO/IEC 27005:2008).
Financial risk is often defined as the unexpected variability or volatility
of returns and thus includes worse-than-expected and better-than-expected
returns. References to negative risk should be understood as applying to
positive impacts or opportunities (losses or gains) unless the context pre-
cludes this interpretation. The related threat and hazard terms are often used
to mean an event that could cause harm.
Risk 3
Economic Risks
Economic risks can be manifested in lower returns or higher expenditures
than expected. The causes can be many, for instance, a hike in the prices of
raw materials, the lapsing of deadlines for construction of a new operating
facility, disruptions in a production process, emergence of a serious com-
petitor in a market, the loss of key personnel, a change of political regime, or
natural disaster (Galasyuk and Galasyuk 2007). Reference class forecasting
was developed to eliminate or reduce economic risk (Flyvbjerg 2008).
It is worth noting that from a societal standpoint losses are much more
lucrative than gains because governmental bodies will do anything required,
according to recent research, to avoid losing or resorting to an inferior
position (Nichols 2000 p. 4).
Health Risks
Risks to personal health may be reduced by primary prevention actions that
decrease early causes of illness or by secondary prevention actions after
measured clinical signs or symptoms are recognized as risk factors. Tertiary
prevention reduces the negative impact of an already established disease by
restoring function and reducing disease-related complications.
Ethical medical practice requires careful discussion of risk factors with
individual patients to obtain informed consent for secondary and tertiary
prevention efforts. Public health efforts in primary prevention require edu-
cation of a population at risk. In both cases, careful communication about
risk factors, likely outcomes, and certainties must distinguish causal events
that must be decreased from associated events that may be merely conse-
quences rather than causes. In epidemiology, Rychetnik et al. (2004) reported
that the lifetime risk of an effect is the cumulative incidence, also called incidence
proportion, over an entire lifetime (see Note 3).
Insurance Risks
Insurance is a risk treatment option that involves risk sharing. It can be con-
sidered a form of contingent capital and is akin to purchasing an option at a
small premium to acquire protection from a potential large loss. Insurance
companies assume pools of risks, usually for profit, and may specialize in
areas such as market risk, credit risk, operational risk, interest rate risk, mor-
tality risk, and longevity risk (Carson et al. 2008).
Given the risk for an event, we can determine the total risk by calculating the
products of the individual class risks. For example, in the nuclear industry,
consequence is often measured in terms of off-site radiological release, often
banded into five or six decade-wide bands.
The risks are evaluated using fault tree and event tree techniques (see
Chapters 6 and 7). Where these risks are low, they are normally considered
broadly acceptable. A higher level of risk (typically 10 to 100 times what is
considered broadly acceptable) must be justified against the costs of reduc-
ing the risk further and the possible benefits that make it tolerable. These
risks are described as tolerable if as low as reasonably practicable (ALARP). Risks
beyond this level are classified as intolerable.
The broadly acceptable level of risk has been considered by regulatory
bodies in various countries. The rationale for such acceptability was dem-
onstrated by F. R. Farmer who showed that certain risk is acceptable to indi-
viduals even though the activity considered presents definable risks. He
demonstrated this by hill walking and similar activities. The results of his
6 Introduction to Risk and Failures: Tools and Methodologies
findings were presented as the now famous Farmer curve of acceptable prob-
ability of an event versus its consequence (Ayyub 2003 p. 101).
The technique as a whole is usually known as probabilistic risk assess-
ment (PRA) or probabilistic safety assessment (PSA). The WASH-1400 report,
also known as The Reactor Safety Study, is an example of this practice. The
report was produced in 1975 for the Nuclear Regulatory Commission by a
committee of specialists. However due to heavy criticism that raised many
questions regarding its assumptions, methodology, calculations, peer review
procedures, and objectivity, the report was declared obsolete and replaced
by the State-of-the-Art Reactor Consequence Analyses.
Security Risks
Security risk management involves protection of assets from harm caused by
deliberate acts. A more detailed definition by Talbot and Jakeman (2009) is:
“A security risk is any event that could result in the compromise of organiza-
tional assets, the unauthorized use, loss, damage, disclosure or modification
of organizational assets for the profit, personal interest or political interests
of individuals, groups or other entities constitutes a compromise of the asset,
and includes the risk of harm to people. Compromise of organizational assets
may adversely affect the enterprise, its business units and their clients. As
such, consideration of security risk is a vital component of risk management.
Table 1.1 lists the risk-related sections from ISO/IEC Guide 73:2002 and 2009.
Societal Risks
In a peer-reviewed study of risk in public works projects in 20 nations on
5 continents, Flyvbjerg et al. (2002, 2005) documented high risks for such
ventures based on both costs and demands. Actual costs of projects were
typically higher than estimated costs; cost overruns of 50% were common,
overruns above 100% were not uncommon. Actual demand was often lower
than estimated; demand shortfalls of 25% were common, 50% not uncommon.
TABLE 1.1
ISO/IEC 27001 Clauses Related to Risk
Topic 2002 Clause 2009 Clause
3.9 Residual risk 3.8.1.6
3.10 Risk acceptance 3.7.1.6
3.11 Risk analysis 3.6.1
3.12 Risk assessment 3.4.1
3.13 Risk evaluation 3.7.1
3.14 Risk management 2.1; 3.1
3.15 Risk treatment 3.8.1
Risk 7
Due to such costs and demand risks, cost– benefit analyses of public
works projects have proven highly uncertain. The main causes of cost and
demand risks were found to be optimism bias and strategic misrepresenta-
tion. Measures identified to mitigate such risks include better governance
through incentive alignment and the use of reference class forecasting
(Flyvbjerg 2004).
listening had the effect of narrowing attention such that the frame was
ignored. This is a practical way of manipulating regional cortical activation
to affect risky decisions, especially because directed tapping or listening is
easily done.
Quantitative Analysis
Because risk carries so many meanings, several formal methods are used to
assess or measure it. Some of the quantitative definitions of risk are grounded
in statistics theory and lead naturally to statistical estimates, and some are
more subjective. For example, human decision making is a critical factor in
many situations. Even when statistical estimates are available, in many cases
risk is associated with rare types of failures and data may be sparse.
Often the probability of a negative event is estimated by using the fre-
quency of past similar events (surrogate data) or by event tree methods, but
probabilities for rare failures may be difficult to estimate if an event tree can-
not be formulated. This makes risk assessment difficult in hazardous indus-
tries, for example, nuclear energy, where the frequency of failures is low but
the harmful consequences of failure are numerous and severe.
Statistical methods may also require the use of a cost function, which in
turn may require the calculation of the cost of loss of a human life. This is a
difficult problem. One approach is to ask what people are willing to pay to
insure against death (Landsburg 2003) or radiological release (such as large
quantities of radioactive iodine). Because the answers depend strongly on the
circumstances, this approach is clearly not effective. In statistics, the notion of
Risk 9
Total risk = R1 + R2 + R3 + … + R n
where R1, R2, R3 … R n represents the event risk. For example, if performing
activity X has a probability of 0.01 of suffering an accident of type A, with a
loss of 1000, and a probability of 0.000001 of suffering an accident of type B,
with a loss of 2,000,000, the total risk is a loss of 12 based on a loss of 10 from
an accident of type A plus 2 from an accident of type B.
One of the first major uses of this concept was the planning of the Delta
Works flood protection program in the Netherlands in 1953 with the aid of
mathematician David van Dantzig (Walman 2008). The kind of risk analysis
pioneered for that project has become common today in fields like nuclear
power, aerospace, and the chemical industry (Note 9).
varies from what analysts deem rational. Risk in this case is the degree of
uncertainty associated with a return on an asset. Recognizing and respect-
ing the irrational influences on human decision making may do much to
reduce disasters caused by naive risk assessments that presume to rational-
ity but in fact merely fuse many shared biases.
Audit Risk
The audit risk model expresses the risk that an auditor will provide an inap-
propriate opinion of a commercial entity’s financial statements. It can be
analytically expressed as:
AR = IR × CR × DR
Other Considerations
Another consideration in risk management is that risks are future prob-
lems that can be treated, rather than current ones that must be immedi-
ately addressed.
uncertainty, or risk proper, as we shall use the term, is so far different from
an unmeasurable one that it is not in effect an uncertainty at all. We … accord-
ingly restrict the term uncertainty to cases of the non-quantitative type.” Thus,
for Knight, uncertainty is immeasurable, not possible to calculate, while in
his view risk is measurable.
Another distinction between risk and uncertainty was proposed by
Hubbard (2007 p. 46, 2009 p. 39). Gertner (2003) and Lerner et al. (2000) sug-
gested a similar distinction:
In this sense, Hubbard uses the terms so that one may have uncertainty
without risk but not risk without uncertainty. We can be uncertain about
the winner of a contest, but unless we have some a personal stake in the
outcome, we face no risk. If we bet money on the outcome of the contest,
then we incur a risk. In both cases, more than one outcome is possible. The
measure of uncertainty refers only to the probabilities assigned to outcomes,
while the measure of risk requires both probabilities for outcomes and losses
quantified for outcomes.
lottery ticket is a very risky investment with a high chance of no return and
a small chance of a very high return. In contrast, putting money in a bank
at a defined rate of interest is a risk-averse action that yields a guaranteed
small gain and precludes other investments with possibly higher gains. The
possibility of getting no return on an investment is also known as the rate of
ruin (Note 10).
Scenario Analysis
Scenario analysis is a process of analyzing possible future events by con-
sidering alternative possible outcomes (sometimes called alternative worlds).
Thus, scenario analysis, which is a method of projection, does not try to
show one exact picture of the future. Instead, it presents alternative future
developments. Consequently, a scope of possible future outcomes is observ-
able. Both the outcomes and the development paths leading to the outcomes
are observable. In contrast to prognoses, scenario analysis does not involve
extrapolation of the past or reliance on historical data; it does not expect past
observations to remain valid in the future. Instead, it tries to consider pos-
sible developments and turning points that may be connected to the past. In
short, several scenarios are demonstrated to show possible future outcomes.
The method is useful for generating optimistic, pessimistic, and most
likely scenarios. Experience has shown that about three scenarios are most
14 Introduction to Risk and Failures: Tools and Methodologies
appropriate for further discussion and selection. More scenarios could make
the analysis unclear (Aaker 2001 p. 108; Bea and Haas 2005 p. 279, 287).
Of course, the analysis is designed to allow improved decision making by
allowing consideration of outcomes and their implications. Scenario analysis
can also be used to illuminate “wild cards.” For example, analysis of the pos-
sibility that the earth will be struck by a large celestial object (meteor) sug-
gests that while the probability is low, the damage inflicted will be so high
that the event is much more important (threatening) than the low probability
in any one year would suggest. However, this possibility is usually disre-
garded by organizations using scenario analysis to develop a strategic plan
since it has such overarching repercussions. (Special note: The meteor that
hit Russia on February 16, 2013, may change this assessment and reasonable
scenarios may develop for the future.)
In politics or geopolitics, scenario analysis involves modeling the possible
alternative paths of a social or political environment and possibly diplomatic
and war risks. For example, in the recent Iraq War, the Pentagon certainly had
to model alternative possibilities that might arise in the war situation and
had to position materiel and troops accordingly.
While there is value in weighting hypotheses and branching potential
outcomes from them, reliance on scenario analysis without reporting some
parameters of measurement accuracy (standard errors, confidence intervals
of estimates, metadata, standardization and coding, weighting for non-
response, error in reportage, sample design, case counts, etc.) is a poor sec-
ond to traditional prediction. Especially for complex problems, factors and
assumptions do not correlate in lockstep fashion. A specific sensitivity that
is undefined may call an entire study into question.
It is faulty logic to think, when arbitrating results, that a better hypothesis
will obviate the need for empiricism. In this respect, scenario analysis tries
to defer statistical laws (e.g., Chebyshev’s inequality law) because the deci-
sion rules occur outside a constrained setting. Outcomes are not permitted
to just happen; rather, they are forced to conform to arbitrary hypotheses
ex post, and therefore, there is no basis on which to place expected values. In
truth, there are no ex ante expected values, only hypotheses; and one is left
wondering about the roles of modeling and data decision. In short, compari-
sons of scenarios with outcomes are biased by not deferring to the data; this
may be convenient, but it is indefensible.
We must emphasize here that scenario analysis is no substitute for complete
and factual exposure of survey error in many studies (chemical, nuclear,
automotive, aerospace, economic, and others). In traditional prediction, given
the data used to model the problem, an analyst using a reasoned specifi
cation and technique can state within a certain percentage of statistical error
the likelihood that a coefficient will fall within a certain numerical bound.
This exactitude need not come at the expense of disaggregated statements
of hypotheses. For example, R Software, specifically the “what-if” module
(Stoll et al. 2006) has been developed for causal inference and to evaluate
Risk 15
Notes
Note 1
ISO 31000 is intended to be a family of standards relating to risk management
codified by the International Organization for Standardization (ISO). The
purpose of ISO 31000:2009 is to provide principles and generic guidelines
on risk management. ISO 31000 seeks to provide a universally recognized
paradigm for practitioners and companies employing risk management pro-
cesses to replace the myriad existing standards, methodologies, and para-
digms that differ among industries, subject matters, and regions. Currently,
the ISO 31000 family is expected to include:
Note 2
The Open Group is a vendor and technology-neutral industry consortium,
currently with over 400 member organizations. It was formed in 1996 when
X/Open merged with the Open Software Foundation. Services provided
include strategy, management, innovation and research, standards, certifica-
tion, and test development.
Note 3
Cumulative incidence or incidence proportion is a measure of frequency, as in
epidemiology, where it is a measure of disease frequency over a period of
16 Introduction to Risk and Failures: Tools and Methodologies
CI(t) = 1 – e–R(t)*D
Note 4
Risk aversion is a concept in psychology, economics, and finance, based on the
behavior of humans (especially consumers and investors) while exposed to
uncertainty to attempt to reduce the uncertainty. Risk aversion is the reluc-
tance of a person to accept a bargain with an uncertain payoff rather than
another bargain with a more certain but possibly lower expected payoff. For
example, a risk-averse investor might choose to put his or her money into a
bank account with a low guaranteed interest rate rather than into a stock that
may bring high returns but also involves a chance of losing value.
Note 5
Framing in the social sciences refers to a set of concepts and theoretical per-
spectives on how individuals, groups, and societies organize, perceive, and
communicate about reality. Framing is commonly used in risk analysis,
media studies, sociology, psychology, and political science.
Note 6
Risk assessment is a step in a risk management procedure. Risk assessment is
the determination of qualitative or quantitative value of risk related to a con-
crete situation and a recognized threat (also called hazard). Quantitative risk
assessment requires calculations of two components of risk (R): the magnitude
of the potential loss (L) and the probability (p) that the loss will occur. In all
types of engineering of complex systems, sophisticated risk assessments are
often made in the areas of safety engineering and reliability engineering in
relation to threats to life, environment, or machine functioning. The nuclear,
aerospace, oil, rail, and military industries have a long history of dealing with
risk assessment. Also, medical, hospital, and food industries control risks
and perform risk assessments continually. Methods for assessment of risk
Risk 17
may differ among industries and whether the assessment involves financial
decisions or environmental, ecological, or public health risks.
Risk management is the identification, assessment, and prioritization of risks
(defined in ISO 31000 as the effect of uncertainty on objectives, whether positive
or negative) followed by a coordinated economical application of resources to
minimize, monitor, and control the probabilities and/or impacts of unfortu-
nate events (Hubbard 2009 p. 46) or maximize the realization of opportuni-
ties. Risks can arise from uncertainty in financial markets, project failures
(at any phase in design, development, production, or sustainment life cycles),
legal liabilities, credit risks, accidents, natural causes and disasters, deliberate
attacks from adversaries, or events of uncertain or unpredictable root cause.
Several risk management standards have been developed by various orga-
nizations, including the Project Management Institute, the National Institute
of Standards and Technology (NIST), actuarial societies, and the International
Organization for Standardization (ISO) that created ISO/IEC Guide 73:2009
(2009) and ISO/DIS 31000 (2009). Methods, definitions, and goals vary widely
according to whether the risk management method is in the context of proj-
ect management, security, engineering, industrial processes, financial port-
folios, actuarial assessments, or public health and safety. The strategies to
manage risk typically include transferring the risk to another party, avoiding
the risk, reducing its negative effect or probability, or even accepting some or
all of the potential or actual consequences.
Certain aspects of many risk management standards have come under criti-
cism for having no provision for measurable improvement even if the confi-
dence in estimates and decisions seems to increase (ISO/DIS 31000: 2009).
Note 7
The CCTA risk analysis and management method (CRAMM) was created
in 1987 by the Central Computing and Telecommunications Agency (CCTA)
of the United Kingdom government. CRAMM is currently on its fifth ver-
sion. Each of its three stages is supported by objective questionnaires and
guidelines. The first two stages identify and analyze the risks to a system.
The third stage recommends how these risks should be managed. The three
stages are as follows:
Stage 2. Assessment of the risks to the proposed system and the require-
ments for security by:
• Identifying and assessing the types and levels of threats that may
affect the system.
• Assessing the extent of the system’s vulnerabilities to the identi-
fied threats.
• Combining threat and vulnerability assessments with asset values
to calculate measures of risks.
Stage 3. Identification and selection of countermeasures commensurate
with the measures of risks calculated in Stage 2. CRAMM contains a
very large library consisting of over 3000 detailed countermeasures
organized into over 70 logical groupings.
Note 8
The Methode Harmonisée d’Analyse de Risques (Method for Harmonized
Analysis of Risks or MEHARI) was developed and distributed by CLUSIF
(a group of French information security professionals). Since 1995, MEHARI
has provided to information security personnel (ISO practitioners, risk man-
agers, chief information officers, etc.) to enable them to evaluate and manage
the risks attached to scenarios. MEHARI is derived from previous stan-
dards (ISO/IEC 13335) and steadily evolved to provide compliance to the
newer ISO/IEC 27001-02 and ISO/IEC 27005 standards. MEHARI generally
involves the analysis of the security stakes and a preliminary classification
of the IS entities according to three basic security criteria (confidentiality,
integrity, availability). The typical steps are:
MEHARI complies by design with ISO 13335 to manage risks. This method
can thus take part in a stage of the information security management system
(ISMS) model of ISO 27001 by:
Note 9
The Delta Works is a series of construction projects in the southwest of the
Netherlands intended to protect a large area of land around the Rhine–
Meuse–Scheldt delta from the sea. The works consist of dams, sluices, locks,
levees, and storm surge barriers. The aim of the project was to shorten the
Dutch coastline, thus reducing the number of dikes that had to be raised.
Along with Zuiderzee Works, Delta Works has been declared one of the Seven
Wonders of the Modern World by the American Society of Civil Engineers.
Note 10
Rate of ruin is the probability that a trading stake will “go bust,” based on a
dollar equivalent standard deviation, a winner-to-loser ratio, and a dollar
trading stake. The calculation utilizes the natural log, and the result is a per-
centage probability.
Note 11
Prospect theory is a behavioral economic concept that describes decisions
between alternatives that involve risk where the probabilities of outcomes
are known. The theory says that people make decisions based on the poten-
tial value of losses and gains rather than the final outcome, and they evaluate
the losses and gains using interesting heuristics. The model is descriptive: it
tries to model real-life choices rather than optimal decisions. The paper titled
“Prospect Theory: An Analysis of Decision under Risk” has been called a
“seminal paper in behavioral economics” (Shafir and LeBoeuf 2002).
Cumulative prospect theory (CPT) is a model for descriptive decisions under
risk that was introduced by Tversky and Kahneman (1992). It is a further
development and variant of prospect theory. The difference between this
version and the original prospect theory is that weighting is applied to the
cumulative probability distribution function, as in rank-dependent expected
utility theory but not applied to the probabilities of individual outcomes.
The main modification to prospect theory is that, as in rank-dependent
expected utility theory, cumulative probabilities are transformed rather than
individual probabilities. This leads to the overweighting of extreme events
that occur with small probability rather than to overweighting of all small
probability events. The modification helps avoid a violation of first-order sto-
chastic dominance and makes the generalization to arbitrary outcome distri-
butions easier. CPT is therefore on theoretical grounds an improvement over
prospect theory.
20 Introduction to Risk and Failures: Tools and Methodologies
References
44 USC §3542(b)(1). http://www.law.cornell.edu/uscode/text/44/3542.
Aaker, D. (2001). Strategic Market Management. New York: John Wiley & Sons.
Ayyub, B. (2003). Risk Analysis in Engineering and Economics. New York: Chapman &
Hall/CRC.
Bea, F. and J. Haas (2005). Strategisches Management. Stuttgart: Lucius & Lucius.
Bouyer, J., D. Hémon, S. Cordier et al. (2009). Épidemiologie principes et méthodes quan-
titatives. Paris: Lavoisier.
Carson, J., E. Elyasiani, and I. Mansur. (2008). Market risk, interest rate risk, and inter-
dependencies in insurer stock returns: a system GARCH model. Journal of Risk
and Insurance, 75, 873– 891.
Chang, R., J. Schaperow, T. Ghosh, J. Barr, C. Tinkler and M. Stutzke. (January 2012).
State-of-the-Art Reactor Consequence Analyses (SOARCA) Report. NRC publica-
tions in the NUREG series, NRC regulations, and Title 10, Energy, in the Code
of Federal Regulations. Washington, DC: The Superintendent of Documents U.S.
Government Printing Office Mail Stop SSOP.
Cortada, J. (2003). The Digital Hand: How Computers Changed the Work of American
Manufacturing, Transportation, and Retail Industries. New York: Oxford University
Press.
Cortada, J. (2005). The Digital Hand, Volume II: How Computers Changed the Work of
American Financial, Telecommunications, Media, and Entertainment Industries. New
York: Oxford University Press.
Cortada, J. (2007). The Digital Hand, Volume III: How Computers Changed the Work of
American Public Sector Industries. New York: Oxford University Press, p. 496.
de Becker, G. (1997). The Gift of Fear: Survival Signals That Protect Us from Violence.
Boston: Little, Brown.
Dodge, Y. (2003). The Oxford Dictionary of Statistical Terms. Cambridge: Oxford
University Press.
Drake, R. (2004). Selective potentiation of proximal processes: neurobiological mech
anisms for spread of activation. Medical Science Monitor, 10, 231–234.
Flyvbjerg, B. (2008). Curbing optimism bias and strategic misrepresentation in plan-
ning: reference class forecasting in practice. European Planning Studies, 16, 3–21.
Flyvbjerg, B. (2006). From Nobel Prize to project management: getting risks right.
Project Management Journal, 37, 5–15.
Flyvbjerg, B. (2004). Procedures for Dealing with Optimism Bias in Transport Planning:
Guidance Document. British Department for Transport. http://flyvbjerg.plan
.aau.dk/0406DfT-UK%20OptBiasASPUBL.pdf
Flyvbjerg, B., B. Holm, and S. Buhl. (2005). How (in)accurate are demand forecasts in
public works projects? Journal of the American Planning Association, 71, 131–146.
http://flyvbjerg.plan.aau.dk/Traffic91PRINTJAPA.pdf
Flyvbjerg, B., B. Holm, and S. Buhl. (2002). Underestimating costs in public works
projects: error or lie? Journal of the American Planning Association, 68, 279–295.
http://flyvbjerg.plan.aau.dk/JAPAASPUBLISHED.pdf
Franklin, H. (2001). The Science of Conjecture: Evidence and Probability before Pascal.
Baltimore: Johns Hopkins University Press.
Risk 21
Rychetnik L, P. Hawe, E. Waters et al. (2004). A glossary for evidence based public
health. Journal of Epidemiology: Community Health, 58, 538–545.
Sanderson, H. and J. Lewis. (2011). A Practical Guide to Delivering Personalisation: Person-
Centered Practice in Health and Social Care. London: Jessica Kingsley Publishers.
Schatz, J., S. Craft, M. Koby et al. (2004). Asymmetries in visual-spatial processing
following childhood stroke. Neuropsychology, 18, 340–352.
Schatz, J., S. Craft, M. Koby et al. (2004a). On the role of response conflicts and stim-
ulus position for hemispheric differences in global/local processing: an ERP
study. Neuropsychologia, 42, 1805–1813.
Shafir, E. and R. LeBoeuf. (2002). Rationality. Annual Review of Psychology, 53, 491–517.
Stoll, H., G. King, and L. Zeng. (2006). Whatif: software for evaluating counterfactu-
als. Journal of Statistical Software, 15, 1–19. http://www.jstatsoft.org/
Talbot, J. and M. Jakeman. (2009). Security Risk Management Body of Knowledge. New
York: John Wiley & Sons.
Tversky, A. and D. Kahneman (1992). Advances in prospect theory: cumulative repre-
sentation of uncertainty. Journal of Risk and Uncertainty, 5, 297–323.
Tversky, A. and D. Kahneman. (1981). The framing of decisions and the psychology
of choice. Science 211 (4481), 453–458.
Wolman, D. (2008). Before the levees break: a plan to save the Netherlands. Wired
Magazine. P. 3, Dec. 22.
Selected Bibliography
Bernstein, P. (1998). Against the Gods. New York: John Wiley & Sons.
Clemens, P. and T. Pfitzer. (2006). Risk assessment and control. Professional Safety, 51,
41–44.
Department of the Army. (2006). Composite Risk Management FM 5-19 (FM 100-14).
Washington.
Fahey, L. and R. Randall. (1997). Learning from the Future: Competitive Foresight
Scenarios. New York: John Wiley & Sons.
Flyvbjerg, B., N. Bruzelius, and W. Rothengatter. (2003). Megaprojects and Risk: An
Anatomy of Ambition. Cambridge: Cambridge University Press.
Franklin, J. (2001). The Science of Conjecture: Evidence and Probability before Pascal.
Baltimore: Johns Hopkins University Press.
Gardner, D. (2008). Risk: The Science and Politics of Fear. New York: Random House.
Heldman, K. (2005). Project Manager’s Spotlight on Risk Management. San Francisco:
Jossey-Bass.
Hillson, D. (2007). Practical Project Risk Management: The Atom Methodology. Vienna,
VA: Management Concepts.
Holton, G. (2004). Defining risk. Financial Analysts Journal, 60, 19–25. http://www
.riskexpertise.com/papers/risk.pdf
Hopkin, P. (2012). Fundamentals of Risk Management, 2nd ed. London: Kogan-Page.
Kendrick, T. (2003). Identifying and Managing Project Risk: Essential Tools for Failure-
Proofing Your Project. New York: American Management Association.
Risk 23
Linneman, R. and J. Kennell. (1977). Shirt- sleeve approach to long- range plans.
Harvard Business Review, 55, 141.
Metzner-Szigeth, A. (2009). Contradictory approaches? on realism and constructiv-
ism in the social sciences research on risk, technology, and the environment.
Futures, 41, 156–170.
Porteous, B. and P. Tapadar. (2005). Economic Capital and Financial Risk Management for
Financial Services Firms and Conglomerates. New York: Palgrave Macmillan.
Proske, D. (2008). Catalogue of Risks: Natural, Technical, Social, and Health Risks. New
York: Springer.
Rescher, N. (1983). A Philosophical Introduction to the Theory of Risk Evaluation and
Measurement. Lanham, MD: University Press of America.
Schwartz, P. (1996). The Art of the Long View: Paths to Strategic Insight for Yourself and
Your Company. New York: Random House.
2
Approaches to Risk
Perhaps when one approaches risk in any situation, there is a profound need
to differentiate disaster and mitigation. Risk is the probability that a hazard
will turn into a disaster. Vulnerability and hazards taken separately are not
dangerous, but if they come together, they become a risk or, in other words,
the probability that a disaster will happen.
The last chapter indicates that probability allows risks to be reduced or
managed. If we are careful about how we treat the environment and under-
stand our weaknesses and vulnerabilities to existing hazards, we can take
measures to ensure that hazards do not turn into disasters. Risk manage-
ment helps us prevent disasters; it also helps us practice what is known as
sustainable development. Development is sustainable when people can make
good livings and be healthy and happy without damaging the environment
or other people over the long term. As noted in Chapter 1, an individual can
make a living for a while by chopping down trees and selling the wood, but
the practice is not sustainable if he or she does not plant more trees than are
cut down. The result will be no trees and no means to make a living.
We also touched on prevention and mitigation—actions we can take to
ensure that a disaster does not happen or causes the least possible damage
if it does. We cannot prevent most natural phenomena, but we can reduce
the damage caused. For example, we can decrease earthquake damage by
building stronger houses on solid ground. One way to address very complex
risk issues is to evaluate them via scenario analysis, which is a process of ana-
lyzing possible future events by considering alternative possible outcomes
(sometimes called alternative worlds). We also address this in the last chapter.
Zero Mind-Set
When one deals with risk, the concerns about failures, accidents, and haz-
ards require discussions by all parties concerned. Two fundamental con-
cerns should be covered in order for discussions to be fruitful in the sense of
eliminating or reducing failures, accidents, and hazards:
25
26 Introduction to Risk and Failures: Tools and Methodologies
Both issues are components of the risk planning (RP) activity that focuses
on minimizing and/or eliminating all failures, accidents, and hazards. In
essence, these points must be followed if the focus is on improving an orga-
nizations processes, operations, and equipment utilization.
To satisfy the philosophy of eliminating or reducing problems, a zero
mind-set should be promoted and supported. Important elements of the mind-
set are the understanding and implementation of the principles of ALARP
based on trying to minimize all risks as far as practicable (and below the
defined acceptable levels) after having assessed foreseen failure modes, con-
sequences, and possible risk-reducing actions. ALARP generally is used to
minimize both the probability of an undesired event and the consequences if
it occurs. In practice, ALARP means that all personnel participating in prep-
aration and execution of operations should actively seek to minimize risk as
far as practicable through preventive operational planning and selecting safe
solutions and robust designs. ALARP is considered a mind-set. Risk-reducing
actions should be based on subjective cost–benefit assessments. Examples of
such actions are the installation of critical low-cost components such as pad
eyes and lifting gear, familiarization and hazard awareness training for per-
sonnel, and limiting numbers of personnel in potentially hazardous areas
where wires are under tension.
While the basic philosophies of all safety and hazard programs are simple
and easy to implement in any organization, the following cornerstones will
make any hazard and safety program effective and improvement-driven:
• Shared vision
• Cultural alignment
• Focus on incident control
• Upstream systems definitions
• Feedback
• Maintenance of safe attitude, awareness, action, and accountability
(the four A’s)
• Cultural change
• Commitments by all, especially management
ALARP
In beginning a discussion about risk, invariably we start by explaining the
concepts of as low as reasonably practicable (ALARP) and so far as is reasonably
practicable (SFAIRP). Both terms have essentially the same meaning and the
concept of reasonably practicable is at their cores. They both imply the need
to weigh a risk against the trouble, time, and money needed to control it.
Thus, ALARP describes the level to which we expect to see workplace
risks controlled.
The use of the reasonably practicable principle allows us to set goals
instead of prescriptive practices for the process owners. This flexibility is a
great advantage, but it has drawbacks too. Deciding whether a risk is ALARP
28 Introduction to Risk and Failures: Tools and Methodologies
We briefly defined ALARP and SFAIRP but have not discussed them in
terms of practicality. Let us look at reasonable practicable as it applies to both
terms. SFAIRP is most often used in the Health and Safety at Work Act of
1974 and other regulations. ALARP is used more commonly by risk special-
ists and process owners. In most situations, ALARP is cited. However, in the
health, safety, and environmental area, the consensus is that the terms are
interchangeable with the exception that the correct phrase must be used in
formal legal documents.
What is the correct legal phrase? According to the Britain’s Court of Appeal
definition in the case of Edwards v. National Coal Board [1949] 1 All ER 743, is:
Approaches to Risk 29
In essence, the court ensured that a risk reduced to ALARP met the standard
of weighing the risk against the sacrifice needed to further reduce it. The deci-
sion is weighted in favor of health and safety because the presumption is
that the process owners should implement the risk reduction measure. To
avoid the need for this sacrifice, the process owners must be able to show that
it would be grossly disproportionate to the benefits of risk reduction that would
be achieved. Thus, the process is not one of balancing the costs and benefits
of measures but rather of adopting measures except where they are ruled
out because they involve grossly disproportionate sacrifices. Two extreme
examples are (1) spending $1 million to prevent five staff members from suf-
fering bruised knees is obviously grossly disproportionate; and (2) spending
$1 million to prevent a major explosion capable of killing 150 people is obvi-
ously proportionate.
In reality, many decisions about risk and the controls to achieve ALARP
are not so obvious because many factors come into play, such as ongoing
costs set against remote chances of one-off events and the routine expenses
and supervision time required to ensure that employees wear earplugs set
against a chance of developing hearing losses in the future. ALARP requires
judgment and no simple formula for computing it exists. The calculations may
be very complicated. To facilitate the determination, the following checklist
serves as a guideline:
• Assess compliance with the law and the use of good practice in each
individual situation.
• Reduce risks by defining and understanding ALARP policy and
guidance.
• Identify the principles and guidelines to assist the team in deciding
whether process owners reduced risk ALARP.
• Identify health, safety, and environmental principles for a cost–
benefit analysis (CBA) in support of ALARP decisions.
• Develop a cost–benefit analysis (CBA) checklist.
• Review ALARP in a cursory approach.
A risk, on the other hand, is the likelihood that a hazard will actually cause
adverse effects and involves a measure of the effect. Risk is a two-part concept,
and both parts are required for a practitioner to make sense of it. Likelihoods
can be expressed as probabilities (one in a thousand), frequencies (1000 cases
per year), or qualitatively (negligible, significant, etc.). The effect can be
described in many ways. For example, we know that accidents happen in
all areas of life. The following statistics are for the United Kingdom and the
United States The sources are http://www.hse.gov.uk/education/statistics.
htm and http://www.osha.gov/oshstats/commonstats.html, respectively.
This represents a slight increase from the 4,551 work fatalities in 2009 and
the second lowest annual total since the fatal injury census was first con-
ducted in 1992. In the construction field, the following fatalities in 2011 are
worth noting:
Note that the 10 most frequently cited violations of OSHA standards in 2011
could have been minimized or even eliminated if proper analysis was per-
formed. The standards in question are:
Sacrifice Costs
> 1 × DF, or > 1 × DF
Risk Benefits
Approaches to Risk 33
As we can see from the formula above, the focus of a CBAs is to ensure that all
the appropriate costs have been included and challenge the costs that seem
out of the ordinary and excessive. Therefore, it would be proper to include
the costs of installation, operation, training, additional maintenance, and the
business losses that would follow a plant shutdown undertaken solely for
the purpose of eliminating or minimizing a hazard. In fact, all claimed costs
must be incurred by the process owner. Costs incurred by other parties such
as members of the public should not be counted.
On the other hand, sacrifice implies non-recoverable cost. If a measure
implies lost production, only the lost production during the delay can be
counted. Conversely, if lost production is actually deferred production (if the
life of the plant is based on operating time rather than calendar time), it
should only take account of interest on the lost production plus allowance
for operational costs during the implementation time and potential increase
in operational costs at the end of life. For example, oil or gas remaining in
a field while work is carried out on a platform should not be counted as
lost production.
If the lost production costs strongly influence a decision not to imple-
ment, the process owner should show that phasing or scheduling the work
to coincide with planned downtimes, for example, for maintenance, would
not change the balance. The costs considered should be only those necessary
and sufficient for the purpose of implementing the risk reduction measure
(no “gold plating” or deluxe items).
Ongoing production losses as a result of the measure (slowed produc-
tion or increased maintenance) can be counted. Any savings resulting from
the measure such as reduced operational costs, avoidance of damage, and
reinstatement costs should be offset against the above costs. These are not
34 Introduction to Risk and Failures: Tools and Methodologies
considered safety benefits but are counted as savings because they reduce
the overall cost of implementing a measure. Finally, the costs claimed should
relate only to the measure being implemented for safety. Translation into
monetary costs is often uncertain and that is why all costs must be justified.
Now that we have looked at cost issues, we can consider the benefits.
The focus is to ensure that all benefits of implementing a health and safety
improvement measure are included and that the benefits associated with
the measure are not underestimated. The benefits should include all reduc-
tions in risks to members of the public, workers, and the wider community.
Benefits can be classified so that prevention is assured in typical areas such
as (1) fatalities, (2) major and minor injuries, (3) ill health, and (4) environ-
mental damage (control of major accident hazards or COMAH).
Benefits can also include avoidance of deployment of emergency services
and avoidance of countermeasures such as evacuation and post-accident
decontamination if appropriate. The cash valuations of preventing health
and safety effects on people are presented in the Table 2.1. The value costs
are estimates. It is very important to note that all benefits of reducing an
injury type should be included. If a risk reduction measure is identified for
one type of accident but reduces other risks such as health problems, all
benefits should be counted. Because of this convolution, the responsible par-
ties may need to treat reinstatement costs as benefits rather than offsetting
them against costs. This would be the case if a plant being reinstated were
safety-related plant, for example, one that treats hazardous wastes. This can
TABLE 2.1
Typical Cash Valuation for Cost–Benefit Analysis
Type of Injury Explanation Value (Cost in $)
Fatality Loss of life 2,000,000
Injury
Permanently Moderate to severe pain for 1 to 4 weeks; pain 250,000
incapacitating gradually reducing but may recur during some
activities; some permanent restrictions to leisure and
possibly work activities
Serious Slight to moderate pain for 2 to 7 days followed by pain 50,000
or discomfort for several weeks; some restrictions on
work and leisure activities for several weeks to
months; return to normal health after 3 to 4 months
without permanent disability
Slight Minor cuts and bruises; quick and complete recovery 500
Illness
Permanently Same as for injury 250,000
incapacitating
Other ill health Absence exceeding 1 week; no permanent health 3,000 + 150 per
consequences day for absence
Minor Absence up to 1 week; no permanent health consequences 1,000
Approaches to Risk 35
Although these issues are not ones that health, safety, and environment
would require, a responsible process owner to consider they can often play a
significant part of any decision about investing in new and safer technology.
A strong warning to the reader is necessary. Just because the verbalization
of the assumptions and uncertainties seems easy and practical, in practice
the assumptions may be much more involved than discussed here. In such
situations, the team should focus on aspects of risk analysis from several
perspectives so that the outcome of a sound CBA may become one of sev-
eral considerations involved in the judgment that a risk has been reduced
ALARP. For example, in policy work and in operational work dealing with
many hazards, you may also need to consider how the public feels about
the risk. In such cases, a CBA should detail societal concerns in the areas of
reducing risks and protecting people.
Risk Leverage
Risk leverage analysis (RLA) is commonly included in preparation of a CBA.
RLA measures the relative costs and benefits of performing various candi-
date risk resolution activities. The equation to calculate risk leverage is:
RE before − REafter
Risk resolution cost
where leverage is a rule for risk resolution that reduces risk by decreasing
the risk exposure (RE). Risk resolution cost is the cost of implementing the
risk action plan. The concept of leverage helps determine actions with the
highest paybacks. Generally, major risk leverage exists in the early phases of
all design and/or development projects. Of course, the intent is to identify as
many risky items as early as possible to reduce rework costs and to minimize
expensive fixes as the design and/or development moves into later phases.
The risk exposure preceding a specific activity is the probability of a cost
overrun multiplied by the consequence of the overrun. At project completion,
100% of the cost overrun is the prior risk exposure. Assuming 75 cents on the
dollar for a reasonable claim, only 25% of the cost overrun remains at risk
after the damage claim activity. If the cost to prepare the entitlement (legal
basis for the claim) and quantum (value of the claim) is $100,000 (3 experts
for 6 months), the risk leverage would be 7.5 to 1 for a $1 million overrun, and
75 to 1 for a $10 million overrun (Hall 1998 p. 115). To calculate risk leverage,
we use the following formula:
Accidents are events that happen without warning and planning. Hazards
are situations or events that threaten life, health, property, or environment.
Failures, accidents, and hazards may all be defined as deviations from
expected outcomes. The fact that they are deviations from goals or targets
requires actions to resolve the deviations.
ALARP Fallacies
Over the years, many myths have developed around ALARP. Many of
these myths have spread over many applications in a variety of industries
and organizations. The four most important ones are identified below
(www. Hse.gov.uk).
Ensuring that risks are reduced ALARP means that we have to raise
standards continually—It is part of health, safety, community, and environ-
ment philosophy that we seek continual improvements in health and safety
standards. That philosophy is widely shared in several countries and pro-
duced excellent records whenever it has been applied. As a case in point,
Britain has one of the best records for occupational health and safety in the
world. However, for any country to achieve similar results, it must encour-
age improvements in a responsible way. Deciding whether an activity is safe
enough (risk is reduced ALARP) is a separate exercise from seeking con-
tinual improvements in standards. Of course, as technology develops, new
and better methods of risk control become available.
Process owners should review what is available from time to time and
consider whether they need to implement new controls. That does not mean
38 Introduction to Risk and Failures: Tools and Methodologies
that the best risk controls available are always reasonably practicable. Only if
the cost of implementing these new methods of control is not grossly dispro-
portionate to the reduction in risk they achieve is their implementation rea-
sonably practicable. For that reason, we accept that it may not be reasonably
practicable to upgrade an older plant and equipment to modern standards.
However, other measures such as partial upgrades or alternative measures
may be required to reduce the risk ALARP.
The determination of what is ALARP will also be affected by changes
in knowledge about the size or nature of the risk presented by a hazard. If
sound evidence indicates that a hazard presents significantly greater risks
than previously thought, of course we should press for stronger controls to
handle the new situation. However, if the evidence shows the hazard pre
sents significantly fewer risks than previously thought, we should accept a
relaxation in control if the new arrangements ensure the risks are ALARP.
If few employers have adopted a high standard of risk control, the stan-
dard is ALARP—Some organizations implement standards of risk control
that are more stringent than good practice for a number of reasons, such
as meeting corporate social responsibility goals, striving to be the best, or
reaching an agreement with staff to provide additional controls. It does not
follow that these risk control standards are reasonably practicable simply
just because a few organizations adopted them. Until such practices are
evaluated and recognized by the properly recognized government authori-
ties, an organization should not seek to enforce them at policy or operational
level. It is also acceptable for a process owner to relax from a self-imposed
higher standard to one accepted as ALARP, for example, simply meeting the
requirements of a relevant approved code of practice (ACOP).
Ensuring that risks are reduced ALARP means that we can insist on
all possible risk controls—ALARP does not mean that every measure that
could possibly be taken (however theoretical) to reduce risk must be taken.
Sometimes a risk can be controlled by more than one method. These con-
trols can be considered barriers that prevent a risk from being realized.
Companies are tempted to require more and more of these protective barri-
ers to reduce risks as much as possible. Typical approaches may be controls
such as limit switches, horns, light signals, and mistake proofing (“belt and
braces” approaches). However, remember that ALARP means that a barrier
can be required only if its introduction does not involve grossly dispropor-
tionate cost.
Ensuring that risks are reduced ALARP means that there will be no acci-
dents or ill health—ALARP does not represent zero risk. We must expect a
risk arising from a hazard to be realized on occasion and harm to occur even
if the risk is ALARP. This is an uncomfortable thought, but it is inescapable.
Of course we should strive to make sure that process owners reduce and
maintain the risks ALARP, and we should never be complacent. However,
we must accept the reality that risk from an activity can never be entirely
Approaches to Risk 39
eliminated unless the activity is stopped. This relates to the issue of risk tol-
erability and explains why risk assessments feed into contingency planning.
Example
A simple method for coarse screening of measures is shown in Table 2.2.
This puts the costs and benefits into a common format of dollars per year for
the lifetime of a plant. Assume a distillery plant utilizing a process in which
an explosion could lead to:
• 30 fatalities
• 50 permanent injuries
• 200 serious injuries
• 500 slight injuries
Further assume that the rate of occurrence of this explosion has been ana-
lyzed to be about 1 × 10 –6 per year or 1 in 1,000,000 annually. The plant has
an estimated lifetime of 30 years. What we want to know is how much could
the company reasonably spend to eliminate (reduce to zero) the risk from
an explosion? If the risk of explosion were eliminated, the benefits can be
assessed as shown in Table 2.2. The $3,390 represents the estimated bene-
fit of eliminating a major accidental explosion at the plant on the basis of
avoidance of casualties. The example did not include discounting or consider
inflation. Also, for an injury to be recognized as not reasonably practicable,
the cost must be grossly disproportionate to the benefits. This is taken into
account by the disproportion factor (DF). In this case, the DF indicates the
consequences of such explosions are great. A DF exceeding 10 is unlikely.
In our example, it might be reasonably practicable to spend about $33,900
TABLE 2.2
Screening Measures
Number of Rate of
Injuries Incidents Value Explosion Years Dollars
Fatalities 30 × 3,000,000 × 1 × 10–6 × 30 2,700
Permanent 50 × 250,000 × 1 × 10–6 × 30 375
injuries
Serious injuries 200 × 50,000 × 1 × 10–6 × 30 300
Slight injuries 500 × 1,000 × 1 × 10–6 × 30 15
Total benefits 3,390
40 Introduction to Risk and Failures: Tools and Methodologies
Differentiating Risks
Although risk is everywhere, those who deal with it must differentiate the
levels for all tasks under investigation. Financial issues play a role in the speci-
ficity of a category, but also depend on the industry and magnitude of a
project. The amounts cited here are only examples. Generally, risks are rated
into three categories:
Major Risks
1. Employee fatalities or serious injuries during work on or off corpo-
rate premises
2. Contractor fatalities or serious injuries during work on or off corpo-
rate premises
3. Explosion, fire, or other acute incident resulting in significant dam-
age to corporate or third-party property and/or third-party injuries
4. Third-party fatalities in incidents involving corporate vehicles
5. Major transportation accidents involving corporate products (e.g.,
vehicle rollover or product release)
6. Noteworthy product contamination (e.g., contaminated oxygen
intended for medical use)
7. Property damage or business interruption likely to cost $1 million
or more
Approaches to Risk 41
Serious Risks
1. Lost time injuries or severe injuries without permanent disabilities
2. On-site material release contained with assistance
3. Off-site release with only minor detrimental effects
4. Statutory offense
5. Financial loss between $10,000 and $1 million
6. Media attention garnering local coverage
Minor Risks
1. First aid or medical attention required
2. On-site material release immediately contained
3. Financial loss between $1,000 and $10,000
Risk Priorities
Risk categories are very significant because they are components of a risk assess-
ment. The three risk categories are generic, intended to explain risk in general.
However, most corporations for simplicity use the following priorities (P’s):
P1—Serious risk
P2—High risk
P3—Medium risk
P4—Low risk
When each risk has been assigned an appropriate reduction measure, the
next step is to prioritize the actions. The system for prioritizing and setting
target dates for hazards and risks varies based on type of organization, but a
typical priority system may be:
A priority rating should correspond to the level of risk; the greater the risk,
the higher the priority. Other factors influencing a decision may require thor-
ough reviews and documentation. Only executive management can approve
extensions to P1 and P2 non-conformances. Some non-conformances are too
large for corrective actions to be completed within the scheduled timelines.
Extensions can be granted if a documented corrective action plan (CAP)
showing amended timelines for completion is developed.
Priority may be defined in simpler terms for convenience when a project is
not large or critical in nature. The alternative priority may be set as (1) high,
(2), medium, (3) low, or (4) of no consequence. In addition to these qualitative
measurements, it is very common in HAZOP analysis to use a risk assess-
ment matrix (RAM) as shown in Table 2.3. Table 2.4 lists category definitions
adopted from the Department of the Army.
TABLE 2.3
Risk Assessment Matrix: Hazard Probability
Definition
Category Frequent Likely Occasional Seldom Unlikely
Catastrophic E E H H M
Critical E H H M L
Marginal H M M L L
Negligible M L L L L
Source: Department of the Army (2006). Composite Risk Management FM
5-19 (FM 100-14). Washington, p. 8.
Note: E = extremely high. H = high. M = moderate. L = low.
Approaches to Risk 43
TABLE 2.4
Categories of Risk Assessment Matrix
Category Description
Frequent Frequent – Occurs very often, known to happen regularly
Likely – Occurs several times, common occurrence
Occasional – Occurs sporadically, not uncommon
Seldom – Remotely possible, could occur at some time
Unlikely – Probably will not occur, but not impossible
Catastrophic Complete shutdown of project
Death or permanent total disability
Loss of major equipment
Major property or facility damage
Severe environmental damage
Critical Severe downgrade of project status
Permanent partial or temporary total disability
Extensive major damage to equipment or systems
Significant damage to property or environment
Marginal Downgrade of project goals
Minor damage to equipment, systems, property, or environment
Lost days due to injury or illness
Minor damage to property or environment
Negligible Little or no adverse impact on project
First aid or minor medical treatment
Slight equipment or system damage, but fully functional or serviceable
Little or no property or environmental damage
Reference
Hall, E. (1998). Managing Risk. New York: Addison Wesley.
http://www.hse.gov.uk/risk/theory/alarpglance.htm
Selected Bibliography
Cagno, E., F. Caron, and M. Mancini. (2002). Risk analysis in plant commissioning: the
multilevel HAZOP. Reliability Engineering and System Safety, 77, 309–323.
LaDuke, P. (2013). Bleeding money: how much does safety really cost? Fabricating and
Metal Working, Feb., 18–19.
Linneman, R. and J. Kennell. (1977). Shirt- sleeve approach to long- range plans.
Harvard Business Review, 55, 141.
Schwartz, P. (1996). The Art of the Long View: Paths to Strategic Insight for Yourself and
Your Company. New York: Random House.
Topping, M. (2001). The role of occupational exposure limits in the control of workplace
exposure to chemicals. Occupational and Environmental Medicine, 58, 138–144.
3
Types of Risk Methodologies
There are many techniques to evaluate risk. In fact, the methodologies and
specific tools vary as much as the organizations that use them. In this chapter,
we will introduce some of the most common ones. Risk analysis methodolo-
gies fundamentally fall into three categories:
1. Qualitative methodologies
a. Preliminary risk analysis
b. Hazard and operability (HAZOP) studies
c. Failure mode and effects analysis (FMEA); failure mode and
effects criticality analysis (FMECA)
2. Tree-based techniques
a. Fault tree analysis
b. Event tree analysis
c. Cause–consequence analysis
d. Management oversight risk tree
e. Safety management organization review
3. Techniques for dynamic systems
a. Go method
b. Digraph or fault graph
c. Markov modeling
d. Dynamic event logic analytical methodology
e. Dynamic event tree analysis
Qualitative Methodologies
Preliminary Risk Analysis
Preliminary risk or hazard analysis is a qualitative technique involving a
disciplined analysis of event sequences that could transform a potential
hazard into an accident. The possible undesirable events are identified first
45
46 Introduction to Risk and Failures: Tools and Methodologies
and then analyzed separately. For each undesirable event or hazard, possible
improvements or preventive measures are formulated.
The results provide a basis for determining which categories of hazards
should be examined more closely and which analysis methods are most suit-
able. Such an analysis may also prove valuable in a working environment
where activities lack safety measures that can be readily identified. With the
aid of a frequency or consequence diagram, the identified hazards can then
be ranked according to risk level, allowing measures to prevent accidents to
be prioritized.
TABLE 3.1
Initial FMEA Documentation
Project: Component: Page:
Component description: Date:
Drawing No.
Equipment Affected
Number Failure Mode Detection Method Identification Effects Comments
48 Introduction to Risk and Failures: Tools and Methodologies
FIGURE 3.1
Typical flow for generating FMEA.
TABLE 3.2
Typical HAZOP/FMEA Worksheet
FMEA Title: FMEA No.
Project Title: Control No. Issue:
System:
Subsystem:
Component:
Types of Risk Methodologies
S= O= D= R=
E= C= E= P=
V= C= T= N=
49
50 Introduction to Risk and Failures: Tools and Methodologies
The rationale for this activity is to make sure that the major items of con-
cern are identified and the deviations are accounted for in the analysis of
risk. When the FMEA is extended by a criticality analysis, the technique is
then called failure mode and effects criticality analysis (FMECA). Table 3.3
is a typical combination worksheet. FMEA has gained wide acceptance by
the automotive, aerospace, military, and many service industries. In fact, the
technique has been adapted in other forms such as misuse mode and effects
analysis and failure mode analysis.
Advantage
FMEA involves a systematic review of a process. Its detailed methodology
allows an item-by-item assessment of an operation.
Disadvantages
FMEA may be time consuming and expensive. Complex processes will
require investigation of many items, each involving examination of a com-
plex series of failure modes.
FMEA is not effective for identifying hazards due to more than one failure.
It is difficult to combine the effects of multiple failure modes of various items
to identify combined hazards.
It is difficult to identify some items that may have been investigated to
identify their various failure modes. Newer equipment may not be well doc-
umented, and some failure modes may be missed.
FMEA requires large amounts of data. The plant or operation needs to
be well established before the technique can be performed, and the various
failure rates for each item must be known.
General Comments
The three techniques just discussed require the employment of hardware-
familiar personnel only. FMEA tends to be more labor intensive because the
failure of each component of a system must be considered. A point to note is
that these qualitative techniques can be used in both the design and opera-
tional stages of a system.
In fact, all the techniques discussed have seen wide usage in the nuclear
power and chemical processing plants. Furthermore, FMEA, one of the most
documented methods, has been used in the automotive, aerospace, medical
device, electronics, communication, and many more industries to improve the
reliability of their processes and products. Figure 3.1 illustrates typical flow.
Preliminary risk analysis has been applied to safety examination in indus-
tries and on offshore platforms. HAZOP is commonly used in the chemical
industry to obtain detailed failure and effect data from studies of piping and
instrumentation layouts and process and instrumentation layouts.
TABLE 3.3
FMEA and FMECA Worksheet
System
Types of Risk Methodologies
Subsystem
Component
Equipment
or Intermediate End Alternatives Risk Value
Activity Component Identification Failure Failure Failure Failure or P×C=R
No. Name Function No. Mode Effect Effect Detection Redundancies (for FMECA only) Comments
Tree-Based Techniques
Fault Tree Analysis (FTA)
A fault tree is a logical diagram that shows the relationship of system failure,
i.e., a specific undesirable event in a system, and failures of the components
of the system. It is a technique based on deductive logic. An undesirable
event is first defined and the causal relationships of the failures leading to
the event are identified. A fault tree can be used in qualitative or quantita-
tive risk analysis. The difference is that the qualitative fault tree has a looser
structure and does not require the same rigorous logic needed for a formal
fault tree.
In essence, fault tree diagrams represent the logical relationship between
a subsystem and component failures and how they combine to cause sys-
tem failures. The top of a fault tree represents a system event of interest and
is connected by logical gates to component failures known as basic events.
After creating the diagram, failure and repair data are assigned to all system
components. Analysis is performed to calculate the reliability and availabil-
ity parameters of the system and identify critical components. Chapter 6 cov-
ers fault tree analysis.
General Comments
The tree-based methods are mainly used to determine conditions that lead
to undesired events. In fact, ETA and FTA have been widely used to quantify
54 Introduction to Risk and Failures: Tools and Methodologies
the probabilities of accidents and other undesired events leading to the loss
of life or economic losses in probabilistic risk assessment. However, their use
is confined to static logic modeling of accident scenarios. Using ETA and FTA
to analyze hardware failures and human errors does not allow explicit mod-
eling of the conditions affecting human behavior. This affects the assessed
level of dependency of events. Techniques such as human cognitive reliabil-
ity to reconcile such deficiencies in FTA for modeling such responses have
emerged and more will be developed in the future.
This method allows cycles and feedback loops that make it attractive for
dynamic systems.
In practice, Markov models are used as sources of basic event data. In addi-
tion, they also may be analyzed independently of a fault tree analysis. In
essence, Markov modeling is a classical modeling technique used for assess-
ing the time-dependent behaviors of dynamic systems. In Markov chain
processes, transitions between states are assumed to occur only at discrete
points in time. In a discrete Markov process, transitions between states are
allowed to occur at any time point. For process systems, the discrete method
states can be defined in terms of ranges of process variables and compo-
nent status.
This methodology incorporates time explicitly and can be extended to
cover situations in which problem parameters are time independent. The
56 Introduction to Risk and Failures: Tools and Methodologies
dP/dt = MP(t)
General Comments
The techniques discussed above address the deficiencies found in FTA and
ETA methodologies by analyzing dynamic scenarios. However, the tech-
niques have limitations. The digraph and GO techniques model system
behaviors and deal to a limited extent with changes in model structures over
time. Markov modeling requires the explicit identification of possible system
states and the transitions among them. However, it is difficult to envision a
complete set of possible states prior to scenario development.
DYLAM and DETAM can solve the problem by defining implicit state
transitions. The drawbacks to these implicit techniques are implementation-
oriented. Computer resources are required for analysis of the large tree struc-
tures generated by DYLAM and DETAM. Another issue is that the implicit
methodologies may require considerable analyst effort to gather data and
construct models.
These 13 risk analysis techniques address some of the fundamental qual-
itative methodologies although they lack the ability to (1) account for the
dependencies among events, and (2) effectively identify potential hazards
and failures within a system. The tree-based techniques addressed this defi-
ciency by considering the dependencies among events. The probability of
occurrence of an undesired event can be quantified based on operational
data. However, no one (to our knowledge) has yet attempted to quantify the
undesired top event on a MORT tree.
Current research has utilized DYLAM and DETAM to study accident scenar-
ios by treating time, process variables, system behaviors, and operator actions
through an integrated framework. These techniques address the problem of
less-than-adequate modeling of conditions affecting control system actions
and operator behavior (e.g., behavior of plant process variables, decisions by
operating personnel) when using FTA and ETA. However, two drawbacks for
these techniques are the needs for extensive computer resources and large
data collections. The development of more efficient algorithm and powerful
computer processors allow these methods to be applied more widely.
Traditional Methodologies
The methods described above are powerful ways of addressing risk and
HAZOP issues. However, other less demanding approaches may also be
used to identify these risks and the more typical and traditional ones are
described below.
58 Introduction to Risk and Failures: Tools and Methodologies
What-If Method
The what-if method (the term is hyphenated and the question mark is omit-
ted in the OSHA regulation) is the least structured hazard analysis tech-
nique and requires the least time. A what-if analysis is conducted by a team
of experienced analysts, engineers, and operations experts whose knowl-
edge and experience equips them to identify several scenarios in such a
way that hazards may be eliminated or minimized. It is important to note
that the team is relatively unstructured. The success of the analysis depends
on the (1) knowledge, (2) thinking processes, (2) experiences, and (4) attitudes
of the team members. This loosely defined approach allows the team mem-
bers to be creative and to expand ideas for appropriate resolutions. In other
words, the loose structure allows out-of-the-box thinking.
However, despite the lack of structure, all team members must prepare
appropriately and thoroughly before they meet for discussion. This prepara-
tion permits and requires all members to participate actively and understand
the issues at hand. Without this preparation, the team will end up discussing
important issues rather than evaluating the findings of the team members
in such a way that a resolution through consensus is agreed upon. Typical
issues that can be discussed during a review include:
1. Divide the facility into nodes, similar to a HAZOP, except that the
nodes are typically larger and more loosely defined. An example of
node separation is shown in Figure I.3.
2. Organize the analysis by major items of equipment like an FMEA
and then discuss the types of failure modes for each item.
Let us examine each approach separately. We begin with the node analysis
and follow with guidelines for utilities, batch processes, operating proce-
dures, and equipment layout.
Nodal analyses are usually organized around major sections of a process
such as a distillation column or a launching system. Team members ask
what-if questions such as: What if there is high pressure? What if the opera-
tor forgets to do this? What if there is an external fire in this area? Using this
approach, many of the individuals on the team will probably instinctively
follow the HAZOP guideword approach. Consequently, a what-if analysis of
this type may take the form of a faster-than-normal HAZOP. However, the
scribe of the team will not need to take notes on every deviation guideword.
Only meaningful discussions should be recorded. What-if discussions tend
to jump from node to node more than normal in a HAZOP analysis, thus
placing greater pressure on the leader and scribe to achieve results and arrive
at relevant conclusions. Some what-if questions for as nodal analysis are:
• Pressure vessels
• Pumps
• Compressors
• Distillation columns
60 Introduction to Risk and Failures: Tools and Methodologies
• Absorbers
• Storage tanks
• Vents
• Flares
• Piping systems
The concept is to use the what-if questions to deal with issues such as leaks
and over-pressure related to specific equipment types. In the case of utility
systems, the analysis of steam headers and instrument air systems can be
difficult because the locations of nodal boundaries are not always clear. A
discussion that starts in one area can roam far and extend almost across an
entire facility. It is very important to recognize that utility systems involve
large numbers of process interfaces, any of which may leak.
Sometimes a leak will be from a utility into a process; or the leak may be
from a process to a utility. The source of a problem can be difficult to detect
in either situation. To optimize detection, one way of analyzing a system is
for the team leader and scribe to note potential interface problems as they are
discussed during the process analysis. These notes can then be discussed by
the group when the utilities are analyzed.
In the case of batch processes, hazard analysis methodologies were devel-
oped initially for large continuous operations such as petrochemical plants
and refineries. However, as we learned more about the methodology, we
applied the HAZOP principles to smaller organizations, especially those
that utilize batch processes in pharmaceutical production and food process-
ing. In some cases, we apply the batch process principles to continuous oper-
ations with internal batch operations such as truck loading and unloading.
The operation of a batch process is dynamic and time is a variable.
Therefore, the analysis of a batch process is more complex than analyzing
a steady-state operation. One way of handling this additional complexity
is to systematically work through the operating procedures using a what-if
approach in which deviation guidewords prompt questions. For example,
if an instruction is to add 200 liters of water to V‑200, the team might ask:
Question Guideword(s)
What if the vessel is over-filled? High level
What if a liquid is not water? Contamination
What if fewer than 200 liters are available? Low flow
What if V-200 is over-pressured? High pressure
What if the water is added too soon? High flow
What if the water is added too late? Low flow
What if the step is omitted altogether? Low flow
After the discussion for this step is complete, the team can analyze the
next step of the operating procedure (OP). Other issues for discussion are
Types of Risk Methodologies 61
(1) whether a step is done early; (2) whether a step is done late; and (3) whether
a step was omitted. OPs represent another way of looking at hazards and
evaluating them appropriately. OPs are sometimes called standard operat-
ing procedures (SOPs). They identify specific procedures needed to achieve
a particular task. It is common in hazard analysis to evaluate procedures for
completeness and accuracy. If the procedures are not complete or accurate a
hazard may result, especially if the procedure is not reflected in the totality
of the process system. A what-if approach is an effective method of conduct-
ing such an analysis. The team works through each step of the procedure by
asking a series of what-if questions:
Checklist
The checklist method uses a set of prepared questions to stimulate thinking,
often in the form of a what-if discussion. The questions are developed by
experts experienced with hazards analysis and the design, operation, and
maintenance of process facilities. Checklists are never all-inclusive because
no one can predict all options and hazards. As a result of this limitation, no
hazard analysis method can make the claim that it is foolproof method and
can foresee all hazards. Although this is a major limitation of the method,
62 Introduction to Risk and Failures: Tools and Methodologies
TABLE 3.4
Topics for Generating Checklist Questions
Checklist Checklist
Question Question
Topics Items of Concern Topics Items of Concern
Equipment Pumps Control loops Emergency loops
Compressors
Pressure vessels
Storage tanks
Piping
Valves
Utilities Steam (various pressure Emergency Fire water
levels) systems
Cooling water
Refrigerated water
Process or service water
Instrument air
Service air
Boiler feed water
Nitrogen
Other utility gases
Fuel gas
Natural gas
Electrical power
Firefighting equipment
External fire equipment
Runaway reaction prevention
Pressure relief Relief valves Human factors Operating procedures
Rupture disks Training
Flares and flare headers
Control loops Emergency loops Chemicals Toxicity
Flammability
Corrosivity
Emergency Fire water Instruments Local instruments
systems Firefighting equipment and controls Board-mounted
External fire equipment instruments
Runaway reaction prevention Distributed control
system (DCS)
TABLE 3.5
Sample Chemical Storage Checklist
Company
Facility
Location
Persons interviewed Name: Title: Date:
Notes
Yes, No, or
# Question Not Applicable Notes
1 Are chemicals separated according to the
following categories?
Solvents, including flammable and combustible
liquids, and halogenated hydrocarbons
Inorganic mineral acids (nitric, sulfuric,
hydrochloric, and acetic acids); bases (sodium
hydroxide, ammonium hydroxide)
Oxidizers
Poisons
Explosives or unstable reactives
2 Are caps and lids on all chemical containers
tightly closed to prevent evaporation of
contents?
3 Are material safety data sheets (MSDSs)
provided for all chemicals at the facility?
4 Are hazardous chemicals purchased in the
smallest quantities possible?
5 Are the MSDSs readily accessible?
6 Is a Hazardous Materials Team in place?
7 Are all chemicals properly logged in on receipt?
8 Is a list of chemicals on hand maintained at all
times?
Continued
64 Introduction to Risk and Failures: Tools and Methodologies
storage. The top section provides information as to how the checklist will be
used. The company, facility, and location are identified. If some of the infor-
mation for the checklist answers comes from discussions and interviews
with personnel at the site, their names are noted. The titles of all documents
reviewed are also entered in the top section. The bottom section lists the
questions. The answer choices are yes, no, or not applicable. Discussions and
background data may be entered in the Notes column.
Types of Risk Methodologies 65
Indexing
Comparative risk levels can be evaluated using indexing methods. Each
design is scored on a variety of factors contributing to overall risk. For
example, a design that uses highly toxic chemicals will score negative
points, whereas a facility located away from populated areas receives posi-
tive points. Credit is also provided for the use of control and mitigation mea-
sures. Three commonly used indexing methods are:
1. The Dow Fire and Explosion Index described in Dow’s Fire &
Explosion Index Hazard Classification Guide, 7th ed. (1994). American
Institute of Chemical Engineers.
2. The Dow Chemical Exposure Index described in Dow’s Chemical
Exposure Index Guide. (1998). American Institute of Chemical Engineers.
3. The Pipeline Risk Management Index. In Muhlbauer, W. Pipeline
Risk Management Manual, 3rd ed. (2003) Maryland Heights, MO. Gulf
Professional Publishing.
Block 1 Block 2
Block 3 Block 4
FIGURE 3.2
Interconnectivity. (Source: Ian Sutton. http://www.stb07.com/process-safety-management/
process-hazards-analysis.html. With permission.)
chain reaction that caused many other offshore platforms in the complex to
shut down in sequence. In the end, many millions of dollars of production
were lost, and the company was lucky no safety or environmental incident
occurred. Because management and the technical staff had not conducted an
interface hazards analysis, they did not understand the interactions of vari-
ous operating units.
Another example of interface operations concerns truck operations. Many
process facilities use trucks from third-party companies to deliver chemicals
and export products and waste streams. It is generally a good idea to invite a
representative of a trucking company to a pertinent process hazards analysis.
That way both parties can be assured that the chances of mishaps are small.
The process facility, for example, can evaluate the procedures to ensure that
delivered chemicals are what they should be; the trucking company repre-
sentative can check for the possibility of reverse flow of process chemicals
onto his company’s trucks.
Sutton also reports that no established methodology exists for analyzing
system connectivity—for conducting what is in effect an IHA. Such a system
can be viewed as a collection of black boxes; each black box represents an
operating unit, each of which has been thoroughly analyzed individually.
Furthermore, Sutton (2010) has shown (Figure 3.2) a system consisting of
four operating units, each of which can be connected to all the others in some
manner, except that there is no link between Block 2 and Block 4. The arrows
that point two ways indicate that connectivity problems can flow in either
direction. For a system containing N blocks, the total number of connections
is 2 × 3 × (N – 1). The 2 represents a two-way connection. The 3 represents
the three types of connections (process fluids, instrument signals, and peo-
ple noted below). In the case of Figure 3.2, the total of potential interfaces is
2 × 3 × 3 = 36 (30 if the missing connection between 2 and 4 is considered).
An interface hazards analysis (IHA) normally covers three areas:
One way of conducting the analysis is with the what-if approach. A hazards
analysis team can use a flowchart of the overall process to ask what-if ques-
tions such as:
How do we know?
What is the consequence?
Are the safeguards adequate?
What is the effect of an upset on other units?
It is important not to draw too sharp a line between the methods. Indeed,
the more experience a person gains in conducting and leading hazard anal-
yses, the more the techniques seem to merge with one another. No single
method is inherently better than any of the others. They all have their ben-
efits and specific uses. A very good discussion of interfaces appears in Sutton
(2010), Chapters 3 and 4 (and at www.stb07.com/process-safety-management/
process-hazards-analysis.html).
References
Erickson, C. II. (2005). Hazard Analysis Techniques for System Safety. New York: John
Wiley & Sons, Inc.
http://www.stb07.com/process-safety-management/process-hazards-analysis.html
Paté-Cornell, E. (1993). Risk analysis and risk management for offshore platforms:
lessons learned from the Piper Alpha accident. Journal of Offshore Mechanics and
Arctic Engineering, 115, 179–190.
Stamatis, D. (2003). Failure Mode and Effect Analysis (FMEA): From Theory to Execution.
Milwaukee, WI: Quality Press.
Sutton, I. (2010). Process Risk and Reliability Management. New York: Elsevier.
4
Preliminary Hazard Analysis (PHA)
The material in this chapter should help a design team perform a prelimi-
nary hazard analysis (PHA). PHA is a design tool that helps engineers
identify and deal with hazards in the initial stages of design. Performing a
PHA allows engineers and management to better recognize and correct the
hazards associated with designs for plants, units, and/or equipment. For an
overview of a PHA, see Figure 4.1. Specifically, a PHA is a qualitative analy-
sis performed in the earliest stages of design primarily to:
1. Identify all potential hazards and accidental events that may lead to
an accident
2. Rank the identified accidental events according to their severity
3. Identify required hazard controls and follow-up actions
4. Formulate appropriate measures to deal with hazards
1. Eliminate hazard
2. Control hazard with design methods
3. Incorporate safety devices to control hazard
4. Provide warning devices if hazard materializes
5. Provide procedures and training for operators
Other approaches may also be used to evaluate hazards. The most com-
mon ones are (1) rapid risk ranking and (2) hazard identification (HAZID).
Although a PHA is conducted very early in the design, the subsequent ben-
efits warrant the effort. The three major benefits are:
69
70 Introduction to Risk and Failures: Tools and Methodologies
FIGURE 4.1
PHA overview.
• Hazardous components
• Safety-related interfaces among various system elements, includ-
ing software
• Environmental constraints, including operating environments
• Operating conditions, testing, maintenance, built-in-tests, diagnos-
tics, and emergency procedures
• Facilities, real property, installed equipment, support equipment,
and training
• Safety-related equipment, safeguards, and possible alternate
approaches
• Malfunctions to systems, subsystems, or software
Preliminary Hazard Analysis (PHA) 71
TABLE 4.1
Typical Severity and Probability Classifications
Hazard Accident
Severity Probability
Classification Description Classification Description
Catastrophic Causes multiple injuries, Probable Likely to occur immediately
fatalities, or loss of facility or within a short time
Critical May cause severe injury, Reasonable Probably will occur
severe occupational illness, probable
or major property damage
Marginal May cause minor injury, Remote Possibly may occur
minor occupational illness
resulting in lost workdays,
or minor property damage
Negligible May not affect safety or Extremely Unlikely to occur
health of personnel, but remote
violates a safety or health
standard
The process of conducting a PHA is very simple and utilizes five steps:
As important as these steps are, they are meaningless unless they are associ-
ated with severity and probability data for each event. Table 4.1 depicts com-
mon associations of severity and probability.
The simplicity and qualitative approach of PHA makes it difficult to deter-
mine what kind of hazards should be evaluated under its rubric. The follow-
ing list of common sources may be helpful in making that determination:
Obviously, the list is not exhaustive, but it should help readers determine
areas that present potential hazards. Table 4.2 lists additional categories that
may aid in developing a checklist and finding additional sources of hazards.
Table 4.3 is a PHA worksheet based on Hammer (1989 p. 555). Table 4.4 pre
sents a preliminary hazard matrix to help identify potential failures (Vincoli
1993 p. 68).
Information may be presented in many ways. In addition to PHA, a team
may use the Table 4.5 format that is convenient and easy to use, especially as
a means for documenting brainstorming activities. After a PHA is complete,
a team should consider at least four post-PHA design activities:
TABLE 4.2
Typical Source for PHA Checklist
System Operation:
Evaluator: Date:
Category Item Category Item
Electrical Shock Mechanical Sharp edges or points
Burn Rotating equipment
Overheating Reciprocating equipment
Ignition of combustibles Pinch points
Inadvertent activation Lifting weights
Power outage Stability and topping
Distribution feedback potential
Unsafe failure to operate Ejected parts and
Explosion, electrical fragments
(electrostatic) Crushing surfaces
Explosion, electrical (arcing)
Pneumatic and Overpressurization Acceleration, Inadvertent motion
hydraulic Pipe, vessel, or duct rupture deceleration, Loose object translation
pressures Implosion gravity Impacts
Mislocated relief device Falling objects
Dynamic pressure loading Fragments or missiles
Improperly set relief Sloshing liquids
pressure Slips and trips
Back flow Falls
Cross flow
Hydraulic ram
Inadvertent release
Miscalibrated relief device
Blown objects
Pipe or hose whip
Blast
Temperature Heat source or sink Ionizing Alpha
extremes Hot or cold surface burns radiation Beta
Pressure evaluation Neutron
Confined gas or liquid Gamma
Elevated flammability X-Ray
Elevated volatility
Elevated reactivity
Freezing
Humidity or moisture
Reduced reliability
Altered structural properties
(e.g., embrittlement)
Fire and Fuel Nonionizing Laser
flammability Ignition source radiation Infrared
factors Oxidizer Microwave
present Propellant Ultraviolet
Continued
74 Introduction to Risk and Failures: Tools and Methodologies
TABLE 4.2 (Continued)
Typical Source for PHA Checklist
Category Item Category Item
Explosives and Mass fire Explosive Heat
effects Blast overpressure initiators Friction
Thrown fragments Impact or shock
Seismic ground wave Vibration
Meteorological Electrostatic discharge
reinforcement Chemical contamination
Lightning
Stray welding sparks
Explosive Heat or cold Explosive Explosive propellant
sensitizers Vibration conditions Explosive gas
Impact or shock Explosive liquid
Low humidity Explosive vapor
Chemical contamination Explosive dust
Materials Liquids or cryogens Chemical and System cross connection
arising from Gases or vapors water Leaks and spills
leaks and Irritating dusts contamination Vessel, pipe, or conduit
spills Radiation sources rupture
Flammable Backflow or siphon effect
Toxic
Reactive
Corrosive
Slippery
Odorous
Pathogenic
Asphyxiating
Flooding
Run-off
Vapor propagation
Physiological Temperature extremes Human factors Operator error
Nuisance dust or odor Inadvertent operation
Barometric pressure extreme Failure to operate
Fatigue Early or late operation
Lifted weights Out-of-sequence operation
Noise Right operation, wrong
Vibration (Raynaud’s control
syndrome) Operated too long
Mutagens Operated too briefly
Asphyxiants
Allergens
Pathogens
Radiation
Cryogens
Carcinogens
Teratogens
Toxins
Irritants
Preliminary Hazard Analysis (PHA) 75
TABLE 4.2 (Continued)
Typical Source for PHA Checklist
Category Item Category Item
Ergonomic Fatigue Control systems Power outage
Inaccessibility Electromagnetic or
Inadequate or no kill electrostatic interference
switches Moisture
Glare Sneak circuit
Inadequate control or Sneak software
readout Lightning strike
Differentiation Grounding failure
Inappropriate control or Inadvertent activation
readout
Location
Faulty or inadequate control
readout
Labeling
Faulty workstation design
Inadequate or improper
illumination
Unannunciated Electricity Common Utility outage
Utility Outage Steam Causes Moisture and humidity
Heating or cooling Temperature extremes
Ventilation Seismic disturbance or
Air conditioning impact
Compressed air or gas Vibration
Lubrication drains and Flooding
slumps Dust and dirt
Fuel Faulty calibration
Exhaust Fire
Single-operator coupling
Location
Radiation
Wear
Maintenance error
Vermin, varmints, mud
daubers
Continued
76 Introduction to Risk and Failures: Tools and Methodologies
TABLE 4.2 (Continued)
Typical Source for PHA Checklist
Category Item Category Item
Contingencies Hard shutdown or failure Mission phasing Transport
(emergency Freeze Delivery
responses by Fire Installation
systems or Windstorm Calibration
operators to Hailstorm Checkout
unusual Utility outage Shakedown
events) Flood Activation
Earthquake Standard start
Snow and ice load Emergency start
Normal operation
Load change
Coupling and uncoupling
Stressed operation
Standard shutdown
Emergency shutdown
Diagnosis and trouble
shooting
Maintenance
Pressure gauge
Heating coil
Plug
Thermostat
Note: No hazards checklist should be considered complete. Every list should be enlarged as
experience and specific applications require. This list intentionally contains redundant
entries because redundancy may lead to a different point of view in an open discussion.
Recommended Responsible
Number Hazard Causes Effects Mode IMRI Action FMRI Comments Individual Status
TABLE 4.4
Preliminary Hazard Matrix
Potential Failure Area
Leakage/
Hazard Group Structural Procedural Electrical Mechanical Pressure Spill
Collision/
mechanical
damage
Loss of habitable
atmosphere
Corrosion
Contamination
Electric shock
Fire
Pathological
impact
Psychological
impact
Temperature
extreme
Radiation
Explosion
System Operator:
Evaluator: Date:
TABLE 4.5
Typical PHA Brainstorming Record
Proposed
Project Incident Treatment
Component Type Scenario Measure Likelihood Consequence Risk
Preliminary Hazard Analysis (PHA) 79
TABLE 4.6
Typical PHA Report
Title: Report no.:
Equipment or system: Report date:
Close-out date: Authorized signature:
Authorizing individual:
Description of hazard and accident that may result:
Events and conditions that may contribute to hazard or accident:
Possible means to eliminate or control hazard or accident effects:
Estimated probability of accident occurrence:
Current condition with control:
• Frequent
• Reasonably probable
• Occasional
• Remote
• Extremely improbable
Explain choice:
Means of verifying adequacy of control or applicable safety requirements:
Organization or person to take action:
Status of action already or to be or taken:
A typical PHA will consider any hazards associated with a design during
the earliest stages of the design process. Two concerns are:
The intent is to design a safe product. A PHA will help ensure that:
The next step is establishing initial (or revised if applicable) design and pro-
cedural requirements to eliminate or control the hazards. Post-PH activities
involve establishing procedures to ensure that hazard elimination or control
measures are effectively incorporated into a design.
A hazard report (see Table 4.6) may be created for each new hazard iden-
tified during the design process. The report may be used to track a hazard
through the design process to ensure that appropriate measures are incor-
porated into the design to eliminate or adequately control the hazard. Of
course, the ability of the design to eliminate or control every identified haz-
ard must be verified by test results. The report may be signed off only after a
design effectively eliminates or adequately controls the hazard.
PHA Limitations
A PHA will only be as effective as the design team’s ability to recognize haz-
ards. If a hazard is not recognized, the PHA will be no help in minimizing it.
A PHA does not effectively account for interactions among hazards.
References
Hammer, W. (1989). Occupational Safety Management and Engineering, 4th ed. Englewood
Cliffs, NJ: Prentice Hall.
Vincoli, J. (1993). Basic Guide to System Safety. New York: Van Nostrand Reinhold.
Selected Bibliography
Hoxie, W. (2003). Preconstruction risk assessments. Professional Safety, 48, 50–53.
Smith, K. and D. Whittle. (2001). Six steps to effectively update and revalidate PHAs.
Chemical Engineering Progress, 97, 70–77.
5
HAZOP Analysis
Overview
Hazard and operability (HAZOP) analysis is a structured and systematic
technique for system examination and risk management. It is often used to
identify potential hazards in a system and operability problems likely to
lead to non-conforming products. HAZOP is based on a theory that assumes
risk events are caused by deviations from design or operating intentions.
Identification of such deviations is facilitated by using guidewords as a sys-
tematic list of deviation perspectives. This approach is a unique feature of
the HAZOP methodology that helps stimulate the imaginations of team
members when exploring potential deviations. As a risk assessment tool,
HAZOP is often described as:
• A brainstorming technique
• A qualitative risk assessment tool
• An inductive risk assessment tool; a bottom-up risk identification
approach in which success relies on the ability of subject matter
experts (SMEs) to predict deviations based on past experiences and
their general expertise
83
84 Introduction to Risk and Failures: Tools and Methodologies
(not replace or repeat) the guidance available from IEC International Standard
61882. This discussion will focus on the basic principles of HAZOP and fol-
low with a detailed analysis of the principles and the execution of a typi-
cal analysis.
Definitions
HAZOP methodology requires understanding of the following basic
definitions:
The above are generally considered default definitions. Readers should under-
stand that different industries and organizations may utilize additional defini-
tions and/or different interpretations. For example, the following definitions
may be found in the marine industry:
Process
Minimum Requirements
Performance of a HAZOP in any environment requires the following six
items at a minimum:
Defining Risk
We already discussed the formal definition of risk and its derivatives in
Chapter 1. However, when developing a HAZOP, we must keep in mind that
risk is the product of probability of occurrence and consequence. Therefore, a
HAZOP team must be able to differentiate and define both the consequence
and probability categories.
HAZOP Analysis 87
TABLE 5.1
A Simple Evaluation Method to Risk
Personnel
Exposure Delay Value Robustness Suitability
Qualification Replacement time Structural Type of operation Type of operation
and experience and cost strength Previous Previous
of personnel Repair possibilities and experience experience
Organization Number of interfaces robustness Installation Installation
Required and contractors or Operation ability ability
presence subcontractors method Equipment used Equipment used
Shift Project development Novelty Margins, Margins,
arrangements period and robustness, robustness,
Deputy and Existing field feasibility condition, condition,
back-up infrastructure maintenance maintenance
arrangements Infrastructure as Previous Previous
Overall project applicable experience experience
particulars Handled object
Trigger Events
Typical events that call for HAZOP analysis are:
Human failure
Equipment, instrument, or component failure
Supply failure
88 Introduction to Risk and Failures: Tools and Methodologies
Use of Analysis
There are two schools of thought as to when a HAZOP should be performed
and how. The first involves a series of activities that include:
The second approach is using HAZOP as a design tool early in the design
stage but depends on project needs and timeframe. The concept engineer-
ing or basic engineering stage may be delayed if this study is included at
that stage. In my experience, a HAZOP study is done after all the required
documents are available, but it is always better to identify hazards early in
the design stage. HAZOP is performed in two stages as shown in Figure 5.1
and Figure 5.2. The first stage takes place after finalization of the process,
that is, after the first P&ID is issued. The second stage occurs during detail
engineering and should involve all package suppliers.
In the final analysis, a HAZOP is best suited for assessing hazards in facili-
ties, equipment, and processes and is capable of assessing systems from mul-
tiple perspectives. The assessment is based on the principle of prevention.
For example, in a design situation—an early activity—the focus is on:
HAZOP Analysis 89
Restriction plate
To up-stream plant.
Used in start-up only
FIGURE 5.1
P&ID of feed section of process.
Restriction plate
Hi
LA
Lo PG
LI LA
FIC
To up-stream plant.
Used in start-up only
FIGURE 5.2
Revised P&ID of feed section of process.
Brainstorming methodology
Systematic and comprehensive methodology (logical approach)
A simpler and more intuitive method than other common risk manage-
ment tools
An interface of HAZOP with other risk management tools such as
HACCP
HAZOP Process
All processes targeted for prevention, including HAZOP analysis, have their
own approaches for developing solutions to eliminate or minimize hazards
from workplaces. The typical HAZOP is conducted in four steps as summa-
rized in Table 5.2.
Definition
This phase typically begins with preliminary identification of risk assess-
ment team members. HAZOP is intended to be a cross-functional team effort
and relies on subject matter experts (SMEs) from various disciplines with
appropriate skills and experience who display intuition and good judgment.
HAZOP Analysis 91
TABLE 5.2
HAZOP Steps
Step Specific Function
Definition Define scope and objectives
Define responsibilities
Select team
Preparation Plan study
Collect data
Agree on style of recording
Estimate time
Arrange schedule
Examination Divide system into parts
Select part and define design intent
Identify deviations by using guidewords on each element
Identify consequences and causes
Identify whether a significant problem exists
Identify protection, detection, and indicating mechanisms
Identify possible remedial or mitigating measures (optional)
Agree on actions
Repeat for each element and then each part
Documentation and follow-up Record examination
Sign off documentation
Produce report of study
Follow up to ensure actions are implemented
Restudy any parts of system if necessary
Produce final output report
SMEs should be carefully chosen to cover all relevant experiences and func-
tions. All meetings should be conducted in a positive atmosphere, and all
members should be able to contribute their ideas without fear of retaliation.
During this phase, the risk assessment team must carefully identify and
agree on the scope in order to focus their efforts. They must define the proj-
ect boundaries, key interfaces, and important assumptions that determine
the direction of the assessment.
Preparation
This phase typically includes:
Examination
This phase begins with identification of all elements, parts, or steps of the
system or process to be examined. A process flow diagram is a good tool for
this purpose. It allows physical systems to be broken down into smaller parts
as necessary. Processes may be broken down into discrete steps or phases.
Similar parts or steps may be grouped to facilitate assessment.
HAZOP Analysis 93
Regulatory requirements
Need for more explicit risk rating or prioritization (e.g., rating deviation
probabilities, severities, and/or detection)
Company documentation policies
Needs for traceability or audit readiness
Other factors
Finally, before the detailed HAZOP takes place, the process (or pipeline is
relevant) and instrumentation diagram (P&ID) must be developed during
the last stage of process design. A P&ID is a schematic of functional rela-
tionships of piping, instrumentation, and equipment components, without
which a process cannot be designed adequately. A P&ID:
Detailed Analysis
The last section provided a general overview of HAZOP. This section will
develop the process with more detail. A HAZOP examination systematically
questions every part of a process or operation to discover qualitatively how
deviations from normal operation can occur and whether further protective
measures, altered operating procedures, or design changes are required.
Figure 5.3 shows the HAZOP procedure in a flowchart characterization.
The examination procedure uses a full description of the process and will
almost invariably include a P&ID or equivalent. It systematically questions
Select line
Yes
Is that hazardous or does it No Consider other causes of
prevent efficient operation? MORE FLOW.
Yes
Consider and specify
No Will the operator know that there
mechanisms for
identification of deviation is MORE FLOW?
Yes
What change in plant or methods
Consider other changes or
will prevent the deviation or make
agree to accept hazard
it less likely or protect against the
consequences?
Yes
Agree change(s) and who is
responsible for action
FIGURE 5.3
HAZOP procedure flow. (Source: HIPAP 8. 2011. With permission)
96 Introduction to Risk and Failures: Tools and Methodologies
Deviation 1 Deviation 3
Normal Operation
Deviation 2 Deviation 4
FIGURE 5.4
Operation with deviations.
every part of the process to determine how deviations from the intent of the
design can occur and determine whether they will give rise to hazards.
The questioning is sequentially focused around guidewords are derived
from a team discussion using Q&A or other investigative techniques. The
guidewords ensure that the questions posed to test the integrity of each part
of the design will explore every conceivable way in which operation could
deviate from the design intent.
Some of the causes may be so unlikely that the derived consequences will
be rejected as not meaningful. Some of the consequences may be trivial and
need to be considered no further. However, some deviations have conceiv-
able causes and potentially serious consequences (see Figure 5.4). The poten-
tial problems are then noted for remedial action. The immediate solution to
a problem may not be obvious and may need further consideration by a team
member or perhaps a specialist. All decisions must be recorded.
Traditionally, recorders were designated to take notes. However, in the age
of technology, software may be used to assist in recording HAZOP proceed-
ings but should not be considered as a replacement for an experienced chair-
person and secretary. The main advantage of the software approach is its
systematic thoroughness in failure identification. The method may be used
at the design stage when plant alterations or extensions are planned or may
be applied to an existing facility.
Sequence of Examination
The logical sequence in conducting a HAZOP is determining:
Typically, a member of the team outlines the purpose of a chosen line of the
process and how it is expected to operate. The various guidewords such as
MORE are selected in turn. The team then considers what issues could cause
the deviation.
The next step is considering the results of a deviation, for example, the
creation of a hazardous situation or operational difficulties. When the con-
sidered events are credible and the effects significant, existing safeguards
should be evaluated and a decision made as to what additional measures
could be required to eliminate the identified cause. A more detailed reliabil-
ity analysis such as risk or consequence quantification may be required to
determine whether the frequency or outcome of an event is serious enough
to justify major design changes.
Fan cooler
Heat Exchanger
Pump
FIGURE 5.5
Cooling water facility. (Source: Lihou, M. http://www.lihoutech.com/h zp1frm.htm)
98 Introduction to Risk and Failures: Tools and Methodologies
intent that a HAZOP study is directed. The deviation term now becomes eas-
ier to understand. A deviation or departure from the design intent in the
case of the cooling facility would be a cessation of circulation or the water at
a too-high initial temperature. Note the difference between a deviation and
its cause. In this case, failure of the pump would be a cause, not a deviation.
1. The process designer briefly outlines the broad purpose of the sec-
tion of the design under study and displays the P&ID (or equivalent)
where it can be readily seen by all team members.
2. General questions about the scope and intent of the design are discussed.
3. The relevant part for study is selected, usually one in which a major
material flow enters that section of the plant. The part or item is
highlighted on the P&ID with dotted lines using a transparent pale
colored felt pen (see Figure I.3 in the Introduction to this book).
4. The process designer explains in detail the purpose, design features,
operating conditions, fittings, instrumentation, protective systems,
and details of the processes immediately upstream and downstream
if relevant.
5. Any general questions about the part or item are discussed.
6. The detailed line-by-line study commences. The discussion leader
reviews the guidewords chosen as relevant. Each guideword such
as HIGH FLOW identifies a deviation from normal operating condi-
tions to prompt discussion of the possible causes and effects of flow
at an undesirably high rate. If, in the opinion of the study team, the
combination of the consequences and the likelihood of occurrence
are sufficient to warrant action, the combination is regarded as
a problem and recorded as such. If the existing safeguards are
deemed sufficient, no further action is required. For major risk areas,
the need for action may be assessed quantitatively using techniques
such as hazard analysis (HAZAN) or reliability analysis. For less
critical risks, assessment is usually based on experience and judg-
ment. The person responsible for defining the corrective action is
also nominated.
7. The main aim of the meeting is to find problems needing solutions
rather than finding solutions. The group should not be tied down
by trying to resolve problems. It is better to proceed with the study,
deferring consideration of the unsolved problems to a later date.
8. When a guideword requires no more consideration, the chairperson
refers the team to the next guideword.
HAZOP Analysis 99
Effectiveness Factors
The effectiveness of a HAZOP will depend on several things, including:
The key elements of a HAZOP are (1) the team; (2) a full description of the
process to be examined; (3) relevant guidewords; (4) conditions conducive to
brainstorming; (5) recording of meetings; and (6) a follow-up plan.
Team
The HAZOP team will typically consist of five to nine people. A team should
have an odd number of members to eliminate the possible ties. Team mem-
bers should have a range of relevant skills to ensure all aspects of the plant
and its operations are covered. Engineering disciplines, management, and
plant operating staff should be represented. This will prevent possible events
from being overlooked through lack of expertise and awareness. It is essen-
tial that the chairperson is experienced in HAZOP techniques to ensure that
the team follows the procedure without diverging or taking shortcuts.
100 Introduction to Risk and Failures: Tools and Methodologies
Engineers
The engineering experts assigned to the HAZOP may include any combi-
nation of project engineers, machinery engineers, instrument engineers,
electrical engineers, mechanical engineers, safety engineers, quality assurance
engineers, maintenance engineers or technicians, and corrosion and materi-
als engineers. These individuals should provide expertise in their respective
disciplines as it applies to the process under study. They are also responsible
for attending the initial hazard analysis meeting. They must be available to
the team as required with the understanding that the team leader will give
adequate advance notice to the experts when possible. The experts must pro-
vide documentation of all existing safeguards and procedures. A HAZOP
team assigned to consider a new chemical plant could include:
Description of Process
A full description of the process is needed to guide the HAZOP team. This
presupposes that a good understanding of the process exists. It also presup-
poses that the appropriate and applicable individuals form the team. In the
case of conventional chemical plants, detailed P&IDs should be available. At
least one member of the HAZOP team should be familiar with the applicable
diagrams and all instrumentation they represent. If a plant is very complex
or large, it may be split into a smaller number of units (nodes) to be analyzed
at separate HAZOP meetings.
In addition to P&IDs, physical or computer-generated models of the plant
or photographs of similar existing plants may also be utilized. The addi-
tional documents greatly help the team visualize potential incidents, espe-
cially those caused by human error. An inspection by the HAZOP team of
a similar plant already in operation before commencement of the HAZOP
would be highly beneficial. If a similar plant is in operation, the team should
review past incidents. Key information that may be required during the
HAZOP should also be readily available and should include:
• Layout drawings
• Hazardous area drawings
• Material safety data sheets
• Relevant codes or standards
• Operating manual for existing plant
• Outline operating procedures for new plant
102 Introduction to Risk and Failures: Tools and Methodologies
When carrying out a HAZOP on a facility for which traditional P&IDs are
not appropriate, it may be more suitable to use alternative visualization and
diagrammatic techniques such as plan and section drawings, layout draw-
ings, or photographs. A decision as to the medium to be used should be
made well before the HAZOP commences.
In batch processes, additional complexities are introduced into the
HAZOP because of the time-dependent nature of the system components. It
is strongly recommended that the chairperson be knowledgeable about the
process at hand and also experienced in batch process HAZOPs.
Relevant Guidewords
A set of guidewords relevant to the operation should be chosen, studied, and
systematically applied to all parts of the operation. This may entail applica-
tion of the guidewords to each process line within a P&ID or by following
each stage of an operation from start to finish. Table 5.4 shows examples of
guidewords and variations. The choice of suitable guidewords will strongly
impact the success of a HAZOP in detecting design faults and operabil-
ity problems.
In addition to reviewing and analyzing normal operations, a HAZOP
should also consider conditions during plant start-up, shut-down, and all
applicable modifications. Human response time and the possibility of inap-
propriate action by an operator or supervisor should also be considered. If
such errors are possible, it is suggested that a mistake-proofing methodology
be designed and implemented to avoid human errors.
TABLE 5.4
Typical Guidewords
Guideword Meaning Comments
NO Complete negation For example, of intention
NO Forward flow When there should be
MORE Quantitative increase More of a relevant physical
property than there should be
(e.g., high flow, temperature,
pressure, viscosity; actions
(heat and chemical reaction)
LESS Quantitative decrease Less of a relevant physical
property, etc.
AS WELL AS Quantitative increase All design and operating
intentions achieved together
with some addition
(e.g., impurities, extra phase)
PART OF Quantitative decrease Only some intentions achieved
REVERSE Opposite of intention Reverse flow or chemical
reaction (e,g., inject acid
instead of alkali in pH
control)
OTHER THAN Complete substitution or No part of original intention;
miscellaneous different result achieved;
start-up, shutdown,
alternative mode of operation,
catalyst change, corrosion, etc.
TABLE 5.4 (Continued)
Typical Guidewords
Guideword Meaning Comments
The team should use care when listing safeguards. Hazards analysis requires
evaluation of the consequences of failures of engineering and administra-
tive controls and a careful determination of whether items are genuine safe-
guards must be made. In addition, the team should consider realistic multiple
failures and simultaneous events when determining whether a safeguard
will actually function as planned in the event of an occurrence.
Meeting Records
One approach to record keeping is to record only key findings (reporting by
exception). A second technique is recording all issues. Experience has shown
HAZOP Analysis 107
TABLE 5.5
HAZOP Meeting Record
Project: Node: Page:
Description: Date:
Drawing/
Revision No.:
Meeting Questions
In any meeting dealing with risk and or hazards, many questions arise. The
following questions must be answered to ensure a complete discussion and
substantial corrective action for the risks and/or hazards identified:
Follow-Up
The fact that a HAZOP analysis is conducted to eliminate or minimize haz-
ards cannot be overstated. Arriving at a solution is only a partial answer to
a concern. The other part is to make sure the solution is workable within
the time constraints and ensure proper follow-up to demonstrate that the
integrity of the solution is what it should be over the short and the long terms
defined by the team.
Report
At the completion of a HAZOP analysis, a full report must be issued. The
minimum requirements for a report are explained below.
study. The title page should also show the type of operation (proposed or
existing) and its location. The title sheet should specify who authorized the
report and the date it was authorized. The chairperson and organization she
or he represents should also be noted.
Table of Contents
A table of contents must be included at the beginning of the report. It should
list report sections or contents and also list figures, tables, and appendices.
Aim
The report must provide sufficient information about each element and
adequate cross references. Any section read alone or in sequence with other
sections should allow and assessment of the adequacy of the HAZOP study.
Guidewords
The guidewords used to identify possible deviations of operations must be
listed. Furthermore, explanations of all specialized words that apply to the
facility should also be given. See Table 5.6.
Scope of Report
This section should briefly describe the aims and purpose of the study. For
example, is the study intended to satisfy conditions of development consent
or as a company initiative as a component of a safety upgrade? Does the
study cover a new development or a modification or extension of an exist-
ing facility. Reference must be made to other relevant safety related studies
completed or under preparation.
Description of Facility
This section should provide an overview of the site, plant, and materi-
als stored and used there. If such information is already available in an
112 Introduction to Risk and Failures: Tools and Methodologies
TABLE 5.6
Guidewords and Parameters
Guideword Parameter
No Flow
Reverse
More
Less
More Pressure
Less
More Temperature
Less
More Level
Less
No
More Phase
Less
More Composition
Less
Other Start-up
Shutdown
Commissioning
Relief and blow-down
Draining
Venting
Isolation
Purging
Sample points
Instruments
Maintenance access
Construction materials
Static electricity
Note: Table 5.4 explains these guide-
words and parameters.
Team Members
All HAZOP participants and their affiliations and positions should be noted.
Their responsibilities, qualifications, and relevant experience should also be
shown. The chairperson and the secretary of the group should be identified.
The dates of the meetings and their duration should be listed. If some mem-
bers did not attend all meetings, the extent of their participation should be
indicated. Special visitors and occasional members must be listed in a simi-
lar manner and the reasons for their attendance detailed. For example, spe-
cialist instrumentation engineers and consultants may be required to attend
certain sessions to overcome specific design problems.
Methodology
The general approach must he briefly outlined. Any changes to the accepted
standard method for a HAZOP must be detailed and explained.
Overview
This section must outline the conditions and situations likely to result in a
potentially hazardous outcome considered in the HAZOP (following line-
by-line analysis) for each P&ID or section, including overview issues, such as:
• Start-up procedures
• Emergency shutdown procedures
• Alarm and instrumentation trip testing
• Pre-commissioning operator training
• Plant protection systems
• Service failure
• Breakdowns
• Effluents (gas, liquid, solid)
• Noise
114 Introduction to Risk and Failures: Tools and Methodologies
Any issues raised and considered necessary for review outside the HAZOP
must be detailed. See Table 5.3 for some typical guidewords.
Findings
This section highlights items that are potentially hazardous to plant per-
sonnel, the public, or the environment or have the potential to jeopardize
plant operability. The section should include a clear statement of commit-
ment to modify the design or operational procedures in accordance with the
required actions and a timetable for implementation. Justifications for not
taking certain actions should also be explained. The current status of the
recommended actions at the time of the report should also be noted along
with the names and titles of persons responsible for their implementation.
Review
In a very broad sense, a HAZOP review is meant to identify actions required
to alleviate or remove potential hazards or operability problems revealed by
the study. Proper recording and reporting of the HAZOP review discussion
is an integral part of a HAZOP review. The scope of the HAZOP review must
be clearly stated in the info pack document (see below). As a guideline, items
such as positioning of safety showers, valve accessibility, handling of process
and waste chemicals are not included in a HAZOP review. The intent is to
establish guidelines for a HAZOP review.
Input Documents
The following input documents are required for a HAZOP review:
controls, and automation) and (2) experience applying highly structured sys-
tematic HAZOP techniques.
The team leader (or chairperson or facilitator) of the HAZOP team must be
selected for his or her ability to effectively lead the review and should have
sufficient status to present the review recommendations to the proper level
of authority. Ideally, the leader must be independent of the project.
For proper recording of the review discussion points, a secretary should be
part of the HAZOP review team. He or she should be technically qualified
to understand the review discussion and understand the HAZOP technique.
The secretary should be preferably from a process discipline and understand
the jargon used by the team.
Other team members for the HAZOP review must be selected on the basis
of the positive contributions they can make based on their special knowl-
edge and abilities. The participation of an operations and/or commissioning
(launching) specialist for a similar plant or unit is strongly recommended. A
typical team composition for a HAZOP review can be as follows:
Preparation
The HAZOP review info pack is a required component of the preparation
for the review discussion. Additionally, the participants in the review have
HAZOP Analysis 117
the duty to study the info pack in detail before the meeting. The team leader
may suggest changes to the info pack to improve the team’s understanding
of the scope of the review.
Part of the preparation involves organizing the location, date, start time,
and duration of the HAZOP review. Preparation is important and must start
early enough to allow the team members to read the material and be ready
to discuss. It is imperative that the info pack and meeting schedule be dis-
tributed to both full- and part-time members so that they can arrange their
schedules. The responsibility for planning and dissemination of this infor-
mation is with the project engineer.
The meeting location for the review must be spacious and well ventilated
to enable the participants to be comfortable for long sessions. The meeting
schedule must include periodic breaks with refreshments because of the
intense nature of the discussions that require tremendous concentration. To
make a review more effective, the meeting room must have:
Methodology
The entire process, plant, or unit is subdivided into manageable sections
(nodes) for ease of understanding (see Figure I.3 in the Introduction to this
book). Indications of nodes on P&IDs and their descriptions should be pro-
vided during the preparation stage and included in the info pack to save
time during the review discussion.
The next step involves using a fixed set of terms (guidewords) for each
process parameter (flow, temperature, pressure, etc.) in the selected node
to identify a potential hazard or operability problem. The combination of
guidewords and process parameters should reveal deviations in a process.
For example, the NO guideword combined with the FLOW process param-
eter reveals a NO-FLOW deviation. Table 5.5 shows typical guidewords and
parameters that may be used in a review process. The following steps are
recommended for conducting a systematic HAZOP review:
118 Introduction to Risk and Failures: Tools and Methodologies
• List node numbers on the HAZOP worksheet and use forms such as
Table 5.3 and Table 5.5 or a format specific to your project to list the
following details and steps for the node:
• A brief description of the node in the HAZOP worksheet.
• Process data such as operating pressure, operating temperature,
design pressure, and design temperature for the node.
• P&ID tag number covering the selected node. If more than one
P&ID is required, provide all tag numbers.
• Enter the first parameter in the appropriate column, for example,
start with FLOW.
• Enter the guideword against the parameter in the next column,
for example, start with NO.
• Identify the deviation. The deviation based on the two above
items now is “No Flow.”
• Identify one cause for “No Flow” based on the process as depicted
in the P&ID.
• Outline the consequences (both upstream and downstream).
• Identify the protection or safeguards available.
• Provide the recommendations/ actions to eliminate/ m itigate
the deviation.
• Identify a second cause for “No Flow” and repeat the steps after
the cause identification as above until the team agrees that no
more reasonable causes of “No Flow” can be identified.
• Repeat the entire procedure until all the deviations associated
with the identified node have been considered.
• After completion of an identified and numbered node move to
another node and repeat the entire procedure as described above.
• Team members will apply their knowledge and experience to each
deviation to identify possible causes (within scope of the HAZOP
study) to establish the credibility of the event. The associated conse-
quences will be considered to determine whether they significantly
impact the hazards and operability of the plant or unit. Usually, there
will be more than one cause for each deviation and the consequences
may vary among causes.
Recommendations
The results of a HAZOP review meeting should yield (1) suggested actions
to prevent or mitigate deviations; (2) requests for additional information; and
(3) recommendations for a quantitative risk assessment (QRA).
Typically, the results from the study will be produced in the form of an action
list (corrective action report or CAR) and generate an individual action response
HAZOP Analysis 119
form for each action point noted in the review meeting. The responsible indi-
viduals and required completion dates for each action item will be noted. The
list will be subjected to periodic reviews to assess progress until the action items
have been implemented and a HAZOP close-out report issued. In summary, all
recommendations are intended to eliminate hazards or minimize them.
Success Factors
All process improvement techniques involve fundamentals to keep in mind
to achieve success. HAZOP is no different. In fact, the literature contains
many recommendations and approaches, for example, Kletz (1999) and others.
Crawley et al. (2000) devised a list of suggestions. We present them here not
as a definitive approach but rather as guidelines. We chose their approach
because it is simple and breaks down the effectiveness components in the
order in which a HAZOP is conducted.
HAZOP also presents a number of pitfalls that must be addressed and
eliminated before, throughout, and after the process. Listed below are com-
mon pitfalls that may affect the quality and value of a study.
Before Study
A study must be initiated by a person who has authority to implement the
results. If the person has no authority and the actions are not implemented, a
study wastes time and money. The design analyzed must be well developed
and firm; the operations or facilities examined should not be under develop-
ment. A study cannot be carried out on a partly developed design because
subsequent changes will undermine the results. Drawings must be accurate
and depict what was studied. A study is worthless if the drawings are inac-
curate or incomplete. Delay should be minimized because viable options will
decrease. A skilled and suitably experienced team leader should be chosen.
He or she must be given a clear scope, objectives, and terms of reference by the
initiator, including delivery date and report distribution. If this is not done,
a study may be incomplete and fail to fulfill the requirements of the initiator.
No study should be expected to yield project decisions and the design
team should not adopt the approach of letting the HAZOP study decide
what should be done. The study group must be balanced and well chosen
to combine knowledge and experience. A study group drawn entirely from
the project team will not be capable of a critical and creative design review.
Equally, a group that has no operations background may lack objectivity. The
group must be given adequate notice of the study so that they can carry out
their preparations, including review and analysis of the P&IDs. The extent to
which problems are evaluated, ranked, and solved should be defined.
120 Introduction to Risk and Failures: Tools and Methodologies
Throughout Study
Perhaps the most important factor in the success of a HAZOP is that it must
form part of an overall safety management system. Another vital factor is
the unequivocal support of senior management. The following issues are
also important:
• The team must be motivated and have adequate time and resources
to complete the examination.
• The boundaries of the study must be clearly analyzed. Changes of
one item may impact other items involving different processes or an
operation upstream or downstream. If the potential impact is not
perceived correctly, the boundaries may be incorrect.
• The boundaries of a study of a modification are equally complex.
A change in a reactor temperature may affect the by-product spec-
trum, thus producing a greater impact than the immediate modifica-
tion. A clear description, design intention, and design envelope must
be assigned to every section or stage examined.
• The study requires creative thought processes. If a study becomes
mechanistic process or fatigue sets in, the study must be halted and
restarted when the team is refreshed.
• Each action must be relevant, clearly defined, and described with-
out ambiguity.
• If the person assigned to follow up the action has missed a meeting,
the results of a misunderstanding could be wasted time and effort.
• The study must accept a flexible approach to actions. Not all actions
are centered on hardware changes; procedural changes may be
more effective.
• The study team members must be aware that some problems ranked
and identified during the study may be caused by human factors
and may not require hardware changes.
• Potential pitfalls must be treated individually when planning routes
around branched systems such as recycling lines, junctions, vents,
and drains.
After Study
Every planned action must be analyzed and described accurately. Many of
the actions raised will require no further change, but all must be designated
for action or no action. Those that require changes should be subjected to a
management of change process (that may require a HAZOP of the change)
and put on a tracking schedule.
HAZOP Analysis 121
Revisions
A risk management plan (RMP) is intended to describe, communicate, and
document activities and processes necessary to manage the risks involved in
the planned operations through all project phases. All processes and activi-
ties deemed necessary to manage risks during the operations should be
reflected in the plan. The RMP should define and allocate responsibilities and
serve as a tool for monitoring the status of the risk management process. The
document should be established early, define responsibilities, and be main-
tained continuously to reflect the project status at various stages. Obviously,
the complexity of a project will dictate the volume and detail of the informa-
tion and discussion about low- and high-risk events. Some of the activities
required for producing the RMP maybe controlled though a checklist such as:
It is imperative that the RMP is responsible for the specific activities cited
on the checklist. All responsibilities should fall to management or another
predefined responsible party. If several contractors or subcontractors are
involved, the plan may be structured hierarchically, with each contractor
executing a contract for his area of responsibility. Contractor and subcon-
tractor RMPs should be based on plans and systems already in place. A first
revision of the RMP is recommended as soon as the need for a change is real-
ized. Further revisions should reflect the various project phases and stages.
Table 5.7 lists revisions and recommendations.
TABLE 5.7
Revisions and Recommendations
Point of Revision Purpose of Issue
Project definition Define responsibilities for risk management process and initial activities.
Completion of Communicate potential risk categories for planned operations after risks
overall are identified and reducing activities are defined and/or when a
assessment contractor is nominated.
Define details of all required risk identification and risk reducing activities,
allocate responsibilities, schedule activities, and monitor status.
Project Document completion dates of planned activities.
completion Number of revisions and versions should reflect complexity and
criticality of planned operations and number of contractors.
122 Introduction to Risk and Failures: Tools and Methodologies
References
Crawley, F., M. Preston and B. Tyler. (2000). HAZOP Guide to the Best Practice.
Warwickshire. Chemical Industries Association.
Det Norske Veritas. (2003). Risk management in marine and subsea operations. Høvik,
Norway. http://www.srcf.ucam.org/polarauvguide/auvs/caseStudies/
DNV-RP-H101a.pdf
HIPAP:8 (2008). HAZOP Guidelines. Sydney. New South Wales Department of
Planning. Hazardous Industry Planning Advisory Paper 8.
Kletz, T. (1999). Hazop and Hazan, 4th ed. Manchester. Institution for Chemical
Engineers.
Lihou, M. http://www.lihoutech.com/hzp1frm.htm
McGraw Hill (2003). Dictionary of Scientific and Technical Terms. 6th ed. New York:
Author.
Pitblado, R.M., L. Bellamy, and T. Geyer (1989). Safety assessment of computer-
controlled process plants. EFCE International Symposium on Loss Prevention
and Safety Promotion in the Process Industries, Oslo.
Selected Bibliography
American Institute of Chemical Engineers. (1994). Guidelines for Preventing Human
Error in Process Safety. New York: Center for Chemical Process Safety.
American Institute of Chemical Engineers. (1985). Guidelines for Hazard Evaluation.
New York: Center for Chemical Process Safety.
Andow, P. (1991). Guidance on HAZOP Procedures for Computer- Controlled Plants.
London: U.K. Health and Safety Executive Contract Research Report 26.
Barton, J. and R. Rogers. (1997). Chemical Reaction Hazards, 2nd ed. London: IChemE.
Bullock, B.C. (1974). The Development and Application of Quantitative Risk Criteria for
Chemical Processes. Fifth Chemical Process Hazard Symposium, Manchester, UK.
Elmendorf, M. (1996). Introduction to process hazard analysis. Journal of Environmental
Law and Practice, 4, 36–56.
Geronsin, R. (2001). Job hazard assessment: a comprehensive approach. Professional
Safety, 46, 23–30.
Gibson, S. B. (1974). Reliability engineering applied to the safety of new projects.
Chemical Engineering, 306, 105.
http://paulthorn.co.uk/healthandsafety/Risk%20Management/HAZOP%20Guide
%20to%20best%20practice-%20.pdf
Kletz, T. (1972). Specifying and designing protective systems. Loss Prevention, 6, 15.
Kletz, T. (1986). Hazop and Hazan. London: IChemE.
Kletz, T. (1995). Computer Control and Human Error. London: IChemE.
Kletz, T. (1998). Process Plants: Handbook for Inherently Safer Design, 2nd ed. London:
Taylor & Francis.
Knowlton, R. (1981). Introduction to Hazard and Operability Studies: A Guideword
Approach. Vancouver: Chemetics International
HAZOP Analysis 123
Overview
Fault tree analysis (FTA) is another technique of reliability and safety analy-
sis and one of many symbolic analytical logic methods applied in operations
and system reliability research. Of course, other techniques such as reliabil-
ity block diagrams (RBDs) shown later in this chapter, but FTA is more con-
venient and easier to use to evaluate safety and reliability issues.
Bell Telephone Laboratories developed the FTA concept in 1961 for used
by the U.S. Air Force with its Minuteman system. FTA was later adopted
and utilized extensively by the Boeing Company and is now used widely in
many fields and industries.
FTA is a deductive analytical technique generally used for reliability and
safety analyses of complex dynamic systems. It provides an objective basis
for analysis and justification for changes and additions (Blanchard 1986).
Fault tree diagrams (negative analytical trees) are logic block diagrams that
display the state of a system (top event) in terms of its components (basic
events); see Figure 6.1.
Like RBDs, fault tree diagrams are graphic design techniques that provide
alternatives to RBD methods. A fault tree is built from the top down and
utilizes events rather than blocks. It reveals a graphic model of the path-
ways within a system that can lead to foreseeable, undesirable loss events or
failures. The pathways interconnect contributory events and conditions and
utilize standard logic (AND, OR) symbols. The basic constructs of a fault tree
diagram are gates and events; the events have the same meanings as blocks
in RBDs; the gates represent conditions (Stamatis 2003).
As used today, the FTA model logically and graphically represents vari-
ous combinations of faulty and normal events in a system that may lead to
the top undesired event. It uses a tree to show the cause‑and‑effect relation-
ships of a single, undesired event or failure and various contributing causes.
The tree shows the logical branches from the single failure at the top to the
root causes at the bottom (Figure 6.2) that may be analyzed further with an
FMEA. Standard logic symbols shown in Table 6.1 and 6.2 are used. After
the tree has been constructed and the root causes identified, the corrective
125
126 Introduction to Risk and Failures: Tools and Methodologies
L. T. H. T.
failure failure
FIGURE 6.1
Typical partial engine FTA diagram.
Electrical fire in
motor circuit
High temperatures
Wire insulation
generated in wiring Air (oxygen present)
(fuel present)
(ignition source)
FIGURE 6.2
Relationship of FTA and FMEA.
TABLE 6.1
Typical FTA Symbols
Basic event
Circle
Undeveloped event
Diamond
Conditional event
Oval
Trigger event
House
Resultant event
Rectangle
TABLE 6.2
FTA Logic Symbols
Benefits
• Helps depict an analysis.
• Helps identify the reliabilities of higher-level assemblies or systems.
• Determines the probability of occurrence for each root cause.
• Provides documented evidence of compliance with safety requirements.
• Assesses the impacts of design changes and alternatives.
• Provides options for qualitative and quantitative system reliabil-
ity analyses.
• Allows analysts to concentrate on system failure at a time.
• Provides insights into system behavior.
• Isolates critical safety failures.
• Identifies ways that a failure can lead to an accident.
After the appropriate logic gates, symbols, and event descriptions have been
developed, the next level of complexity of FTA involves the calculation of
the probabilities of occurrence of the top-level events. To perform the calcu-
lation, the probabilities of occurrence values for the lowest-level events are
required. The probability equation is:
And
Or
(3)
P(3) = 0.01
(1) (2)
P(1) = 0.001 P(2) = 0.002
FIGURE 6.3
FTA depiction of parallel system.
1 2
FIGURE 6.4
Typical block diagram.
References
Blanchard, B. (1986). Logistics Engineering and Management, 3rd ed. Englewood Cliffs,
NJ: Prentice Hall.
Stamatis, D. (2003). Failure Mode and Effect Analysis: FMEA from Theory to Execution.
Milwaukee, WI: Quality Press.
Selected Bibliography
Henley, E. and H. Kumamoto. (1981). Reliability Engineering and Risk Assessment. New
York: Prentice Hall.
http://www.fault-tree.net/papers/clemens-event-tree.pdf
Kececioglu, D. (199 1). Reliability Engineering Handbook, Vols. 1–2. Englewood Cliffs,
NJ: Prentice Hall.
Motorola Corporation. (1992). Reliability and Quality Handbook. Phoenix, AZ: Motorola
Semiconductor Products Sector.
Omdahl, T.P., Ed. (1988). Reliability, Availability, and Maintainability Dictionary.
Milwaukee, WI: Quality Press.
Stamatis, D.H. (2003). Six Sigma and Beyond: Design of Experiments. Boca Raton, FL:
St. Lucie Press.
7
Other Risk and HAZOP
Analysis Methodologies
We already noted that risks and HAZOP issues may be analyzed in a num-
ber of ways, depending on the industry and scope of a project. This chapter
focuses on available methods that are often overlooked for many reasons,
including doubt about their effectiveness because of their simplicity.
Process Flowchart
Flowcharts are easy-to-understand diagrams showing how steps in a process
fit together. This makes them useful tools for communicating how processes
work, and clearly documenting how a specific job is done. Furthermore, the
act of mapping a process in a flowchart format helps clarify the understand-
ing of the process and helps reveal where the process can be improved. A
flowchart can therefore be used in HAZOP to:
Activity or operations O
Inspection □
Flow or movement →
Delay D
Inventory storage ∇
133
134 Introduction to Risk and Failures: Tools and Methodologies
No
Lamp
Plug in lamp
plugged in?
Yes
Yes
Bulb Replace bulb
burned out?
No
Repair lamp
FIGURE 7.1
Simple flowchart.
Function #2
OR OR
Function #3
OR OR
Function #1
Function #2
AND AND
Function #3
FIGURE 7.2
Logic depiction used in functional diagrams.
Block diagrams are not intended to illustrate all the functional relationships
that must be considered in a HAZOP or FMEA. The diagrams should be as
simple and explicit as possible. Figure 7.3 is a block and logic diagram.
System-level diagrams are generated for components or large systems
comprising several assemblies or subsystems. Detail-level diagrams are gen-
erated to define the logical flow and interrelationships of individual com-
ponents and/or tasks. Reliability-level diagrams generally are used at the
system level to illustrate the dependence or independence of the systems or
components contributing to specific functions (General Motors 1988). They
also are used to support predictions of successful functioning for specified
operating or usage periods.
Generally, it is assumed that a component has only two possible states:
operational or faulty. To successfully apply the technique, the operation of
a process must be described in detail. The description should contain state-
ments of (1) functions to be performed; (2) performance parameters and pos-
sible limits; and (3) environmental and operating conditions.
A process is then divided into blocks that can be further divided into sepa-
rate reliability block diagrams if required. If possible, each block should be
independent of the other blocks and contain no redundancies. The system
definition is then used to organize the block diagram. The output of one
block is used as the input to the next. If no redundancy is present, the result-
ing reliability block diagram will be linear, indicating that the failure of any
block will cause the entire process to fail. If redundancies exist within a pro-
cess, blocks can be drawn in parallel, indicating that despite the failure of
one of these blocks, a path for the process to work is still available.
136 Introduction to Risk and Failures: Tools and Methodologies
Abbreviations/Notes:
“And” Gate: Parallel Function
“Or” Gate: Alternate Function
Functional
description
9.2.1 Ref.
3.5 Ref Parallel G
and and and or
functions 11.3.1
See Detail 9.2.3
See Detail
Diagram or or G
Diagram
Alternate
functions or
Sys 9.2.4
1.1.2 Ref
No go flow
Malf.
Tentative
Interface reference See Detail Diagram
Leader note function
block (used on first-
and lower-level
function diagrams Flow level designator 2nd Level
only)
FIGURE 7.3
Block diagram with designated boundary line.
Stamatis (2003 pp. 55–58) provides examples of block diagrams, logic dia-
grams, schematic diagrams, functional diagrams, and layout diagrams.
Control Plan
A control plan is a written summary of quality planning actions for a spe-
cific process, product, and/or service. The plan lists all process parameters
and design characteristics considered important to customer satisfaction and
requiring specific quality planning actions (Chrysler 1986; Ford 1992, 2000;
General Motors 1988; ISO/TS 19649; AIAG 2001). A control plan describes the
actions and reactions required to ensure that a process is maintained in a
state of statistical control agreed upon by company and supplier.
Remember that FMEA identifies critical characteristics and therefore
serves as the starting point for a control plan. A control plan cannot trigger
the FMEA. A process flow diagram dictates the flow of the process. Stamatis
(2003 pp. 60–61) provides an example of a control plan. A typical control plan
may include:
138 Introduction to Risk and Failures: Tools and Methodologies
The feasibility analysis uses product design and process FMEA as its pri-
mary tools (Ford 1992, 2000).
Task Analysis
After a system has been defined and described, the specific tasks that must
be performed are analyzed. Kirwan (1992) defines task analysis as a system-
atic method for analyzing a task based on its goals, operations, and plans.
A task is a set of operations or actions required to achieve a set goal. The
goal represents the required outcome of the actions, the operation involves
Other Risk and HAZOP Analysis Methodologies 139
various stages required to implement the goal, and plans are methods and
conditions under which the stages are performed. In essence, a task analy-
sis defines:
Task analysis also studies the human activities involved in performing tasks
and asks questions such as:
• Creating plan: The methods and conditions under which the various
stages of the analysis are performed should be defined.
• Analyzing plan. The plan created should be fully analyzed to identify
hazards to the equipment, operators, and environment and should
note lacks of controls and protection measures. Possible deviations
and their likelihoods should also be examined.
• Modifying plan: Modifications to the plan should improve work
methods and safety and minimize deviations. The plan should also
recommend possible actions if deviations occur. The plan along with
a description of the appropriate methods and conditions can be pre-
sented to management or other authority.
The first two stages are performed during task analysis and the results can
be incorporated into the final stages of the HRA. The accuracy of the val-
ues produced by HRA is unknown and the results should be regarded only
as estimates.
A HAZID analysis involves six phases, each of which is distinct and requires
specific tasks to be performed.
Phase 1: Planning
• State the objectives of the risk assessment.
• Describe the activity to be evaluated.
• Confirm the scope (what will be included and excluded).
• Select team to perform the evaluation and assign responsibilities.
• Nominate a team leader.
• Identify how the results of the assessment will be communicated
and to whom.
Other Risk and HAZOP Analysis Methodologies 143
TABLE 7.1
Typical Hazards Outside Envelope of Process Equipment
Parameter Guideword Parameter Guideword
Hydrocarbon Layout Natural environment Extreme weather
hazards Over- and under-temperature Seismic activity
Loss of containment Transport Vessels
Flammable materials Road vehicles
Fire protection Chemicals
Overpressure Noise
Different composition Toxics
Ignition sources Health hazards Exposure
Gas detection Working conditions
Safety controls Radiation
Equipment Integrity Chemicals
or plant Material dissimilarities Noise
failure Installation Toxic materials
Failure modes Damage to Discharges to air
Utility Capacity environment Pollution control
systems Failure Discharges to sea
Operation Different modes Waste management
and control Normal shutdown HSE management Command and control
Commissioning and start-up Emergency response
Emergency shut-down Interfaces
Maintenance Preparation Training
Reinstatement Communications
Execution Security
Related tasks Staffing
Supervision
• Checklist: The items listed depend on the industry and the process
to be examined. Note that no checklist is ever all-inclusive. The team
should be prepared to add or delete items as applicable to the spe-
cific study. Examples of checklist items are:
• Electrical hazards
• Environmental concerns
• Human factors
• Mechanical hazards
• Process hazards
• Quality issues
• Radiation
144 Introduction to Risk and Failures: Tools and Methodologies
These seven principles are closely related to FMEA in the sense that they
try to predict potential hazards. The difference is that FMEA focuses on the
severity of a failure, then on occurrence, and finally on detection. HACCP
focuses on hazards at critical points and then on controls. The techniques
differ widely. From the author’s view, FMEA is more powerful than HACCP.
In conjunction with HACCP, a reliability centered maintenance or remote
condition monitoring (RCM2) evaluation is often performed. Again, FMEA
is used to predict failures at the design and/or process level and devise
action plans to avoid failures at those levels. RCM2 is a cost-effective life
cycle asset management strategy. They are related but focus on different
areas. FMEA evaluates failure issues; RCM2 is concerned with costs. One
may use FMEA data in CRM2 but the reverse will not work.
Audits
An effective audit includes a review of the relevant documentation and pro-
cess safety data, inspection of the facilities, and interviews with all levels
of plant personnel. By using an audit procedure and checklist developed in
the preplanning stage, an audit team can systematically analyze compliance
with standards and any other relevant corporate policies. For example, the
audit team may review all aspects of a training program as part of its assign-
ment. It will review written materials for adequacy of content, frequency of
training, and effectiveness of training in meeting goals and objectives fitting
the relevant standards.
Through interviews, the team can determine employees’ knowledge and
awareness of safety procedures, duties, rules, and emergency response
assignments. During the inspection, the team can observe work practices,
including safety and health precautions. This approach enables the team to
identify deficiencies and determine where corrective actions or improve-
ments are necessary.
FIGURE 7.4
Overview of ETA.
Success
Outcome 1
Success
Fail
Success Outcome 2
Success Accident
Outcome 3 Scenarios
Fail
IE Fail
Outcome 4
Fail
Outcome 5
FIGURE 7.5
Generic ETA showing primary and secondary trees.
for most risk assessment applications but is most effective for modeling acci-
dents in which multiple safeguards are in place as protective features. ETA
is highly effective in determining how various initiating events can cause
accidents. A visual overview is shown in Figure 7.4. ETA gives an analyst the
ability to handle large-scale problems and utilize success logic. The event
tree model may be created independently of the fault tree model or may use
fault tree analysis gate results as sources of event tree probabilities.
It is important to note that ETA can handle both primary and secondary
event trees, multiple branches, and multiple consequence categories; see
Figure 7.5. Some of the possibilities of ETA are:
As flexible as the ETA is, it has other unique features worth mentioning.
They are based on the ease of application such as identifying:
• Full minimal cut set analysis allowing full handling of success states
• Sensitivity analysis allowing the automatic variation of event failure
and repair data within specified limits
• Range of event failure and repair models, including fixed rate, dor-
mant, sequential, stand-by, time-at-risk, binomial, Poisson, and ini-
tiator failure models
• Risk importance analysis identifying major contributors to risk
• Basic events that may be linked to Markov analysis
• Comprehensive risk calculation
An event tree begins with an initiating event and progresses through com-
ponent failures and concludes with outcomes. The flow is shown in Table 7.3.
The consequences of an event may follow a series of possible paths. Each path
is assigned a probability of occurrence and the probabilities of various pos-
sible outcomes can be calculated. A typical flow for a fire example is shown
in Table 7.4. As the arrows in the cells indicate, a concern is identified in the
TABLE 7.3
Flow of ETA
Critical Events
Initiating Event Event 1 Event 2 Event 3 Outcomes
TABLE 7.4
Flow of ETA in Application Format
Fire Alarm Sprinkler System
Fire Start Fire Detected Start Start Consequence Results
154 Introduction to Risk and Failures: Tools and Methodologies
Success (P3S)
Success (P2S) Outcome A
PA = (P1E)(P1S)(P2S)(P3S)
Fail (P3F)
Success (P1S) Outcome B
PB = (P1E)(P1S)(P2S)(P3F)
Success (P3S)
Fail (P2F) Outcome C
Event PC = (P1E)(P1S)(P2F)(P3S)
P1S = 1 – P1F Fail (P3F)
(P1E) Outcome D
Fail (P1F) PD = (P1E)(P1S)(P2F)(P3F)
P2F Outcome E
PE = (P1E)(P1F)
P2F P3F
P1F
FIGURE 7.6
Generic ETA associated with FTA and propabilities.
left-most cell. The team progressively follows the details of the event until
the results are identified in the extreme right cell. Generally the flow follows
a binary functional diagram. Figure 7.6 shows probabilities associated with
ETA and FTA. In developing the ETA, one may want to use FTA and asso-
ciated probabilities for each event identified. A typical generic format may
look like Figure 7.6.
Characteristics
ETA models a range of possible accidents resulting from an initiating event or
category of initiating events. A typical ETA is shown in Figure 7.7. In essence,
ETA is a risk assessment technique based on success and failure that effec-
tively accounts for timing, dependence, and domino effects among various
accident contributors that are too cumbersome to model in fault trees. The
assessment is performed primarily by an individual working with subject
matter experts through interviews and field inspections with the intent to
generate at least:
Success
Failure
Failure
Success
Failure
Success
Initiation Failure
Success
Success
Failure
Success
Failure
Failure
FIGURE 7.7
Typical ETA showing individual events of success and failure.
Process
The ETA process follows seven steps:
1. Define the system or area of interest. Specify and clearly define the
boundaries of the system or area for which ETA will be performed.
2. Identify the initiating events of interest. Conduct a screening
level risk assessment to identify the events of interest or categories
of events to be addressed. Categories include such events as vessel
groundings, collisions, fires, explosions, toxic releases, and so on.
3. Identify lines of assurance, physical phenomena, and safeguards
(lines of assurance) that will help mitigate the consequences of the
initiating event. Lines of assurance include engineered systems and
human actions. Physical phenomena include ignitions and meteoro-
logical conditions that may affect the outcome of an initiating event.
4. Define accident scenarios. For each initiating event, define the sce-
narios that can occur. Do not be afraid to identify events that may be
considered “wild” or “out of the box.”
156 Introduction to Risk and Failures: Tools and Methodologies
5.
Analyze accident sequence outcomes. Determine the appropriate
frequency and consequences that characterize each specific outcome
on the tree.
6.
Summarize results. Event tree analysis can generate numerous acci-
dent sequences that must be evaluated. Summarizing the results in a
separate table or chart will help organize the data for evaluation.
7.
Use the results in decision making. Evaluate the recommendations
from the analysis and the benefits they are intended to achieve. Benefits
can include improved safety and environmental performance, cost
savings, or additional production. Determine implementation criteria
and plans. The ETA results may serve as bases for deciding whether to
perform additional analyses on a selected subset of accident scenarios.
Keep in mind that ETA is used to determine the path from an initiating event
to various consequences and reveal the expected frequency of each conse-
quence. Pipe breaks, alarms that fail to activate, and human errors of omis-
sion and commission are events that can produce insignificant or catastrophic
consequences. The event tree models these initiators and consequences and
determines their frequencies. In summary, ETA is one more way to evaluate
hazards. It is a bottom-up deductive analytical technique that is applicable
to automated and human-operated systems and to decision-making and/or
management systems.
Specifically, ETA can explore system responses to initiate challenges and
opportunities for pursuing successes and assessing failures; see Figure 7.7.
Furthermore, ETA is closely related to other techniques, especially FTA and
FMEA; see Figure 7.8.
Typical challenges that may be analyzed using the ETA are ignitions
of stored combustibles, epidemic outbreaks, utility system failures, tech-
nology needs, business competition, and others. ETA is simply a credible
system of analyzing operating permutations that lead to a success or fail-
ure. This, of course, is the classic Bernoulli model. Binary branching will
reveal unrecoverable failures and undefeatable successes leading to final
outcomes. Of course, after the ETA is formulated, other analyses such as
FTA may be necessary to determine the probability of an initiating event
or condition; see Figures 7.6 through 7.8.
15 Success
7
16 Failure A1
3
17 Success
8
18 Failure B1
1
19 Success
9
20 Failure B2
4
21 Success
10
22 Failure C
i
23 Success
11
24 Failure B3
5
Failure
25 Success A1-2
12
26 Failure A2
2
27 Success
13
28 Failure D
6 Failure Failure
29 Success A1 A2
14
30 Failure
16 7* 3* 1* i 26 12 5* 2 i
FIGURE 7.8
ETA and FTA relationship.
Example
Clemens (1990) presents a simple example of an ETA dealing with an anti-
flooding system (Figure 7.9). Figure 7.10 is a reliability diagram of the system
and Figure 7.11 shows its reliability and associated probabilities. A subgrade
compartment containing important control equipment is protected against
flooding by the system shown. Rising floodwaters will close float switch S,
powering pump P from an uninterruptible power supply. A klaxon (horn)
K sounds to alert operators to perform manual bailing B should pump P
fail. Pumping or bailing will dewater the compartment effectively. We will
assume flooding has commenced and analyze responses of the dewatering
system. The assumptions for this system are:
158 Introduction to Risk and Failures: Tools and Methodologies
Pump Klaxon
P
B
K
FIGURE 7.9
Anti-flooding system.
Pump
P
Float Switch
S
Klaxon Bailing
K B Cut
Sets
Path S
Sets
S/P P/K
S/K/B P/B
FIGURE 7.10
Reliability diagram of flooding system.
Success
Float Switch Event Tree....
Succeeds [PP – PPPS – PKPP + PKPPPS –
(1–PS) Klaxon Succeeds PBPP + PBPPPS + PBPKPP –
(1-PK) PBPKPPPS]
[PP – PPPS – PKPP
Pump Fails + PKPPPS] Bailing Fails
Water Rises
Failure
(PP) (PB)
(1.0)
Klaxon Fails [PBPP – PBPPPS – PBPKPP +
[PP – PPPS]
(PK) PBPKPPPS]
Float Switch
Fails [PKPP – PKPPPS]
(PS)
[PS]
FIGURE 7.11
ETA reliability diagram with associated probabilities.
References
AIAG, Ed. (2000) ISO Technical Specification 19649: Quality Management Systems.
Daimler Chrysler Corporation, Ford Motor Company, and General Motors
Corporation. Southfield, MI: Author.
AIAG. (2001). Potential Failure Mode and Effect Analysis, 3rd ed. Daimler Chrysler
Corporation, Ford Motor Company, and General Motors Corporation.
Southfield, MI: Author.
Bass, L. (1986). Product Liability: Design and Manufacturing Defects. Colorado Springs,
CO: Shepard/McGraw Hill.
Chrysler Motors. (1986). Design Feasibility and Reliability Assurance in FMEA. Highland
Park, MI: Author.
Clemens, P. (1990). Event Tree Analysis, 2nd ed. Sverdrup. http://www.fault-tree.net/
papers/clemens-event-tree.pdf
Dhillon, B. (1986). Human Reliability with Human Factors. Oxford, U.K.: Pergamon Press.
Dougherty, E., Jr. and J. Fragola. (1988). Human Reliability Analysis: A Systems Engineering
Approach with Nuclear Power Plant Applications. New York: John Wiley & Sons.
Ford Motor Company (1992). FMEA Handbook. Dearborn, MI: Author.
Ford Motor Company (2000). FMEA Handbook with Robustness Linkages. Dearborn, MI:
Author.
General Motors Corporation (1988). FMEA Reference Manual. Detroit, MI: Author.
Kirwan, B. and L. Ainsworth, Eds. (1992). A Guide to Task Analysis. New York: Taylor
& Francis.
Motorola Corporation (1992). Reliability and Quality Handbook. Phoenix, AZ: Author.
Omdahl, T. P., Ed. (1988). Reliability, Availability, and Maintainability Dictionary.
Milwaukee, WI: Quality Press.
160 Introduction to Risk and Failures: Tools and Methodologies
Stamatis, D.H. (2003). Failure Mode and Effect Analysis (FMEA) from Theory to Execution,
2nd ed. Milwaukee, WI: Quality Press.
Swain, A. and H. Guttman. (1983). Handbook of Human Reliability Analysis with
Emphasis on Nuclear Plant Applications. U.S. Nuclear Regulatory Commission,
Report NUREG-CR-1278.
Selected Bibliography
Andrews, J. and S. Dunnett. (2000). Event tree analysis using binary decision dia-
grams. IEEE Transactions on Reliability, 49, 230–238.
Center for Chemical Process Safety. (2008). Guidelines for Hazard Evaluation Procedures,
3rd ed. New York: John Wiley & Sons.
Ericson, C. (2005). Hazard Analysis Techniques for System Safety. New York: John Wiley
& Sons.
Gibson, S. (1974). Reliability engineering applied to the safety of new projects.
Chemical Engineering, 306, 105.
Gould, J., M. Glossop, and A. Ioannides. (2000). Review of Hazard Identification
Techniques. Health and Safety Laboratory. HSL/2005/58
Henley, E. and H. Kumamoto. (1981). Reliability Engineering and Risk Assessment. New
York: Prentice Hall.
Henley, E. and H. Kumamoto. (1996). Probabilistic Risk Assessment and Management for
Engineers and Scientists, 2nd ed. New York: IEEE Press.
Kaplan, S. and B. Garrick. (1981). On the quantitative definition of risk. Risk Analysis,
1, 11–37.
Kletz, T. (1972). Specifying and designing protective system. Loss Prevention, 6. 15.
Lees, F. (2001). Loss Prevention in the Process Industries, 2nd ed., Vols. 1–3, Maryland
Heights: MO: Elsevier-Butterworth-Heinemann.
Papazoglou, I. (1998). Functional block diagrams and automated construction of
event trees. Reliability Engineering and System Safety, 61, 185–214.
Stamatis, D. H. (2003). Six Sigma and Beyond: Design for Six Sigma. Boca Raton, FL:
St. Lucie Press.
Stamatis, D.H. (2003). Six Sigma and Beyond: Design of Experiments. Boca Raton, FL:
St. Lucie Press.
8
Teams and Team Mechanics
This chapter covers the basic aspects of teams and how team actions affect
both HAZOP and FMEA results. The information in this chapter does not
represent an exhaustive examination of teams but does cover issues related
to both methodologies. To achieve best results, a risk analysis (HAZOP,
FMEA, FTA, etc.) must be written by a team. This is because a risk analysis
should act as a catalyst to stimulate interchanges of ideas among the groups
affected (Stamatis 1991). A typical view is shown in Figure 8.1.
A single engineer or other individual cannot perform a risk analysis. A
team should consist of five to nine people (preferably five). All team members
must have some knowledge of team behavior, the task at hand, the problem
to be discussed, and direct or indirect ownership of the problem. Above all,
they must be willing to contribute. Team members must be cross‑functional
and represent varied disciplines. Furthermore, whenever possible, custom-
ers and/or suppliers should participate as ad hoc members.
1. Organization
a. Philosophy
b. Rewards
c. Expectations
d. Norms
2. Team
a. Meeting management
b. Roles and responsibility
c. Conflict management
161
162 Introduction to Risk and Failures: Tools and Methodologies
Deadbeats
confronted (peer
Appreciation of cross- pressure)
functional and
multidiscipline aspects of Pride in
the organization product
What outcomes
are expected?
Customer
awareness
Respect for
diversity
Why teams? Company
Employee product
enthusiasm How
tos
Active
learning Intense work with
Practical people from other
Forced backgrounds
leadership
FIGURE 8.1
Team overview.
Organization Team
Individual
FIGURE 8.2
Team performance factors.
d. Operating procedures
e. Mission statement
3. Individual members
a. Self‑awareness
b. Appreciation of individual differences
c. Ernpathy
d. Caring
Teams and Team Mechanics 163
• Work
• Task complexity
• Productivity and quality advantages
• Work system stability
• People
• Rising expectations
• Affiliation needs
• Increased cognitive ability
• Specific time-related concerns
• Future directions
• Survival in a global market
All teams regardless of application must be familiar with problem solving steps:
HAZOP Team
An effective hazard incident investigation and analysis program generally
has two major components: technical and human. The technical side of the
investigation is where most of the literature focuses, particularly on root
cause analysis. However, the human aspects of incident investigation do not
receive the same degree of attention. An effective investigator understands
how people think and behave. He or she must be able to communicate with
a wide range of team members and management levels. Chapter 10 covers
effective communication processes.
Technicians
Most hazard incidents involve front-line technicians (operators and main-
tenance workers), some of whom may have been injured or emotionally
shaken. These people will often feel defensive and upset and may feel guilty
if any colleagues were injured or died.
Technicians often may not understand what caused an incident and worry
that they will be blamed. An effective investigator encourages these front-
line technicians to be open and candid, primarily by letting them talk with-
out interruption. Unfortunately, many investigators, even those with years
of experience are quick to interrupt a technician’s narrative flow with ques-
tions, war stories, or snap judgments about the event. An investigator should
also clearly state that the goal of the investigation is to learn what happened,
not apportion blame or show how smart the investigator is.
Mid-Level Managers
Most investigations find that changes are needed at the facility’s mid-level
management systems. Examples of such changes include an increased
Teams and Team Mechanics 165
Senior Managers
Many investigators find that technicians are candid and open and mid-level
managers are generally willing to honestly address the need for improvements
to systems. What investigators sometimes learn, however, is that senior man-
agers can be resistant to the findings and implications of an investigation.
The findings may indicate that systemic changes to management systems are
required. Senior managers in charge of such systems may become very defen-
sive about holding onto their ideas. An effective investigator will know how
to communicate with senior managers and obtain their buy-ins. This ability
is critical because senior managers provide the funding needed to implement
the investigation’s recommendations.
An additional concern about the involvement of senior managers is that
they are usually strong personalities; they may try to take over an investiga-
tion and direct it to meet their own opinions, goals, and agendas. A strong
investigator is able to resist these efforts.
A HAZOP team will typically consist of five to nine people. Team members
should be cross-functional and possess a range of relevant skills to ensure all
aspects of the plant and its operations are covered. All relevant engineering
disciplines, management, and plant operating staff should be represented.
This will help prevent possible events from being overlooked through lack
of expertise and awareness.
A fundamental difference between regular and HAZOP teams is that a
regular team operates under the direction of a leader who may be selected
by members and can facilitate meetings. A HAZOP team operates under
the direction of a chairperson who must be knowledgeable about HAZOP
techniques and has experience in conducting HAZOPs. This will ensure that
the team follows the procedure without diverging or taking shortcuts. If a
HAZOP is required as a condition of development consent, the name of the
chairperson is usually submitted to the regulators and/or other authority that
approves the commencement of the exercise.
Consensus
Consensus is a collective decision reached through active participation by
all members who have personal ownership in the decision. It requires all
166 Introduction to Risk and Failures: Tools and Methodologies
To recognize consensus, the team members must answer yes to four questions:
Errors will occur if a team continues to meet without a process check. Errors
may be prevented by testing, training, and auditing. Some of the most com-
mon errors are:
• Results of misunderstandings
• Inadequate information
• Incomplete data because form is too difficult to complete
• Incomplete or biased data caused by fear
• Failure to use existing data
Member who talks too much: If a discussion turns into a dialogue between
the leader and an overly talkative individual, the other members will lose
interest. Even if the talkative individual has something of value to say, the
team leader should not let him or her monopolize the discussion. Tactful
approaches must be used to divert the discussion to others. If the leader
knows that one member of the team likes to dominate, he or she should pose
questions to the group without looking at the talkative individual or ignore
the individual’s responses.
The leader of a FMEA or HAZOP team should want all members to par-
ticipate. If one member dominates the discussion because he or she has
more experience or education, the leader should utilize that individual as a
resource and a coach for other team members.
A talkative person may simply be trying to make an impression to satisfy
his or her own ego. The only way to handle such an individual is to advise
him or her in advance that the group disapproves of such behavior and con-
tinue exerting team pressure as needed.
A talkative person may interfere by taking too much time to express his
or her ideas. This is a very sensitive area because the person participates as
expected but also annoys the rest of the team. The leader must handle this
situation delicately. If the situation is mishandled, the talkative participant
will lose self-confidence and ultimately withdraw. Usually, it is better to tol-
erate a certain amount of this behavior rather than discourage the individual
too much.
Another case is the talkative person who starts a private conversation with
a neighbor. A leader can eliminate this problem by asking a direct question
to those involved in the conversation or make the team large enough to allow
generation of a variety of ideas and small enough to sustain small cliques.
A team usually consists of five to nine persons; a five-person team is the
most common.
Member who talks too little: Members may not want to participate, may
feel out of place, or may not understand the problem discussed. It is the
responsibility of the leader to actively draw this individual into the discus-
sion by direct questions at meetings or by attempting to motivate the indi-
vidual outside the team environment with statements such as “We need your
input,” ”We value your contribution,” or “You were selected for the team
because of your experience and knowledge.”
Member who strays from agenda: This problem is common in a team
environment (especially early in team development) where individuals want
to talk about their own agendas instead of the issues facing the team. It is
the responsibility of the leader to bring the discussion back to the meeting
agenda and/or outline. On rare occasions the leader may want to involve the
whole team by asking whether the off-agenda item should be ”taken up right
now” or “sent to the ‘parking lot’ for future discussion.”
Teams and Team Mechanics 169
Problem Solving
This section will not discuss problem-solving tools and methods. It focuses
on helping teams to understand the mechanics and rationale for pursuing
methods to eliminate and/or reduce problems. Detailed descriptions of tools
may be found in basic Statistical Process Control (SPC) books, statistical lit-
erature, and/or organizational development sources. For a good review of
the process, readers may want to see Stamatis (2002, 2003).
For most people, the prediction or onset of a problem indicates a need for a
change in behavior. When an individual or a team is actually or potentially
in trouble, a unique set of strategies is required to trigger at least a temporary
change in behavior—a new course of action. Without a deliberate strategy
for pursuing a new course of action, revised behaviors may make a situa-
tion worse.
Problems are often not clear to the individuals who experience them. It is
difficult to isolate a problem and its related components. Even if this is pos-
sible, the selection and implementation of a solution involves some physical
or psychological risk. Familiar patterns of behavior are safe. In a problem sit-
uation, a person is torn between the need to change and the desire to main-
tain the old patterns. This conflict produces strong emotions and anxieties
that affect the cognitive processes required to make workable decisions. If a
problem is sufficiently severe, cognitive paralysis may result (Pfeiffer 1991).
People and teams who are in trouble need useful tools to help them under-
stand their problem situations, decide their courses of action, and manage
the new directions chosen. The components of a generic model of problem
identification and solving are:
conflicts that surround most human problems. Furthermore, they are of little
use in attacking problems that people consider intangible. The techniques
may be useful for evaluating an alternative business plan or buying a new
washing machine, but offer little help in interpersonal problems. Therefore,
this section discusses methods of incorporating human issues into the
problem-solving process.
Meeting Planning
Before a team actively works on a project, some preliminary steps must be
followed. The first step is planning the meeting. Bradford (1976); Nicoll (1981);
Schindler‑Rainman, Lippit, and Cole (1988); and Stamatis (1991) identified
the following concerns.
People: Meeting participants may differ in values, attitude, experience,
sex, age, and education. All these differences, however, must be considered
in planning a meeting.
Purpose: The purpose, objective, and goal of the meeting must be under-
stood by all participants and by management.
Atmosphere: The comfort of participants contributes to the effectiveness
of a meeting. It is imperative that the meeting planner consider the climate
and atmosphere.
Place and space: Planners must consider (1) access to meeting space; (2) size
of space; (3) acoustics, lighting, and temperature control; (4) costs; (5) equip-
ment required; and (6) available parking.
Costs: Cost is of paramount importance. The preparation of a risk analy-
sis through PHA, HAZID, HAZOP, FMEA, FTA, etc., is a lengthy process.
Another consideration is that the system, design, process, and service per-
sonnel involved in the project may be in different and distant places.
Time considerations: How long will this activity take? Is an alternate
schedule available? Can the participants be spared for this task? Without
evaluating time constraints and recognizing that a meeting may be pro-
longed for an unexpected reason, the agenda items and objectives may suffer.
Pre- and post-meeting work: The amount of work resulting from a meet-
ing is related directly to the amount of planning that preceded the meeting.
Lengthy and complex tasks may require major portions of work to be per-
formed outside the meeting and only reviewed by participants at the meeting.
Plans, program, and agenda: An agenda is essential and no meeting can
proceed without one. A detailed program or agenda distributed to all partic-
ipants ensures effectiveness and prevents surprises. An agenda should cover
all objectives of the meeting.
Teams and Team Mechanics 171
• Struck by • Caught on
• Struck against • Fall from same level
• Caught between • Fall to lower level
• Contact with • Overexertion
• Contacted by • Exposure
• Caught in
After basic hazards are identified, the team should consider the categories
that the hazards in question belong. The major categories are:
At this stage, the team should have a good understanding of the hazard
and have developed ideas about resolving it. The team has two options:
(1) eliminate the risk or (2) reduce it. In both cases, the team has three choices
of corrective actions:
1. State and restate the initial question until everyone agrees on the
issue to be discussed.
2. Solicit participants’ honest opinions at the outset.
3. Think of opinions as hypotheses; test them instead of arguing
over them.
4. Plan a method of testing opinions against reality by reviewing the
issue and the goal.
5. Establish a rule that additional information revealed at a meeting
must be relevant to agenda topics.
6. Encourage disagreements and differences of opinions.
7. Do not judge others’ opinions hastily. Learn to appreciate diverse
points of view.
8. Encourage members’ commitments to resolving the issue when-
ever possible.
9. Compromise as needed.
10. Ask whether a decision is necessary. Remember that choosing to do
nothing is a legitimate choice.
11. Construct a process for feedback to determine whether a decision
was successful.
Awareness of these traps can help a meeting facilitator avoid them. Constructive
confrontation is an effective technique for dealing with many disruptive and
dysfunctional meeting behaviors. A meeting leader who chooses to confront
must discuss only behavior, not the participant. More desirable behaviors
should be suggested in a direct but caring way. Jones (1980) suggests two
approaches to dealing with disruptive meeting participants. The first requires
the meeting leader to communicate directly with the disruptive person by:
• Turning his or her question into statements, thus forcing the person
to take responsibility for his or her opinion.
• Refusing to engage in a debate. By noting that debates have winners
and losers, the leader should promote a win–win outcome.
• Suggesting that the leader and disruptive person swap roles to dem-
onstrate the effect of the disruptive person on the group.
• Using active listening techniques to mirror a participant’s feelings,
for example, “You seem upset today, especially when I disagree with
you.” We have two ears and only one mouth. Therefore, listen twice
as hard as you speak.
• Agreeing with the person’s need to be heard and supported.
The second approach suggested by Jones treats the other meeting partici-
pants as allies against the disruptive person:
teams for discussion, less assertive members often become more willing to
participate. A small team is not as likely to wander off the subject as a large
team. Because fewer people compete for attention in a small team, members
feel a stronger sense of commitment. Finally, small teams can diffuse aggres-
sive members’ tendencies to dominate discussions.
Meeting leaders will find that their meetings will become more interest-
ing, lively, and balanced if they follow the guidelines presented in this sec-
tion. The core points to remember are that all meeting participants must
be treated equally; honesty must be the norm; and all opinions must be
respected (Stamatis 1992).
References
Allmenclinger, G. (1990). Performance measurement: impact on competitive perfor-
mance. Technology, December, 10–13.
Bradford, L.P. (1976). Making Meetings Work: A Guide for Leaders and Group Members.
San Diego, CA: University Associates.
Jones, J. E. (1980). Dealing with disruptive individuals in meetings. In Pfeiffer, J. S.
and E. Jones, Eds., The 1980 Annual Handbook for Group Facilitators. San Diego,
CA: University Associates.
Mosvick, R. K., and R. B. Nelson. (1987). We’ve Got to Start Meeting Like This! A Guide to
Successful Business Meeting Management. Glenview, IL: Scott, Foresman.
Nicoll, D. R. (1981). Meeting management. In Pfeiffer, J. S. and E. Jones, Eds., The 1981
Annual Handbook for Group Facilitators. San Diego, CA: University Associates.
Pfeiffer, J. W., Ed. (1991). Theories and Models in Applied Behavioral Science: Management
Leadership, Vols. 2–3. San Diego, CA: University Associates.
Schindler‑Rainman, E., R. Lippit, and J. Cole. (1988). Taking Your Meetings out of the
Doldrums. San Diego, CA: University Associates.
Stamatis, D. H. (2003). Six Sigma and Beyond: Statistical Process Control. Boca Raton, FL:
St. Lucie Press.
Stamatis, D. H. (2002). Six Sigma and Beyond: Problem Solving and Basic Mathematics.
Boca Raton, FL: St. Lucie Press.
Stamatis, D. H. (1987). Conflict: you’ve got to accentuate the positive. Personnel,
December, 47–50.
Stamatis, D. H. (1991). Team Building Training Manual. Southgate, MI: Contemporary
Consultants.
Stamatis, D. H. (1992). Leadership Training Manual. Southgate, MI: Contemporary
Consultants.
9
OSHA Job Hazard Analysis
177
178 Introduction to Risk and Failures: Tools and Methodologies
TABLE 9.1
Specific Hazards by Categories
Type Description
Electrical Electrical hazard is any use of electrical power that results in electrical overheating or
arcing to the point of combustion or ignition of flammables or electrical component
damage. It may also involve moving or rubbing wool, nylon, other synthetic fibers, or
flowing liquids that can generate static electricity. An excess or deficiency of electrons
on a material surface discharges (sparks) to the ground, leading to the ignition of
flammables, damage to electronics, and damage to the human nervous system. A
typical electrical hazard is a critical equipment failure resulting from a power loss that
causes ergonomic harm (strains and sprains) due to overexertion or repetitive motion.
Ergonomics may affect and effect system design, process design, procedures, and
equipment. The focus in ergonomic design is to eliminate and or minimize situations
leading to human errors, for example, designing a switch that clearly indicates on and
off conditions.
Mechanical Mechanical hazards occur when devices fail because they operate beyond capacity or are
inadequately maintained. Mechanical issues may affect skin, muscle, or other body parts
exposed to impact, crushing, cutting, tearing, shearing and may be viewed as equipment
failures. Other failures are noise levels exceeding 85 dBA (8-hour time weighted average)
that lead to hearing damage or inability to communicate safety-critical information;
ionizing radiation (alpha, beta, gamma, and neutral particles and x-rays that cause
damage of cellular components); non-ionizing radiation (visible, ultraviolet, infrared,
and microwaves) that injure tissues by thermal or photochemical means.
Other examples are soil collapses in trenches or excavations resulting from improper or
inadequate shoring. Soil type is critical in determining hazard likelihood. Fall
conditions (slippery floors, poor housekeeping, uneven surfaces) lead to slips and
trips. Fire, heat, and extreme temperatures can cause burns and other organ damage. A
fire requires a heat source, fuel, and oxygen. Vibration can damage nerve endings.
Vibration or material fatigue can cause a safety-critical failure. Examples are abraded
slings and ropes and weakened hoses and belts.
Other mechanical hazards are falling objects and projectiles that can cause injury or
death by striking a body part, for example, a screwdriver that slips; temperatures that
result in heat stress, extreme exhaustion, or metabolic slow-down (hypothermia); poor
visibility (inadequate lighting or obstructed vision) that results in an error or accident.
Weather is an obvious mechanical hazard.
Chemical A chemical hazard involves absorption of a toxic material through the skin, lungs, or
bloodstream that causes illness or death.
The amount of exposure is critical in determining hazardous effects. MSDSs and OSHA
1910.1000 provide chemical hazard data.
Some chemicals combust when exposed to a heat ignition source. Typically, the lower a
chemical’s flash point and boiling point, the more flammable it is. Check MSDSs for
flammability information.
Corrosive chemicals such as acids and bases cause skin damage and may destroy
surrounding materials.
A chemical alone or a chemical reaction may also cause an explosion (a sudden and
violent release of a large amount of gas and/or energy due to a significant pressure
difference), for example, the rupture of a boiler or compressed gas cylinder. An
explosion may be triggered by contact with an exposed electric conductor or a short
circuit, for example, a metal ladder that touches power lines. The common domestic
60 Hz alternating current can stop a heart.
OSHA Job Hazard Analysis 179
A job hazard analysis can be conducted on many jobs but certain types of
jobs have high priorities:
Traditionally, there are five ways to evaluate a starting point for a JHA.
Involve your employees: It is very important to involve employees in the
hazard analysis process. They have a unique understanding of the job and
their knowledge is invaluable for finding hazards. Involving employees
will help minimize oversights, ensure a quality analysis, and obtain worker
buy-ins to solutions because they will share ownership in their safety and
health program.
Review your accident history: Review with your employees the work site’s
history of accidents and occupational illnesses that needed treatment, losses
that required repairs or replacements, and near-misses (incidents that did
not cause accidents or losses but could have). These events are indicators that
existing hazard controls may not be adequate and deserve more scrutiny.
Conduct a preliminary job review: Talk with your employees about the
hazards in their work areas. Brainstorm with them for ideas to eliminate or
control the hazards. If any hazards pose immediate dangers to an employ-
ee’s life or health, take immediate action. Any problems that can be corrected
easily must be corrected as soon as possible. Do not wait for completion of
a job hazard analysis. This will demonstrate your commitment to safety
and health and enable you to focus on the hazards and jobs that need more
study because of their complexity. Evaluate types of controls for hazards
determined to present unacceptable risks. Some typical hazard controls are
shown in Table 9.2.
Information obtained from a job hazard analysis is useless unless hazard
control measures recommended in the analysis are implemented. Managers
should recognize that not all hazard controls are equal. Some are more effec-
tive than others at reducing risks. The order of precedence and effectiveness
of hazard control is typically as outlined in Table 9.2.
The selection of one hazard control method over another higher in the con-
trol precedence may be appropriate for providing interim protection until a
hazard is abated permanently. In reality, if a hazard cannot be eliminated
entirely, the adopted control measures will likely be a combination of all
three types of control measures instituted simultaneously.
180 Introduction to Risk and Failures: Tools and Methodologies
TABLE 9.2
Hazard Categories and Controls
Control Category Examples
Engineering Elimination or minimization of hazard by designing facility, equipment,
or process to remove the hazard, or substituting processes, equipment,
materials, or other factors to lessen it
Enclosure of hazard (enclosed cabs, enclosures for noisy equipment) or
other means
Isolation of hazard with interlocks, machine guards, blast shields, welding
curtains, or other means
Removal or redirection of hazard, e.g., improved local and exhaust
ventilation
Administrative Written operating procedures, work permits, and safe work practices
Limits on exposure times (usually to control temperature extremes and
ergonomic hazards)
Monitoring uses of highly hazardous materials
Installation of alarms, signs, and warnings
Implementation of buddy system
Train employees to find and avoid hazards
Personal Respirators, hearing protection, protective clothing, safety glasses, and
protective hard hats; PPE is acceptable as a control method (1) if engineering
equipment controls are not feasible or do not totally eliminate a hazard; (2) while
(PPE) engineering controls are being developed; (3) if safe work practices do
not provide sufficient protection; and (4) during emergencies when
engineering controls may not be feasible
List, rank, and set priorities for hazardous jobs: List jobs with hazards
that present unacceptable risks starting with those most likely to occur and
those presenting the most severe consequences. These jobs should be your
first priority for analysis.
Outline the steps or tasks: Most jobs can be broken down into tasks or
steps. When beginning a job hazard analysis, watch an employee perform
and list each step he or she takes. Record enough information to describe
each action without excessive detail. Avoid making the breakdown of
steps so detailed that it becomes unnecessarily long or broad and does not
describe basic steps. It may be valuable to obtain input from other work-
ers who perform the same job. After the observation step, review the steps
with the employee to make sure no steps are omitted. Inform the employee
that you are evaluating the job and not his or her performance. Include the
employee in all phases of the analysis from reviewing the job steps to dis-
cussing uncontrolled hazards and recommended solutions. In conducting a
job hazard analysis, it may be helpful to photograph or videotape the worker
performing the job. These visual records can be helpful in conducting a more
detailed analysis of the work.
OSHA Job Hazard Analysis 181
After the five steps have been addressed, job hazard analysis becomes an
exercise in detective work to determine:
Table 9.3 is a typical hazard analysis form that helps organize the pertinent
information and relevant. Rarely does a hazard arise from a single cause
and produce a single effect. Usually many factors line up in a certain way to
create a hazard.
Here is a simple example of a hazard scenario in a metal shop (environ-
ment). While clearing a snag (trigger), a worker’s hand comes into contact
(exposure) with a rotating pulley that pulls his hand into the machine and
TABLE 9.3
Hazard Analysis Form
Job title
Job location
Analysis date
Task description
Hazard description
Consequences
Hazard controls
Rationale or additional comments
182 Introduction to Risk and Failures: Tools and Methodologies
severs his fingers (consequences) quickly. In a job hazard analysis, the above
items are considered:
• What can go wrong? The worker’s hand may come into contact with a
rotating object that “catches” the hand and pulls it into the machine.
• What are the consequences? The worker may be severely injured
and lose hands and fingers.
• How could it happen? The accident resulted when the worker tried
to clear a snag during operation or as a maintenance activity while
the pulley rotated during operation. Obviously, this hazard scenario
could not occur if the pulley was not rotating.
• What are other contributing factors? This hazard occurs quickly; the
workers has no opportunity to recover or prevent injury once his
hand comes into contact with the pulley. This is an important factor,
because it helps determine the severity and likelihood of an acci-
dent and the selection of appropriate hazard controls. Unfortunately,
experience has shown that training is not very effective in hazard
control when triggering events happen quickly because humans can
react only so quickly.
• How likely is it that the hazard will occur? This answer requires
some judgment. If previous incidents or near-misses occurred, the
likelihood of recurrence Is high. If the pulley is exposed and eas-
ily accessible, that raises another consideration. In the example, the
likelihood that the hazard will occur is high because no mechanical
guard prevents the contact of hand and pulley and the operation is
performed while the machine is running.
The steps in this example allow us to organize hazard analysis activities. The
next example shows how a JHA can identify existing or potential hazards
for each basic step involved in grinding iron castings. The job involves three
steps. Table 9.4 shows the analysis.
Step 1: Reach into metal box to right of machine, grasp casting, and
carry to wheel.
Step 2: Push casting against wheel to grind off burr.
Step 3: Place finished casting in box to left of machine.
After reviewing the list of hazards with the employee, consider what con-
trol methods will eliminate or reduce them. The most effective controls are
engineering modifications that change a machine or work environment to
prevent employee exposure to the hazard—a mistake-proof approach. The
more reliable or less likely a hazard control can be circumvented, the better. If
this is not feasible, administrative controls may be appropriate, for example,
OSHA Job Hazard Analysis 183
TABLE 9.4
Hazard Analysis of Grinding Castings
Job Location Metal Shop
Name of Analyst Stacey Robinson
Date 1/5/13
Step 1 Step 2 Step 3
Task description Worker reaches into Worker reaches into Worker reaches into
metal box to right of metal box to right of metal box to right of
machine, grasps machine, grasps machine, grasps
15-pound casting; 15-pound casting; 15-pound casting;
carries it to grinding carries it to grinding carries it to grinding
wheel; grinds 20 to wheel; grinds 20 to wheel; grinds 20 to
30 castings per hour 30 castings per hour 30 castings per hour
Hazard Worker could drop Castings have sharp Reaching, twisting,
description casting onto his foot; burrs and edges that and lifting 15-pound
casting size and can cause severe castings from floor
weight and height lacerations level could strain
could seriously muscles of lower back
injure foot or toes
Hazard controls (1) Worker removes (1) Worker uses (1) Move castings from
castings from box and clamp or other device floor level and place
places them on table to pick up castings; them closer to work
next to grinder; (2) wears cut- zone to minimize
(2) wears steel-toe resistant gloves that lifting; ideally place
shoes with arch allow good grip and them at waist height or
protection; (3) wears fit tightly to on adjustable platform
protective gloves that minimize the chance or pallet; (2) train
allow better grip; of being caught in workers not to twist
(4) uses device to pick grinding wheel while lifting and
up castings reconfigure work
stations to minimize
twisting during lifts
Note: Use similar form for each job step.
failure to follow proper job procedures leads to a close call, discuss the situ-
ation with all employees who perform the job and remind them of proper
procedures. Whenever a JHA is revised, it is important to train all employees
affected by the changes in the methods, procedures, or protective measures.
In recent years, hazard analysis has important to many organizations. An
analysis should be appropriate, applicable, and accurate, especially if the pro-
cesses involved are complex. When complexity is an issue, help is available from
professional consultants, the organization’s insurance carrier, and the local fire
department. OSHA offers assistance and consultation services through its
regional and area offices (contact numbers may be found at www.osha.gov).
Organizations must understand that despite the availability of outside
help, its employees must play a role in identifying and correcting hazards
because employees are in the workplace every day and are most likely
encounter the hazards. New circumstances and a recombination of exist-
ing circumstances may cause old hazards to reappear and new hazards to
emerge. Management and employees must be ready, willing, and able to
implement whatever hazard elimination or control measures a professional
consultant recommends.
It is worth noting that OSHA can provide extensive help through a vari-
ety of safety and health programs, plans, workplace consultations, volun-
tary protection programs, strategic partnerships, training, education, and
more. In addition to helping employers identify and correct specific hazards,
OSHA’s consultation service provides free on-site assistance in developing
and implementing effective workplace safety and health management sys-
tems focused on preventing worker injuries and illnesses. The comprehen-
sive assistance provided by OSHA includes a worksite hazard survey and
appraisals of all aspects of existing safety and health management measures.
OSHA helps employers develop and implement effective safety and health
management systems. Employers also may receive training and education
services and limited assistance away from their worksites.
OSHA awards grants through its Susan Harwood Training Grant Program
to nonprofit organizations to provide safety and health training and educa-
tion in the workplace. The grants focus on educating workers and employers
in small businesses (fewer than 250 employees) and training workers and
employers on new standards for high-risk activities and hazards. Grants are
awarded for 1 year and may be renewed for a second year, based on satisfactory
results. OSHA expects each organization awarded a grant to develop a train-
ing program that addresses a safety and health topic named by OSHA, recruits
workers and managers for training, and conducts the training. Grantees
are also expected to follow up with people who were trained to learn what
changes were made to reduce the hazards in their workplaces as a result of the
training. Each year OSHA holds a national competition. Details appear in the
Federal Register and on the Internet (www.osha-slc.gov/Training/sharwood/
sharwood.html). The OSHA Office of Training and Education is at 1555 Times
Drive, Des Plaines, IL 60018, (847) 297-4810.
OSHA Job Hazard Analysis 185
Reference
http://www.osha.gov/Publications/osha3071.pdf
Selected Bibliography
Bancroft, K. (2002). Job hazard analysis for unsafe acts. Occupational Health and Safety,
71, 206–215.
Clemens, P. and T. Pfitzer. (2006). Risk assessment and control. Professional Safety, 51,
41–44.
Geronsin, R. (2001). Job hazard assessment: a comprehensive approach. Professional
Safety, 46, 23–30.
Morris, J. and J. Wachs. (2003). Implementing a job hazard analysis program. AAOHN
Journal, 51, 187–193.
Occupational Safety and Health Administration. (2002). Job Hazard Analysis.
Publication 3071. Washington, DC: Author.
Rozenfeld, O., R. Sacks, Y. Rosenfeld et al. (2010). Construction job safety analysis.
Safety Science, 48, 491–498.
Swartz, G. (2002). Job hazard analysis. Professional Safety, 47, 27–33.
U.S. Army Corps of Engineers (2008). Safety and Health Requirements Manual
(EM 385-1-1). Washington, DC: Government Printing Office.
U.S. Department of the Army (2010). Army Safety Program (385-10). Washington, DC.
U.S. Department of the Army (2006). Composite Risk Management (FM 100-14).
Washington, DC.
10
Hazard Communication Based
on Standard CFR 910.1200
187
188 Introduction to Risk and Failures: Tools and Methodologies
Members
HMCC members are workers who may come from engineering, medical,
health and safety, production, or research areas. If an operation is unionized,
a health and safety technician and a union industrial hygiene representative
Hazard Communication Based on Standard CFR 910.1200 189
will also be members. The HMCC also obtains advice from experts in safety,
industrial hygiene, firefighting, toxicology, and medicine.
Responsibilities
The major task of the HMCC is to ensure the success of the hazardous com-
munication program. The committee is responsible for (1) developing a writ-
ten hazard communication program specific to the facility and (2) approving
all chemical materials and processes. The committee uses MSDSs to decide
whether new chemicals are safe. If the HMCC does not approve a chemical,
it may not be used.
Employee Training
An important part of a hazard communication program is employee train-
ing. Employees learn:
Additional training may be required if (1) new physical and/or health haz-
ards for which employees have not been previously trained are introduced;
(2) existing chemicals are used in a new way; (3) employees are assigned
new jobs that involve chemical for which they have not been trained; (4) an
employee performs a non-routine (rarely performed) task; (5) new hazard
information about a chemical material becomes available. A typical safe use
training program covering chemicals may include the following:
• Program overview
• Understanding hazards
• Detecting and evaluating hazards
• Controlling hazards
• Safe use category system
• Halogenated solvents
• Solvents with flashpoints below 100°F
190 Introduction to Risk and Failures: Tools and Methodologies
The first five topics generally constitute an overview of the hazardous com-
munication program. They cover various ways chemicals can be hazardous
and how hazards can be controlled and explain the safe use category system.
All employees must receive training on the first five topics as a minimum.
The next 20 items provide information on each safe use category, includ-
ing explanations of the categories, possible hazards, and safe use. Employees
receive training on the categories specific to the chemicals to which they may
be exposed. For the training to be successful, all employees must participate.
Employee Access
The third element of a hazard communication program is employee access.
All employees who work with or have potential exposure to chemicals have
the right to access to MSDSs, safe use instructions, chemical materials lists,
and the written hazard communication program.
Any employee working with a potential hazard has the right to review
SUIs at all times while in the work area. He or she does not need a written
request to review and discuss the information in MSDSs, chemical materials
lists, or the written hazard communication program. He or she may make a
written request for copies of these documents. The area supervisor or union
representative should know where to find this information.
Hazard Communication Based on Standard CFR 910.1200 191
Information Sources
The final element of a hazard communication program is the availability of
various forms of information to employees who work with or near chemi-
cals: labels, SUIs, chemical; materials lists, and MSDSs.
Labels
All chemical containers must be labeled. Bags, barrels, bottles, boxes, cans,
cylinders, drums, reaction vessels, and storage tanks are all containers.
Storage tanks include tank trucks and railcars. Labels are used to identify a
chemical and provide appropriate hazard warnings. Every label must clearly
state the material identity (name or number by which the chemical is known
in the facility). The identity noted on the label should match the product
name on the MSDS and list of chemical materials.
Every label must include a warning describing the health and physical
effects of overexposure to the chemical and cover effects on specific organs.
The warning may consist of or combine words, pictures, and symbols to con-
vey the required information.
If an operation transfers small quantities of chemicals from large drums to
smaller containers, employees must use dedicated transfer containers with
permanent labels. However, if a worker uses the chemical from the large con-
tainer immediately (during the current shift), it is not necessary to have a ded-
icated transfer container or permanent label. In certain types of operations,
dedicated containers are always required and the time of use does not matter.
Under certain conditions, alternative labeling on chemical containers is
appropriate. Signs, placards, process sheets, and/or batch tickets may be used
in place of labels on individual stationary process containers. Alternative
labels must contain the required information and attached so that they can-
not be removed easily. Table 10.1 is a sample label.
TABLE 10.1
Chemical Container Label
Item Description Identification
1 Product name or material identity
2 If solvent, flash point > 100°F
Safe use categorya
3 The < symbol means less than; > means greater than
4 WARNING! Overexposure may result in central nervous system (CNS)
effects, including headache, dizziness, nausea, unconsciousness,
death. Skin irritant. Possible liver and kidney effects
5 Check appropriate lines below:
◻ Do not use in confined space without appropriate personal
protective equipment (PPE)
◻ Flammable
6 Health hazards:
◻ Harmful if inhaled or swallowed
◻ Harmful if absorbed through skin
◻ Cancer-suspect agent
7 Specific chemicals with additional health hazards. See safe use
instructions
Legend:
1. Names or numbers of chemical materials as used in facility
2. Group of chemicals (safe use category) to which material belongs
3. The < symbol means less than; > means greater than
4. Effects of overexposure
5. Special precautions
6. How material enters the body and whether it contains a cancer-suspect agent
7. Ingredients that need special precautions for use
a Employees have the right and are encouraged to review MSDSs and SUIs to obtain
Certain SUIs will include attachments that list specific chemical ingredients
that need special precautions or are used widely. Table 10.2 is a typical SUI
showing the points mentioned.
TABLE 10.2
Typical SUI Form
Chemical Risk Safe Use Instruction Creation Date:
Management Safe Use Category (SUC):
Print Date:
Page ___ of ___
Location
Department:
Work area:
Process:
Occupation:
TABLE 10.3
Generic Material Safety Data Sheet
HMCS Identification SUC:
Product Name: Supplier:
Manufacturer: Original date:
Revision date:
Effective date:
Print date:
Page ___ of ___
Product and company identification (1):
Emergency overview:
Additional comments:
Additional comments:
Continued
198 Introduction to Risk and Failures: Tools and Methodologies
Personal precautions:
Cleanup methods:
Neutralization:
Safe-handling advice:
Storage
Engineering measures:
Exposure limits:
Limit values for all chemicals
Conditions to avoid:
Additional comments:
Practical experiences:
Health effects:
Classification of ingredients
Carcinogenicity:
General remarks:
Additional comments:
Continued
200 Introduction to Risk and Failures: Tools and Methodologies
Hazard waste:
Additional comments:
Other information (16)
Additional comments:
Changes:
References
29 CFR 1910.1200. Hazard Communication.
https://www.osha.gov/pls/oshaweb/owadisp.show _ document ? p_table =
STANDARDS&p_id=10100
U.S. Department of Labor. OSHA. Occupational Safety & Health Administration.
Washington. www.OSHA.gov
Appendix A: Checklists
• Management commitment
• Documented safety philosophy
• Safety goals and objectives
• In-house safety committee
• Line responsibility for safety
• Supportive safety staff
• Rules and procedures
• Audits
• Safety communications
• Safety training
• Accident investigation
• Motivation
201
202 Appendix A: Checklists
TABLE A.1
Safety Plan Checklist
Page Element The Safety Should Describe
1 Scope of work Nature of the work being performed
3 Organizational policies and Application of organizational safety-related policies
procedures and procedures for work being performed
3 Hydrogen and fuel cell How previous organization experience with hydrogen,
experience fuel cells, and related work applies to project
4 Identification of safety ISV methodology applied to project FMEA, what-if,
vulnerabilities (ISV) HAZOP, checklist, FTA, ETA, PRA, or other method
Designation of leader and steward of ISV
methodology
Significant accident scenarios identified
Significant vulnerabilities identified
Safety-critical equipment determined
Storage and handling of hazardous materials and
ignition sources
Explosion hazards and material interactions
Leakage and accumulation detection
Hydrogen handling systems (supplies, storage,
distribution, volumes, pressures, estimated use rates)
4 Risk reduction plan Prevention and mitigation measures for significant
vulnerabilities
4 Operating procedures Operational procedures applicable to location and
performance of work including sample handling and
transport
Operating steps that must be written to show critical
variables, and acceptable ranges and responses to
deviations
5 Equipment and mechanical Initial testing and commissioning
integrity Preventative maintenance plan
Calibration of sensors
Test and inspection frequency and documentation
6 Management of change System and/or procedures for reviewing proposed
procedures changes of materials, technology, equipment,
procedures, staffing, and facility operation to
determine effects on safety vulnerabilities
6 Project safety Communication of required safety information that
documentation must be available to all project participants. Safety
information includes ISV documentation, procedures,
handbooks, standards, other references, and safety
review reports
7 Employee training Requirement for initial and refresher general safety
training
Initial and refresher hydrogen-specific and hazardous
material training
System allowing organization stewards to participate
in training participation and confirm their
understanding
Continued
204 Appendix A: Checklists
Based on the answers to the basic questions above, a checklist may be used
as a guide for an AHA evaluation and reveal safety deficiencies. Critical
questions for a checklist are:
TABLE A.2
Facility Location Checklist
Area of Concern or Question Response Recommendations
Spaces between Process Components
1. Have adequate provisions been made for
relieving explosions in process equipment?
2. Are operating units and machines within units
spaced to minimize potential damage from
fires or explosions in adjacent areas?
3. Are there safe exit routes from each unit?
4. Has equipment been adequately spaced and
located to safely permit anticipated
maintenance (pulling heat exchanger bundles,
dumping catalyst, crane lifting) and hot work?
5. Are vessels containing highly hazardous
chemicals located sufficiently far apart? If not,
what hazards are introduced?
6. Is there adequate access for emergency
vehicles such as fire trucks?
7. Can adjacent equipment or facilities withstand
the overpressure generated by explosions?
8. Can adjacent equipment and facilities such as
support structures withstand flame
impingement?
Continued
206 Appendix A: Checklists
Locations of Machine Shops, Welding Shops, Electrical Substations, Roads, Rail Spurs, and
Other Likely Ignition Sources
1. Are likely ignition sources (maintenance
shops, roads, rail spurs) located away from
release points for volatile liquids and vapors?
2. Are process sewers located away from likely
sources of ignition?
Continued
208 Appendix A: Checklists
Unit Layout
1. Are large inventories or release points of
highly hazardous chemicals located away from
vehicular traffic within the plant?
2. Could specific siting hazards arise from
external forces such as high winds, earth
movement, outside utility failures, flooding,
natural fires, and fog?
3. Do emergency vehicles such as fire trucks have
clear access? Are access roads free from
blockage from trains and highway congestion?
4. Are access roads engineered to avoid sharp
curves? Are traffic signs provided?
Appendix A: Checklists 209
Electrical Classification
1. Is there an electrical classification document?
2. Does the document appear correct and
complete?
3. Has the document been revised recently?
4. Have significant changes made since system
construction been explained in the electrical
classification document:
• New materials added?
• New sources of flammable gases or vapors?
• New low points (sumps or trenches) at
grade?
• Areas that have been enclosed since the
system was constructed?
5. Are the designs and maintenance programs for
ventilation systems adequate: ventilation
systems being properly maintenance, and
alarms and interlocks on these systems
periodically function checked?
• Regular maintenance to check functioning
of natural ventilation systems?
• Technical bases for design changes to
ventilation system?
• Ventilation systems verified adequate for
new gas or vapor loads?
6. Will safeguards alert operators when a
ventilation system fails?
7. Are controls adequate to ensure that
electrically qualified equipment is replaced
with equipment of equal or higher
classification?
8. Are physical boundaries in place between
electrically classified areas? If not:
• Are boundaries marked?
• Do workers understand the boundaries of
electrically classified areas and their
importance?
9. Are Division 1 areas necessary?
Continued
212 Introduction to Risk and Failures: Tools and Methodologies
Contingency Planning
1. What expansion or modification plans does the
facility have?
2. Can the unit be built and maintained without
transporting heavy items above operating
equipment and piping?
3. Are calculations, charts, and other documents
available to verify that facility location has
been considered in the layout of the unit? Do
these documents show that consideration has
been given to:
• Normal direction and velocity of wind?
• Atmospheric dispersion of gases and
vapors?
• Estimated radiant heat density created by a
fire?
• Estimated over-pressure?
4. Are appropriate security safeguards (fences
and guard stations) in place?
5. Are gates located away from public road so
that the largest trucks can move completely off
the roadway while waiting for gates to be
opened?
6. Where applicable, are safeguards in place to
protect high structures against low-flying
aircraft?
7. Are adequate safeguards in place to protect
employees against exposure to excessive
noise? Do safeguards consider the cumulative
effects of machines located in close proximity?
8. Is adequate emergency lighting provided? Is
adequate back-up power available for this
lighting?
9. Are procedures in place to restrict non-
essential or untrained personnel from entering
areas deemed hazardous?
213
References
http://www.hse.gov.uk/risk/theory/r2p2.pdf
Health and Safety at Work Act of 1974. SI 1974/1439. London. Her Majesty’s Stationery
Office.
Appendix B: HAZOP Analysis Example
Title Page
Title: Hazard and Operability Study (HAZOP) Report, DOP Refineries Ltd.,
Proposed Project: Distillation Unit at Refinery
Address: 15668 Irene Street, Town, State, U.S. 49614
Chaired by: S. Robinson
Authorized by: J. Mitchell, General Manager of DOP Refineries Ltd.
Authorization date: January 12, 2014
Contents
Glossary and abbreviations
Summary
HAZOP study
Description of facility
HAZOP team members
HAZOP methodology
Guidewords
Plant overview
Analysis of main findings
Action arising from HAZOP
Minutes
Figure B.1: P&ID
Figure B.2 Revised P&ID
215
216 Appendix B: HAZOP Analysis Example
Summary
DOP Refineries Ltd. proposes to construct a refinery for the recovery of kero-
sene from the waste kerosene solvent returned from auto engine repairers.
An environmental impact statement (EIS) and preliminary hazard analysis
(PHA) were submitted prior to the approval of the development application
(DA). The consent conditions for the DA required that the following study
reports be submitted for approval:
• Construction safety
• Fire safety
• HAZOP
• Final hazard analysis
• Transport
• Emergency plan
• Safety management system
The first two studies have been completed and submitted for approval. This
report is the third. XYZ Consultants were retained by DOP to provide an
independent HAZOP chairman and assist in the preparation of this HAZOP
study report.
The prime objective of this HAZOP study was to systematically exam-
ine the proposed design and identify, construction issues, hazards, and/or
potential operational problems that may be avoided by (mostly minor) rede-
sign or suitable operating procedures before the design is hardened and
Appendix B: HAZOP Analysis Example 217
allows no changes. Selected lines and items in the P&ID were examined by
applying appropriate guidewords. The credible unfavorable and potentially
hazardous situations and subsequent consequences were evaluated and/or
estimated. Measures to eliminate or minimize the undesirable consequences
are recommended. The results of the step-by-step procedure and the rec-
ommendations were entered in the HAZOP log book. The main HAZOP
recommendations are:
HAZOP Study
In Chapter 5 discussing HAZOP analysis in detail, we identified a typical
format for a report such as this. However, for expediency, fine details are not
included in this example. A technical description of the plant, guidewords,
and other necessary details are explained briefly to enable readers to follow
the meeting minutes and/or log sheets.
Facility Description
Figure B.1 is a P&ID (Drawing DOP 001, Rev. 1) of the plant. The main items
include a distillation column H3, gas fired hot oil furnace H1, product reboiler
H2, condenser C1, and associated pumps, controls, and piping. The contami-
nated waste kerosene is fed to H3 under gravity from a holding tank (not
shown). In-flow is controlled by flow control valve VO preset at the desired
flow rate. The closed hot oil system uses a heating fluid heated in H1 and
circulated through H2 by pump P1. The waste kerosene is boiled in H2 (shell
and tube heat exchanger).
A temperature indicator and controller TIC on H3 control the piped natu-
ral gas feed valve V1 to the burner in H1 to maintain a set temperature in
218 Appendix B: HAZOP Analysis Example
Water In Vent
L7 L8
Destillation Water Out
H3 Column
T1 LIC
FIGURE B.1
Drawing DOP 001, Rev. 1.
H3. The residues in H3 are maintained at the required level by pump P3 and
valve V12, which is controlled by the level indicator and controller LIC.
The kerosene vapors in H3 are condensed in C1, a water-cooled shell and tube
heat exchanger. A vent is provided to release non-condensables. Level in refined
product receiver T1 is maintained by LIC and V10. Product pump P2 transfers
product to holding tank (not shown) for distribution to customers by tanker.
HAZOP Methodology
Selected lines and plant items in the P&ID were examined in turn, start-
ing from L5. [All lines and items are not covered here to conserve space.]
Items were recorded in the minutes generally by exception; only key
issues likely to pose significant consequences were recorded; see Table B.1.
However, items 2 and 3 on minute sheet 1 were included for the purpose
of illustration.
Guidewords such as HIGH FLOW (listed below and in minute pages __–__)
were applied from a set of guideword cards maintained in a ring binder.
Likely causes applicable to each guideword were entered in the second col-
umn and credible consequences in the third. The fourth column was for
recording existing design or operational safeguards. [No safeguards were
found in this simple case.] Where the consequences were likely to present
potential hazards or losses involving financial and time impacts, possible
changes to the system to eliminate or minimize the consequences were con-
sidered and recommendations made. [For simple cases, the recommenda-
tion number (Rec #) was noted in the fifth column and the change explained
briefly in the sixth column. If several options were presented or further eval-
uation was required, the recommendations were recorded accordingly.]
Plant Overview
This example covers only the operating mode. In a full HAZOP, where
start-up and shutdown procedures are analyzed, more changes may be rec-
ommended. The issues to be evaluated further before design changes are
made include:
TABLE B.1
HAZOP Log Report
Project: Product Distillation Unit—Waste Oil. Kerosene Exchange Washing Node: C1 Page 1 of 2
Description: Condenser, water cooled Date:
September 28,
2012
Drawing No.
DOP 001,
Rev. 2
Guideword Cause Consequence Safeguard Rec # Recommendation Indiv Action
1. High Flow Flow controller Level in column rises, then 1 Independent high-flow alarm KW
fault temperature falls; product on L5
reboiler will attempt to
maintain temperature in
column until reboiler capacity
is reached, after which liquid
level will rise arid—Mood line
LS column stops operating
2. Low Flow 1. Product feed Temperature rise and drop in KW
pump failure liquid level in column;
2. Jammed over-heating; reboiler can
isolating handle this; TIC will control
valve gas and air feed to furnace H1;
not a problem
3. Zero Flow As above As above KW
Appendix B: HAZOP Analysis Example
4. High Level Level controller Flooding of L6 and reboiler 2 High-level alarm KW
fault stops operation independent, of level
controller LIC; alarm level
below L6
5. Low Level Level controller Not a problem (as for no flow) 3 Low-level alarm KW
malfunction or
low flow
6. High Pressure Water failure in Condenser vent acts as relief 4 Pressure indicator on KW
condenser device; no adverse effect column; high-pressure
alarm and trip on gas/air
control valve V1
7. High Loss of feed No adverse effect 5 High- and low-temperature KW
Temperature alarm on TIC; additional
high-temperature alarm
Appendix B: HAZOP Analysis Example
Continued
222
TABLE B.1 (Continued)
HAZOP Log Report
Guideword Cause Consequence Safeguard Rec # Recommendation Indiv Action
12. Low Flow P1 fails Loss of heat to H2; TIC will call 10 Install flow sensor/indicator KW
for further opening of V1, or alarm to trip furnace via
increasing temperature in H1 V1 or other
13. High Pressure Heating; Burst pipes, etc. 11 Surge tank in oil system; CS
expansion of hot evaluate location of tank:
oil on L3 (at pump suction) or
L2; check dead leg and
moisture condensation in
oil
14. High 1. High product High temperature in furnace 12 Pyrometer in furnace to CS
Temperature load on H3 alarm or trip gas supply
causing high
flame in H1
2. TIC on H3
failed, V1
failed to open
3. H2 partly
blocked or
heat transfer
poor
15. Contamination Water from Water turns to steam and 13 Locate surge tank in hot CS
(water in oil) atmosphere explodes system; avoid dead legs;
through vent steam vents at high points
in pipe system; nitrogen
connection on vent
Appendix B: HAZOP Analysis Example
Appendix B: HAZOP Analysis Example 223
The study results are detailed in the minutes on pages __–__. The recom-
mendations arising from the study are:
Consequence and/or risk analysis was considered necessary for the issues
raised by Recommendations 6, 7, 9 11, and 13. Detailed analysis subsequent
to the HAZOP indicated that:
Water Vent
Out a
Nitrogen
Water Oil (Kerosene Ex Engine Washing) TAH
Product Distillation Unit
C1
L7 L8
Water
Destillation In
H3 Column
PAH PI
LAH
T1 LIC
LAL
H1 TAH Nitrogen
Prod L10 To Storage
L2 Receiver L9 P2
V10 Via Cooler
O/F
Nitrogen
b Product
Pump
TAH
FIC FAH
L1 L6 LAH L0
L4 Waste Kerosene
Gas air V1 V0 Flow
P1 H2 TIC
Feed Meter
F1 L3 LIC LAL
Hot Oil TAH
Furnace FAL
TAL
L5 L12 To Storage
Oil
Circulation P3 V12 Via Coller
Pump Product L11
Reboiler
Residues
LIC = Level Indicator/Controller FAH = Flow Alarm High Pump
DOP Pefinery Prod
TIC = Temperature Indicator/Controller FIC = Flow Indicator/controller
Distillation
TAH = Temp Alarm High FI = Flow Indicator
TAL = Temp Alarm Low Unit P & ID
FAL = Flow Alarm Low FAL = Flow Alarm Low
DRW No. DOP 001 REV 2
LAL = Level Alarm Low PI =pressure Indicator
LAH = Level Alarm Low PAH = Pressre Alarm High
Note–Consider Options a OR b
FIGURE B.2
Drawing DOP 001, Rev. 2.
Appendix B: HAZOP Analysis Example 227
Selected Bibliography
Aksorn, T. and B. H. W. Hadikusumo (2008). Measuring the effectiveness of safety
programmes in the Thai construction industry. Construction Management and
Economics, 26, 409–421.
Andow, P. (1991). Guidance on HAZOP Procedures for Computer-Controlled Plants. HSE
Contract Research Report 26/1991. London: Her Majesty’s Stationary Office.
Balemans, A. (1974). Checklist guide lines for safe design of process plants. First
International Loss Prevention Symposium. London.
Bancroft, K. (2002). Job hazard analysis for unsafe acts. Occupational Health and Safety,
71, 206–215.
Beach, L. (1992). Image Theory: Decision Making in Personal and Organizational Contexts.
New York: John Wiley & Sons.
BS5760:2 (1992). Guide to the Assessment of Reliability.
BS5760:5 (1992). Guide to Failure Modes, Effects, and Criticality Analysis (FMEA and
FMECA).
BS5760:7 (1992). Guide to Fault Tree Analysis.
BS5760:9 (1992). Reliability of Systems, Equipment, and Components. Part 9: Guide
to Block Diagram Technique.
Chastain, J. and J. Jenson (1997). Conduct better maintenance and operability studies.
Chemical Engineering Progress, 93, 49–53.
Clemens, P. and T. Pfitzer (2006). Risk assessment and control. Professional Safety, 51,
41–44.
Elmendorf, M. (1996). Introduction to process hazard analysis. Journal of Environmental
Law and Practice, 4, 36–56.
Fairley, R. (1985). Software Engineering Concepts. New York: McGraw-Hill.
Garvey, P. (2008). Analytical Methods for Risk Management: A Systems Engineering
Perspective. Boca Raton, FL: Taylor & Francis.
Garvey, P. (2000). Probability Methods for Cost Uncertainty Analysis: A Systems
Engineering Perspective. Boca Raton, FL: Taylor & Francis.
Geronsin, R. (2001). Job hazard assessment: a comprehensive approach. Professional
Safety, 46, 23–30.
Giebe, K. (2013). Zero injuries: how safety and productivity find common ground.
Fabricating and Metal Working, Sept., 54–55.
Gould, J., M. Glossop, and A. Ioannides (2000). Review of Hazard Identification
Techniques. Health and Safety Laboratory. HSL/2005/58.
Greenberg, H. and J. Cramer (1992). Risk Assessment and Risk Management for the
Chemical Process Industry. New York: Van Nostrand Reinhold.
Haddad, S. (2011). HIPAP 6: Hazard Analysis. Sydney: New South Wales Department
of Planning.
Hammer, W. (1989). Occupational Safety Management and Engineering, 4th ed.
Englewood Cliffs, NJ: Prentice Hall.
Hoxie, W. (2003). Preconstruction risk assessments. Professional Safety, 48, 50–53.
http://www.hse.gov.uk/risk/theory/r2p2.pdf
http://www.stb07.com/process-safety-management/process-hazards-analysis.html
228 Introduction to Risk and Failures: Tools and Methodologies
Risk is everywhere, in everything we do. Realizing this fact, we all must try
to understand this “risk” and if possible to minimize it. This book expands the
conversation beyond failure mode and effects analysis (FMEA) techniques.
While FMEA is indeed a powerful tool to forecast failures for both design and
processes, it is missing methods for considering safety issues, catastrophic
events, and their consequences. Focusing on risk, safety, and HAZOP as they
relate to major catastrophic events, Introduction to Risk and Failures:
Tools and Methodologies addresses the process and implementation as well
as understanding the fundamentals of using a risk methodology in a given
organization for evaluating major safety and/or catastrophic problems.
The book identifies and evaluates five perspectives through which risk and
uncertainty can be viewed and analyzed: individual and societal concerns,
complexity in government regulations, patterns of employment, and polarization
of approaches between large and small organizations. In addition to explaining
what risk is and exploring how it should be understood, the author makes a
distinction between risk and uncertainty. He elucidates more than 20 specific
methodologies and/or tools to evaluate risk in a manner that is practical and
proactive but not heavy on theory. He also includes samples of checklists and
demonstrates the flow of analysis for any type of hazard.
Written by an expert with more than 30 years of experience, the book provides
from-the-trenches examples that demonstrate the theory in action. It introduces
methodologies such as ETA, FTA, and others which traditionally have been used
specifically in reliability endeavors and details how they can be used in risk
assessment. Highly practical, it shows you how to minimize or eliminate risks
and failures for any given project or in any given work environment.
K23013
6000 Broken Sound Parkway, NW ISBN: 978-1-4822-3479-4
Suite 300, Boca Raton, FL 33487
711 Third Avenue 90000
an informa business New York, NY 10017
2 Park Square, Milton Park
www.crcpress.com Abingdon, Oxon OX14 4RN, UK
9 781482 234794
w w w.crcpress.com