0% found this document useful (0 votes)

130 views

18081

Uploaded by

Juan Angel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

130 views

18081

Uploaded by

Juan Angel

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 333

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Root Cause
Investigations for CAPA:
Clear and Simple

James L. Vesper

PDA
Bethesda, MD, USA

DHI Publishing, LLC

River Grove, IL, USA

LICENSED TO JOSE CASTELLA

Vesper Book.indb 1 5/29/2020 10:55:43 AM

10 Licensed
9 8 7 6to 5Enger,
4 3 Tehya/PDA:
2 1 Copying and Distribution Prohibited.

ISBN: 978-1-942911-28-9
Copyright © 2018 by James L. Vesper and Tim Sandle
All rights reserved.

10 9 8 7 6 5 4 3 2 1
All rights reserved. This book is protected by copyright. No part of it may
be reproduced,ISBN:
stored in a retrieval system, or transmitted in any form or by
978-1-942911-50-0
any means, electronic,
Copyright mechanical,
© 2020 by James photocopying,
L. Vesper recording, or otherwise,
without written Allpermission
rights reserved. from the publisher. Printed in the United States
10 9 8 7 6 5 4 3 2 1
of America.
Where a This book is
ISBN:
product protected byregistration
978-1-942911-42-5
trademark,
Copyright © 2020 Tim Sandle
copyright. Nomark, part of itormay be reproduced,
other protectedstored
in
mark is made in the a retrieval
All text,
system,
rights ownership
reserved.
or transmitted in any form or by
of the mark remains with the lawful any means, electronic,
mechanical, photocopying, recording, or otherwise, without written
owner of the mark.All No claim,
rights
intentional
reserved.
orprotected
otherwise,
This book isPrinted
is made by reference
permission from the publisher. inby copyright.
the United No partof
States ofAmerica.
it may
to any such marksWhere inreproduced,
be theabook. stored in a retrieval system or transmitted in any means,
electronic, product
mechanical,trademark, registration
photocopying, mark,
recording, or other protected
or otherwise, without mark
At the time of
is made printing,
in the
written text, all
permission web
ownership
from the site
of thelinks
publisher. mark referenced
remains
Printed in the with functioned,
United the lawful
States of owner
however PDAof and DHI
mark.cannot
theAmerica. No claim, guarantee
intentionalthe or accuracy
otherwise, of the information
is made by reference to any
Where a product trademark, registration mark, or other protected
or that the listed web
such marks
mark sitesinwill
is made not
theinbook. move
the text,Websites or cited
ownership delete areinformation.
of the current
mark at the
remains time
with theof publication.
lawful
While every effort has been made by the publishers, editor, Ifand
The author
owner has
of themade
mark. every
No effort
claim, to provide
intentional or accurate
otherwise, citations.
is made bythere are
reference to any such marks in the book. Websites cited are current at the
any
authors to ensure timeomissions,
the of accuracy please
publication.
contact
ofThetheauthor the publisher.
information
has made every contained in this
effort to provide book,
accurate
While
citations.every
If thereeffort
are any has been made
omissions, pleaseby the the
contact publishers
publisher.and the author
this organization accepts While
no responsibility forbyerrors or omissions. The
to ensure theevery effort has
accuracy of been
the made
information the publisher
containedand the
in author to
this book, this
views expressed inensure this book are those
the accuracy of theofinformation
the editorexpressed
and authorsin this and
book, may
the
organization
organization accepts
acceptsno no responsibility
responsibility forfor errors
errors or omissions.
or omissions. The views
The views
not represent those of either
expressed
expressedin this Davis
book
in this book Healthcare
are
arethose
those of theInternational
of the authorand
author andmaymay ornot
not therepresent
PDA,
represent its
those those of
officers, or directors.of either Davis Healthcare International or the PDA, its officers, or directors.
either Davis Healthcare International or the PDA, its officers, or directors.

This book is printed on sustainable resource paper approved by the Forest Stewardship
PDA This book is The
Council. printed onGasch
printer, sustainable
Printing, isresource Davis Healthcare International
a member paper approved
of the Green by theand
Press Initiative Forest Stewardship
all paper

Council. The printer, Gasch Printing, is a member of thePublishing, LLC and all
used is from SFI (Sustainable Forest Initiative) certified mills.
Green Press Initiative
paper
Bethesda Metro used is from SFI (Sustainable Forest Initiative) certified
Center 2636mills.West Street
PDA Global Headquarters Davis Healthcare International Publishing, LLC
Suite 1500 Bethesda Towers, Suite 150 River Grove 2636 West Street
PDA 4350
Global Headquarters
East-West Highway Davis Healthcare InternationalRiver Publishing,
Grove LLC
Baltimore, MDBethesda
20814 Towers,
Bethesda, Suite 600
MD 20814 IL 60171 2636 West Street
IL 60171
United States 4350 East-West
United States
Highway
www.pda.org/bookstore
United States United States
River Grove, IL 60171
www.DHIBooks.com
301-986-0293 Baltimore, MD 20814
001-301-986-0293 United States
www.DHIBooks.com
United States www.dhibooks.com
www.pda.org/bookstore

Front matter.indd 2 3/16/2020 11:03:32 AM

LICENSED TO JOSE CASTELLA

Vesper Book.indb 2 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

CONTENTS

FOREWORD ix
ACKNOWLEDGMENTS xiii
INTRODUCTION xv
ABOUT THE AUTHOR xxi

1 WHY INVESTIGATIONS AND CORRECTIVE

ACTIONS MATTER 1
A different industry 2
What other industries have done 3
High reliability organizations 7
What can we learn from others and apply to what
we do? 8
Conclusion10
References10

2 REGULATORY REQUIREMENTS AND

EXPECTATIONS13
Differences in expectations between medical devices
and drugs 14
What regulators have been finding 15
GMP expectations 17
References29

iii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 3 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

3 ROLES AND RESPONSIBILITIES 31

Competencies and competency-based training 31
Specific competencies for those involved in
investigations33
Developing competencies 36
Who “owns” the problem? 37
How big should the team be? 37
What makes for a successful team? 38
The value of a team 38
How to be a good facilitator if you are leading an
investigation39
Report writers 44
Conclusion 45
References 45

4 THE BIG PICTURE: INVESTIGATIONS AND

CORRECTIVE ACTIONS 47
The 14-step process 47
Conclusion61
References62

5 THE INITIAL DISCOVERY OF AN EVENT 63

Psychological safety 64
Direct observation 65
Big data and data mining 71
So what does this all mean? 73
Conclusion73
References73

6 APPLYING RISK-BASED THINKING TO

QUALITY EVENTS AND DEVIATIONS 77
The ICH Q9 process 80
QRM and risk-based thinking 83
Conclusion90
References90

7 MODELS USED IN DESCRIBING INCIDENTS 93

Single-event model 94

iv
LICENSED TO JOSE CASTELLA

Vesper Book.indb 4 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Chain-of-events models or domino theory 95

Hierarchical models 98
Factorial model 100
Individual and human factors 103
Conclusion104
References 105

8 HUMAN ERRORS AND HUMAN FACTORS 107

Classifications 108
The person or the “system”? 109
Why such a high proportion of so-called human
errors?111
What about a “blameless” culture? 112
Five principles of human performance 114
Models and tools to identify causes that result in
human error 116
Conclusion133
Valuable resources 133
References134

9 METHODS AND TOOLS USED WHEN

CONDUCTING INVESTIGATIONS 137
Why use methods and tools? 137
Specific methods and tools 139
So what tool should be used? Tool selection guidance 162
Conclusion163
References163

10 INTERVIEWS 165
Interviews compared to interrogations 165
Fear166
An interesting case study of how our memories
can warp 166
How memories are created—and recreated 167
Ways to obtain the most accurate recounting of an
incident168
The cognitive interview process 169
Conclusion173
References174
v
LICENSED TO JOSE CASTELLA

Vesper Book.indb 5 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

11 IMMEDIATE ACTIONS AND CORRECTIONS 177

Immediate actions 177
Corrections179
Conclusion179

12 CORRECTIVE ACTIONS AND PREVENTIVE

ACTIONS181
Linking corrective actions to the causes 182
Change control and risk assessment 183
Looking ahead to an effectiveness check 183
The range of corrective action options 184
Corrective actions specific to causes categorized as
“human error” 191
Defining key terms 194
Where do qualification and validation fit into
corrective actions? 197
When you cannot prevent, try to manage 198
When the root causes cannot be found 199
Short term vs. long term 199
Residual risks of corrective actions 200
Conclusion203
References203

13 PROCEDURES: CAUSES OF PROBLEMS AND

POTENTIAL CORRECTIVE ACTIONS 205
Procedures as a cause and a contributor to
unwanted events 205
The biggest writing challenge: appropriate level of
detail209
The information ecosystem 212
Do we need a procedure for this? 213
What should a procedure look like? 214
Revising a procedure as a corrective action 215
Checklists216
Conclusion218
References218

14 TRAINING AS A CORRECTIVE ACTION 221

Training as part of a system 221
vi
LICENSED TO JOSE CASTELLA

Vesper Book.indb 6 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Tacit and explicit knowledge 222

Instructional methods—ways to present knowledge
and skills 223
Assessment and evaluation of the learning 232
Conclusion232
References233

15 CORRECTIVE ACTION EVALUATION AND

EFFECTIVENESS CHECKS 235
Formative and summative evaluation 235
Timing and methods for effectiveness checks 236
Documenting the effectiveness checks 240
Evaluation and effectiveness checks related to
training and performance 240
A caution 242
Conclusion242
References242

16 WRITING THE REPORT 243

General considerations of an investigation report 244
Conclusion 253
References 253

17 REVIEW AND APPROVAL OF THE

INVESTIGATION AND REPORT 255
Stated requirements 256
Minimizing personal preferences 257
Giving feedback 258
Receiving feedback 259
Including the basis of the reviewers’ and approvers’
signatures260
Churning metrics 261
Conclusion261
References266

18 COMMUNICATION267
Who sees what? 268
Methods for incident communication 269

vii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 7 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Communicating potential risks 270

Conclusion272
References272

19 LEARNING FROM SUCCESSES AND FAILURES 273

“Fail fast, fail often” (but fail safely) 274
Characteristics of organizations that learn from
mistakes 275
What about a “blameless” culture? 276
After-action reviews 277
The role of leadership 280
Conclusion280
References280

20 MANAGEMENT RESPONSIBILITIES 283

What can leadership do? 285
Investigations and quality culture 287
Conclusion289
References289

APPENDIX 1: DEFINITIONS 291

APPENDIX 2: INCIDENT INVESTIGATOR’S

WORKSHEET 295

INDEX 303

viii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 8 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

FOREWORD

Investigations are an essential part of the regulated healthcare

product manufacturing Quality System—few industry professionals
would disagree with that. However, a point can be made that
improving the performance of root cause investigations and the
outcome of corrective actions is even more important in today’s
biopharmaceutical environment. Today, proper planning and
implementation of effective investigation programs are not only
required to control and comply with GMPs, but are a means to
improve manufacturing processes. The objective of an investigation
is not merely to perform the investigation, but to improve the
reliability of our manufacturing operations, the ultimate objective
being increased quality and availability of those regulated healthcare
products.

This is a time of significant and unprecedented opportunity

and challenge in our industry. Exciting new therapies are being
developed. Global health authorities are becoming more active and
influential in the preparation of process control guidance. Innovative
manufacturing, contamination control, and data acquisition
approaches are available, and the promises of new technologies
and modalities are on the horizon. Technologies such as continuous
manufacturing, automation, and manufacturing intelligence change
the role of personnel and place a new understanding on human
performance. Standardized, smaller, less complex manufacturing
spaces, process analytical technology (PAT), rapid process

ix
LICENSED TO JOSE CASTELLA

Vesper Book.indb 9 5/29/2020 10:55:43 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

monitoring and testing, and manufacturing intelligence data

gathering are promoting higher levels of manufacturing efficiency.

In parallel, external influences are affecting the business

aspects of our industry. Drug and healthcare product shortages are
shifting our public health emphasis to product supply availability,
affordability, and sustainability. This leads to a more mature
business understanding of the positive correlation between quality
process and higher yields, increased productivity, reduced failures,
and business profit. This emphasis on business aspects of our
industry should result in increased appreciation for and investment
in innovative technologies and willingness to design and implement
new manufacturing strategies.

With this change in emphasis come corresponding opportunities

and challenges. There is opportunity for more innovative use of
technology for manufacturing, improvement of well-designed and
scientifically based process control strategies, and replacement
of standard approaches to process control and quality assurance.
Along with these changes comes the recognition that manufacturing
processes must be more reliable and process control approaches
must be aligned with product quality benefit.

Improvement means challenging the status quo. It comes

from asking critical science- and risk- based why and why not
questions. It is essential that those responsible for the operation,
control, and evaluation of healthcare products and manufacturing
processes employ critical thinking to effectively and efficiently
meet related healthcare strategies. There is a growing realization
that traditional product testing, process monitoring, validation,
and Quality System controls may not be the most effective means
to ensure product quality. Ineffective requirements and approaches
take limited resources from more important activities and in that
way can compromise public health. This is a time to understand
that risk- and science-based principles, encouraged by regulators to
justify the appropriateness of existing process control actions, can
be used to design and gain approval for alternative, new approaches
and strategies, thus providing a vehicle for innovation and process
improvement.

x
LICENSED TO JOSE CASTELLA

Vesper Book.indb 10 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

The objectives of failure investigations and corrective actions

should not only be to identify deviations and determine the
impact on product quality, but also must be to gain insight from
the evaluation of failures, prevent future failures, reduce impact of
failures when they do occur, improve the manufacturing process,
and increase the assurance of product quality. Better understanding
of process limitations and failures provides an opportunity to better
understand the variables inherent in processes, and through that
understanding, provide better means to evaluate and improve those
processes.

This book incorporates three of the most essential process

control objectives: investigation, root cause determination, and
correction. These themes are at the heart of process knowledge
acquisition, understanding, and control. Investigation achieves the
objective of gathering knowledge and converting that knowledge
to understanding. It is the process of learning and analyzing
information. Root cause analysis takes what has been learned and
uses it to meet the objective of judging or determining the reason
for the failure. Correction addresses the impact of the failure and
commensurate means to prevent its reoccurrence. Meeting these
objectives achieves the goal of manufacturing process improvement.
Achieving this goal is ever more essential in today’s business and
healthcare environment.

As members of the global healthcare product community,

our objective is the health and welfare of patients. It is through
understanding of patient needs that new products are developed. It
is through identification of the required quality attributes of those
products that processes are designed. It is through recognition
of the variables of those processes that control strategies are
employed. It is through the utilization of innovative technologies
that processes are improved. Awareness of patient needs, product
requirements, processes variables, and available technology will
result in addressing today’s and tomorrow’s healthcare challenges
more effectively.

Books such as Root Cause Investigations for CAPA: Clear and Simple
provide a path for reaching that level of awareness and thus help
our industry achieve our healthcare product and patient welfare

xi
LICENSED TO JOSE CASTELLA

Vesper Book.indb 11 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

objective. This volume captures information and presents insight

discovered and reinforced through workshops, research, and
practical application. It includes chapters on investigation and CAPA
fundamental elements, regulatory expectations, human behavior
impact, risk-based decision making, models, methods, training,
management, follow-up, and evaluation of results. Understanding
how to properly plan and perform investigations, how to decide
on effective means to address the outcome of such investigations,
and how to use the knowledge gained from the experience are key
to process reliability and improvement. That makes this book so
important and the approaches presented in it so valuable. Enjoy the
book. Learn from the journey.

Hal Baseman

June 2020

xii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 12 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

ACKNOWLEDGMENTS

Thanks to the participants in workshops and also the clients who

have shared in many mutual learning experiences and helped me to
become a better instructor and writer.

Thanks to my colleagues at ValSource, particularly Hal

Baseman, Mike Long, and Igor Gorsky. A special callout to Amanda
McFarland for her wit and many ideas as we have worked together.

Thanks to Umit Kartoglu and Tom Reeves for their

encouragement and counsel on this and other projects over the past
decade.

And a special thank you (again) to Gray Brown for being patient
and very supportive as I worked on yet another writing project.

xiii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 13 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

xiv
LICENSED TO JOSE CASTELLA

Vesper Book.indb 14 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

INTRODUCTION

One of the most common questions a writer has is, “How do I start?”
In writing this book my question was, “how do I stop?” Every time
I went online to find an article or check a reference, I would see
something else and think, “I need to add that!” If I kept that up, I
would still be writing and discovering and adding and on and on.
(As it is, I am way overdue in getting this to the publisher.)

So I had to stop. What this means is that this is not going to be a

complete, totally comprehensive volume on how to do every type of
deviation, out of specification, excursion, or failure investigation that
could be encountered in the pharma/biopharma industry. Instead,
it tries to emphasize some of the principles and considerations that
apply widely to many types of investigations you might need to
perform.

This book is based on public and in-house workshops that I

have conducted across the US, Canada, and Europe. Each workshop
has taught me something new—a better way to explain a concept,
terrific examples from participants, or a different point of view. That
is one of the best aspects of being an instructor or consultant—you
keep learning things.

The chapters are arranged to first provide some high-level

information and then go into the investigation and CAPA (corrective
and preventive actions) activities. Here is a quick overview of what
you will find:

xv
LICENSED TO JOSE CASTELLA

Vesper Book.indb 15 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Chapter 1, Why Investigations and Corrective Actions Matter,

provides a rationale as to why understanding the reason for
deviations—and finding ways to prevent their recurrence—is
important. We will look beyond pharma and biopharma to see what
governments and other industries are doing and why.

Chapter 2, Regulatory Requirements and Expectations, looks at

the good manufacturing practices and relevant guidances from the
US, Europe, Canada, the World Health Organization (WHO), and
the International Conference on Harmonization (ICH). The intent is
that we should be doing good investigations for patient and quality
reasons (discussed in the first chapter), rather than being required
to do so by health authorities. Much of this chapter comes from the
5th edition of GMP in Practice that Tim Sandle and I co-wrote.

Chapter 3, Roles and Responsibilities, identifies who

should be involved in different aspects of investigations and
the competencies these people should have. Since most large,
complicated investigations involve more than a single investigator,
characteristics of successful teams are presented as well as the role
of the facilitator—a role important not just in failure investigations
but in formal risk assessments as well.

Chapter 4, The Big Picture: Investigations and Corrective

Actions, summarizes the 13 key activities involved in doing an
investigation, from identifying a problem through to the effectiveness
check that gives you (and health authority inspectors) confidence
that your investigation and action plans were effective. Most of the
chapters that follow provide more details and rationales on how to
accomplish these steps.

Chapter 5, The Initial Discovery of the Event, considers how

someone recognizes that there is a problem in the first place. How
is that done? How does someone determine that an event or a result
is different? Mental models and experience play major roles in this.

Chapter 6, Applying Risk-based Thinking to Quality Events

and Deviations, considers the role that risk-based thinking has in
determining how to address an unwanted event (the triage process).
The underlying concept is that some investigations require more
rigor than others; a risk-based approach is what is used to determine

xvi
LICENSED TO JOSE CASTELLA

Vesper Book.indb 16 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

this. The ICH Q9 Quality Risk Management model is presented and

described in terms of how industry is using QRM since the ICH
guideline was released in 2005.

Chapter 7, Models Used in Describing Incidents, presents ways

to describe an unwanted event so as to better tell the story of what
happened, why, and what it means. The models discussed here are
often used in investigations performed in other industries. Parts of
this chapter come from Risk Assessment and Risk Management that I
wrote in 2006.

Chapter 8, Human Errors and Human Factors, is the chapter

that you need to read if nothing else in this book. The message is
human error is not a root cause. Say it out loud this time: Human error
is not a root cause. If you take away just this message from the book,
I think I will have been successful. We will look at why human error
is a result rather than a cause and at several different models used to
determine true, valid root causes.

Chapter 9, Methods and Tools Used When Conducting

Investigations, discusses what are really at the heart of root cause
investigations. It presents a toolkit that goes well beyond the
fishbone diagrams and the five whys that are almost universally
(over-) used in pharma. Some of the tools are obviously simple ways
to arrange the information that is being collected; others are more
complex but allow one to easily tell the story of the unwanted event
and its causes.

Chapter 10, Interviews, describes why getting statements as soon

as possible from those who may have witnessed or been involved
in a situation is critical. A research-based method for conducting
the interviews developed by psychologists, the Cognitive Interview
Process, is outlined, along with specific questions relevant to
pharma.

Chapter 11, Immediate Actions and Corrections, looks at

what might be done when a situation is first identified to “stop
the bleeding.” Corrections, those actions taken to fix the thing(s)
affected by the event, are also discussed.

xvii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 17 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Chapter 12, Corrective Actions and Preventive Actions, defines

what a true corrective action is and provides a number of examples
of how to prevent a recurrence of the unwanted event.

Chapter 13, Procedures: Causes of Problems and Potential

Corrective Actions, considers procedures from two different points
of view—being part of the problem and being part of a solution.
“Revising the procedure” is one of the two most frequent corrective
actions. Unfortunately, it often involves bolting something onto the
procedure which makes it even more difficult to use. Might there be
alternatives to this?

Chapter 14, Training as a Corrective Action, examines the other

most frequently used corrective action: training. Why? Because
training has become the default action if a true root cause isn’t found
or if “human error” is assigned. I am a trainer; I love training and
learning. But training will be the most expensive, most unreliable
solution to your problem if the real root cause is not a lack of
knowledge or skills.

Chapter 15, Corrective Action Evaluation and Effectiveness

Checks, is the way that you and health authorities can have confidence
that you have found the root and proximal causes and taken the
appropriate actions to control the process. There are a number of
ways that this can be done at different points of implementing the
corrective actions.

Chapter 16, Writing the Report, discusses documenting the

event, the investigation, the scope and impact of the incident, and
all the actions taken as part of knowledge management described
in the ICH guideline Q10, Quality Systems. The extent of the
documentation depends on the significance of the event.

Chapter 17, Review and Approval of the Investigation and

Report, describes the intent of review and approval along with
the importance of having consistency in this process so as to avoid
needless “churning” of the report between the writers, reviewers,
and approvers. A checklist of things to consider is provided.

Chapter 18, Communication, discusses the various types

of communication that occur throughout the “lifecycle” of the

xviii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 18 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

unwanted event. Communication is often coupled with an

escalation procedure that defines the speed of the messaging and
who—company leadership as well as regulators—is informed about
different situations.

Chapter 19, Learning from Successes and Failures, looks at ways

of extracting some value from an unwanted happening. Studies have
shown that organizations that learn from their mistakes have certain
characteristics. One of the most used models, after-action reviews,
is presented.

Chapter 20, Management Responsibilities, considers what

management can do to create, support, and sustain a culture that is
focused on first preventing unwanted events and then, should one
occur, preventing a recurrence. This chapter looks at the roles that
leadership must play—activities that are aligned with a culture of
quality.

Appendices include a glossary of terms and a data collection

tool.

As I finish this, the novel coronavirus and COVID-19 are moving

across the globe, having a devastating and a yet untold impact
on individuals, families, communities, and economies. Without
any natural immunity, people are looking to the pharmaceutical
industry to create antiviral medications and vaccines to treat and
prevent the often fatal infections. It is an opportunity for the industry
to reclaim the respect and reputation for helping people that had
been significantly tarnished in recent years. In these exceedingly
challenging times, it is essential that we learn as much as we possibly
can from the quality events, deviations, and failures that we will
encounter so that we can meet the public health needs of our world.

James Vesper
Provincetown, MA
June 2020

xix
LICENSED TO JOSE CASTELLA

Vesper Book.indb 19 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

xx
LICENSED TO JOSE CASTELLA

Vesper Book.indb 20 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

ABOUT THE AUTHOR

James L. Vesper designs and develops instructional courses and

workshops for the pharmaceutical and biopharma industries and
has more than 40 years’ experience in the pharmaceutical industry.
Before joining ValSource in 2017, he established and is currently
president of the firm LearningPlus, Inc. Dr. Vesper worked 11 years
at Eli Lilly and Company. His first assignment was as corporate
industrial hygienist, followed by three years in Corporate Quality
Assurance. His last assignment at Lilly was Project Leader of
GMP (Good Manufacturing Practice) Education and Instruction,
establishing the department and its mission.

Since 1991, Dr. Vesper has been creating innovative instructional

products for the pharmaceutical and healthcare industries and has
presented papers and workshops at numerous conferences around
the world. In 2001 he was awarded the PDA’s Agalloco Award for
Excellence in Training. Dr. Vesper served as executive producer of
pharma courses at LearnWright until 2006. He has trained health
authority inspectors at US FDA, China FDA, and PIC/S QRM Expert
Circles (Los Angeles, London, and Taiwan).

As an author, Dr. Vesper has written five other books and

multiple chapters and technical articles for peer-reviewed journals.
He also contributed to a training guideline for the World Health
Organization (WHO). He has worked with the WHO as a consultant
and as advisor to WHO’s Vaccine Quality/Global Learning
Opportunities, designing and presenting workshops.

xxi
LICENSED TO JOSE CASTELLA

Vesper Book.indb 21 5/29/2020 10:55:44 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

At ValSource, Dr. Vesper works with other ValSource consultants

leveraging their expertise to create specialized learning solutions for
clients in the areas of GMP, quality systems, contamination control,
and quality risk management.

Dr. Vesper earned a BS in Biology from Wheaton (Illinois)

College, an MPH from the University of Michigan School of Public
Health, and a PhD in Education from Murdoch University in Perth,
Western Australia. His doctoral research, conducted as part of a
larger project with the WHO, is titled Developing Expertise of
those Handling Time- and Temperature-Sensitive Pharmaceutical
Products Using ELearning: A Design Research Study. His dissertation
received awards from Murdoch University and the Association for
Talent Development (ATD). He is a member of ATD, PDA, and ISPE.
Dr. Vesper also serves on several industry and health organization
advisory boards.

James L. Vesper, PhD, MPH

Mobile: +1 585 230 1145
Email: jvesper@valsource.com

The author with Bizot (Photo by Graham Brown)

xxii
LICENSED TO JOSE CASTELLA

Vesper Book.indb 22 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

WHY INVESTIGATIONS AND

CORRECTIVE ACTIONS MATTER

Things happen:

During a routine visual inspection of the retained samples of a liquid

injectable product, laboratory analysts observed thin flakes of glass
suspended in the liquid. Millions of vials of this and other similar
products were on the market.

Multiple consumer complaints in the summer stated that a product

had a bad smell. “Moldy” and “musty” were the terms frequently
used in describing the odor.

A pharmacist found an insect in a bottle of tablets; a regulatory agency

received multiple consumer complaints of insects, spiders, and insect
parts in containers produced by the firm.

A health authority inspector questioned the reliability and

trustworthiness of data generated in a QC laboratory. Subsequent
actions resulted in barriers to exporting the product into major
markets.

Two supervisors argued about which way to turn a valve. One

supervisor prevailed, turning the valve and inadvertently discharging
the contents of a production tank. Nearly $1 million of a biological
product was lost in less than one minute.

The list of problems, mistakes, breakdowns, and compli-cations in

the pharmaceutical and biopharmaceutical industry is innumerable.

1
LICENSED TO JOSE CASTELLA

Vesper Book.indb 1 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
2 Root Cause Investigations for CAPA: Clear and Simple

Anyone who has worked in our industry has their own mental register
of favorite horror stories or nightmares they have experienced.

While compliance with regulatory expectations is definitely

needed (see Chapter 2), there are a number of other reasons attention
is being paid both in and beyond our industry to unwanted events
and ways to avoid recurrences.

A DIFFERENT INDUSTRY
The pharmaceutical industry of today is significantly different
than the one of the 1970s–80s. That period was prior to the current
significant economic pressures placed on name brand manufacturers,
in part by competition from generic firms. Additionally, the supply
chain structure was very different from what we see today. Then,
if a batch of product failed due to a manufacturing breakdown, it
resulted in some discomfort but not too much agony. Yes, the failure
might have cost many thousands of dollars, but another lot of the
active pharmaceutical ingredient (API) could be made (or found)
and a replacement batch squeezed into the production schedule.
Batch rejection rates of 10–25 percent were not uncommon.

These days, with inventory levels kept intentionally low, there

might not be another extra lot of API in the warehouse or, if the
API manufacturer is on the other side of world, it may not be able
resupply in an expedient manner. With the higher reliance on contract
manufacturing organizations (CMOs), changing the schedule and
“fitting something in” are not easily accomplished. Biotech APIs
can take far longer to produce than many chemical syntheses. These
factors can contribute to drug availability problems, an issue that
peaked in 2011 with 251 shortages occurring in the US (FDA, 2015).

Thinking about the issue more broadly, we can reasonably

make the statement that, on the whole, many pharma and
biopharma firms do not aggressively work to understand what
caused or contributed to an unwanted event or outcome. Nor does
the industry find innovative ways to prevent these problems from
recurring; observations from regulatory agencies support this.
There are at least three underlying factors that contribute to this

LICENSED TO JOSE CASTELLA

Vesper Book.indb 2 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Why Investigations and Corrective Actions Matter 3

situation. First, there is a time pressure to quickly launch a new

product or extension, which forces firms to accept a “good enough”
process. In other words, it might not be optimal, but it will work
and allow the product to be marketed and capture market share
before a competitor’s product arrives on the scene. Coupled with
this is the second factor—the regulatory hurdles in making changes
(or “variations”) once the marketing authorization, product, and
process have been approved or licensed. The time pressures and
regulatory hurdles often force the company to live with a process
that is not the best because of the time, cost, inventory complexities,
and regulatory uncertainties of making incremental improvements
that would be natural in other industries. Firms are willing to live
with a sometimes difficult and temperamental process. Finally, we
have trapped ourselves into believing that this is the way it is—
we just need to live with these “facts” as a cost/burden of doing
business.

WHAT OTHER INDUSTRIES HAVE DONE

While there are certainly unique aspects to pharma and biopharma,
we sometimes use “being special” as blinders to practices that have
benefited other organizations that have their own distinct challenges.
Other critical, high-risk industries like nuclear power generation
and transportation have conducted research to investigate causes
and contributors to problems and then put into practice solutions to
correct or prevent them. Looking at some of these examples can be
useful as we consider approaches that might have applicability in
the production of healthcare products.

Automobile and highway safety

According to US government data, 41,717 persons died in 1998
as the result of motor vehicle crashes. Since then, the numbers of
fatalities have decreased with 36,560 deaths in 2018 (NHTSA-DOT,
2019). The decrease is more dramatic when considering miles driven:
In 1997, there were 1.65 motor vehicle crash deaths per 100 million
miles driven; in 2017 the ratio had fallen to 1.17.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 3 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
4 Root Cause Investigations for CAPA: Clear and Simple

Not to disparage the work of the automobile industry in

designing safer cars (IIHS-HLDI, 2015a, 2015b), but some credit needs
to be given to public health practitioners who changed the point of
view of accident investigations from one of simply describing the
accident to looking for its etiology or causative factors (Haddon,
1968). The descriptive approach is aligned with a single factor or
“act of God,” often used in describing how an accident occurred
(see Chapter 7), which simply states that an accident occurred
resulting in the death of the driver, rather than giving details on
how it happened and what specifically contributed to the fatality.
Knowing those specific details allows one to better understand
how to take actions that could prevent future accidents and protect
the vehicles’ occupants. Haddon, for example, created a matrix
to examine the components (driver, passenger, vehicle, highway,
etc.) of an accident in terms of three phases: pre-crash, crash, and
post-crash (Table 1). This different perspective takes a more holistic
view, identifying ways to prevent the accident in the first place (e.g.,
better roads, more traffic rule enforcement) and reducing the impact
(e.g., breakaway highway signs, mandatory motorcycle helmet use)
(O’Neill et al., 2002). By investigating the larger system of things
that are involved or contribute to accidents, better corrective and
preventive measures can be put in place.

Table 1: Haddon Matrix used in better understanding factors that

contribute and cause an accident (Haddon, 1968)

Phase Vehicle Driver Passenger Highway Regulations

Pre-crash

Crash

Post-crash

Another factor in reducing highway fatalities is having the

farsightedness, will, and determination to do so. The Swedish
parliament in 1997 enacted a “Vision Zero” plan with the goal of
eliminating all traffic-related fatalities that challenged the status quo
of simply accepting that there will always be traffic accidents and
deaths. Actions taken related to highway design and construction,

LICENSED TO JOSE CASTELLA

Vesper Book.indb 4 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Why Investigations and Corrective Actions Matter 5

more visible pedestrian crossings, and enforcement have all

contributed to a dramatic decline in injuries and deaths (Economist,
2014).

Commercial aviation safety

Commercial airlines have investigated and implemented corrective
actions making air travel several orders of magnitude safer than
traveling in a car. Industry data from 2017, which was the safest year
for air travel with 44 deaths, showed that there was, on average, one
fatal accident for every 16 million flights (Reuters, 2018).

After the crash of Asiana Airlines Flight 214 while landing in San
Francisco on 6 July 2013, the word “miracle” was used by reporters;
307 passengers and crew were on board; there were three fatalities.
Other commentators with an aviation background challenged that
characterization saying that air travel is safer because of the work
by professionals designing better planes, writing more effective
procedures, improving the use of simulator training, flight crews
communicating and working together more skillfully, and a host of
other factors (Brown, 2013).

In the past 25 years, there has been an increasing amount of work

in understanding not just the role of people in an unwanted event or
accident, but in finding more effective approaches to determine what
went wrong (or had unintended consequences) and ways to control
and mitigate the problem. Rather than blaming people for causing
mistakes, work is being done to minimize and manage conditions
that contribute to the errors and to improve ways to detect and catch
the errors before they have an impact.

Healthcare
In 1999, the US Institute of Medicine (IOM) released its report
To Err is Human: Building a Safer Health System, that said between
44,000 and 98,000 people die each year in the US due to preventable
medical mistakes. Total annual financial costs were estimated to
range from $17 to $29 billion, with added healthcare costs being half
of that amount, not to mention the physical pain and psychological

LICENSED TO JOSE CASTELLA

Vesper Book.indb 5 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
6 Root Cause Investigations for CAPA: Clear and Simple

stress affecting injured individuals and surviving families (NAS,

1999). A more recent study has shown that the original estimates
were low: 210,000 to 400,000 patients each year are believed to have
experienced a preventable harm that contributed to their deaths
(James, 2013).

The IOM report identified four key categories of errors—

diagnostic, treatment, preventive, and other—that were due in large
part to “faulty systems, processes, and conditions that lead people
to make mistakes or fail to prevent them” (NAS, 1999, p. 2). The
report provided a number of recommendations including learning
from errors and voluntarily reporting them, raising performance
standards and expectations, and implementing comprehensive
safety systems at the healthcare delivery level.

While the US healthcare delivery system is enormously

complex and decentralized, there have been slow but significant
improvements. For example, “patient safety improved, led by a 17%
reduction in rates of hospital-acquired conditions between 2010 and
2013, with 1.3 million fewer harms to patients, an estimated 50,000
lives saved, and $12 billion in cost savings” (AHRQ, 2015, p. 2).
In New York state, the rate of various hospital-acquired infections
decreased 25–52 percent between 2015 and 2018 (NYSDOH, 2019).
Additionally, data from reporting hospitals showed a significant
increase in the number of hospitals that scored highly in a variety
of quality metrics, including reductions in situations that could
negatively affect patient outcomes and patient safety (AHA, 2018).

The reasons for these improvements include forced

transparency—hospitals and care facilities must report certain
quality indicators like infection and complication rates—as well as
incentives for good results and disincentives for poor results. For
example, in the US, many private insurance carriers and Medicare
now use a “pay for performance” model that gives a financial bonus
to healthcare providers if they meet or exceed certain patient-
focused outcomes or performance measures. Conversely, if a
patient acquires an infection while having surgery and must remain
hospitalized for extra days, insurance companies or Medicare will
not pay for the additional care—the hospital is thus penalized for

LICENSED TO JOSE CASTELLA

Vesper Book.indb 6 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Why Investigations and Corrective Actions Matter 7

the mistake. These new payment models are having a significant

positive impact on patient quality and desirable outcomes.

HIGH RELIABILITY ORGANIZATIONS

High reliability organizations (HROs) are those that have a high
level of risk and complexity but avoid catastrophic outcomes
because of diligent efforts in designing and implementing systems
and behaviors. Examples of HROs that have been well studied
include aircraft carriers, nuclear power plants, and air traffic control
operations. In contrast to HROs are organizations that emphasize
efficiency (Weick et al., 1999). Weick and his colleagues identified
that HROs have a “mindfulness” that allows them to discover and
manage unwanted events, often when only weak signals or just a
hint of a problem are available; this gives rise to the organization’s
reliability. They described six processes that contribute to
mindfulness:

• Preoccupation with failure: Full-blown failures do not occur

often in an HRO because the organization looks to understand,
predict, and prevent unwanted events. If an unwanted event
does occur, efforts are made to contain (or mitigate) the problem.
When a failure or a “near miss” occurs, the HRO tries to learn
as much as possible from that situation and make adjustments
for the future. In an HRO, finding and self-reporting errors are
commended.

• Reluctance to overly simplify interpretations: Simple is good

but simplicity is not. HROs have to match the complexities
of the environment that they are in with their own unique
complexities. They understand that there are issues unknown
to them that will undoubtedly appear and they must be ready
to address them using diverse teams and resources. A team
member who is skeptical and questions and double-checks
contributes to reliability.

• Sensitivity to operations: Situational awareness is the ability

to identify early on when a problem is beginning to appear and
also detect changes in the environment that could potentially

LICENSED TO JOSE CASTELLA

Vesper Book.indb 7 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
8 Root Cause Investigations for CAPA: Clear and Simple

have an impact on what is happening. This requires having a

robust mental model of the process and constantly refreshing it
as things change.

• Commitment to resilience: HROs are constantly expecting

the unexpected. While they have prepared based on their
knowledge of the system and processes involved, they realize
previously unrecognized problems can arise, possibly because
of subtle external changes. They are ready for the unexpected.

• Empowered individuals: HROs usually have hierarchies, but

when confronted with a significant issue, hierarchies are reduced
and decision makers defer to those with expertise. Individuals
who are at the lowest levels of the hierarchy have the ability to
take certain actions, like stopping a process or activity. Others
with knowledge and different points of view can come in and
share their perspectives and expertise.

With this brief description of HROs, there are several points to

make that are germane to root cause investigations and corrective
actions. First, the preoccupation with failure is achieved because
efforts have been made to prevent as many failure modes as possible.
That is to say that HROs have aggressively addressed the causes of
problems so there are fewer incidents that occur; problems are an
anomaly, not the norm. Second, people involved with HROs have
a strong mental model or picture in their mind of what a correct
or normal operation is, and, if something is not consistent with
that mental model, they take immediate action. Third, those with
technical knowledge on a topic are listened to; in times of trouble,
technical competence trumps organizational status.

WHAT CAN WE LEARN FROM OTHERS AND APPLY TO

WHAT WE DO?
We’ve examined four different types of industries or organizations
that have made strides in understanding specific and generalized
problems that can occur and have put in place solutions to prevent or
to reduce the impact of unwanted outcomes. From those examples,
we can propose several attributes of an investigation and corrective

LICENSED TO JOSE CASTELLA

Vesper Book.indb 8 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Why Investigations and Corrective Actions Matter 9

action program important to the success of such efforts in pharma

and biopharma.

First, an emphasis on investigations and corrective actions

needs to be a priority for the organization. Having a mandate to
make improvements can be internal—e.g., from the president or
CEO of a company—but it could also be externally imposed, such as
from a governmental body or agency. Everyone needs to know that
finding ways to truly prevent unwanted events is not just important
but is critical.

Second, successful solutions are holistic; that is, they consider

the broader human, contextual, environmental, and systems aspects
of the issue. Fewer people are dying in car accidents not just because
car designs have improved but because of changes in highways,
traffic enforcement, and societal pressures on improper actions (e.g.,
driving while intoxicated, texting while driving).

Third, everyone has a role to play, from those who observe

a potential problem, to subject matter experts who share their
knowledge and insights, to management that supplies resources.

Fourth, incentives/disincentives can encourage/discourage

behaviors. Incentives and disincentives can both work in positive
and negative ways. If people are encouraged to self-report problems
but there is a punitive action when they do so, they will learn quickly
that self-reporting is probably not going to be in their best interest.
Conversely, if people are positively recognized for self-reporting,
treated fairly, and their experience is used to prevent a future
problem, others will notice and be more willing to step forward in
difficult circumstances.

Finally, as the proportion of system-, process-, and equipment-

related problems decrease, there will be an increase in problems
attributed to “human error.” To resolve these types of problems and
prevent recurrences, there needs to be a thorough understanding of
the processes, practices, and human factors involved. Control and
mitigation strategies need to be established that account for the fact
that people are not perfect and sometimes fail.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 9 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
10 Root Cause Investigations for CAPA: Clear and Simple

CONCLUSION
We have seen that there are reasons for investigating problems
and implementing corrective actions beyond simply wanting to
achieve regulatory compliance. Positive health outcomes, patient/
user safety, customer (stakeholder) satisfaction, and public health
are some examples; long-term cost reduction and other economic
benefits are others. Accomplishing these desirable outcomes
requires efficient, knowledge-generating investigations that point
to effective corrective actions, topics that will be covered in the
remaining chapters.

REFERENCES
AHA (2018) Trendwatch – Aligning efforts to improve quality.
American Hospital Association. https://www.aha.org/system/
files/2018-10/AHA_TrendWatch_Report_Quality_Healthcare_v31_
pages.pdf. Accessed 2 Mar 2020.

AHRQ (2015) 2014 National Healthcare Quality and Disparities

Report Patient Safety Chartbook. Rockville, MD: U.S. Agency for
Healthcare Research and Quality.

Brown, D.P. (2013) A few thoughts on Asiana Airlines flight 214 crash
at SFO. AirlineReporter, 8 Jul 2013. http://www.airlinereporter.
com/2013/07/a-few-thoughts-on-asiana-airlines-flight-214-crash-at-
sfo/. Accessed 19 Oct 2015.

Economist (2014) Why Sweden has so few road deaths. Economist,

26 Feb 2014.

FDA (2015) Drug shortages infographic. http://www.fda.gov/Drugs/

DrugSafety/DrugShortages/ucm441579.htm. Accessed 15 Oct 2015.

Haddon, W.J. (1968) The changing approach to the epidemiology,

prevention, and amelioration of trauma: The transition to
approaches etiologically rather than descriptively based.
Amerian Journal of Public Health, 58(8):1431–1438.

IIHS-HLDI (2015a) Saving lives: Improved vehicle designs bring

down death rates. Insurance Institute for Highway Safety/

LICENSED TO JOSE CASTELLA

Vesper Book.indb 10 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Why Investigations and Corrective Actions Matter 11

Highway Loss Data Institute, Status Report 50:1, 29 Jan 2015.

http://www.iihs.org/iihs/sr/statusreport/article/50/1/1. Accessed 15
Oct 2015.
IIHS-HLDI (2015b) Stopping power: IIHS rates 19 new models for
front crash prevention. Insurance Institute for Highway Safety/
Highway Loss Data Institute, Status Report 50, 2–7, 26 Aug 2015.
James, J.T. (2013) A new, evidence-based estimate of patient harms
associated with hospital care. Journal of Patient Safety, 9(3):122–
128.
Joint Commission (2014) America’s hospitals: Improving quality and
safety. The Joint Commission’s annual report 2014. Oakbrook
Terrace, IL: Joint Commission.
Leveson, N.G. (2011) Engineering a Safer World: Systems Thinking
Applied to Safety. Cambridge, MA: MIT Press.
NAS (1999) Announcement of Report: To Err is Human. Washington,
DC: National Academy of Sciences.
NHTSA-DOT (2019) Fatality analysis reporting system (FARS)
encyclopedia. https://www-fars.nhtsa.dot.gov/Main/. Accessed 19
Oct 2015.
NYSDOH (2019) Hospital-Acquired Infections: New York State 2013.
Albany, NY: New York State Department of Health.
O’Neill, B. and Mohan, D. (2002) Reducing motor vehicle crash
deaths and injuries in newly motorising countries. British
Medical Journal, 324(7346):1142–1145.
Rankin, W. (2007) MEDA Investigation Process. AeroMagazine, Qtr
2.07.
Reuters (2018) 2017 safest ever on record for commercial passenger
air travel: groups. https://www.reuters.com/article/us-aviation-
safety/2017-safest-year-on-record-for-commercial-passenger-air-
travel-groups-idUSKBN1EQ17L. Accessed 3 Feb 2020.
Savage, I. (2013) Comparing the fatality risks in United States
transportation across modes and over time. Research in
Transportation Economics, 43(1):9–22.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 11 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
12 Root Cause Investigations for CAPA: Clear and Simple

Vesper, J.L. (2006) Risk Assessment and Risk Management in the

Pharmaceutical Industry – Clear and Simple. Bethesda, MD: PDA/
DHI.

Weick, K.E., Sutcliffe, K.M., Obstfeld, D. (1999) “Organizing for high

reliability: Processes of collective mindfulness.” In Sutton, R.S.
and Staw, B.M. (Eds.), Research in Organizational Behavior, Vol.
21, pp. 81–123. Stamford, CT: Jai Press.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 12 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

REGULATORY REQUIREMENTS
AND EXPECTATIONS
(Portions of this chapter are from Chapter 6, “Discrepancy
Observations and Investigations,” in GMP in Practice, 5th Edition, 2018,
cowritten by James Vesper and Tim Sandle.)

Even in the best pharmaceutical companies, problems occur—

something goes wrong or a lab instrument fails. Good manufacturing
practices (GMPs) require that people be alert and able to recognize
a problem, investigate why the problem occurred, evaluate how
the problem affects other drug products, and identify actions to be
taken to prevent the problem from recurring.

This chapter presents six high-level expectations that a sample

of health authorities have related to investigations and corrective
actions. Specific sections of their regulations or requirements are
provided as well.

Having confidence that a step or process works consistently

reduces risk and provides other positive business and supply
benefits. One benefit of an investigation is that it adds to the
understanding of the process and product—you now have
information about what can cause or contribute to a problem or
unwanted event. This information may need to be shared with those
involved in development or technology transfer.

Investigations must be done according to a procedure; the

resulting report must be reviewed and approved by the quality unit
before the batch (or batches) involved can be released.
13
LICENSED TO JOSE CASTELLA

Vesper Book.indb 13 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
14 Root Cause Investigations for CAPA: Clear and Simple

Rejected batches also must be investigated to understand what

went wrong in order to prevent future losses and rejections. Any
changes proposed in reacting to a problem or that are intended
to prevent problems in the future must go through the change
management process.

Most quality auditors and national authority inspectors would

agree that identifying the direct cause of a discrepancy as “human
error” is not adequate, as there is not enough information given
to be able to take action against that cause. Calling something
“human error” is as unsatisfying and imprecise as only using the
term “equipment failure” and not delving into the issue further.
If people were involved in the discrepancy, the investigators need
to understand what really contributed to the mistake, the so-
called human factors. Was the procedure unclear or was there poor
communication between people working in different departments
or on different shifts? As will be seen in Chapter 7, some pharma
firms are using accident models that are used in the aviation and
nuclear power industries as ways to provide better structure to their
investigations.

Investigation reports are important from an auditing and

inspection point of view as well. Quality auditors and national
authority inspectors request problem investigation reports so they
can better understand the firm’s problem-solving processes and
abilities and also to determine how much risk the firm’s products
are to those using them. There are numerous examples of where an
inspection team finds investigations inadequate and the firm recalls
products from the marketplace.

DIFFERENCES IN EXPECTATIONS BETWEEN MEDICAL

DEVICES AND DRUGS
When comparing the requirements and expectations between
medical devices and pharma/biopharma products, there are some
subtle differences. The current regulations for medical devices
are rooted in the International Organization for Standardization
(ISO) quality standards such as ISO-9000 and ISO-13485:2016
(Medical devices — Quality management systems — Requirements for

LICENSED TO JOSE CASTELLA

Vesper Book.indb 14 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 15

regulatory purposes) and in guidance originally published by the

Global Harmonization Task Force (GHTF). (The GHTF became the
International Medical Device Regulators Forum or IMDRF.)

One of the key differences is that medical devices see corrective

and preventive action (CAPA) as the methods of collecting,
analyzing, and evaluating information and then taking corrective
actions to remedy a problem and preventive actions on anticipated
issues. Also, process and product improvements are part of this
integrated quality system element. Regulators also expect device
firms to be regularly mining their data from or regarding customer
feedback, manufacturing issues, audits, suppliers, and the like. This
is specifically called out in ISO-13485:2016:
The organization shall document procedures to determine, collect and
analyse appropriate data to demonstrate the suitability, adequacy and
effectiveness of the quality management system. The procedures shall
include determination of appropriate methods, including statistical
techniques and the extent of their use (p. 24).

In pharma and biopharma, we tend to have more independent,

stand-alone quality system elements that connect. For example,
there are complaint and pharmacovigilance systems that might
trigger an investigation that could result in a corrective action. Also,
there is not a stated regulatory requirement that the same sort of
data analysis occur.

WHAT REGULATORS HAVE BEEN FINDING

In their periodic reports on issues related to noncompliance with drug
GMPs, health authorities frequently cite inadequate investigations.

For example, the second most frequent observation made by

US FDA inspectors visiting drug manufacturing facilities in fiscal
year 2019 (1 October 2018 to 30 September 2019) was inadequate
investigations of discrepancies (Unger, 2019).

In their report that presented 2016 inspection data, the United

Kingdom’s Medicines and Healthcare products Regulatory Agency
(MHRA) provided examples of some of their specific findings related
to root cause investigations and CAPAs (MHRA, 2016):

LICENSED TO JOSE CASTELLA

Vesper Book.indb 15 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
16 Root Cause Investigations for CAPA: Clear and Simple

Deficiencies related to incident investigations and corrective and

preventive action (CAPA) implementation:

• Deviation reports did not contain sufficient information to

describe the investigations conducted or demonstrate the
evidence that supported the proposed root cause.

• In some cases there were no formal CAPA raised and in others

the CAPA were not adequate. There was no review of repeated
deviations which would indicate a trend or failure of CAPAs to
resolve the issue.

• The site had not established and maintained an effective

control system to monitor process and product quality, and
had not applied an appropriate level of investigation or fully
documented all potential serious incidents, with the objective
of determining the root cause and implementing appropriate
corrective and preventive action.

The deviation procedure lacked sufficient detail to ensure that

investigations were appropriately thorough:

• There was no requirement to identify the impact of the deviation

on the batch.

• There was no process to escalate deviations in a timely manner

in the event of an issue having the potential to present a patient
safety impact.

• There was no procedural requirement to consider if the deviation

had occurred previously.

• There was no requirement to ensure that process, procedural

or systems based errors had not been overlooked prior to
identifying ‘Personnel Error’ as a root cause.

• There was no timeline for completion of the deviations in the

procedure (other than ‘in a timely manner’).

• The root causes recorded were not always those identified in the
procedure.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 16 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 17

• At least eight overdue CAPA (ranging from 59 days to 242 days

overdue) were observed to have been closed the day before the
inspection.

• Two overdue CAPA were open at the time of the inspection (186
days and 60 days overdue).

• Where 134 deviations were raised between November 2015 and

February 2016, no CAPA were raised.

• Effective monitoring of CAPA was not in place as numerous

CAPA with different due dates could be recorded on a single
form but only the latest date was tracked.

• The review of effectiveness of CAPAs was identified as being

part of Management Review; however there was insufficient
detail describing this process and the process was not risk based
as the Management Review was only carried out once a year.

GMP EXPECTATIONS
If one were to ask what inspectors in the US, Europe, Canada, and
the World Health Organization expect regarding investigations and
corrective actions, the answer may include the following summary
of expectations found in the GMP regulations from those countries
or regions (FDA, 2019; EC, 2020; Health Canada, 2018; WHO,
2011). Guidelines published by the International Conference on
Harmonization (ICH, 2008) are also included in this review.

1. All personnel are alert and able to observe potential

problems and unusual or atypical situations, materials, and
results (i.e., “discrepancies”).

• Everyone in a drug manufacturing facility needs to be

continually alert for situations that could result in problems.
In general, these are called discrepancies. Sometimes these are
very subtle: a chromatogram that has a peak with a slightly
different shape, or a raw material that is chunky instead of a
free-flowing powder. Experienced personnel frequently sense

LICENSED TO JOSE CASTELLA

Vesper Book.indb 17 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
18 Root Cause Investigations for CAPA: Clear and Simple

something being different before they can fully describe what

is wrong.

• When anything looks odd or different or when a procedure

or method cannot be followed as written, supervisors, the
quality unit, or other appropriate people must be informed
immediately. Delay or not taking any action could result in more
serious problems that could have been avoided. Sometimes, for
instance if there is a stability failure of a marketed product, a
regulatory authority like the FDA needs to be notified within
72 hours.

GMP Reference Examples

Any unexplained discrepancy (including a percentage of theoretical

yield exceeding the maximum or minimum percentages established
in master production and control records) or the failure of a batch
to meet any of its specifications shall be thoroughly investigated,
whether or not the batch has already been distributed [§ 211.192].

Any deviation from instructions or procedures is avoided. If

deviations occur, qualified personnel investigate, and write a report
that describes the deviation, the investigation, the rationale for
disposition, and any follow-up activities required. The report is
approved by the quality control department and records maintained
[C.02.011 #5].

Any significant or unusual discrepancy observed during

reconciliation of the amount of bulk product and printed packaging
materials and the number of units packaged is investigated and
satisfactorily accounted for before release [C.02.011 #42].

Damage to containers and any other problem which might adversely

affect the quality of a material should be investigated, recorded and
reported to the Quality Control Department [EU 5.4].

LICENSED TO JOSE CASTELLA

Vesper Book.indb 18 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 19

WHO

Records are made (manually and/or by recording instruments)

during manufacture to show that all the steps required by the
defined procedures and instructions have in fact been taken and
that the quantity and quality of the product are as expected; any
significant deviations are fully recorded and investigated [WHO
Annex 3–GMP, 2.1(f)].

ICH

(There are no references in Q8(R2), Q9, or Q10 for this specific

expectation.)

2. Procedures define the actions to be taken when a

discrepancy or atypical result is observed.

• Different types of discrepancies may involve different

responses—for example, who to call, what must be done
immediately, and how to investigate the problem.
• Different firms have different terms they use for discrepancies—
some call them “deviations,” “special occurrences,” or
“difficulties.” Whatever terms are used, they should be defined
in the standard operating procedure (SOP).
• The procedure would identify specific situations when
investigations are required, such as rejections, complaints, out
of specification results, and failures.
• The procedure usually mandates that the investigation be
completed in a certain period of time—generally in less than
30 days from when the problem was first observed. Often,
a thorough investigation takes more time to complete. If this
happens, interim reports should be prepared and approved by
QA describing the status, results to date, and what yet needs to
happen.
• Some firms use definitions and decision trees to categorize a
deviation as “critical,” “major,” or “minor,” each with its own
path for documentation and investigation. This type of triage can
help focus attention on the more significant, higher-risk issues.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 19 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
20 Root Cause Investigations for CAPA: Clear and Simple

GMP Reference Examples

There shall be a quality control unit that shall have the responsibility
and authority to approve or reject all components, drug product
containers, closures, in-process materials, packaging material,
labeling, and drug products, and the authority to review production
records to assure that no errors have occurred or, if errors have
occurred, that they have been fully investigated [§ 211.22(a)].

There shall be written procedures for production and process

controls designed to assure that the drug products have the identity,
strength, quality, and purity they purport or are represented to
possess. Such procedures shall include all requirements in this
subpart [§ 211.100(a)].

Deviations and borderline conformances are evaluated in accordance

with a written procedure [C.02.014 #2.1].

The Quality Control Department as a whole will also have other

duties, such as . . . participate in the investigation of complaints
related to the quality of the product, etc. All these operations should
be carried out in accordance with written procedures and, where
necessary, recorded [EU 6.2].

WHO

Any deviation from instructions or procedures should be avoided

as far as possible. If deviations occur, they should be done in
accordance with an approved procedure. The authorization of the
deviation should be approved in writing by a designated person,
with the involvement of the quality control department, when
appropriate [WHO Annex 3–GMP, 16.3].

ICH

A structured approach to the investigation process should be used

with the objective of determining the root cause [Q10, 3.2.2].

LICENSED TO JOSE CASTELLA

Vesper Book.indb 20 5/29/2020 10:55:47 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 21

3. Discrepancies are documented and investigated.

• It is critical that the investigation begins immediately before

data, evidence, and people’s memories are lost or begin to fade.
The first 24 hours after an accident are frequently called “the
golden hours” by investigators.

• Investigations must be thorough and credible; they should be

done using a standardized approach. For significant issues, tools
like “root cause analysis,” Kepner-Trego®, or Cause Mapping®,
can provide a highly structured approach in finding the direct
and contributing causes.

• At the very least, the person observing or discovering the

discrepancy should immediately document the event.

• Constructing a flowchart or chronology (timeline) of the incident

can be valuable in assessing the logic and completeness of the
investigation.

• The rigor and scope of the investigation should be related to

the potential impact of the discrepancy on the safety, identity,
strength, and purity of the product and the risks to the drug
firm and the end users.

• “Human error” is not an adequate conclusion in an investigation,

as there is not enough information to use in creating a corrective
action. (What do you do, get rid of the humans?) Often, more
specific accident investigation models that take into account
human factors are used, such as models from the aviation or
nuclear power industries.

GMP Reference Examples

There shall be a quality control unit that shall have the responsibility
and authority . . . to review production records to assure that no
errors have occurred or, if errors have occurred, that they have been
fully investigated [§ 211.22(a)].

LICENSED TO JOSE CASTELLA

Vesper Book.indb 21 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
22 Root Cause Investigations for CAPA: Clear and Simple

All drug product production and control records, including

those for packaging and labeling, shall be reviewed and approved
by the quality control unit to determine compliance with all
established, approved written procedures before a batch is
released or distributed. Any unexplained discrepancy (including a
percentage of theoretical yield exceeding the maximum or minimum
percentages established in master production and control records)
or the failure of a batch to meet any of its specifications shall be
thoroughly investigated, whether or not the batch has already been
distributed. The investigation shall extend to other batches of the
same drug product and other drug products that may have been
associated with the specific failure or discrepancy [§ 211.192].

CA
Any deviation from instructions or procedures is avoided. If
deviations occur, qualified personnel investigate, and write a report
that describes the deviation, the investigation, the rationale for
disposition, and any follow-up activities required. The report is
approved by the quality control department and records maintained
[C.02.011 #5].

Deviations and borderline conformances are evaluated in

accordance with a written procedure. The decision and rationale
are documented. Where appropriate, batch deviations are subject
to trend analysis [C.02.014 #2.1].

Any non-conformances, malfunctions or errors including those

pertaining to premises, equipment, sanitation, and testing, that may
have an impact on the quality and safety of batches pending release
or released, should be assessed and the rationale documented
[C.02.014 #2.2].

EU
Quality control personnel should have access to production areas
for sampling and investigation as appropriate [EU 6.4].

Out of specification or significant atypical trends should be

investigated. Any confirmed out of specification result, or significant
negative trend, affecting product batches released on the market
should be reported to the relevant competent authorities. The

LICENSED TO JOSE CASTELLA

Vesper Book.indb 22 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 23

possible impact on batches on the market should be considered [EU

6.35].

Products which have been involved in an unusual event should

only be reintroduced into the process after special inspection,
investigation and approval by authorized personnel. Detailed
record should be kept of this operation [EU 5.60].

Any significant or unusual discrepancy observed during

reconciliation of the amount of bulk product and printed packaging
materials and the number of units produced should be investigated
and satisfactorily accounted for before release [EU 5.61].

All incidents, not only system failures and data errors, should
be reported and assessed. The root cause of a critical incident should
be identified and should form the basis of corrective and preventive
actions [EU Annex 11, 13].

WHO
Any deviation from instructions or procedures should be avoided
as far as possible. If deviations occur, they should be done in
accordance with an approved procedure. The authorization of the
deviation should be approved in writing by a designated person,
with the involvement of the quality control department, when
appropriate [WHO Annex 3–GMP, 16.3].

Quality control personnel must have access to production

areas for sampling and investigation as appropriate [WHO Annex
3–GMP, 17.6].

ICH
A structured approach to the investigation process should be
used with the objective of determining the root cause. The level of
effort, formality, and documentation of the investigation should be
commensurate with the level of risk, in line with ICH Q9. CAPA
methodology should result in product and process improvements
and enhanced product and process understanding [Q10, 3.2.2].

[Quality risk management can be used] to identify potential

root causes and corrective actions during the investigation of out of
specification results [Q9 II.7].

LICENSED TO JOSE CASTELLA

Vesper Book.indb 23 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
24 Root Cause Investigations for CAPA: Clear and Simple

4. For significant discrepancies, a written report is prepared.

• The written record of the investigation must include all

observations, studies performed, recommended disposition of
the product in question, and actions recommended to prevent a
recurrence of the failure or discrepancy.

• Any unrelated deviations observed during an investigation are

reported to the appropriate manager.

• Investigations must be thorough and complete, but conducted

in a timely manner. The prevention of similar occurrences may
save unnecessary financial loss.

• While there is not an official format required by regulatory

authorities, most firms have a standard outline used for a formal
investigation report, as shown below:

First page:
Incident title, number, date of report, and summary (what was the
problem, what was the cause, what will be done).

Following pages:

a. Incident description—basic facts.

b. Immediate actions—what was done when the incident was
discovered to prevent the situation from getting worse.
c. Scope of the investigation—what was affected (product
name, equipment, material, lot numbers).
d. Historical review results.
e. Investigation findings and results (what was looked at and
what was found or not found).
f. Cause conclusion.
g. Product impact assessment and justification.
h. Corrections made on the lot/materials/equipment involved.
i. Corrective actions.
j. Preventive actions.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 24 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 25

k. Recommended disposition.
l. Background information (optional).
m. Documents attached (list).

GMP Reference Examples

A written record of the investigation shall include the conclusions

and follow-up [§ 211.192].

Any deviation from instructions or procedures is avoided. If

Any deviation from instructions or procedures should be avoided

as far as possible. If a deviation occurs, it should be approved in
writing by a competent person, with the involvement of the Quality
Control department when appropriate [EU 5.15].

The basic requirements of quality control are that: . . . records

are made, manually and/or with or by recording instruments, which
demonstrate that all the required sampling, inspecting, and testing
procedures were actually carried out. Any deviations are fully
recorded and investigated [EU 1.4(iv)].

WHO

The system of quality assurance appropriate to the manufacture

of pharmaceutical products should ensure that: . . . deviations are
reported, investigated and recorded [WHO Annex 3–GMP, 1.2(j)].

LICENSED TO JOSE CASTELLA

Vesper Book.indb 25 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
26 Root Cause Investigations for CAPA: Clear and Simple

ICH
The level of effort, formality, and documentation of the investigation
should be commensurate with the level of risk, in line with ICH Q9
[Q10, 3.2.2].

5. The investigation is expanded as needed to consider other

products, raw materials, equipment, personnel, methods, and
the like that may have some connection to the deviation.

• It is not unusual that once a discrepancy or problem is noticed in

one batch or product and the investigation begins, other batches
or products are found that have that same problem.
• During the investigation, team members need to consider if
there is a common cause of the problem that could affect other
products, materials, or methods.
• Quality auditors and national authority inspectors sometimes
ask a firm, “How do you know that this problem has not affected
other products or batches?” The auditors and inspectors would
expect to see data that supports the decision.
• Retained samples, batch records, certificates of analysis, and
test data may be used to help answer this question.
• Risk assessment tools can help in making decisions about how
far to extend an investigation.

GMP Reference Examples

The investigation shall extend to other batches of the same drug

product and other drug products that may have been associated
with the specific failure or discrepancy. A written record of the
investigation shall include the conclusions and follow-up [§ 211.192].

Should any failure to conform to finished product testing

requirements be identified, an investigation of the extent of the

LICENSED TO JOSE CASTELLA

Vesper Book.indb 26 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 27

noncompliance is to be conducted. This investigation may lead to

reassessment and re-testing of all dosage forms from the fabricator.
This procedure may include: re-evaluation of GMP compliance;
and additional complete confirmatory testing, based on the risk
associated with the noncompliance [C.02.019 #5, 5.1, 5.2].

If a quality defect is discovered or suspected in a batch, consideration

should be given to checking other batches and in some cases other
products, in order to determine whether they are also affected. In
particular, other batches which may contain portions of the defective
batch or defective components should be investigated [EU 8.11].

WHO

If a product defect is discovered or suspected in a batch, consideration

should be given to whether other batches should be checked in order
to determine whether they are also affected. In particular, other
batches that may contain reprocessed product from the defective
batch should be investigated [WHO Annex 3–GMP, 5.6].

ICH

The level of effort, formality, and documentation of the investigation

should be commensurate with the level of risk, in line with ICH Q9
[Q10, 3.2.2].

6. The investigation is completed and approved by the quality

control unit before the final disposition of the material or drug
product is approved/certified by the quality control unit or
those responsible for batch release.

• Generally, investigations of significant problems involve a team

of experts from different disciplines, including someone from
the quality unit.

• The investigation must be completed and approved by the

quality control unit before the lot of finished drug product can
be considered for release.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 27 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
28 Root Cause Investigations for CAPA: Clear and Simple

• Those releasing the product (whether the quality unit or the

qualified person) must be confident that the product’s safety,
identity, strength, purity, and quality have not been negatively
affected by the discrepancy or incident.

GMP Reference Examples

All drug product production and control records, including

Any deviation from instructions or procedures is avoided. If

The assessment for the release of finished products embraces

all relevant factors, including the production conditions, the results

LICENSED TO JOSE CASTELLA

Vesper Book.indb 28 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Regulatory Requirements and Expectations 29

of in-process testing, the fabrication and packaging documentation,

compliance with the finished product specifications, an examination
of the finished package, and if applicable, a review of the storage
and transportation conditions [C.02.014 #2].

Arrangements are made for the manufacture, supply and use

of the correct starting and packaging materials, the selection and
monitoring of suppliers and for verifying that each delivery is from
the approved supply chain [EU #1.4(vi)].

The QPs certifying the different finished product batches

may base their decision on the quality control testing of the first
imported finished batch provided that a justification has been
documented based on Quality Risk Management principles. [EU
Annex 16, #1.5.7].

WHO

Assessment of finished products should embrace all relevant factors,

including the production conditions, the results of in-process
testing, the manufacturing (including packaging) documentation,
compliance with the specification for the finished product, and an
examination of the finished pack [WHO Annex 3–GMP, #9.2].

ICH

(No specific GMP references are found in Q8, Q9, or Q10.)

REFERENCES
EC (2020) EudraLex - Volume 4 - Good Manufacturing Practice
(GMP) guidelines. European Commission. https://ec.europa.eu/
health/documents/eudralex/vol-4_en. Accessed 2 Mar 2020.

FDA (2019) Current good manufacturing practice, 21 CFR Part 211.

https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSearch.
cfm?CFRPart=211. Accessed 2 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 29 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
30 Root Cause Investigations for CAPA: Clear and Simple

Health Canada (2018) Good manufacturing practices guide for drug

products (GUI-0001). https://www.canada.ca/en/health-canada/
services/drugs-health-products/compliance-enforcement/good-
manufacturing-practices/guidance-documents/gmp-guidelines-0001.
html. Accessed 2 Mar 2020.

ICH (2008) Pharmaceutical quality system, Q10. International

Conference on Harmonisation. https://database.ich.org/sites/
default/files/Q10_Guideline.pdf. Accessed 2 Mar 2020.

MHRA (2016) MHRA GMP inspection deficiency data trend 2016.

Medicines & Healthcare products Regulatory Agency. https://
assets.publishing.service.gov.uk/government/uploads/system/
uploads/attachment_data/file/609030/MHRA_GMP_Inspection_
Deficiency_Data_Trend_2016.pdf. Accessed 2 Mar 2020.

Unger, B. (2019) FDA FY2019 Drug inspection observations and

trends. Pharmaceutical Online. https://www.pharmaceuticalonline.
com/doc/fda-fy-drug-inspection-observations-and-trends-0001.
Accessed 2 Mar 2020.

WHO (2011) Forty-fifth report of the WHO Expert Committee on

specifications for pharmaceutical preparations. WHO technical
report series, no. 961. http://whqlibdoc.who.int/trs/WHO_
TRS_961_eng.pdf. Accessed 2 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 30 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

ROLES AND RESPONSIBILITIES

Who might be involved in the process of observing and responding

to a quality event or unwanted incident? Who could be indirectly
involved in setting up that process or monitoring it? What do they
need to be able to do? What do they need to know? And what about
those who evaluate these activities as either quality auditors or
health authority inspectors? How many people should be taking an
active part in the investigation?

This chapter examines the roles and responsibilities of these

people and the competencies they need to have, as well as other
important factors for those doing investigations.

To begin, let’s define what we mean by competency.

COMPETENCIES AND COMPETENCY-BASED TRAINING

A competency is the set of knowledge, skills, and attitudes that are
needed for a person to be proficient at a particular task or job.

Competency-based training is used in a variety of professional

and vocational development programs as a way for a person to
acquire and then demonstrate the knowledge and skills needed to
safely and effectively perform the tasks required in a given role or
position. The competency approach differs from an academic model
in that the goal of the learner is not just to receive a grade, advance
to the next level, and then graduate with a degree or certificate, but

31
LICENSED TO JOSE CASTELLA

Vesper Book.indb 31 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
32 Root Cause Investigations for CAPA: Clear and Simple

rather be able to achieve a defined level of performance. Competency-

based training is not concerned with groups, but rather with the
individual—what can be done to help this person optimally achieve
the intended goal. For some it might require more coaching and
practice. For others, less. Often, structured on-the-job training is
integrated with a competency-based approach that gives a level of
confidence that the person can successfully perform the task.

Various versions of competency models exist; however, they

all incorporate a combination of a person’s capabilities, basic or
foundational competencies, and job/role-specific competencies.

When assessing if a person has acquired the knowledge and

skills, a pass/fail scoring system is used with behavioral criteria that
can be readily observed. Additionally, the task is broken down to
include the critical success factors for the task, which helps make it
easier to remediate a nonsuccessful performance. For critical tasks
(experts or trainers must define what criticality is for a task in a
given context), such as aseptic manipulations or preparing a sample
for analysis, it may be useful for the person to do multiple successful
performances.

Advantages of competency-based training include that it

is specific to what the learner/performer does in his or her job;
unnecessary content is not included. Also, it is focused on the
individual with the goal of helping the individual succeed;
adjustments are made as needed to provide remedial coaching and
extra practice time. Even with this individual focus, a competency-
based approach that includes structured on-the-job training will
usually take less time to move someone from a novice to a confident,
competent performer than traditional training. Drawbacks of
competency-based training are that it takes time and expertise to
initially identify what the competencies are and to establish the
learning plans for personnel. This approach also requires more
active involvement from instructors, coaches, and mentors.

When developing a competency model there are certain

competencies that are “universal” in a good manufacturing practice
(GMP) or good practice (GxP) environment. For example, record-
keeping techniques incorporating the “ALCOA-plus” specifications

LICENSED TO JOSE CASTELLA

Vesper Book.indb 32 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 33

that result in reliable and trustworthy records must be used by those

in clinical research, pharmacovigilance, product development,
manufacturing, and distribution. Other competencies are more job
and role specific.

SPECIFIC COMPETENCIES FOR THOSE INVOLVED IN

INVESTIGATIONS
The following list presents several different roles and recommended
competencies for each role:

• Production operator, technician, QC analyst

– Observes situations, events, results, etc. that are not as

anticipated and takes appropriate immediate action.

– Has a working understanding of equipment, instruments,

processes, methods, and systems in order to diagnose and
solve problems.

– When a problem or mistake occurs, looks for ways that

would prevent a recurrence.

– Provides support to those conducting the investigation.

• Subject matter expert/investigation team member

– Observes situations, events, results, etc. that are not as

anticipated and takes appropriate action.

– Has a detailed working understanding of equipment,

instruments, processes, methods, and systems that can
diagnose and solve problems.

– Looks for and shares ideas that could improve tasks, quality,
and compliance.

– Applies a methodical problem-solving approach.

– Critically evaluates options.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 33 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
34 Root Cause Investigations for CAPA: Clear and Simple

– When a problem or mistake occurs, looks for ways that

would prevent a recurrence.

• Lead investigator/team facilitator

– Applies risk-based thinking when evaluating the criticality

of an event and determining the actions to take.

– Executes effective methods for investigating incidents,

problems, etc.

– Executes procedures for identifying, implementing, and

checking the effectiveness of corrective actions.

– Uses appropriate approaches to understand and effectively

remediate “human error” situations.

– Communicates with management on situations that need

their timely involvement in resolving.

– Selects appropriate tools/methods for investigating.

– Applies critical thinking.

– Facilitates/leads team as appropriate.

– Demonstrates effective time management skills.

• Report writer

– Writes the investigation report in a clear, fact-based, logical,

concise manner.

– Includes relevant, required details.

– Writes using correct structure and grammar.

– Demonstrates effective time management skills.

• Quality unit leadership

– Establishes and implements procedures and systems

that help to identify problems, monitor trends, and other
undesired situations.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 34 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 35

– Establishes and implements ways of monitoring and

trending environment quality, product quality, unwanted
incidents, deviations, etc.

– Establishes and implements effective methods for

investigating incidents, problems, etc.

– Establishes and implements procedures for identifying,

implementing, and checking the effectiveness of corrective
actions.

– Establishes a “lessons learned” program for events that

went well and those that were problematic.

– Communicates relevant QA and GMP compliance

requirements to management.

– Applies risk-based thinking when evaluating the criticality

of an event and determining the actions to take.

• Management

– Observes the importance and value of an effective

investigation and corrective action program.

– Recognizes the importance and value of an effective

program to collect lessons learned.

– Supports meeting GMP compliance requirements related

to investigations and corrective actions.

– Makes data-driven, risk-based decisions, takes actions,

and provides resources that will contribute to an effective
monitoring, investigation, and corrective action system.

• Quality auditor/health authority inspector

– Evaluates the overall approach for monitoring,

investigations, and corrective actions.

– Evaluates examples of investigation reports.

– Applies risk-based thinking when evaluating the criticality

of a deficiency.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 35 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
36 Root Cause Investigations for CAPA: Clear and Simple

– Uses the appropriate regulatory reference when making an

observation of noncompliance.

– Explains the underlying issue(s)/risk(s) as to why a citation

is being made.
The person and his or her roles can quickly change over time: at
one moment, a technician may observe a problem and then become
involved as a subject matter expert (SME) assisting with the
investigation. Or the lead investigator may also facilitate meetings
with SMEs and then write up the investigation report. These types of
quick role changes are most common in small organizations where
personnel wear multiple hats.

While competencies can be developed through training,

coaching, and experience, it is important that there is a good
foundational match between the person and the role. The quality
unit of a large biopharma firm observed that the investigation reports
that were being written lacked clarity, good grammatical structure,
and key elements of what they (and industry practice) wanted to
include in an adequate investigation. The quality leaders thought that
extensive training and coaching sessions for all the writers would be
the solution to the problem. As part of understanding the problem,
the learning consultant that was brought in interviewed most all of
the writers. The predominant comment from the writers was that
they did not want this task of writing reports. Many said they did
not go to college or university because they didn’t like writing. They
said their management gave them this extended assignment as a
“development opportunity,” but they were floundering in it. The
recommendation made to quality leadership was that there needed
to be better alignment between the task, the underlying skills needed,
and the desire of the personnel to acquire or enhance those skills.

What would training have done? For most of those assigned

to write up the investigations, it would have been frustrating,
agonizing, and a waste of time and resources.

DEVELOPING COMPETENCIES
The combination of formal training, learning by doing, and being
coached are very effective ways of becoming competent. It is a

LICENSED TO JOSE CASTELLA

Vesper Book.indb 36 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 37

mistake, however, to assume that everyone needs the same type and
level of training. In most cases, it is more effective and efficient to
have a smaller number of lead investigators and report writers who
do many investigations than many people who do only one or so a
year. The more you do, the better you get.

One of the fundamental topics in a new-hire training orientation

is what to do if one should observe a deviation or something that
goes wrong. Alerting members of supervision or a senior person
should be an immediate action as well as something that can be
done to “stop the bleeding.” (See Chapter 11.)

One type of training that has been shown to be effective is for

teams—including investigators, SMEs, and quality reviewers—to be
in the same training together. This helps to standardize expectations
and, by using a combination of writing/reviewing activities,
“calibrate everybody’s eyeballs” so all are looking at things the
same way.

WHO “OWNS” THE PROBLEM?

When an unwanted event occurs that requires an investigation, it
is important to know who owns the problem. While GMPs require
that the quality unit approve the investigation report, that is not the
same as having the responsibilities of ensuring an appropriately
thorough investigation occurs, that corrections and corrective
actions have been identified and are aligned with the cause(s) and
implemented, and that a well-written report is prepared. The owner
should be part of the review and approval process.

Often the owner is the manager of the area where the problem
was found. This should be defined in a policy or procedure.

HOW BIG SHOULD THE TEAM BE?

The non-satisfying answer is “it depends.” Factors to consider
would include the scope and significance of the deviation. For
most investigations, one or maybe two people should be able to
investigate and write up the report. Where a team becomes valuable

LICENSED TO JOSE CASTELLA

Vesper Book.indb 37 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
38 Root Cause Investigations for CAPA: Clear and Simple

is when the investigation is large, complex, or involves complicated

systems or processes. The broader the scope, the more people would
typically be involved.

WHAT MAKES FOR A SUCCESSFUL TEAM?

In the early 2010s, Google began a formal research program on what
makes successful team (Duhigg, 2016). They found that what was
most important was not who was on the team, but rather how the
team worked together. The five specific characteristics that were
identified, in decreasing significance, were:

• Psychological safety: What was the team members’ willingness

to risk embarrassment by saying something controversial,
asking a “stupid” question, or admitting a mistake? If members
did not sense this psychological safety, they would be less
effective.

• Dependability: Team members could be counted on to follow

through on tasks and assignments.

• Structure and clarity: Team members understood what was to

be done and how the tasks could be accomplished.

• Meaning: The work itself and the product generated gave each
team member a sense of pride and accomplishment.

• Impact: Team members felt that what they did was important,
contributing to something of value (re:Work, 2020).

Whether the work group is developing a new product or, in our

case, investigating a significant problem or failure, teams that exhibit
these characteristics will be more effective in reaching their goal.

THE VALUE OF A TEAM

Many of the investigations that are performed are relatively simple—
an investigator who interviews a technician or SME may be all that
is needed. For large, complicated, or significant investigations,
however, having a team—even just two or three members—can

LICENSED TO JOSE CASTELLA

Vesper Book.indb 38 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 39

contribute to a better outcome. Having a cross-functional group

of team members adds considerable value: having different
perspectives, experiences, and knowledge will bring about a more
robust and richer result. There are several reasons for this.

First, if the team members have different backgrounds or

experiences, they can use those in the observations they make. For
example, someone might look at a set of coins and identify them
just as money. On the other hand, someone who has traveled
internationally might recognize a loonie from Canada, a Euro, or
a 500 yen coin from Japan. As we will see later when we discuss
observation skills (Chapter 9), being able to specifically articulate—
use specific names or correct words for what one sees—is very
valuable.

Second, people with different backgrounds can ask important

questions—things that you or I may not think to ask. Don’t
discount those without expertise in a specific domain, for example
microbiology or the operation of a high-purity water system. Their
“naïve” questions can crack open up a set of assumptions that other
team members thought were true facts.

Third, your team may have subject matter experts—those

working in a field for ten years—who can recognize when something
is wrong but, more importantly, when something is missing. Also, a
true expert can improvise if a defined procedure does not cover how
to address a particular or novel situation (Klein, 2017).

HOW TO BE A GOOD FACILITATOR IF YOU ARE

LEADING AN INVESTIGATION
(This section is adapted from Vesper and McFarland, 2019.)

If you are leading an investigation, and you have SMEs, management,

and others working together and having a meeting, they’re sure to
achieve the desired outcome, right? Perhaps that is true for a simple
investigation, but success is not very likely for one that is even
moderately complicated. That’s where a skilled facilitator is needed.

The root of the word “facilitate” is facile, which means to make

easy. That really says it all. Facilitators are a neutral third party with

LICENSED TO JOSE CASTELLA

Vesper Book.indb 39 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
40 Root Cause Investigations for CAPA: Clear and Simple

a set of special skills and abilities to keep the group on task, follow
acceptable norms and practices, and help them reach their goal. In
other words, facilitators help establish and strengthen relationships
among the group’s members so they can effectively and efficiently
accomplish what they need to.

So, how do you work with the group to make the investigation
process as easy as possible ?

First, as a facilitator you need to adopt some key principles of

group behavior that are foundational for an effective team (NOAA,
2010):

• A group of informed individuals, working together, can

accomplish more than one person working alone.

• Everyone’s opinion is of equal value, regardless of rank or

position.

• People are more committed to ideas and plans that they have
helped to create.

• Participants will act responsibly in assuming accountability for

their decisions.

• The process—if designed well and sincerely applied—can be

trusted to achieve results.

With these foundational principles, as a facilitator, you:

• Help establish the goals and objectives for the project and the
agenda for each meeting.

• Use various approaches so the goals and objectives can be

achieved.

• Keep the group on track with the agenda and process by asking
questions and minding the clock.

• Ensure all voices are heard and individuals or factions do not

dominate conversations.

• Challenge statements or opinions by having group members

apply critical thinking.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 40 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 41

• Summarize actions or decisions at key points.

• Prepare or assist with meeting summaries that include decisions

made and actions to be taken (and by whom).

• Do not “own” the problem—you’ve been asked to use your

interpersonal and team facilitation skills to guide those who can
solve the problem.

• Create and ensure a safe place where individuals and what they
are contributing are respected.

Selecting the team

Some team members—SMEs, an “owner” of the investigation, and
others—may already have been selected to participate. You may need
to suggest inviting additional SMEs to have the needed expertise
available. (Consider having SMEs come in for short periods when
their knowledge is needed.) You should ensure the team is reasonable
in size—not too large, not too small—and includes people who can
contribute in the process. Consider the roles each person will play
within the group. Depending on the scope and complexity of the
investigation, participants may be together for one meeting or for
several.

Preparing the agenda

For each meeting, an agenda should be provided to the team
members in advance. If there are assignments of what to bring—
flow charts, trend analyses, samples—an advance agenda can help
the participants prepare. To help manage the process, include the
time allotted for each agenda item.

The meeting itself

If you are unfamiliar with the room, get there early to make sure the
technology works, there are enough flip charts or whiteboards, and
markers are working well.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 41 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
42 Root Cause Investigations for CAPA: Clear and Simple

At the start of the session, go over the agenda with the participants
and obtain their agreement. Digressions from the agenda that might
occur during the session should be placed on the “parking lot” flip
chart. Consider having a team member be the timekeeper, perhaps
using the alarm on their smartphone to signal it is time to move
ahead.

As you set the agenda times, beware of “optimism bias” that

assumes everything will go smoothly and take less time than it does.
You will need to be prepared to limit discussion, stop participants
from digressing and descending into rabbit holes, and refocus the
group. The challenge is knowing what is essential to resolve at a
given point and what can be set aside for later.

For a face-to-face meeting, try, if possible, to limit the meeting

time to approximately three to four hours, including a brief break.
If longer than this, decision fatigue may begin to set in, where the
quality of decisions (e.g., brainstorming, problem solving, scoring
hazard likelihood and severity) deteriorates.

Ground rules
Most people who have worked in team/group settings are
familiar with having ground rules that set the expectations for the
participants. These are primarily based on common sense and the
need for participants to respect each other and their ideas. Together,
these create a safe working environment to discuss, explore, and
decide.

Sometimes, organizations have formalized these ground rules

and display them on posters in conference rooms. In any case, it
can be helpful to have the participants voice for themselves specific,
relevant behaviors that are to be encouraged (e.g., everyone
participates, is present for the entire meeting, and listens to ideas
without judging) or avoided (e.g., checking phone messages, side
conversations, talking over or interrupting others).

Facilitation techniques
A good facilitator uses a number of techniques. These include:

LICENSED TO JOSE CASTELLA

Vesper Book.indb 42 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 43

• Active listening: Listening can take considerable energy as

you hear a person’s words and watch their nonverbal cues.
Encourage people without interrupting them.

• Asking questions: Questions test assumptions, invite

participation, gather information, and probe for hidden points.
The facilitator should ask open-ended questions (questions
that provide more than simple yes or no answers) to encourage
thorough discussion.

• Paraphrasing: Paraphrasing involves repeating what has been

said to let participants know they are being heard, let others hear
the point a second time, and clarify key ideas. It also provides
an opportunity to ascertain if the facilitator has correctly heard
or interpreted what was said.

• Parking lot: When a discussion gets off track and participants

talk about issues not on the agenda, the facilitator can place the
ideas on a designated flip chart or in an electronic note. Be sure
that these are addressed later or that there is a plan to resolve
the issue.

• Remaining neutral: The facilitator must focus on the process

role and avoid the temptation to offer an opinion on the topic
under discussion. A facilitator who becomes involved in the
content discussion must let the group know he is stepping out
of the facilitator role.

• Summarizing: After listening attentively to all that has been

said, a facilitator should offer a concise and timely summary.
Summarizing is a good way to revive a discussion or end one
when things seem to be wrapping up.

• Synthesizing: While it may sometimes be appropriate to record

individual ideas of each participant, in other situations the
facilitator may encourage attendees to comment on and build
on each other’s ideas and then record the collective idea on a flip
chart. This builds consensus and commitment.

• Using flip charts: Making flip chart notes not only records
decisions, priorities, and key points of discussion, but also

LICENSED TO JOSE CASTELLA

Vesper Book.indb 43 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
44 Root Cause Investigations for CAPA: Clear and Simple

focuses the group’s attention and ensures all participants agree

with what is being recorded.

• Using technology: Projecting a text document or a spreadsheet

on a screen allows everyone to see and comment (NOAA, 2010).

Notes/documentation
The facilitator is usually not the person who takes notes during the
session. Ask a team member to help with this, allowing you as the
facilitator to focus on the process and the people. The notetaker works
closely with the facilitator, listening particularly to the summaries
the facilitator gives with actions, timing, and assignments. (In some
situations, such as when the investigation is rather simple, it may be
more expeditious for the facilitator to provide a written summary of
what happened.)

Final summary of the meetings

As the facilitator, you will mention the highlights of the meeting,
including decisions made and actions accomplished. Tie these in
with the goal of the meeting, e.g., doing a fishbone or an is/is not
exercise. This summary can be included in the meeting minutes.
You can wrap things up by referring to the “3 Ws:”

• What actions are to be taken?

• Who is accountable for executing each action?

• When should the actions be completed?

REPORT WRITERS
Preparing the written report—whether it is in a more traditional
technical report format or in sections that are part of an online
tracking application—requires skills in organizing information,
clearly telling the story, and critical thinking. Some firms employ
technical writers for this task while others provide some level of
training for investigators/writers who have other responsibilities as

LICENSED TO JOSE CASTELLA

Vesper Book.indb 44 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Roles and Responsibilities 45

well. It is not unusual to find firms that approach this by having new
writers use existing reports as templates or models that are to be
followed. As will be discussed in later chapters, grammar, spelling,
and appearance are important to readers, particularly health
authority inspectors and auditors.

CONCLUSION
The size and scope of the investigation affects the decision of who
to include on the investigation team. Having curiosity, wanting to
learn, and critical thinking skills are important characteristics for
team members to have. Training in the investigation process, root
cause investigation tools, and facilitation skills can make team
members more effective and efficient.

REFERENCES
Duhigg, A. (2016) What Google learned from its question to build
the perfect team. NY Times Magazine, Feb 25, 2016. https://www.
nytimes.com/2016/02/28/magazine/what-google-learned-from-its-
quest-to-build-the-perfect-team.html. Accessed 5 Feb 2020.

Klein, G. (2017) Sources of Power: How People Make Decisions. 20th

Anniversary Edn. Cambridge, MA: MIT Press.

NOAA (2010) Introduction to planning and facilitating effective

meetings. Available online at https://coast.noaa.gov/data/
digitalcoast/pdf/effective-meetings.pdf. Accessed 28 Nov 2018.

re:Work (2020) Guide: Understand team effectiveness. Google.

https://rework.withgoogle.com/print/guides/5721312655835136/.
Accessed 5 Feb 2020.

Vesper, J. and McFarland, A. (2019) How to become a (better) facilitator

for risk assessment and root cause analysis. Pharmaceutical
Online, Feb 28, 2019. https://www.pharmaceuticalonline.com/doc/
how-to-become-a-better-facilitator-for-risk-assessment-and-root-
cause-analysis-0001. Accessed 2 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 45 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 46 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

THE BIG PICTURE:

INVESTIGATIONS AND
CORRECTIVE ACTIONS

Every investigation into a quality event or incident is a little different

because of the situation, the facts, and those involved. Despite this,
there are similarities in all investigations. These include activities
that are performed sometimes in a sequential fashion, other times in
parallel, and often with the need to go back and revisit a previous step
based on newer, more accurate information that the investigation
uncovers. In this chapter, we will examine a 14-step process that
begins with the observation of the incident and continues through
to the confirmation that the corrective actions were effective.

THE 14-STEP PROCESS

Figure 1 shows the key activities conducted during an investigation

and in the selection and implementation of the corrections and
corrective actions. These steps can be grouped into four phases
of the investigation and response. Each phase and step will be
described briefly below with more details on how to perform many
of the steps presented in subsequent chapters.

47
LICENSED TO JOSE CASTELLA

Vesper Book.indb 47 5/29/2020 10:55:48 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
48 Root Cause Investigations for CAPA: Clear and Simple
Figure 1. Fourteen steps in the investigation and corrective action
process

Circumstances will dictate the sequence in which the steps are

executed, but it is critical that the actions be taken to minimize the
impact as much as possible, collect evidence that will be important
in helping to determine the direct and contributing causes, and
to keep management, regulators, and in some cases, health care
professionals, informed.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 48 5/29/2020 10:55:49 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 49
Figure 2. Activities performed when observing the event and taking
immediate actions

Observation and immediate actions

Step 1. Observe incident, initiate communication, and take

immediate action including holds on product(s), materials, and
components.

The process begins when someone notices that an unwanted event

or incident has occurred. This can happen in real time, such as a
technician listening to a centrifuge that is making a strange, painful
sound, or the incident can be noticed after the fact, as when a batch-
record reviewer sees that a recorded value is out of the desired
range. This all hinges on the person being able to perceive a gap
or difference between the “should be” and “as is” states. In some
cases, analysts who are trending and interpreting data may become
aware of something that is out of the ordinary, such as an increase
in a type of product complaint or in-process control data that has an
observable shift. (See Chapter 5.)

Once the incident is discovered, there may be immediate actions

that are appropriate to take. If the incident has just occurred, there
may be one or more immediate steps that can be used to stop the
progression of the event and prevent things from getting worse.
If a fire sprinkler head in a warehouse creates a deluge because it
was hit, an immediate action would be to shut off water and isolate
that section. Another immediate action would be to move items

LICENSED TO JOSE CASTELLA

Vesper Book.indb 49 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
50 Root Cause Investigations for CAPA: Clear and Simple

nearby, preventing them from being damaged. If an incident was

discovered after the fact, perhaps by a lab test, an action would be to
rapidly communicate the information to the quality unit and to put
the lot (or other possibly affected lots) on hold so they remain under
control of the firm while a further impact assessment is conducted.
(See Chapter 11.)

Simultaneously, efforts need to be taken to collect any and all

evidence—including observations and experiential information—
from those involved in the incident or who may have witnessed it.

Communication between stakeholders is critical throughout

all the steps to ensure that the needed resources are provided, that
regulatory/compliance requirements (like the filing of a US FDA
Field Alert Report) are met in a timely way, and to reduce painful
surprises. Firms should have a communication and escalation
procedure that provides details on who must be contacted and
when. (See Chapter 18.)

Stakeholders at this point should perform a risk assessment,

perhaps using a decision tree or rubric (a set of rules) to determine
the investigative rigor that the incident demands. (See Chapter 6.)

Step 2. Assemble investigation and resource team.

Getting the right people together—those who have expertise in the

issue and possibly the impacted stakeholders—needs to happen
as soon as possible to help identify and implement the immediate
actions, collect important evidence, and be aware of the broader
implications of the event. (See Chapter 3.)

Some pharma firms have designated immediate response

teams, similar to what hospitals and transportation safety agencies
use. For example, when there is a quality incident such as a rogue or
unwanted tablet found on the packaging line, a message is sent out to
all response team members to leave what they are doing—meetings
with management, working in their offices—and instructing them
to go immediately to the site of the event. For the team, this is the
priority of the moment. The firm that implemented the response
team found that this practice has been very successful in efficiently

LICENSED TO JOSE CASTELLA

Vesper Book.indb 50 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 51

getting to the root causes and putting in place effective corrective

actions.

An important member to include in the response team would be

a representative from the quality unit who could provide guidance
from a quality/GMP point of view.

Step 3. Collect initial information and facts to describe event (“as

is” situation).

The time the incident occurs is the beginning of the “golden hours,”
which are usually considered the first 24 hours after an accident or
event. Investigators (and detectives) view this period as the most
favorable and productive span of time for gathering information
(e.g., measurements, photographs, witness statements) and artifacts
(e.g., mixed-up tablets, incorrectly labeled bottles, broken gaskets)
before memories fade and items decompose, are lost, or mishandled.

Having an incident response kit can be very helpful in these

situations. The kit could be a simple toolbox with plastic bags for
collecting samples, a digital camera, measuring tools, pens, markers,
labels, a notebook computer, or other items to help collect and
preserve the evidence.

Clearly identifying the “as is” or problem situation helps

the investigation team focus on the incident. If the situation
involves particulate matter and tablets, what do you know about
the particulate matter at this point? Is it metal? Glass? Dirt? Is its
composition unknown? Also, where is it in or on the tablet? Is it
embedded inside the tablet or only on the outside, or can you see it
both inside and outside the tablet? Depending on the answers, the
team would take different approaches in their investigation. In Step
5 below, this “as is” information is combined with what “should be”
into a problem statement.

At all stages of the investigation, the investigators need to be

considering the scope of the investigation and potential negative
impact of the situation on the materials and products and, ultimately,
on the consumer or patient. The scope of the investigation can and
does change as more information is obtained. The question that

LICENSED TO JOSE CASTELLA

Vesper Book.indb 51 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
52 Root Cause Investigations for CAPA: Clear and Simple

must be anticipated (raised by the quality unit, qualified person,

or NRA inspector) is, “How do you know that only this (or these)
specific lot(s) were involved?” Often, firms will point to a lot of raw
material that was deficient in some way or a unit process that failed
and affected the lot in process. The investigation must consider if
other lots of the raw material or other batches using the same process
step/equipment were affected similarly.

Some people visualize this as putting things into the “scope

basket.” It is easy to put things into the scope basket; that is, there
need not be much evidence for that, but to remove something from
that basket takes evidence to support the move.

Along with scope, impact on the material, product, and

consumer must be considered throughout the investigation. For
something to be impacted (i.e., affected) by the event, it must be
within scope. If something is legitimately not within the scope of the
event, it will not be impacted.

For example:

• Situation: During a storm, the electrical service to a site

failed. The backup generators went on as intended for the all
critical areas with the exception of one temperature-controlled
chamber (2–8 degrees C) which contained in-process materials;
this chamber was identified as CR-05. The chamber remained
without power for 2.5 hours. The investigation found that a
failed circuit board for the affected chamber was the cause of
the problem.

• Scope: Entire contents of CR-05. Justification for limiting

scope: No other areas were considered within the scope of this
event—the utility monitoring system showed that the on-site
power generator for critical GMP areas started immediately as
designed.

• Impact: None of the contents in CR-05 were impacted by this

event. Justification for limiting impact: Utility monitoring system
and local chart recorder both showed temperature remained
within limits during the 2.5-hour outage (see attachments 1 and
2 for temperature data).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 52 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 53

Conduct the investigation

Step 4. Prepare an investigation plan based on what you know at

this point.

A practice that some firms use as they initiate a large or significant

investigation is to have the lead investigator develop a plan to
help guide their activities. This plan is shared with the quality unit
and possibly modified with their input. The value in having an
investigation plan (and having agreement by the quality unit) is that
it can help standardize what is done and can build on the collective
knowledge of the organization. This plan becomes a to-do list for
the investigator and the team members. A special investigation plan
may not be needed for very simple investigations, but it can help
with more complex issues.

A frustration for some investigators who are working through

a plan is that during the investigation, additional evidence is found
that changes the path of the investigation. Sometimes investigators
go to their quality unit partner who then suggests more to do or
a very different tack to take based on what has been learned. The
plan is just that: the intended strategy to be used, but that remains
dynamic and affected by facts and intuition.

Figure 3. Activities performed when conducting the investigation to

determine root, proximal, and contributing causes

In putting together an investigation plan, keep it simple and

useful and realize that it will probably change as the investigation
proceeds. Table 1 presents items that may be useful in such a plan.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 53 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
54 Root Cause Investigations for CAPA: Clear and Simple
Table 1. Items to include in a quality incident investigation plan

1. How will you approach the investigation based on the information obtained up to
this point?

2. What will you look for?

3. Who (positions/roles) will you want to talk with?

4. What specific questions will you want answered?

5. What else will be important to do in this investigation?

Step 5. Clearly describe the “should be” or normal operation/

specification and write a clear, precise problem statement.

A clear, concise problem statement is important for the team as it

conducts its investigation. The problem statement would usually
include what is expected or what “should be” and contrasts that
with the “as is” situation that is present in the problem (and
described in Step 3); in some cases when the “should be” state is
obvious, stating the “should be” is not needed. (For example, it is
a given that injectable products must be free of visible particulate
matter.) Having a simple, precisely worded problem statement that
everyone thoroughly understands allows the team to construct a
common view of what the issues are. Including the scope/size of the
situation can also be useful. As more information is gathered, the
team may want to refine the problem statement.

Some examples of problem statements include:

• Unknown particulate matter found on outside of metoprolol 25

mg tablets sampled from bulk tablet drum.

• pH of Chaosamix (lot number 15BE032) in filling tank was 6.2;

specification requires pH of 6.8–7.2.

• 38 percent of incident investigations in the current year are

taking longer than 90 days to complete and approve; procedure
requires that they be completed and approved in 40 days.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 54 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 55

Problem statements are slightly different from the problem

description. The problem description goes beyond the statement to
include the who, when, where, and how discovered.

Step 6. Collect additional information as needed.

The amount of time and resources spent on collecting data during

the investigation can vary widely and is dependent on a number of
factors including the significance of the event, how apparent or not
the direct and contributing causes are, and if the event has happened
previously. During this step and the following step, a variety of
different techniques and tools may be used, for instance change
analysis and cause-effect diagrams (Chapter 9) and interviewing
(see Chapter 10).

A useful practice is to record the information being collected onto

a worksheet that also guides the team in their inquiry. An example
of a data collection worksheet is found in Appendix 2. The intent
of this worksheet is to organize the information as it is learned; it
should not be considered “raw data” that supports the conclusions.
If this type of worksheet is used, the governing procedure should
clearly state how it is to be used and if (and where) it is to be retained
after the final investigation report has been reviewed and approved.

Often, when a potential cause is examined, no direct or indirect

connection to the unwanted event is found. That is still useful
information to record. Sometimes called “negative information,”
documenting this at the time and detailing it in the final investigation
report helps support the conclusion of the true causes and also
demonstrates the diligence of the investigation team.

(In my early years as a member of a quality assurance

department, I was asked to explain an investigation I had done on an
out of specification result for an antibiotic. The US FDA investigator
had my report and asked me why I had not talked to anyone in
the product development group. I replied that I had—I distinctly
remembered a phone call I had with our expert on that product. He
said it was “like a rock” and that he had never seen a failure like this

LICENSED TO JOSE CASTELLA

Vesper Book.indb 55 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
56 Root Cause Investigations for CAPA: Clear and Simple

before. As I told the investigator this, he very dramatically looked

through the report and said, “I don’t see that written anywhere here,
and if it isn’t written down . . . ” I filled in the rest of that well-worn
adage: If it isn’t written down, it didn’t really happen.)

Step 7. Analyze information and reflect on its meaning.

As information is being collected, it is simultaneously being

analyzed to some degree, with technical, regulatory, and operational
judgments being made as to its usefulness and also how it supports
a working hypothesis the investigation team might be using.

Analysis techniques are varied, valuable, and often are

integrated with data collection. Timelines, flowcharts/process maps,
fishbone diagrams, fault trees, scatter plots, graphic visualizations,
and the like can all be used depending on the specific situation. (See
Chapter 9.) Oftentimes, several techniques are used to examine the
data from different vantage points.

Reflection is a specific way to make meaning of the information

that is available. “Reflection in action” (Schön, 1983) refers to how
conscious critical thinking can be used to understand and resolve
unexpected situations (surprises) that do not fit the expected
outcome.

A useful definition is:

Reflection is a form of mental processing – like a form of thinking – that
we use to fulfill a purpose or to achieve some anticipated outcome. It
is applied to relatively complicated or unstructured ideas for which
there is not an obvious solution and is largely based on the further
processing of knowledge and understanding and possibly emotions
that we already possess (Moon, 2001).

An important part of that definition is “knowledge and

understanding”—key concepts that are now viewed as integral in
a modern pharmaceutical quality system described in ICH’s Q10
Quality Systems (2008).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 56 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 57

Step 8. Identify root, contributing, and proximal causes; confirm

scope and impact.

The culminating step in conducting the investigation is to identify

the causes of the unwanted event. This is a point where language,
specifically definitions, can be a both a help and a distraction.
Definitions for causes (Appendix 1) are:

• Root cause: Causal factors that, if corrected, would prevent

recurrence of the same or similar accidents. Root causes are
the specific underlying causes, can be reasonably identified,
are under the control of management to fix, and effective
recommendations can be developed to correct/prevent them
(adapted from Rooney et al., 2004).

• Contributing cause: A factor, situation, or agent that accelerates

or intensifies the occurrence of the unwanted event. If the
contributing cause is removed, it does not prevent the unwanted
event from occurring.

• Proximal cause: The action closest to the unwanted event.

Sometimes considered the “but for” event. For example, “The
fire would not have started but for the spark from turning on a
switch that ignited the fuel.” Another example is “the straw that
broke the camel’s back.”

Positively identifying the root and proximal causes is essential

in crafting specific corrective actions. If these causes are not known,
the corrective actions probably will have little positive affect. If the
causes cannot be identified, it would be useful to look for a strategy
to detect if the problem is occurring and collect information that
would help to identify the causes.

When identifying causes, there is often the temptation to use

the term human error. Human error is never the root cause; it is a
category, an output that needs to be understood. Why did the human
error occur? Was it because the technology was difficult to use or
that the procedure was confusing? Those answers are things that are
“actionable”—you can address them with a corrective action. (See
Chapters 8, 13, and 14.) The only action you can take when you say
“human error” is to get rid of the people.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 57 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
58 Root Cause Investigations for CAPA: Clear and Simple

As the causes are being identified and confirmed, the scope of

the deviation and its impact need to be finalized as much as possible.
One of the most common questions that you should be prepared to
answer is, “How do you know that the scope of the deviation or
incident is limited to what you have identified?”

Figure 4. Activities performed when identifying and implementing

corrections and corrective actions

Identify and implement effective corrections and

corrective actions

Step 9. Identify corrections and corrective actions; consider the

risks.

Corrections are applied to the thing or things affected by the

unwanted event. In some cases, the immediate action may also
have been a correction. In other situations, a correction may not be
possible, for example, if the material must be rejected.

The corrective action is taken so the specific unwanted event

does not happen again—you are preventing a recurrence of the
problem. (A true preventive action is where the unwanted event has
not occurred, but you are taking action to try to ensure it does not
happen in the future.)

There are a variety of corrective actions, some of which are

superior to others. For example, elimination, substitution, and
engineering controls would be the top three to consider. The most
overused corrective actions are usually training personnel and

LICENSED TO JOSE CASTELLA

Vesper Book.indb 58 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 59

revising the relevant procedures. These should only be considered if

the root or proximal causes show that there was a lack of or incorrect
information, or a lack of knowledge or skills on the part of personnel
doing the job.

As part of identifying corrections and corrective actions,

consideration must be given to risk. Often this is done through the
change management quality system element that asks questions
such as:

• What are the risks that could occur if we implement this change?

• What are possible unintended consequences that might occur?

• What other changes do we need to make so this change will be

successful?

• What are the risks if we do not make this change?

Step 10. Implement actions; establish effectiveness checks.

The corrective actions identified and approved (through the change

management process) are implemented at this point. While some
actions can be fully carried out, others—capital projects requiring
significant funding, engineering, regulatory approvals, and the
like—may take longer. In these situations, consideration must be
given for stop-gap measures. These would be controls put in place
to provide some level of protection while the more comprehensive
solution is instituted. (See Chapter 12.)

As implementation of the corrective actions occur, methods

need to be in place to provide confidence that the actions are
working as intended—that is, to prevent a recurrence of the
problem. There are a variety of approaches that can be used. In some
cases, qualification and validation can provide confidence about
equipment, instruments, and processes. Monitoring the situation
to determine if recurrences occur is another useful approach. (See
Chapter 15.)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 59 5/29/2020 10:55:50 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
60 Root Cause Investigations for CAPA: Clear and Simple

Step 11. Look for broader application of the actions.

Corrective actions are targeted at avoiding a recurrence; preventive

actions are intended to thwart a “precurrence,” an event that might
happen. This sometimes requires reflecting on the question, “What
did we learn and how can we extract some sort of value from the
unwanted event?” This is where an organization learns from its
problems and takes proactive actions. Preventive actions could be
applied elsewhere at the same site or at other locations; perhaps the
approaches could be applied to similar unit operations or platforms.
(See Chapters 19 and 20.)

Figure 5. Activities performed when documenting the event and

investigation and continuing to communicate to stakeholders

Document and continue to communicate

Step 12. Write the report; send for review and approval.

Document the investigation and actions taken to provide evidence

for what was done. Investigation reports are favorite starting places
for health authority investigators and quality auditors. Inspectors
and auditors often ask for a list of investigations (the title, impact
level, product(s), processes) and select those that are of most interest
to them.

Reports are produced in several different ways: most medium

and large pharma firms use special incident and CAPA applications
that require the writer to fill in various sections of the report (e.g.,
description, results of investigation). These online tools support
the uploading of documents such as photos, lab summaries, and

LICENSED TO JOSE CASTELLA

Vesper Book.indb 60 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Big Picture 61

complete narrative reports. Smaller firms use paper-based systems:

one- or two-page forms for the simpler investigations and more
formal reports for investigations that are complex.

Since these reports are subject to health authority review,

spelling, grammar, and professional appearance are important.

Step 13. (Continue to) communicate with stakeholders.

This task may have started when the unwanted event was first
recognized; it often spans the entire investigation and CAPA process.
As lessons learned become evident and solutions are put in place,
information should be shared with those who may be affected by the
event or could benefit in some way. This should not be considered
the same as training, but rather disseminating relevant information.
(See Chapter 18.)

Step 14. Celebrate as appropriate.

Completing a significant investigation takes a lot of work, sometimes

on the scale of several months. Recognizing those who contributed
to the investigation and a better understanding of the product,
process, method, or task is important.

CONCLUSION
The process described in this chapter is meant to be a general guide
for those performing, managing, or supporting an investigation.
Steps may be performed in a different sequence; in some cases a
step may be added or deleted. In any case, these eight key questions
need to be answered:

• What happened?

• Why did it happen?

• What is the scope of the problem (or how big is it)?

• What are the various impacts (or effects) of the problem?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 61 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
62 Root Cause Investigations for CAPA: Clear and Simple

• Can we fix the thing(s) that were affected? If so, how?

• What can we do to prevent this from happening again?

• What have we learned?

• Who do with share this information with?

REFERENCES
ICH (2008) Q10 – Pharmaceutical quality system. Geneva:
International Conference for Harmonisation. https://database.ich.
org/sites/default/files/Q10_Guideline.pdf. Accessed 12 Mar 2020.

Moon, J. (2001) PDP working paper 4: reflection in higher education

learning. LTSN Generic Centre – Learning and Training Support
Network. https://nursing-midwifery.tcd.ie/assets/director-staff-edu-
dev/pdf/PD-%20Working-Paper-4-Moon.pdf. Accessed 12 Mar
2020.

Rooney, J. and Vanden Heuvel, L. (2004) Root cause analysis for

beginners. Quality Progress, July:45–53.

Schön, D. (1983) Educating the reflective practitioner. New York: Basic

Books.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 62 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

THE INITIAL DISCOVERY OF AN

EVENT

The investigation process starts with the observation or discovery

of a situation that is different in some way. It could be as obvious as
a swarm of insects in a controlled area, a piece of manufacturing
equipment unexpectedly failing, or a test result that is way beyond
its specification. On the other hand, the event might be a subtle
shift in a pattern of complaints or an in-process result that is within
the specifications but, for some reason, is different from what is
expected or typically seen. Or, the event could come when doing
a trend analysis of a variety of factors during a product’s annual
product quality review.

Just because the situation has caught the attention of someone

doesn’t mean that it will demand a full-on, extensive investigation.
A risk-based triage that will categorize what will happen will be
discussed in Chapter 6. But first, a more extensive look at how a
situation could be discovered.

What is being observed—the specification failure, the shift

in complaint trends, or the swarm of insects—is a symptom. (In
clinical medicine, symptoms and signs are different. A symptom is
something that is observed and reported by the patient; a sign is a
more objective piece of evidence that is observed or measured by
a healthcare professional. When a person says they feel hot from a
fever, they are experiencing a symptom; the 40°C temperature from
the oral thermometer is the sign. To keep things simple here, we are
63
LICENSED TO JOSE CASTELLA

Vesper Book.indb 63 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
64 Root Cause Investigations for CAPA: Clear and Simple

not going to differentiate between the symptoms and signs.) In your

investigation, you will be uncovering how and why this symptom
is appearing in the particular place, at the particular time, and what
the potential impact could be on patients, processes, and other
things of value.

For organizations with integrated quality management systems

with multiple quality system elements like complaints and adverse
events, vender/contractor qualification, change management,
and the like, it is through these system elements that a deviation
is frequently introduced into the investigation process. Most all
of the approaches discussed below (e.g., direct observation) can
be investigation initiators. In the quality system element (like
complaints) there is often a procedurally-based risk assessment
that provides information on what should be done, who should be
informed, how to escalate particular issues, and how to document
any findings or activity.

PSYCHOLOGICAL SAFETY
Before describing ways that unwanted events can be discovered,
an important question to ask is, “How willing is someone to tell
others, including management, the news of a problem or failure,
particularly if that person was involved or responsible?” While
there are often human factors and error traps that are the underlying
causes of what we call “human error,” it still takes a large amount of
courage for someone to self-report a mistake or problem.

Psychological safety is an area of study by Harvard professor

Amy Edmondson. She defines it as:

Psychological safety, or the belief that one will not be rejected or

humiliated in a particular setting or role, describes a climate in which
people feel free to express work-relevant thoughts and feelings. In
psychologically-safe environments, people believe that if they make
a well-intentioned mistake, others will not think less of them for it,
nor will they resent or penalize them for asking for help, information,
or feedback. Psychological safety thus fosters the confidence to take
interpersonal risks, allowing oneself and one’s colleagues to learn

LICENSED TO JOSE CASTELLA

Vesper Book.indb 64 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Initial Discovery of an Event 65

and focus on collective goals and problem prevention rather than on

self-protection (Edmondson et al., 2009, p. 48).

If personnel believe that they can come forward without fear

of retribution, they will be more willing to speak up. (Chapter 8
discusses self-reporting in relation to human error in more detail.)

DIRECT OBSERVATION
The simplest way to discover an event that needs some level of
investigation is when one senses a problem. The problem could be
of a type that anyone would conclude that “we’ve got a problem
here”—a solvent leak, a packaging line that unexpectedly stops, or
a person who has called to say she found a red tablet in a bottle
of white tablets. In each of these examples, the person that raises
the issue has determined that what they are experiencing does
not match the mental model of what they were expecting or the
“normal,” desired situation.

Mental models
A mental model is a picture that a person has in their mind of how
something works, how the pieces fit together, and the expectations
of what should happen. Examples of using a mental model could
be an employee who notices something that seems out of place or a
customer who calls in with a complaint that there is a blue tablet in
what should be a bottle of all white tablets.

The idea of mental models was described by the British

psychologist Kenneth Craik (1943) who said a person’s mind creates
“small-scale models” based on reality and used to anticipate future
events. Mental models, sometimes called “cognitive constructions,”
tend to be simple representations of the elements, concepts, and
relationships that make up a complex system and how the system
works: “The image of the world around us, which we carry in
our head, is just a model. Nobody in his head imagines all the
world, government or country. He has only selected concepts,
and relationships between them, and uses those to represent the

LICENSED TO JOSE CASTELLA

Vesper Book.indb 65 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
66 Root Cause Investigations for CAPA: Clear and Simple

real system” (Forrester, 1961). A mental model enables a person to

predict what something should do (Norman, 1983).

A simple example of a mental model is: If I push the button on a

doorbell, a sound will occur that alerts the people inside know that
someone is at the door, and they will come to the door to greet me.
Therefore, if I want someone to greet me, I need to push the doorbell
button.

The mental model that a person has can include:

• Structure: The parts and how they fit together to form a system
(the push button, the wire or wireless connection, and the device
that creates the audible sound).

• Causal relationships: If… then… (if I push the button and, if

someone is there, then they will come to the door).

• Time and distance: Typical, atypical (it might take a few

moments for someone to respond; if I do this repeatedly and no
one comes to the door, either no one is at home or, if they are,
they don’t want to see me).

• Behaviors: What normally occurs (a dog might bark; someone

might ask, “Who is it?”; the door opens).

Individuals create their mental models based on knowledge and

the experiences they have had. Working with a more experienced
person or expert, using visuals (diagrams or photos), and seeing
examples are ways that someone new to a process or system can
develop their understanding and create their own mental model.

Although the model may be incomplete or not be totally correct,

it might still represent someone’s experiences. For example, when
I push the button by the door, the dog goes crazy barking, and
somebody comes to the door, I might think that the button only
causes the dog to bark and that is what prompts the person to greet
me (something that could still be true, but not be the full picture).
At other times, a mental model can be incorrect, which will result
in miscommunication or a wrong prediction. Learning is when
knowledge and experience corrects or refines someone’s mental
model so it more completely and accurately represents how the

LICENSED TO JOSE CASTELLA

Vesper Book.indb 66 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Initial Discovery of an Event 67

system works. Having a correct mental model can result in better

communication, decisions, and actions.

Why mental models are important

There are several important benefits to people having accurate
mental models for a process or system. These include:

• Processing information more quickly, flexibly. For example,

expert pilots remembered many more important “concept”
words than “filler” words when listening to air traffic control
radio transmissions (Endsley, 2006).

• Improved communication and coordination due to a common,

shared understanding of required behaviors and expectations
as seen in high-performing sports teams (Hodges et al., 2006).

• Developing situational awareness—having the ability to

visualize and keep track of how different elements fit into the
bigger, dynamic picture.

• Improving descriptions—what should have happened versus

what actually did occur. This point is key in our discussion: a
certain result should occur; however, for some reason, there was
a different result. This observed discrepancy is what prompts
further investigation.

Situational awareness
A key difference that distinguishes a novice airplane pilot from those
who are the best in their profession (and have the most flying time)
is situational awareness (Endsley, 2006), a characteristic that world-
class athletes have as well (Epstein, 2013). Hockey great Wayne
Gretzky put it simply: “A good hockey player plays where the puck
is. A great hockey player plays where the puck is going to be.”

Situational awareness is “the perception of the elements in the

environment within a volume of time and space, the comprehension
of their meaning, and the projection of their status in the near future”
(Endsley, 2006, p. 634). A significant number of plane crashes are due

LICENSED TO JOSE CASTELLA

Vesper Book.indb 67 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
68 Root Cause Investigations for CAPA: Clear and Simple

to pilots losing situational awareness when flying into a whiteout or

fog.

A pharmaceutical example: A manufacturing operator at a

vaccine production facility was manually cleaning equipment
following a production run at the same time a clean-in-place (CIP)
skid was operating. The operator heard an unusual sound coming
from the CIP skid. He immediately interrupted what he was doing,
went to the skid, and shut it down. Upon investigation it was
determined that something inside the CIP skid had broken, causing
the strange sound, and had that immediate action not been taken,
the equipment would have had significant damage.

From the above definition and example, one can see that
situational awareness has three levels (Endsley, 2006). First,
there is the perception: acquiring important information about the
situation. The operator heard something unusual. Perception has
two important requirements. First, the person must be able to sense
and acquire the information. An interesting study of hundreds
of baseball players showed that professional players’ vision was
superior to the vision of college athletes and far better than the
general public. Visual acuity that allows players to sense information
earlier (literally milliseconds) and from farther away is a predictor
of success (Epstein, 2013). The second important requirement of
perception is paying attention to the important information and
ignoring nonvalue-adding information. Having expertise in a field
(meaning that the person has had many years of being exposed
to a variety of experiences and challenges) is critical in discerning
what is and is not important. And, even with that expertise, in many
complex, dynamic situations, this determination still can be difficult.
In the CIP example, the operator was able to distinguish the normal
sounds of the equipment and ambient room noise from the unusual
one.

The second level of situational awareness is comprehension, a

process that involves understanding what the acquired information
means along with its significance. Someone hearing the CIP skid
operating for the first time may not put any importance to what they
are hearing, but to the experienced operator the abnormal sound
is a signal that something is wrong and action needs to be taken––

LICENSED TO JOSE CASTELLA

Vesper Book.indb 68 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Initial Discovery of an Event 69

not when they have completed their assigned task but now—
immediately.

The third level is projection: using the information on what is

happening now to anticipate what could happen in the future.
Projection requires knowledge of the immediate system involved
but also the variety of other factors and uncertainties that may be in
a very dynamic state. The intent of projection can be summarized
in a phrase pilots use: “Stay ahead of the airplane,” or according to
Gretsky, “Play where the puck is going to be.”

Mental models, situational awareness, experience, and

expertise
Situational awareness depends on having an accurate, robust mental
model of the system and the environment in which the system sets
(Endsley, 2000). Mental models are developed and improved as one
acquires more knowledge and skill in a specific field (Dreyfus et
al., 2005). As one sees more and different situations, new mental
models are developed and existing mental models are expanded
and refined. This doesn’t happen right away; it takes hundreds
of hours and many years of diligent practice to become an expert
(Ericsson et al., 2007).

But what about those cases when someone without any specific
experience in field walks into a laboratory or looks at a graph with
“new eyes” and asks a question that initiates an investigation? They
may not have extra sensory perception (ESP), but they might have
another mental model (e.g., data recording and integrity) that they
are applying in a different situation. Or, perhaps those who have
spent many hours in one particular lab or reviewing data have failed
to keep their mental models current. Or there may be a perceptual
bias often described as “we see what we want to see.”

Signal detection
Another avenue that can identify an event is signal detection. While
there are many definitions of signal, a simple, useful one is that
a signal is “a clue or a sign or a piece of evidence” that gives an

LICENSED TO JOSE CASTELLA

Vesper Book.indb 69 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
70 Root Cause Investigations for CAPA: Clear and Simple

early warning about a particular event or danger (adapted from

Wohlstetter, 1962, p. 2). The challenge when looking for a signal
is to differentiate it from the noise (random patters that might be
mistaken for signals) (Silver, 2012, p. 416). While used in a range
of fields from electronics to radio communication and information
technology, signal detection has also become important in
pharmaceuticals, particularly in the area of pharmacovigilance, which
is “the science and activities relating to the detection, assessment,
understanding and prevention of adverse effects or any other drug-
related problem” (WHO, 2002, p. 42).

Signal detection is the first of the three phases of signal

management used in pharmacovigilance. The second phase is
signal prioritization, which involves understanding the importance
of the signal based its context and validity. The third phase is
signal assessment to determine what next steps, if any, need to be
taken. Prioritization and assessment are similar to what is done in
risk evaluation and risk reduction—identifying what are the more
significant risks and what can be done to reduce them.

A relatively simple approach for signal detection and assessment

has been process control or process-behavior charts that were
developed by Walter Shewhart at Bell Labs in the 1920s (Shewhart,
1931). A control chart is a time-sequence chart showing plotted
values of a statistic or individual measurement, including a central
line and one or more statistically derived control limits (NIST, 2012).
By comparing an individual measurement or data point to the
centerline that shows the mean value and the statistically derived
upper and lower control limits, one can determine if the process is in
or out of control. At least 12 different types of process control charts
are used, each adapted for a certain need or type of data (Tague,
2005).

Process control charts are meant to separate out a signal (i.e.,

special cause variation due to an assignable cause which would be
identified in an investigation and then corrected) and noise (common
cause variation that is inherent in the system or process). Rules or
heuristics have been developed to give guidance as to when actions
should be taken so as not to overcorrect and cause more potential
problems.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 70 5/29/2020 10:55:51 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Initial Discovery of an Event 71

Control charts show the results of the process, not the cause.
They can be used to help identify special cause variation where a
change has occurred resulting in variation that is outside the typical
or historical patterns seen in the process. Control charts can also
show common cause variation, such as natural, typical patterns or
when the process devolves or deteriorates. To understand these
types of problems, one needs to have a thorough understanding of
the process and factors that can affect it. Figure 1 shows a simplified
example of a control chart.

Figure 1. Example of a control chart

BIG DATA AND DATA MINING

Another approach that has the potential to identify deviations and
trends is big data. Most every field is collecting and storing huge
amounts of raw data, in part because it is so easy and inexpensive
to do. Gartner, an information technology research firm, defines big
data as “high-volume, high-velocity and high-variety information
assets that demand cost-effective, innovative forms of information
processing for enhanced insight and decision making” (Gartner,
undated). With all this data, special approaches such as exploratory
data analysis (EDA) (NIST, 2012) need to be used to obtain

LICENSED TO JOSE CASTELLA

Vesper Book.indb 71 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
72 Root Cause Investigations for CAPA: Clear and Simple

information and understanding. Using EDA generally involves a

variety of graphic techniques that puts the investigator in the role of
a detective (Tukey, 1977).

Applying data mining techniques to large data sets can provide

new insights about populations and individuals. Five different
outcomes (Furnas, 2012) and examples of uses of data mining are:

• Anomaly detection: Identifying products that meet

specifications but have results that are slightly different than
typically seen.

• Association identification: Connecting specific types of

complaints to sources of packaging materials.

• Cluster determination: Finding subgroups within a population

of results such as locations and types of microbial contamination.

• Classifications: Categorizing new complaints based on criteria

developed.

• Regression: Developing and continuing to refine predictive

models, such as using component specifications, processing
parameters, and stability data collected over time, to create a
stability model for a given batch of a product.

The approaches and algorithms used in data mining are still

in their infancy and are sometimes controversial. For example, in
2008 Google began developing a tool, Google Flu Trends (GFT), to
monitor the progression of seasonal flu based on the words people
were using in their searches. The intent was to give an early warning
signal of flu outbreaks based on searches on 45 flu-related terms.
Other researchers, however, found that GFT over-estimated flu
cases in 100 of 108 weeks, sometimes double the estimates by the
US Centers for Disease Control and Prevention (CDC) (Lazer et al,
2014). These researchers said that better, more accurate prediction
results could be obtained by using a combination of the traditional
reporting methods (which have a two-week time lag) and adjusting
the algorithms used.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 72 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Initial Discovery of an Event 73

SO WHAT DOES THIS ALL MEAN?

We have seen that unwanted events can be discovered in a variety
of ways. Some situations or problems are obvious—smoke coming
from a motor or water that does not come from an outlet when the
water valve is opened. Other events can be more subtle and require
that someone with a mental model of what should happen perceives
that the situation they are facing does not match. Acquiring that
model takes time and significant hands-on experience, often guided
by a more experienced mentor. Feedback from others—patients,
caregivers, and healthcare professionals—is another valuable
source of information on problems. Finding anomalies through the
trending of data, such as using simple tools like control charts or
through applying machine learning to big data, are other ways to
make sense of a variety of factors that could cause a process to not be
in a state of control, something that would demand an investigation.

CONCLUSION
Discovering that an event is happening or has occurred can be
done through direct observation of an event or a symptom that is in
evidence. Events can also be discovered by looking at data that has
been collected using a range of tools—some simple and others very
complex.

Not every event is significant, however. Determining the event’s

importance depends on the potential risk, particularly the impact
it may have on the product and, ultimately, the patient. That risk
assessment will be discussed in Chapter 6.

REFERENCES
Craik, K.J.W. (1943) The Nature of Explanation. Cambridge, UK:
Cambridge University Press.

Dreyfus, H.L. and Dreyfus, S.E. (2005) Expertise in real world

contexts. Organizational Studies, 26(5):779–792.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 73 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
74 Root Cause Investigations for CAPA: Clear and Simple

Edmondson, A. and Roloff, K. (2009) “Overcoming barriers to

collaboration: Psychological safety and learning in diverse
teams.” In Salas, E., Goodwin, G.F., Burke, C.S. (Eds.), Team
Effectiveness in Complex Organizations: Cross-Disciplinary
Perspectives and Approaches (pp. 183–208). Routledge/Taylor &
Francis Group.

Endsley, M.R. (2006) “Expertise and situational awareness.” In

Ericsson, K.A., Charness, N., Feltovich, P.J., Hoffman, R.R.
(Eds.), The Cambridge Handbook of Expertise and Expert Performance
(pp. 633–651). Cambridge, UK: Cambridge University Press.

Endsley, M.R. (2000) “Theoretical underpinnings of situational

awareness: A critical review.” In Endsley, M.R. and Garland,
D.J. (Eds.), Situation Awareness Analysis and Measurement (pp.
3–32). Mahwah, NJ: Lawrence Erlbaum Associates.

Epstein, D. (2013) The Sports Gene. New York, NY: Current.

Ericsson, K.A., Prietula, M.J., Cokely, E.T. (2007) The making of an

expert. Harvard Business Review, Jul/Aug; 85(7–8):114–21.

Forrester, J.W. (1961) Industrial Dynamics. Cambridge, MA: MIT

Press.

Furnas, A. (2012) Everything you wanted to know about data

mining but were afraid to ask. Atlantic, April 3, 2012. https://
www.theatlantic.com/technology/archive/2012/04/everything-you-
wanted-to-know-about-data-mining-but-were-afraid-to-ask/255388/.
Accessed 5 Feb 2020.

Gartner (Undated) Gartner glossary. https://www.gartner.com/en/

information-technology/glossary/big-data. Accessed 9 Apr 2020.

Hodges, N.J., Starkes, J.L., MacMahon, C. (2006) “Expert performance

in sport: A cognitive perspective.” In Ericsson, K.A., Charness,
N., Feltovich, P.J., Hoffman, R.R. (Eds.), The Cambridge Handbook
of Expertise and Expert Performance (pp. 471–488). Cambridge,
UK: Cambridge University Press.

Kim, D.H. (1993) The link between individual and organizational

learning. Sloan Management Review, 35(1):37–50.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 74 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
The Initial Discovery of an Event 75

Lazer, D., Kennedy, R., King, G., Vespignani, A. (2014) The parable
of Google flu: Traps in big data analysis. Science, 343:1203–1205.

NIST (2012) NIST/SEMATECH e-Handbook of Statistical Methods.

https://www.itl.nist.gov/div898/handbook/index.htm. Accessed 9
Apr 2020.

Norman, D. (1983) “Some observations on mental models.” In

Gentner, D. and Stevens, A.L. (Eds.), Mental Models. New York,
NY: Psychology Press.

Shewhart, W. (1931) Economic Control of Quality of Manufactured

Product. New York, NY: D. Van Nostrand.

Silver, N. (2012) The Signal and the Noise: Why So Many Predictions
Fail – but Some Don’t. New York, NY: Penguin Press.

Tague, N.R. (2005) The Quality Tool Box, 2nd ed. Milwaukee, WI:
Quality Press.

Tukey, J.W. (1977) Exploratory Data Analysis. Reading, PA: Addison-

Wesley.

WHO (2002) The importance of pharmacovigilance. Geneva: World

Health Organization.

Wohlstetter, R. (1962) Pearl Harbor: Warning and Decision. Stanford,

CA: Stanford University Press.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 75 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 76 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

APPLYING RISK-BASED THINKING

TO QUALITY EVENTS AND
DEVIATIONS

Risk management, in any of its forms, is all about making decisions.

Using formal tools like failure mode effects analysis (FMEA) or
preliminary risk assessment (PRA) helps reduce bias and the
influences that can mislead an individual or group into making an
incorrect choice. Often risk management is used proactively; that is,
determining what could potentially go wrong and then reducing the
risks by taking preventive actions—the PA in CAPA.

Risk management can also be used in a retrospective way—

that is, considering the risks that a quality event or deviation could
have in terms of patients, compliance, product availability, or any
other set of factors. It is retrospective risk assessment that is of most
interest to us here.

In this chapter we will briefly review quality risk management

and risk-based thinking and examine how risk principles can be
applied to incidents, investigations, corrections, and corrective
actions.

The pharma industry’s view of risk management (RM) was

heavily influenced with the 2005 publication of Q9: Quality
Risk Management (QRM) by an expert working group from the
International Conference for Harmonisation (ICH, 2005). This

77
LICENSED TO JOSE CASTELLA

Vesper Book.indb 77 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
78 Root Cause Investigations for CAPA: Clear and Simple

guideline has been adopted by regulatory authorities around the

world and defines QRM as “a systematic process for the assessment,
control, communication and review of risks to the quality of the
drug (medicinal) product across the product lifecycle” (p. 8).

Oftentimes, even with the best of intentions, QRM becomes an

exercise that is required and performed seemingly only for the sake
of doing a risk assessment. When QRM is done properly, asking
a valid risk question, it can contribute to a better outcome. As Q9
states, QRM is “a proactive means to identify and control potential
quality issues during development and manufacturing” that “can
improve the decision making if a quality problem arises” (p. 1). An
important caveat is that QRM should be used to answer a valid risk
question and help make a decision, not to justify a decision that has
already been made.

Q9 presented the now familiar conceptual lifecycle for QRM

(Figure 1) with the different phases/activities (described in more
detail below). The QRM framework can help provide answers
to eight questions (Table 1) that are commonly asked in any risk
management activity, regardless of the industry. The Q9 guideline
provides examples of where QRM can be used when responding to
quality events:

To provide the basis for identifying, evaluating, and communicating

the potential quality impact of a suspected quality defect, complaint,
trend, deviation, investigation, out of specification result, etc.

To facilitate risk communications and determine appropriate action

to address significant product defects, in conjunction with regulatory
authorities (e.g., recall) (ICH, 2005, p. 15).

However, the guideline does not prescribe an approach, method, or

tool—these are left to the individual firm to adopt or develop and
then apply.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 78 5/29/2020 10:55:52 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 79
Figure 1. ICH Q9 Quality Risk Management model (ICH, 2005)

Table 1. Eight questions that apply to any risk management process

1. What can go wrong?

2. What is the likelihood that it could happen?

3. What are the consequences if it does happen?

4. What are the priority risks to address?

5. What can be done and what are the options available?

6. What can be done to communicate what has been done?

7. What can be done to document what has been done?

8. How will we know if any conditions or assumptions have changed?

Source: Kaplan et al., 1981, revised Vesper

LICENSED TO JOSE CASTELLA

Vesper Book.indb 79 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
80 Root Cause Investigations for CAPA: Clear and Simple

THE ICH Q9 PROCESS

The Q9 model has been widely adopted in the pharma/biopharma
industry, with most firms providing some level of training to their
personnel; a brief summary of 10 key activities to review is provided
below. (The descriptions are based on current industry practice that
has evolved since 2005 and differ slightly from descriptions found
in Q9.)

Initiate the QRM process. A cross-functional team is formed

with members who have expertise in the area of interest and some
basic training in the QRM process and tools. Often an experienced
facilitator (Vesper et al., 2019; McFarland et al., 2020) guides the group
through the QRM process and use of the tools. Here at the start of
the process, the goal and scope of the risk exercise are defined along
with the decision that the risk assessment is meant to inform. Details
concerning the “thing of value,” such as the process (e.g., process
diagram) or product being assessed (e.g., critical quality attributes,
specifications) are needed as well as rating scales (the criteria for
severity/impact, likelihood of occurrence, and sometimes detection/
controls). One important output of the initiation step is a well-written
risk question that guides the team. For example, “What is the risk to
product quality and stability if Chaosamix tablets were made when
the room temperature averaged 3°C above specification?”

Risk (hazard) identification. Strictly speaking, this step is

actually looking for hazards—the source of harm. (Since you have
not determined the impact of the harm or likelihood of occurrence,
you cannot yet estimate the risk.) Hazard identification can be done
by using simple brainstorming or through a more defined tool like
a fishbone diagram, the hazard identification tool (HIT), or the
hazardous operations (HAZOP) method.

Risk analysis. For each hazard, the team determines its cause,
severity, current control/detection strategy, and likelihood of
occurrence. Tools used for risk analysis include risk rating check
sheets, preliminary risk assessment (PRA), (the over-used [Vesper
et al., 2016]) failure mode and effects analysis (FMEA), and fault
tree analysis (FTA). Rating scales are used to assign categories of
severity, likelihood, and sometimes, if FMEA is used, detection.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 80 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 81

Risk evaluation. With information about the hazard, its impact,

and likelihood of occurrence, the identified risks can now be
prioritized by using the FMEA risk priority number (RPN) or a risk
matrix (Figure 2), for example. This is an important output because
a goal of the risk assessment process is to determine which risks are
more important than others, thereby requiring the most attention.
Usually three categories are used here: high (must be reduced),
moderate (should be reduced if possible), and low/negligible (can
be accepted as they are).

Figure 2. Example of a risk matrix used for evaluating risks

Severity

Minor (1) Major (2) Critical (3)

High (3) Medium (3) High (6) High (9)

Likelihood

Medium (2) Low (2) Medium (4) High (6)

Low (1) Negligible (1) Low (2) Medium (3)

Risk reduction. For those risks that must or can be reduced,

a variety of approaches exist. For the risks determined to be high,
risk reduction is a necessity. For risks classified as moderate,
consideration can be given to the extent of risk reduction and the
benefits that would result. Low risks usually do not require any risk
treatment; however, if there is something that can simply reduce the
risk even more, it could be considered.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 81 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
82 Root Cause Investigations for CAPA: Clear and Simple

Reducing risks can be done in two primary ways: control and

mitigation. Risk control is sometimes called prevention because it
attempts to reduce the likelihood of an unwanted event happening.
(Improving the detectability of the hazard or failure mode before it
causes harm is often considered to be an approach in reducing risk.)
While it cannot always be accomplished, risk mitigation assumes
the unwanted event will happen and attempts to reduce the severity
of impact or harm. For example, one cannot prevent a hurricane
from striking a manufacturing facility in Puerto Rico, but one can
have a mitigation strategy, such as having a well-constructed and
maintained building, reducing the inventory kept at the site, having
an emergency response plan, having multiple power generators
with lots of diesel fuel, and so on. If neither control nor mitigation
are appropriate, for example with a risk that has high severity and a
low likelihood of occurrence, preparation should be considered. This
could include having an action plan in place that can be executed
should the event happen. A product recall plan would be an example
of this.

Prior to its implementation, the proposed risk reduction

plan should be reviewed to determine if any new risks might be
unintentionally introduced by the planned measures. Usually the
risk reduction plan goes through the change control process.

Residual risk evaluation. After plans to reduce the risks have

been identified, residual risks still remain. Decision makers need
to determine if the risks have been reduced to tolerable, acceptable
levels or if the residual risk level requires more treatment. When the
risks have been sufficiently reduced, they are then accepted.

Risk communication. Throughout the QRM process, there

should be mechanisms in place so that stakeholders—management,
regulatory authorities, employees, healthcare professionals, patients,
caregivers—are appropriately notified. Within an organization, this
is often called “escalation” and should be described in a procedure.

Output/Result of the QRM process. As quality professionals

and regulatory inspectors often say, “If it was not written down, it
did not really happen”; therefore, documenting the risk assessment
inputs and outputs is a critical part of QRM. Some of the tools (e.g.,

LICENSED TO JOSE CASTELLA

Vesper Book.indb 82 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 83

PHA, PRA, FMEA) use self-documenting tables that can be included

with a formal report. Other times, less structured approaches to risk
assessments (the “what if…?” method), can be documented using a
simple one- or two-page risk memo or report.

Risk review and monitoring. Unless the risk assessment

is intentionally a once-and-done exercise to inform a discrete
decision (i.e., quality event triage or determining the level of change
management), most QRM reports are periodically revisited. The
time period selected for this risk review depends on factors such
as how dynamic or stable the process/product or system is (i.e.,
early stage versus late stage). Additionally, there might be events
that could trigger a review such as a complaint, stability failure,
critical deviation, or change. Risk review and monitoring should be
described in a procedure and its performance should be documented.

QRM AND RISK-BASED THINKING

QRM is a specialized application of risk-based thinking (RBT). RBT
is a concept that is being embraced by all varieties of organizations
and is described in an international quality standard, ISO 9000:2015.
RBT can be defined as:
A broader approach to considering risks and ways to reduce them. It
is integrated into systems, decisions, and actions throughout all parts
and levels of the organization.

RBT is something that we each do every day. It depends on our

having a practical, working understanding of the process, system,
or activity under evaluation, a level of curiosity, and basic common
sense. We can use RBT without going through the trouble of creating
some sort of risk table; often it is a set of heuristics or rules of thumb
we can apply before taking an action or making a decision. A personal
example: After many years of flying commercial airlines, I have my
own set of “rules” that I use when making travel reservations. If
I’m going to an important meeting or training event, I will usually
go the night before or at least have one other flight option to help
improve the chances of arriving. I’ve sworn off several different
airlines and, depending on the time of year (e.g., summer or winter),
certain airports and routings. If I’m flying internationally, I always

LICENSED TO JOSE CASTELLA

Vesper Book.indb 83 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
84 Root Cause Investigations for CAPA: Clear and Simple

allow an extra day just in case there is a problem (and 15 percent of

the time there is). There isn’t a written procedure I follow but rather
a mental decision tree that I use when making travel decisions.

As we will see, RBT and QRM can be integrated into an

investigation and the resultant actions taken, and can be used in
considering the impact of changing and implementing these actions.

Incorporating RBT and QRM into the investigation

process
If we consider RBT and QRM as methods we can use to make better
data-driven, proactive decisions, there are several opportunities
that allow us to answer questions like:

• What are the quality events or deviations that we should spend

the most time and effort investigating?

• What are the potential impacts of the event that occurred?

• What events can we legitimately, from a risk-based approach,

simply notate that they have occurred?

• How likely is it that this event will happen again?

• How much time, effort, and resource should we put into

executing corrective actions?

• What is the likelihood that a defective product could have a

negative effect on the patient?

In applying RBT and QRM, there are three particular times

when they are useful:

1. Deciding how much effort to expend when investigating the

unwanted event,

2. Considering unwanted or unintended effects that could occur

when taking an immediate action, and

3. Determining the impact that the unwanted event could have on

the patient or other things of value.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 84 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 85

QRM in the context of an investigation (particularly items

one and three above) is a retrospective use—the event has already
happened. These assessments and actions can also be considered ad
hoc—that is, they answer a very specific question which does not
require the risk assessment to be reviewed and monitored in the
future.

Risk-based triage
The most frequently seen application of RBT and QRM related to
quality events and deviations is having an established set of rules
to determine the extent and thoroughness of the investigation
that is warranted based on what is initially known. The concept
of triage is thought to have originated in the Napoleonic wars,
where responders to an event with mass casualties would quickly
categorize those who would be expected to succumb to their injuries
regardless of further treatment; those who would be expected to
survive without additional treatment because their injuries were
minor; and those who, only if they received medical attention,
would survive. Obviously, the evaluation criteria for quality events
is different, but the concept of categorizing the events to decide
where to spend one’s limited resources is similar. Using this type of
risk-based approach is aligned with the Pareto Principle, or the 80/20
rule, that quality guru Joseph Juran said applied to quality problems
in manufacturing where a small number of causes produced most
of the defects, or as he said, “the vital few and the trivial many”
(Hindle, 2008, p. 55).

Table 2 shows some of the decision criteria used to categorize

quality events into four categories. These are based, in part, on a
decision tree prepared by the World Health Organization (WHO,
2013).

The outcome of this risk-based triage should also be used in

determining how quickly and how extensively the issue is to be
escalated to the organization’s management.

It is important to highlight when there is uncertainty or if

something is unknown; because of the lack of information or a
high level of certainty, there is higher risk, or in this case, a more

LICENSED TO JOSE CASTELLA

Vesper Book.indb 85 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
86 Root Cause Investigations for CAPA: Clear and Simple

critical event. As more information becomes available, the deviation

category can be adjusted if it is appropriate. As knowledge increases,
the level of true risk can be more accurately estimated.
Table 2. Characteristics that can be used in triaging unwanted events
into four categories

Category 1 Category 2 Category 3 Category 4

events events events events
Term Critical Major deviation Minor deviation Incident
deviation
Impact Most serious Moderate, Least serious, No impact,
or unknown known or known known
impact unknown
impact
Cause Unknown Unknown Recognized, Obvious/known
known
Corrections, Need to be Need to be Obvious; Obvious;
corrective determined determined immediate could be self-
actions actions may correcting or
obviate the self-limiting
need for
additional
actions.
Procedures
often provide
a specific path
forward
Patient impact Would affect a Could have Would Would
critical quality some impact definitely not definitely not
attribute, on a noncritical affect critical affect SISPQ-A
SISPQ-A*, or quality quality attribute
affect unknown attribute, a or SISPQ-A
but judged to qualified piece
be serious of equipment,
validated
process, or
some impact on
SISPQ-A
Regulatory/ Very significant Possible if not None None
compliance or unknown thoroughly
significance but thought to addressed
be serious
Scope or Large or Limited to item, Limited to item, Limited, known
pervasiveness unknown scope product, batch product, batch
of the event involved; scope involved
is known

LICENSED TO JOSE CASTELLA

Vesper Book.indb 86 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 87

Category 1 Category 2 Category 3 Category 4

events events events events
Type or Significant Significant Usually None
rigor of procedurally
investigation defined
Documenta Full Full Note in batch Notation
tion of event record or log
book
Possible True sterility Temporary Material Temporary
examples** failure; use of an unexpected released from power
unreleased raw power failure warehouse disruption to
material; insect in a GMP area; not according warehouse that
infestation HEPA filter to FIFO that does not
shutdown requirement contain any
during filling; (decision to use temperature-
mix-up that material based sensitive
occurs in on expiration materials;
formulation date and not temperature
on receipt date. does not go out
Sometimes of typical range
called FEFO,
first expired,
first out).
*SISPQ-A is an acronym for characteristics of a GMP-compliant product as listed in the
US FDA CGMPs: Safety, Identity, Strength, Purity, and Quality. “A” is for availability of
product, a recent additional expectation, though not explicitly found in the CGMPs.
**Providing examples in these situations is always a risk in itself. Actual, specific situations
must be carefully understood and investigated as needed.

Criteria for not moving to further action or investigation

Very clear criteria need to be developed that describe the situations
that do not need to move further through the investigation process.
For example, criteria that must be completely met in order to avoid
moving forward in the investigation could be:

• The cause of the event is immediately known

AND
• The event can be immediately corrected (and perhaps that the
correction is defined by a procedure)
AND
• The event does not have any impact on product safety, identity,
strength, purity, or quality.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 87 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
88 Root Cause Investigations for CAPA: Clear and Simple

Some organizations also stipulate that this type of event (one that
has low criticality) has happened fewer than x times in a 12-month
period.

A point to highlight is that most national health authority

inspectors expect to see trending performed for all types of quality
events, regardless of the criticality. While, by definition, these
events do not have product or patient impact, they might be signals
of more serious events to come or, more likely, be opportunities
for quality improvements. Documenting and trending will benefit
the organization and enable continuous process improvement (see
Chapter 20).

Risk-based thinking applied to immediate actions

When the event is first observed, one assesses the situation and
mentally determines an immediate action plan in order to limit the
scope and impact of the unwanted event. When the event has been
considered as a potential occurrence, often there is a procedure that
prescribes actions to be taken. Without that guidance, however, there
are two simple but useful risk questions that should be answered by
the responders:

1. What are the risks to people (including myself), products,

facilities, equipment, and other things of value in my proposed
immediate action? (In other words, “What could go wrong with
my plan?”)

2. What are the risks if I do nothing?

Patient impact
Two questions that a health authority inspector, when reviewing a
deviation investigation, will want to see answered are, “What is the
scope of the problem?” i.e., “How big is the problem?” and “What
is the effect or impact of this problem on patients?” One way to
answer the effect question is through an impact assessment. There
are different ways this can be considered. One qualitative approach
comes from the “Swiss cheese” model discussed in Chapter 7.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 88 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 89

Using this concept, one can describe the quality system elements or
practices that are multiple effective barriers to the defect harming
the patient.

One other question that might be important to ask when deciding

to reject a batch of product (which is a type of corrective action) is,
“What is the risk to patients if this product is not available?” For
medically necessary products where patients have no other option,
this can be a very important question to answer. There may be
ways to reduce the impact or harm by way of a mitigation strategy.
For example, if there is a medically necessary injectable product
that has particle contamination and there is no similar approved
therapy, a limited mitigation strategy might be to advise healthcare
practitioners to use a filter when withdrawing the liquid from the
vial. Typically this reliance on the controls that users must apply
is not acceptable; however, here you are attempting to examine the
relative risks of two different options.

Patient impact—no; Compliance impact—maybe

There are times when it can be shown that a deviation has no real
negative impact on patients, but the situation could be seen as a
failure of GMP or commitment to the health authority. Often the
advice in these situations would be to make a scientific argument
for proceeding or releasing a product, but under other conditions
the firm may decide it is not in its best interest to do so. One such
event occurred in a firm that was under significant scrutiny by a
health authority for making a series of poor decisions that eroded
trust with the agency. The firm had committed to doing certain in-
process tests, one of which was missed by mistake. All the other
tests were done as required and any problem would have shown
up in the later tests that were correctly performed. The firm decided
that it was worth more to demonstrate to the agency that they were
living up to their commitments than trying to build a case to release
the batch. These are very difficult situations to be in; taking the long,
strategic view is often in the best interest of the organization.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 89 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
90 Root Cause Investigations for CAPA: Clear and Simple

CONCLUSION
There are many important decisions that need to be made related
to a quality event or deviation. Quality risk management and risk-
based thinking can be the source of information that will contribute
to consistent, data-driven decisions. It is important that the decision
is as free from bias as possible and not simply a justification of a
poor practice.

REFERENCES
Hindle, T. (2008) Guide to Management Ideas and Gurus. London:
Profile Books.

ICH (2005) Quality risk management – Q9. International Conference

on Harmonisation. https://database.ich.org/sites/default/files/Q9_
Guideline.pdf. Accessed 2 Mar 2020.

Kaplan, S. and Garrick, J. (1981) On the quantitative definition of

risk. Risk Analysis, Vol. 1, pp.11–27.

McFarland, A. and Vesper, J. (2020) How to facilitate great virtual

meetings during a pandemic (or any other time). Pharmaceutical
Online, Mar 20, 2020. https://www.pharmaceuticalonline.com/doc/
how-to-facilitate-great-virtual-meetings-during-a-pandemic-or-any-
other-time-0001. Accessed 22 Mar 2020.

MHRA (2016) MHRA GMP inspection deficiency data trend 2016.

Vesper, J. and McFarland, A. (2019) How to become a (better) facilitator

LICENSED TO JOSE CASTELLA

Vesper Book.indb 90 5/29/2020 10:55:54 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Applying Risk-Based Thinking to Quality Events and Deviations 91

Vesper, J. and O’Donnell, K. (2016) Current challenges in

implementing quality risk management. Pharmaceutical
Engineering, Nov-Dec, Vol. 36:6, p. 73–79.

WHO (2013) Deviation Handling and Quality Risk Management –

Draft Guidance. Geneva: World Health Organization. https://
www.who.int/immunization_standards/vaccine_quality/risk_
july_2013.pdf. Accessed 16 Apr 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 91 5/29/2020 10:55:55 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 92 5/29/2020 10:55:55 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

MODELS USED IN DESCRIBING

INCIDENTS

(Note: This chapter is adapted from Chapter 4, “Accident Models,”

in Risk Assessment and Risk Management in the Pharmaceutical Industry,
James Vesper, 2006.)

Here are two favorite quotes to begin this chapter:

All models are wrong, but some are useful. (George E. P. Box)

Nothing is as practical as a good theory. (Kurt Lewin)

Models are human-made constructs used to explain a system or

experiment with a phenomenon. They are frames of reference that
are important as we each try to create a picture in our minds—a
mental model—that we can use to understand how something
works (see example in Chapter 5). Mental models also help us as
we try to explain an event or tell a story—one of the things that
we do as we write an investigation report. Additionally, if we can
better understand how accidents occur by having one or more
mental models available to us, we may be more likely to see factors
that contribute to or cause these undesirable events with the goal of
controlling them.

Accident theory has evolved a great deal in the past 90 years. It

has given us models to use as we examine hazards, harm, and risks
that have developed. As we will see, some of the models no longer
accurately describe the way we view reality and have been largely
93
LICENSED TO JOSE CASTELLA

Vesper Book.indb 93 5/29/2020 10:55:55 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
94 Root Cause Investigations for CAPA: Clear and Simple

discarded from serious study. Other models have been expanded to

include a new understanding of organizations and organizational
systems. New models have been proposed to address emerging
technologies and advances in disciplines like engineering and
computer science. Some models incorporate features from other
categories, resulting in hybrid or “blended” models.

As we discuss the models that were often used to describe how

accidents occurred, we will be using the broader term incident that
does not necessarily mean any type of loss.

SINGLE-EVENT MODEL
Perhaps the oldest—and simplest—paradigm explaining how
incidents happen is the single-event model. Its simplicity, however,
is also its weakness. According to the model, an incident is caused by
a single event that is necessary (i.e., the situation would not happen
if the event didn’t take place) and sufficient (i.e., the event was all
that was needed to cause the incident). In other words, the model
simply shows cause and direct effect.

For example, I’m walking and fall because I slip on an icy patch.
Or a shark bites a swimmer’s leg because she was in the “wrong
place at the wrong time.” Or the wrong ingredient is added to a
tank because the operator misread the ingredient’s label. In each of
these examples there is a cause (ice, shark, operator mistake) and
an effect (fall, injury, contaminated batch). The single-event model
merely presents the occurrence of an event, without asking why or
how the triggering (proximal) event occurred.

While this model may seem highly unsatisfactory, it continues

to be used, often in news reports, for several reasons. First, we may
not know the “whys” or the “hows”; this model doesn’t require that
we understand the reasons for the incident. For example, we may
not know why a shark suddenly appears in an area where sharks
usually do not venture. Second, those who present an incident
as a simple cause and effect may prefer a simple model because
they assume others will be able to “fill in the blanks” explaining
what led to the incident. Third, those reporting or describing the
incident may not have the time (or want to take the time) to ask

LICENSED TO JOSE CASTELLA

Vesper Book.indb 94 5/29/2020 10:55:55 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Models Used in Describing Incidents 95

why or how events unfolded. The news media often present events
in this way: A reporter may report that an alligator has gotten into
a family’s swimming pool in Florida (cause) and scared the family
(effect) but feels no need to explain how the alligator got there in the
first place—the houses encroaching on and destroying the natural
habitat of the alligator, etc. This approach is also evident in accident
statistics, in which events are regarded as having only one cause,
and in insurance policies, which refer to “acts of God.”

CHAIN-OF-EVENTS MODELS OR DOMINO THEORY

Because of the obvious limitations of the single-event model, others
have come up with models that consider multiple events that result
in the accident. The “domino” theory or “sequence of events” model
was developed by H.W. Heinrich in 1931 and later refined by him
(1936). While working at an insurance firm, he was investigating
ways of preventing industrial accidents. Heinrich examined 80,000
accident records (by hand!) and determined that:

• 88 percent of the accidents were caused by unsafe acts by people.

• 10 percent were caused by unsafe physical conditions.

• 2 percent were caused by “acts of God” and would have

occurred anyway.

In trying to understand how people cause accidents, he came up

with five sequential factors, or dominoes (Figure 1):

1. Ancestry and social environment (including education)

2. Fault of the person (e.g., carelessness)

3. Unsafe act or condition

4. Accident

5. Damage or injury

LICENSED TO JOSE CASTELLA

Vesper Book.indb 95 5/29/2020 10:55:55 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
96 Root Cause Investigations for CAPA: Clear and Simple
Figure 1. Heinrich’s domino theory model

For example, consider the case of a worker injured because he

didn’t use a ladder properly. In Heinrich’s model, the person:

1. Is unfamiliar with ladders and the safety precautions for their

use (ancestry and social environment)
2. Climbs to the top rung of the ladder (fault of the person)
3. Leans over to reach something (unsafe act)
4. Loses his balance and falls (accident)
5. Falls off the ladder and breaks his arm (injury)

Heinrich goes on to discuss how accidents could be prevented

by modifying one or more of the first three dominoes through
engineering, education, and enforcement of safety rules.

While those in the pharmaceutical/biopharma industry today

may have different categories, the domino model, one that is very
linear, is widely used. An indication of this is the number of times
training or retraining is the sole corrective action for a problem. This
implies that all one needs to do to prevent a recurrence of the event
is to provide more information.

A deficiency of the domino model is that it treats accidents and

incidents as mechanistic and linear and having only one path. Also, by
emphasizing the person involved and not the conditions and systems
that can contribute to an accident, the model “blames the victim.”

LICENSED TO JOSE CASTELLA

Vesper Book.indb 96 5/29/2020 10:55:56 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Models Used in Describing Incidents 97

The “multilinear events sequencing” (MES) model is a more

complex chain-of-events model devised by Benner (1975) that is
frequently used to investigate airplane accidents. A person using the
model investigates an accident by examining the various animate
or inanimate “actors” and the “actions” that they perform or affect.
Using this model involves identifying the actors—which can be
human, mechanical (e.g., a switch), or conditions (e.g., weather)—
and plotting what they do and how they interact over time. The
MES model also incorporates aspects of the hierarchical models
discussed below.

James Reason, a British researcher, described a model that is a

slight variation on the linear model. Called the “defenses in depth”
model, (sometimes called the “Swiss cheese” model, Figure 2), each
level of defense creates a barrier to an accident (Reason, 1990, p.
208). A void or gap in the barrier (in the model the gap is a depicted
as a hole in the slice of cheese) may allow a hazard to pass through.
The next level of defense presents yet another potential barrier to
the hazard. Again, any gap in the defense may permit the hazard to
progress into an accident. One way to think of this is that each piece
of cheese is an element in the firm’s quality system.

Figure 2. James Reason’s Swiss cheese model (Reason, 2000)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 97 5/29/2020 10:55:58 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
98 Root Cause Investigations for CAPA: Clear and Simple

HIERARCHICAL MODELS
A third type of model used to describe accidents is the “hierarchical
model.” Unlike the domino model, in which one thing causes another
to occur, the hierarchical model identifies factors that contributed to
or set the stage for an accident but did not cause it directly.

One example is that the structures, systems, processes, and

procedures set the stage for an event to happen. These can be
conditions (like oxygen and fuel required for a fire) that just require
an event or action (such as a spark) to trigger the incident. The
conditions in and of themselves are not sufficient for the incident,
but they are necessary.

Accidents or incidents of this sort are frequently “organizational

accidents” (Reason, 1997) that affect the organization, its
stakeholders, and its employees. Regulatory noncompliance of this
sort rarely develops overnight—it generally involves a series of bad
decisions and inadequate systems, as demonstrated in this scenario:

• Management focuses on productivity rather than quality—

perhaps not via a conscious decision but rather through a general
culture wherein productivity is emphasized over quality.

• Emphasis is placed on getting “good” lots released, not on

deviation investigations of environment monitoring data
related to microbial contamination.

• The source of the microbial contamination is not found and it

grows, affecting more lots of product.

• The contamination is detected through sterility testing.

• Numerous batches are rejected, causing the medically important

product to be unavailable, which arouses the interest of the
regulatory agency.

• The regulatory agency conducts an inspection and finds a

number of deficiencies.

Another specific example of a hierarchical model is the “human

factors analysis classification system” (Figure 3).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 98 5/29/2020 10:55:58 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Models Used in Describing Incidents 99
Figure 3. An example of a hierarchical model used to identify human
factors that can contribute to the category of human error (Shappell
et al., 1997)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 99 5/29/2020 10:55:58 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
100 Root Cause Investigations for CAPA: Clear and Simple

FACTORIAL MODEL

A frequently used problem-solving tool for accidents and for quality

improvement initiatives is the cause-and-effect or “fishbone”
diagram (Figure 4). It is also called the Ishikawa diagram, after Kaoru
Ishikawa, a prominent quality engineer in Japan, who formalized
its use in 1943 (Ishikawa, 1986). W. Edwards Deming, J. Juran, and
others have recommended the Ishikawa diagram as one of the first
tools to use for improving quality.

The cause-and-effect diagram looks at several categories

(historically the “4 Ms”: methods, manpower, materials, machines)
to determine what may be causing or contributing to the accident or
incident. The results can be used to identify direct and contributing
causes as well as causes of variability.
The factorial model doesn’t necessary show how a specific
accident occurred, but rather what the possible factors are that
could, directly and indirectly, cause the undesired event.

Systems Accident

The accidents described above are examples of situations caused

by equipment breakdowns, by failure to follow procedures for a
variety of reasons, or due to organizational failures. In a “systems
accident,” on the other hand, there is not necessarily a tank
rupturing under normal operations or an operator mistake. Perrow
defines a systems accident as one that arises from the interaction
between components rather than from the failure of the individual
components themselves. He calls these scenarios “normal accidents”
in that they are inevitable in complex, tightly-coupled processes
because of dysfunctional interactions of the components (Perrow,
1999).

Figure 5 shows an example of a systems accident.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 100 5/29/2020 10:55:58 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Models Used in Describing Incidents 101
Figure 4. Fishbone diagram—a type of factorial model

LICENSED TO JOSE CASTELLA

Vesper Book.indb 101 5/29/2020 10:56:00 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
102 Root Cause Investigations for CAPA: Clear and Simple
Figure 5. Example of a systems accident (Leveson, 2002)

A classic example (Leveson, 2002) of a systems accident

occurred in a chemical facility. The “rig”—including the reaction
vessel, condenser, and material-charging devices—was controlled
by a computer programmed to freeze all operations in response to
an alarm of any kind. On the day of the accident, the reaction started
normally: a catalyst was being added to the reactor; the vapor passed
over a water-cooled condenser and returned to the reactor as a
reflux. Since the reaction was just starting, the condenser hadn’t yet
reached a level of maximum cooling. Early in the reaction, an alarm
signaling a low oil level in the gearbox was sent to the computer. The
computer responded by freezing all activity. The problem was that
the reaction was starting to heat up in response to the catalyst, and
more and more hot vapor was being generated and not being cooled
down by the condenser because the amount of cooling water had
been locked in place by the computer. The reaction continued until
it erupted from its containment vessels, releasing harmful material
into the environment.

As complicated systems play a larger and larger part in our

lives, and those operating the systems become more and more
detached from their functions, systems accidents undoubtedly will

LICENSED TO JOSE CASTELLA

Vesper Book.indb 102 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Models Used in Describing Incidents 103

become more frequent and have a greater impact on organizations

and communities.

INDIVIDUAL AND HUMAN FACTORS

The last category of accident/incident models emphasizes the people

directly involved in accidents. Some estimates from the nuclear
power industry, for example, show that human performance
problems are the main cause of incidents (Reason, 1990, p. 186). One
reason to explain this is that people are the last line of defense when
unsafe or problem situations arise (Graeber, 1999).

Individual factors involve a person’s individual weaknesses: he

or she might be ill, or stressed from getting to work amid traffic
or a transit workers’ strike, or perhaps distracted by thoughts of
upcoming holiday plans. Ten percent of accidents are estimated to
be caused by such individual factors (Bridges et al., 1994).

Ergonomics is the scientific discipline concerned with

interactions between humans and other elements of a system
(IEA, 2000). Among the various disciplines associated with human
factors, one focuses on the person–machine interface, and others
examine larger issues, including organizational and environmental
structures (realms that are similar to the hierarchical model).

When considering the role of human factors in an accident, the

workplace is of particular interest. From a human factors standpoint,
the workplace consists of interactions between:

• Workers, who vary in size, strength, range of motion, intellect,

education, expectations, and other physical and mental
capacities.

• Work setting, consisting of parts, tools, furniture, control and

display panels, and other objects.

• Work environment, created by climate, lighting, noise, vibration,

and other conditions (Ergoweb, 2010).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 103 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
104 Root Cause Investigations for CAPA: Clear and Simple

Models and approaches that are useful in uncovering human

error traps and conditions that set someone up to fail are discussed
in detail in Chapter 8.

Other human factors models examine the interactions between

physical human factors, cognitive human factors, and industrial
design. A mismatch or conflict between these components or factors
makes an accident or incident highly probable.

Organizational, structural, and economic factors may combine

with human error to cause accidents. For example, the lead
investigator in the April 1996 Chernobyl nuclear reactor disaster
attributed the accident to human errors and procedural violations:
“After being at Chernobyl, I drew the unequivocal conclusion that
the accident was . . . the summit of all the incorrect running of the
economy, which had been going on in our country for many years”
(Reason, 1997, p. 16.). Several factors may also have contributed
to the Bhopal disaster. Originally the disaster was believed to
have resulted from the introduction of water—intentionally or by
accident—into a tank of methyl isocyanate. Four years before the
disaster, however, Union Carbide inspectors had noted inadequate
safety practices and poor maintenance in the Bhopal facility. These
concerns went unaddressed (Gupta, 2002).

Almost all the tools used for risk estimation can consider the
contribution of human factors to accidents or incidents. Checklists
of points to consider when using a preliminary risk analysis or
guide words when examining processes and procedures have been
identified (Bridges et al., 1994), but a simpler—and potentially
more powerful—approach when identifying a potential human
contributor to a possible incident is to ask repeatedly, “How could
that happen?” or “What is the true cause of that?”

CONCLUSION
We’ve looked at some models used to explain how accidents occur;
the most useful ones are those that take into consideration indirect
or contributing factors as well as direct causes.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 104 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Models Used in Describing Incidents 105

Having a wider understanding of what can cause and contribute

to accidents, incidents, and other unwanted events helps risk
managers and analysts to understand better how risk assessment
tools work and what their strengths and limitations are. The models
also help risk analysts identify hazards—to know what they look
like, where they may be found, and when they might appear.
Finally, some of the models, particularly those developed more
recently, show that contributing factors at times can be manipulated
to reduce the chances that a hazard will be expressed.

REFERENCES
Benner, L. (1975) Accident investigation: Multilinear events
sequencing methods. Journal of Safety Research, 7:2.

Bridges, W.G., Kirkman, J.Q., Lorenzo, D.K. (1994) Include human

errors in process hazard analysis. Chemical Engineering Progress,
Vol. 74.

Ergoweb (2010) Ergonomics concepts. https://ergoweb.com/ergonomics-

concepts/. Accessed 17 Apr 2020.

Graeber, C. (1999) The role of human factors in improving aviation

safety. Aero magazine, no. 8. http://www.boeing.com/commercial/
aeromagazine/aero_08/human.pdf. Accessed 17 Apr 2020.

Gupta, J.P. (2002) The Bhopal gas tragedy: could it have happened
in a developed country? Journal of Loss Prevention, 15:1–4.

Heinrich, H.W. (1936) Industrial Accident Prevention. New York, NY:

McGraw Hill.

IEA (2000) The discipline of ergonomics. International Ergonomics

Association. http://www.iea.cc/ergonomics. Accessed 18 Apr 2020.

Ishikawa, K. (1986) Guide to Quality Control. 2nd rev. edn. Tokyo:

Asian Productivity Organization. Ann Arbor, MI: UNIPUB.

Leveson, N. (2002) A new approach to safety system engineering. In

process. http://sunnyday.mit.edu/book2.pdf. Accessed 18 Apr 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 105 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
106 Root Cause Investigations for CAPA: Clear and Simple

Perrow, C. (1999) Normal Accidents. Princeton, NJ: Princeton

University Press.

Reason, J. (2000) Human error: Models and management. BMJ

2000; 320:768–770. https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC1117770/. Accessed 3 Mar 2020.

Reason, J. (1997) Managing the Risks of Organizational Accidents.

Burlington, VT: Ashton Publishing.

Reason, J. (1990) Human Error. Cambridge, UK: Cambridge

University Press.

Shappell, S. and Wiegmann, D. (1997) A Human error approach

to accident investigation: The taxonomy of unsafe operations.
International Journal of Aviation Psychology, 7:4, pp. 269–271.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 106 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

HUMAN ERRORS AND HUMAN

FACTORS

Consider this situation: You have been asked to do a major

presentation to the leadership of your organization. You arrive at
the conference room early to connect your laptop into the projection
system. Things seem to be powering up but nothing is showing on
the conference room screen. Suddenly your computer displays the
“blue screen of death” with the message:
An equipment failure has occurred.

Or consider this: The same situation, but in this case, the message on
your screen says:
This computer is not properly connecting with the projector. Check the
connection or replace the cable.

Which is more helpful or “actionable”? Undoubtedly you selected

the second message because it provides details on what you should
do to correct the problem.

Now consider this situation: You are sent a quality event notice
that there was a mix-up in the clinical supplies that were sent to a
clinical study site. The root cause of this was identified as:
Human error.

Or, the same the situation, but in this case more details are provided:
There was not a packaging list or checklist used to confirm contents of package
being sent to clinical study site PQR.

107
LICENSED TO JOSE CASTELLA

Vesper Book.indb 107 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
108 Root Cause Investigations for CAPA: Clear and Simple

Which of these scenarios is more actionable? Which one indicates

a cause that can be specifically addressed? Again, the second one is
obviously the more helpful one.

This analogy illustrates that calling something a human error is

as unhelpful as saying that “equipment failure” was the root cause of
a problem. Human error, if anything, is really a category that needs
to be explored and defined in much more depth by focusing on
human factors. When we use the term human error, we are referring to
a class that involves people and an unintended, unwanted outcome
influenced by or due to more specifically identified factors or causes.
These factors are what we are going to examine in this chapter.

The term human factors is used two different ways. The first is
in the context of ergonomics, specifically:
Human factors use knowledge of human abilities and limitations to
design systems, organizations, jobs, machines, tools, and consumer
products for safe, efficient, and comfortable human use (HFES, 2019).

The second usage, the one that is more suited to our discussion, can
be defined as:
Human factors are the faulty systems, processes, circumstances,
conditions, and causes that lead people to make mistakes or fail to
prevent them.

CLASSIFICATIONS
You can categorize the things that cause or contribute to human
error in different ways. The simplest has three groupings:

• Commission errors: These errors involve incorrectly performing

a specific task such as adding the wrong material to a container,
turning the valve the wrong way, or taking the incorrect action
during an emergency.

• Omission errors: These errors occur when an action is not

performed that should have been. For example, failing to
submit all required documents for review, not calibrating an
instrument by the required due date, or not taking a sample at
a critical timepoint.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 108 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 109

• Intentional errors: These deliberate and intentional events

happen when a person knows what should be done but does not
do it. Taking an unauthorized shortcut or violating a procedure
would be examples of these.

The categories above do not provide any information on why or how

the errors occurred—that must be determined in an investigation.

Other researchers have used the term “active” or “sharp end”

of an incident and “latent” or “blunt end” of the incident (Reason,
1990). “Latent” refers to conditions that may have existed for a long
time, but for some reason have never been activated until a certain
failure sequence was initiated. Reason also uses the term “resident
pathogens” when he refers to latent conditions. The “sharp end”
is where the incident actually occurs; this is where the person is in
direct contact with the system, process, or activity.

Another approach divides events as either “planning failures”

or “execution failures” (Reason, 1997). Planning failures are also
called “mistakes.” In this situation, a person correctly performs an
action that is intended to achieve a desired goal, but the action is not
the appropriate one or the action is inherently deficient. Execution
failures occur when the action is not properly performed due to
“slips”—usually due to inattention, perception failures, or because
of lapses when you forget to do something often because of memory
failures.

One other unique way that errors have been classified is as a

“phenotype”—what the error or deviation looks like (similar to what
is seen at the “sharp end” of the failure sequence), and “genotype”—
potential or contributing causes of the unwanted event (Moura et al.,
2016). As in genetics, having a DNA gene for a trait (the genotype)
that may not appear or be expressed is something that is lurking
beneath the surface.

THE PERSON OR THE “SYSTEM”?

James Reason, a noted British authority on the topic of human
error writes (2000) that there are two investigational approaches
that can be used (Table 1). The first, less constructive approach

LICENSED TO JOSE CASTELLA

Vesper Book.indb 109 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
110 Root Cause Investigations for CAPA: Clear and Simple

considers the person(s) involved with the incident. The focus is on

the improper acts that the person did and the intentional violations
that were made. The reasons for these unacceptable behaviors are
presumed to be personal flaws—not paying attention, forgetting
things, negligence, and not being careful enough. So what are we
going to do about this? We’re going to put more controls in—we
will modify procedures (almost always by making them longer and
more complicated); we will scare the person—”If you do this one
more time, there will be a memo put into your permanent HR file.
We will let everyone know that you were the cause of the failure.
And one more thing—we will force you to sit through the 3-hour
training program again.”

Reason suggests that there is another, better way of investigating

the causes of an unwanted event. He says the emphasis should be on
the system and that when people are involved in any task, one needs
to recognize that there will be failures, because all of us humans are
fallible. Using the systems view, we need to look upstream—that
is, to consider the organizational and process factors that can, often
unwittingly, set someone up to fail. These can be error traps, for
instance paper-based log books that are inconsistent in their design
from one instrument to another, or sampling outlets that are difficult
for a person to safely and easily access. What we can do about this
is think more broadly about the problem—what are the defenses
and barriers that we can use, perhaps people working together as a
team instead of just relying on one overworked professional, having
checklists that a trained person can use to prevent omitting a critical
step, or changing the location of the sampling outlet.
Table 1. Two approaches when considering the category of human
error (Reason, 2000)

Attribute Person approach System approach

Focus Unsafe acts; errors and Humans are fallible; errors are
violations to be expected
Presumed causes Forgetfulness, inattention, “Upstream” failures, error
negligence, carelessness traps; organizational failures
that contribute to these
Countermeasure Fear, more/longer procedures, Establish system defenses and
to apply retraining, disciplinary barriers
measures, shaming

LICENSED TO JOSE CASTELLA

Vesper Book.indb 110 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 111

WHY SUCH A HIGH PROPORTION OF SO-CALLED

HUMAN ERRORS?
If you ask a pharma or biopharma firm for a list of causes of quality
events, it is not uncommon to see anywhere from 30 to 60 percent
of the unwanted events having their root cause being ascribed to
human error. Corrective actions in these situations are almost always
training, retraining, or modifying (i.e., usually adding more detail)
to the procedure. Why is human error identified so frequently as the
root cause? There are a number of reasons, including:
• It is fast and easy. It takes time, resources, and expertise to do
an in-depth investigation. As MIT’s aeronautical engineering
professor Nancy Leveson writes, “The less that is known about
an accident, the more likely it will be attributed to human error”
(Leveson, 2011, p.37).
• Investigations start too late. The “golden hours” are the first
12–24 hours after an incident occurs. Often, investigations are
driven more by the need to close them out within 30 days,
so the real work starts a week or so before the deadline. This
means that evidence may have been lost or those involved have
forgotten critical, subtle details.
• There is usually a person at the end of the failure chain. This
is the unlucky individual (or group of individuals) that are
present when the proximal cause triggers the failure. All of the
latent factors may have existed, but that last particular action
or decision at the “sharp end” is what causes everything to go
wrong. Maintenance may have been delayed so the equipment
is held together with a patchwork of temporary fixes. At a
random (and usually worst-possible) time, something occurs
that catalyzes the failure, with those personnel present getting
the blame.
• People are viewed as unreliable and negligent. This type of
statement is based on a variety of biases such as attribution bias,
where people are judging others without any substantial facts
or support.
• A culture of blame. If the first question from management is
“Who did it?,” it is very likely that someone will be blamed for
the failure. Another question that is sometimes asked is “Who
is responsible for that?” If that question is asked, there could

LICENSED TO JOSE CASTELLA

Vesper Book.indb 111 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
112 Root Cause Investigations for CAPA: Clear and Simple

be the possibility that management may want to go deeper into

factors (and those in positions of responsibility) as being more
distantly involved.
• Hindsight bias. This is often coupled with a culture of blame.
This bias is evident when someone says “They should have
known!” When you are looking backward from a failure, it
is easy to put the pieces together that point to people. “Why
weren’t they more careful? If they hadn’t taken that shortcut,
everything would have been OK.” No one seems to want to
understand why that shortcut was taken. (One of the counter-
approaches to hindsight bias is recognizing the concept of
bounded rationality, which says that when a person makes a
decision, they are limited in the knowledge, often imperfect,
that they have at that particular moment) (Simon, 1982).
• Earlier signals were not thought to be important enough
to prompt action. These are the “near misses” that a robust
employee safety program looks for and takes action upon:
“There was no injury or damage this time, but think about what
might have happened. How can we prevent this from happening
again in the future?”
• Fear of speaking truth to power. This is the elephant in the
room that no one really wants to point out. Maybe the incident
was blamed on a technician who was doing a task at the end of a
double shift—they had been working for 16 hours with limited
rest breaks. Why? There aren’t enough people? Why? Hiring
restrictions? Why? Managers were told to cut their budgets but
were required to have the same amount of productivity. Why. . .?
Addressing these types of issues requires an organizational
culture where there is psychological safety (Chapter 5) so people
can be honest with leadership.

WHAT ABOUT A “BLAMELESS” CULTURE?

An approach that at least one pharma firm uses is what they call a
“blameless” culture, which is the opposite of what usually occurs
after a manager demands to know, “Who did it?” As discussed
earlier (Chapter 5), a blameless culture can make people more willing
to come forward with a failure or a near-miss that they caused or
witnessed. A criticism of this blameless culture is that it fails to hold
people responsible for their actions.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 112 5/29/2020 10:56:01 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 113

For errors or failures involving people, there might be other

ways to approach this rather than a simple blame or blameless
dichotomy. Table 2 presents a spectrum of reasons for human-
related failure from Amy Edmondson (2011). Her contention is that
deviances, i.e., intentional violations, are actions that may be cause
for placing blame on the person. Failures due to inattention should
be examined carefully to understand why—perhaps the deviation
was due to an alarm that needed attention, causing the performer to
address that and not attend to another important task. In all of these
cases, understanding the why is important.

Table 2. A range of reasons for human-related failure. Adapted from

Edmondson (2011)

Category Example
Deviance Individual chooses to violate a
Blameworthy prescribed process, procedure, or
practice.
Inattention Individual inadvertently deviates from
procedure or specification
Lack of ability Individual does not have knowledge,
skills, capability, or training to perform
task.
Process inadequacy Competent individual follows
procedures or instructions for an
inherently faulty or incomplete
process.
Task challenge Competent individual attempts a task
too difficult to be executed reliably
every time.
Process complexity A process composed of many
elements fails when it encounters
novel inter-actions.
Uncertainty Lack of clarity about future events
causes people to take reasonable
actions with undesired results.
Hypothesis testing An experiment conducted to prove
that an idea or design will succeed
fails.
Exploratory testing An experiment conducted to expand
Praiseworthy knowledge and investigate a possibility
leads to an undesired result.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 113 5/29/2020 10:56:02 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
114 Root Cause Investigations for CAPA: Clear and Simple
Another different model used by the US Department of Energy
focuses on determining the intent of the individual. If the violating
actions were intended but the failure outcome was not, that is a very
different situation than if both the violating actions AND the failure
outcome were intended; the latter would be deemed as sabotage.

FIVE PRINCIPLES OF HUMAN PERFORMANCE

As you try to achieve the desired level of performance from
individuals, there are five key principles that need to be kept in
mind (adapted from DOE, 2009).

1. People are fallible—even the best people make mistakes.

Imperfection is part of the human condition. Given the “right”

set of conditions and circumstances, even experts can omit
a critical step or misinterpret information due to a cognitive
illusion or bias that is in operation (Kahneman, 2011).

2. Error-likely situations are predictable, manageable, and

preventable.

Cognitive and behavioral science as well as extensive experience

have identified precursors to errors as well as conditions where
errors can be expected. The TWIN approach discussed later in
this chapter provides numerous specific examples of things to
look for. By identifying these in a system, job, or area, one can
take corrective actions.

3. Individual behavior is influenced by organizational processes

and values.

An organization’s culture—the values, behaviors, and practices

that form the fabric which hold an organization together—
shapes what individuals do and how they respond to events
and situations. If people perceive a lack of psychological safety,
the belief that the work environment is safe for interpersonal
risk taking (Edmondson, 2019), they will be reluctant to report
or take responsibility for an unwanted event. If people are in
fear for their jobs—getting demerits for often trivial behaviors

LICENSED TO JOSE CASTELLA

Vesper Book.indb 114 5/29/2020 10:56:02 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 115

or those out of their control—they may not be willing to extend

themselves beyond the bare requirements. People are smart;
we are quick to see what gets rewarded—even if the behaviors
are improper—and what does not. In the Institute of Medicine
report entitled To Err is Human, this statement appears:
“Although almost all accidents result from human error, it is
now recognized that these errors are usually induced by faulty
systems that ‘set people up’ to fail. Correction of these systems
failures is the key to safe performance of individuals” (IOM,
2000, p. 169).

4. People achieve high levels of performance because of the

encouragement and reinforcement received from leaders, peers,
and subordinates.

Behaviorists such as psychologist B.F. Skinner have shown that

in most cases, positive forms of feedback and reinforcement
produce more lasting desirable effects than negative
reinforcement. When giving feedback, it needs to be:

– Specific: Generalities do not help people isolate what was

good or what needs improvement.

– Behavior focused: What was the action or decision that is

the real issue? Don’t make comments about the person. For
example, “I noticed that you were 20 minutes late to the
meeting,” not “Can’t you keep track of time?!”

– Timely: Try to provide it as soon as possible after an event

while it is still fresh in people’s minds.

– Personal impact: What affect might this behavior have?

What does it mean to you? What might it mean to patients?

5. Events can be avoided through an understanding of the reasons

mistakes occur and application of the lessons learned from past
events (or errors).

As will be seen later in this chapter, there have been a number

of different models and approaches developed that are aimed at
better understanding why events involving people happen. The

LICENSED TO JOSE CASTELLA

Vesper Book.indb 115 5/29/2020 10:56:02 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
116 Root Cause Investigations for CAPA: Clear and Simple

problem that is often seen in the pharma and biopharma industry

is that we do not apply these tools that have demonstrated their
value in other domains (e.g., worker health and safety) or in
other industries (e.g., nuclear power and aviation). We need to
take these lessons to heart.

Now that we have seen that we need to more deeply examine

the human factors that can lead to unwanted events and contribute
to undesired levels of human performance, we can look at models
and tools to help us do this.

MODELS AND TOOLS TO IDENTIFY CAUSES THAT

RESULT IN HUMAN ERROR
There are any number of approaches that can be used in drilling
down to the causes and contributors of unwanted events that fit
into the broad category of human error. In the examples below you
will see some that are very high-level and conceptual, one that is a
hierarchical model, and another that is more of a checklist. They all
have the same intent, which is to determine factors against which
actions can be taken.

Human performance model

This example (Figure 1) was intended to provide eight broad
categories to consider when looking for barriers that prevented one
from reaching a human performance goal. If you begin from the
starting point and everything goes properly, you achieve the goal.
But often, there are barriers that have to be resolved or structures
that need to be put in place or strengthened.

This is an example of a conceptual hierarchical model. To

consistently reach the goal, the items higher up on the list need to
be in place and be operational before the lower ones, particularly
training. If training is given without the chance to practice or having
the needed tools or the performer receiving feedback, the training
will be a waste of time and money.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 116 5/29/2020 10:56:02 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 117
Figure 1. A human performance model (Vesper, 1993)

Looking at the concept of data integrity as an example will help

us see how the pieces fit together.

• Organizational structure and systems: Governance needs to be

in place. This would establish and implement policies followed
by monitoring outcomes. For data integrity, this would mean
writing a policy, communicating it to all individuals, enforcing
its use, having an active monitoring program, and if there were
data integrity issues, investigating and taking appropriate
immediate and corrective actions.

• Aptitude and capability of personnel: Personnel are given

assignments that they are able to perform. For example, lab
technicians in a chemistry lab have the cognitive ability to do
the necessary calculations.

• Design of the job: The processes are capable of achieving the

desired intent or goal. A data integrity requirement is that data
be attributable—you positively know where the data came from
or who entered or processed it. A barrier to this would be a firm

LICENSED TO JOSE CASTELLA

Vesper Book.indb 117 5/29/2020 10:56:03 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
118 Root Cause Investigations for CAPA: Clear and Simple

that requires sharing of login and password credentials among

numerous people because it did not pay for the number of users
(i.e., seats or licenses) that need to access the system. In other
words, the design of the process or task does not support the
goal.

• Clear, known requirements: Procedures and expectations

are communicated to everyone. People need to know that
data integrity is a “must have” not an optional “nice to have.”
Specifically, people need to know practices for signing and
dating an entry.

• Appropriate feedback: Those doing a task are provided

information concerning their performance. It could be a
supervisor telling a technician, “The data sheet you are
completing looks great—the writing is very legible. Continue
doing that!” or “You might want to consider printing a bit
more clearly—the entries on this form are very difficult to
read.” Feedback needs to be timely, specific, and focused on the
behavior, not the person.

• Appropriate rewards: Personnel are recognized (in various

ways) for their positive contributions and performance. The
important word here is appropriate. Inappropriate rewards can
drive inappropriate behavior. For example, being positively
recognized for never having an out of specification result could
push an analyst to delete or modify data in a poorly controlled
laboratory information management system (LIMS) in order to
have an unblemished record.

• Tools and opportunities to practice: Repeated practice helps

develop proficiency in performing a task. Practice also helps
one acquire tacit knowledge—the “know-how” related to a
task. Tools might be as simple as having the correct procedure
or method available when performing the tasks or having the
correct pens (black or blue indelible ink) that conform to the
procedural requirements.

• Skill and knowledge: Education and training are needed

to safely, effectively, compliantly perform a task. This could
include training on how verifications are to be performed or

LICENSED TO JOSE CASTELLA

Vesper Book.indb 118 5/29/2020 10:56:03 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 119

on the rules for significant figures and rounding of numerical

results.

How you can use this

After you have identified the event as an omission, commission, or

violation, you can begin at the bottom of the list to ask if lack of skill
and knowledge had anything to do with the event. Regardless of the
answer (i.e., yes or no), go up to the next level and ask if the person
had the proper tools and opportunities to practice; if no, ask why not.
Then continue moving up the list. You will be collecting a variety
of factors that can be used to determine the root, contributing, and
proximal causes and find a set of possible corrective actions.

An example

A health authority inspector that was inspecting a large pharma

manufacturing site commented about the number of cross-outs and
corrections that were present in a manufacturing batch record. The
site’s leadership, upon reviewing the records, found that most of the
corrections involved calculations (e.g., yields, reconciliations). They
told the training manager that training in calculations and arithmetic
was to be done immediately. The training manager delayed that
request so she could talk to some of the operators involved. They
told the training manager that this was a known issue but it had
not been resolved. Specifically, the problem was the hand-held
calculators had very small keys that could not easily accommodate
the large fingers of many of the male operators, particularly if they
were wearing gloves. What would happen is that the calculation
would be performed, the result written down on the batch record,
and then another operator would re-do the calculation and, in
some cases, get a different result because the wrong digits were
inadvertently tapped. So, the root cause of the problem was really an
inappropriate tool—the small calculator—that was quickly replaced
with one that had larger keys. An effectiveness check three months
later showed a major improvement in what the batch records looked
like: fewer cross-outs related to calculations.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 119 5/29/2020 10:56:03 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
120 Root Cause Investigations for CAPA: Clear and Simple

Human factors analysis and classification system (HFACS)

HFACS (Fig 2) is based on work from James Reason and was
developed for the US Navy by Shappell and Wiegmann (1997). It
is an example of a hierarchical model that also includes aspects of
a decision tree. Compared to the human factors model discussed
above, the HFACS provides more detail and considers both active
errors as well as latent errors and underlying factors. If you replace
the word “unsafe acts” with “quality event” or “deviation,” this
model can be easily applied to the pharma industry.
Figure 2. Human factors analysis classification system

LICENSED TO JOSE CASTELLA

Vesper Book.indb 120 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 121

Beginning on the lowest level, we can describe each of the causes

or contributors and provide examples (adapted from Shappell
et al., 1997). The causes on this level are active errors that cause an
immediate effect.

• Errors: These can be viewed as “honest mistakes” that could

happen to anyone; they are actions or inactions that have
unintended results. If these errors are found, you need to ask
multiple whys—there are often reasons that are more subtle
than just lack of knowledge or skills. There are three types of
errors:

– Skill-based errors are failures in performing a task. These can

be due to attention failures, such as being interrupted and
losing one’s place in a method or attempting to do multiple
tasks at the same time and failing to notice a problem. This
category includes slips (action related) and lapses (memory
related).

– Decision errors are actions intended to be taken but do not

achieve the desired result. These mistakes can be “rule-based”—
for example, using the wrong procedure in an emergency, or
“knowledge-based,”—not having enough experience to solve a
problem in the given time. Another variation of a mistake is
choosing a course of action that is dangerous or difficult to
implement. One of the critical aspects of a decision error
or a mistake is that those involved truly believe they are
doing the right thing. If this is combined with confirmation
bias (interpreting the information in a way that supports the
plan that was taken), a potentially serious failure can occur.
Decision errors can be very difficult to detect until it is too
late.

– Perceptual errors are those due to not correctly acquiring

or processing input data from your senses—for example,
misreading a label.

• Violations: Deliberately performing an action (or not performing

a required action) with the intent that there will not be an adverse
event or deviation. (If one commits a violation with the intent of
having a negative outcome, it would be considered sabotage.)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 121 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
122 Root Cause Investigations for CAPA: Clear and Simple

These often occur with the best of intentions such as a manager

trying to juggle multiple important deadlines. Shortcuts are often
an example of a violation. The dangerous thing about shortcuts
is that they often do work without a problem—something that
gives people an invalid confidence that the shortcut can be
justified. There are two types of violations

– Routine: A violation that occurs frequently and is not viewed

as an abnormal situation but is seen as “typical.” Routinely
driving above the speed limit because “everybody does it”
is an example of this.

– Exceptional: A violation that is not done very often. An

example might be signing someone else’s name to a paper
or electronic document in order to meet an important
deadline.

Moving up a level finds situations that could either be “active

errors” or “latent errors”—that is, situations and underlying
conditions that are preconditions for an unwanted incident. Latent
errors are just waiting for the right combination of factors to be
present for the unwanted event to be expressed.

• Situational factors: Part of the context of where and how the

task was performed. These are often conditions over which the
individual has little control.

– Tools and technology: The tools and supports that are part
of the task. It could be that the procedure (a simple, often
paper-based technology) was confusing or missing a step.
Another example could be poorly designed batch records
(whether on paper or online) that vary widely from product
to product.

– Physical environment: The conditions in which the tasks

are performed. For example, the gauge was incorrectly read
by the technician because there was not enough light. Or
the noise and activity levels in a laboratory were very loud,
making it difficult to concentrate and resulting in a lapse.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 122 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 123

• Condition of operators: These are the factors that are generally

internal to those doing the task, whether operational personnel
or technical experts or management.

– Adverse physiological state: Being ill, having jetlag that

could affect decision making, or an aseptic processing
technician being prone to “shedding” micro-organisms that
could contaminate products.

– Adverse mental state: Psychological or mental limitations

that result in stress or fatigue. For example, being distracted
because of worrying about a child or elderly parent who is ill.
This category also includes biases such as overconfidence.

– Physical or mental limitations: Not having the needed

capabilities to perform a task. For example, short-term
memory issues, limited vision or hearing, or not being able
to adapt one’s sleep patterns to different work shifts or
schedules.

• Personnel factors: This category is a combination of elements

that negatively influence a person’s individual performance
and their contribution to a team.

– Personal readiness: Factors here could be a combination of

things that do not break any rules—for example, not getting
proper nutrition because one is eating only candy bars and
potato chips, or actions that are violative, such as drinking
alcohol or using drugs at inappropriate times. (Aviation
has what is called “the bottle to throttle” rule which says no
alcoholic beverages are to be consumed eight hours before a
flight; some airlines have increased that to 12 hours.)

– Communication, coordination, and planning: Aviation

uses the term crew resource management (CRM) that
refers to teamwork and effective communication between
all members of the flight crew. Ineffective CRM has been
seen in several high-profile plane crashes (Langewiessche,
2014). A major component of cabin crew training is meant
to promote effective teamwork. Research has shown

LICENSED TO JOSE CASTELLA

Vesper Book.indb 123 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
124 Root Cause Investigations for CAPA: Clear and Simple

that teams are more effective in catching problems than

individual performers alone (Edmondson, 1999).

The next level up in the HFACS model considers the role that
supervision has. Generally these are all latent errors or underlying
conditions that are often uncovered by repeatedly asking “why?”
Four categories related to supervision are:

• Inadequate supervision: Mismanagement of individuals on

a personal level—for example, by not providing coaching
or training as needed. This also could be not having enough
supervisors on an overnight shift or supervisors that do not
spend enough time being present “on the floor” in a laboratory
or production area.

• Planned inappropriate operations: Improper scheduling or

“normalizing” activities that should only be done in emergency
situations, such as requiring people to routinely work double
shifts.

• Supervisory violations: Supervisors encouraging shortcuts to

be taken or turning a blind eye (ignoring) practices that do not
align with policies, procedures, or other requirements.

• Failure to correct known problems: Deficiencies are allowed

to continue uncorrected. This could include operating a critical
instrument that hasn’t been properly maintained, or hearing
about water on the floor in a washroom and not taking any
actions to remediate the problem.

The top layer of the HFACS model is with management, to

whom a large proportion of incidents can be traced. “Objective
researchers have regularly shown that about 80 to 90 percent of the
damage done by poor quality is traceable to managerial actions”
(Juran, 1992, p. 248). Latent errors and underlying conditions are
found here. Three management-related categories are:

• Resource management: Not having the right number of

qualified personnel or not providing personnel with tools, time,
or money required to get a task accomplished—for example, not
enough technical staff to do timely, complete investigations.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 124 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 125

• Organizational culture: The spoken and unspoken “rules” and

culture within a group. This might look like a firm that has a
quality statement saying that “patient safety is our number
one priority,” yet allowing products that have known serious
deficiencies to remain on the market.

• Organizational processes: Not having accepted and

standardized ways of doing things, such as policies and
procedures that do not exist or are inadequate.

How you can use this

Starting out at the lowest level, you want to determine if the event is
due to a commission or omission type of error or a violation of one
kind or another. Asking “why” moves the inquiry up one or more
levels to preconditions and underlying conditions.

An example
It was observed that operators were signing the “verified by” sections
of production records in a way inconsistent with industry practice.
Specifically, the personnel sat at a table at the end of their shift and
filled in the “verified by” entries. The consultant asked the operators
what “verification” meant and received a variety of answers, none of
them being “having real-time witness of a critical activity,” which is
a definition often used in the pharma industry. There was no formal
training provided on verification nor were there definitions in any
policy or procedure. The event was considered to be a decision
error—that is, “conscious, goal-intended behavior that proceeds as
designed, yet the plan proves inadequate or inappropriate for the
situation” (HFACS, 2014). (Had there been proper definitions and
if people had been trained on these, the event would have been
classified as a violation.)

There probably was nothing on the next level, preconditions,

related to this event. Asking why had this occurred on the
supervisory factors level could point to inadequate supervision;
supervisors with experience in pharma and data integrity issues
would have and should have intervened. Why was this? Moving
up to the organizational influences level, resource management

LICENSED TO JOSE CASTELLA

Vesper Book.indb 125 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
126 Root Cause Investigations for CAPA: Clear and Simple

(not having experienced supervisors or quality unit personnel to

identify the issue) and organization process (not defining important
data integrity terms in a policy and procedure and then providing
training on them) would be root causes.

When using this model, it is important to not point one’s finger

in blame. You are identifying things that can be corrected and
improved.

TWIN or WITH approach

At the start of this chapter, when we looked at five principles of
human performance, we stated that situations that could cause a
human error are “predictable, manageable, and preventable.” In
other words, one can look for various factors in a situation that
are frequently seen in other deviation events. If these are found,
actions can be taken against these factors. That is what was done
by the Institute of Nuclear Power Operations (INPO) and resulted
in the TWIN (or WITH) listing of precursor events. There are
four categories, each with a number of things that could cause or
contribute to an unwanted event. The database search of incidents
that INPO performed found that some of these precursor events
were seen more often than others; these are shown in boldface type
in Table 3.

Table 3. Error precursors

“The conditions listed below were derived from an in-depth study of INPO’s event
database and several highly regarded technical references on the topic of error. Many
references refer to error precursors as behavior-shaping factors or performance-shaping
factors. The bolded error precursors are more prevalent and are listed in order of impact.
Other error precursors are not listed in any particular order” (DOE, 2009, pp. 2–35 and
2–36).

Task Demands Work Individual Human Nature

Environment Characteristics
Time pressure Distractions/ Lack of know- Stress (limits
(in a hurry) Interruptions ledge (faulty attention)
mental model)
High workload Changes/ New technique Habit patterns
(memory Departure from not used before
requirements) routine

LICENSED TO JOSE CASTELLA

Vesper Book.indb 126 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 127

Task Demands Work Individual Human Nature

Environment Characteristics
Simultaneous, Confusing Imprecise Assumptions
multiple tasks displays/Controls communication (inaccurate
habits mental picture)
Repetive actions/ Work-arounds Lack of Complacency/
Monotony proficiency/ Overconfidence
Inexperience
Irreversible acts Instrumentation Indistinct Mindset
problem-solving
skills
Interpretation of Hidden system “Unsafe” Inaccurate risk
requirements response attitudes for perception
critical task
Unclear goals, Unexpected Illness/Fatigue/ Mental shortcuts
roles, or euipment Injuriy (general (biases)
responsibilities conditions health)
Lack of or un- Lack of alterna- Unawareness of Limited short-
clear standards tive indication critical parameters term memory
Confusing Personality Inappropriate values Polyanna effect
procedure/Vague conflicts
guidance
Excessive Back shift or recent Major life event: Limited perspective
communication shift change medical, financial, (bounded
requirements emotional rationality)
Delays; idle time Excessive group Poor manual Avoidance of
cohesiveness/peer dexterity mental strain
pressure
Complexity/High Production over- Low self-esteem; First day back from
information flow emphasis moody vacation/days off
Long-term Adverse physical Questionable ethics Sugar cycle (after a
monitoring climate (habitability) (bends the rules) meal)
Excessive time on No accounting of Sense of control/ Fatigue (sleep
task performance Learned deprivation and
helplessness biorhythms)
Conflicting conven- Personality type Tunnel vision (lack
tions; stereotypes of big picture)
Poor equipment “Something is not
layout; poor access right” (gut feeling)
Fear of conse- Pattern-matching
quences of error bias
Mistrust among Social deference
work groups (excessive profes-
sional courtesy)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 127 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
128 Root Cause Investigations for CAPA: Clear and Simple

Task Demands Work Individual Human Nature

Environment Characteristics
Meaningless rules Easily bored
Nuisance alarms Close-in-time
cause-effect
correlation
Unavailable parts or Difficulty seeing
tools own errors
Acceptabilitly of Frequency and
“cookbooking” similarity biases
practices
“Rule book” culture Availability bias
Equipment Imprecise physical
sensitivity (inadvert- actions
ant actions)
Lack of clear strate-
gic vision or goals

How you can use this

The list of TWIN error precursors can be used in at least two different
ways. First, when investigating an event that is categorized as human
error, the lists can be checklists of points to consider. Second, these
lists can also be used proactively—for example, a smaller subset of
the items under each TWIN category could be selected, in part based
on trends within a department or organization. Prior to beginning
a task, those involved can see if any of these precursors might exist
and take appropriate precautions.

An example

A busy QC chemistry laboratory was having a number of failures,

though the problems rarely caused an actual out of specification
(OOS) failure. When an anomaly was observed, the first phase of
the investigation involving a supervisor and analyst to review what
was done usually found the problem quickly—a wrong dilution, or
mix-ups of flasks and reagents. QC analysts looked at lists similar to
those in the TWIN model and realized that several of the top TWIN
items commonly occurred: time pressures, simultaneous/multiple

LICENSED TO JOSE CASTELLA

Vesper Book.indb 128 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 129

tasks (multitasking), and distractions/interruptions. The lab staff

thought distractions/interruptions were the most significant because
they caused them to have to refocus after the interruptions, some
of which caused mental distractions that lasted beyond the initial
interruption. The solution was for each workstation to have yellow
tape printed with the words DO NOT DISTURB that the analyst
could put up during critical parts of the sample preparation and
testing phases. An evaluation of the results several months later
showed that the incidents had decreased significantly.

This is similar to the “sterile cockpit rule” to which commercial

pilots and flight crews must adhere. There is supposed to be no
talking about things other than the tasks at hand during taxiing,
takeoffs, and landings. Specifically:

Flight crewmember duties:

No flight crewmember may engage in, nor may any pilot in command
permit, any activity during a critical phase of flight which could
distract any flight crewmember from the performance of his or her
duties or which could interfere in any way with the proper conduct of
those duties (US FAA 14 CFR § 121.542 (2)).

A brief comment on multitasking (called “switch-tasking”

by neuroscientists (Napler, 2014)): Our brains cannot actually
consciously do multiple things at the same time. What multitaskers
try to do is rapidly switch between the tasks they are performing.
But each time, there is a start/stop cycle; it takes the brain fractions
of a second (at least) to reconnect with the task they are doing.
Multitasking does not save time—it actually takes more than
working through a task before moving to the next one.

Checklist questions
One other tool that can be useful during an investigation of human
error is a set of questions to ask. This list (Table 4) was developed by
Kevin O’Donnell, PhD, a highly regarded inspector with the Irish
Health Products Regulatory Agency (HPRA) (Poska, 2010).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 129 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
130 Root Cause Investigations for CAPA: Clear and Simple
Table 4. Suggested checklist for incorporating into deviation and
complaint investigation procedures for use when human error is
suspected (Used by permission of K. O’Donnell)

Questions to help establish whether Yes No N/A Comment

the cause of the incident may be
process, procedural, equipment, or
environment related:
Can manufacturing process or other work activity
be considered to be robust, capable, and stable?
Have the necessary process and other validation
studies been executed and completed?
Is the necessary equipment (including
instrumentation) in place for executing the work
activity correctly?
Is the manufacturing process or the concerned
work activity formally proceduralized?
Are there up-to-date written procedures,
guidance, or policies for the work activity in place
and do they provide sufficient detail so that this
incident should not have occurred?
Are work instructions clearly written without
ambiguity in what is required?
Is key terminology in procedures consistent? Are
there documented materials for executing a work
task available at the location in which the activity
occurs, where relevant?
If the answer is NO to any of the above
questions, the cause of the incident may not be
human error; it may be related to one or more of
the areas above.
Could the design of the manufacturing process or
the work activity of concern have contributed to
or caused the incident?
Could the design of the operator/equipment
interface or the operator/process interface have
been a factor in the incident?
Could the design of the working environment in
which the incident occurred have contributed
to or caused the incident? (For example, is there
adequate lighting and space to carry out the
task?)
Could the design of the operator/equipment
interaction or the operator/process interface
have been a factor in the incident?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 130 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 131

If the answer is YES to any of the above

questions, the cause of the incident may not be
human error; it may be related to one or more of
the areas above.
Questions to help establish whether Yes No N/A Comment
the cause of the incident may be
training or communications related:
Has the staff member in question received formal
training for executing the required work activity
or process in which the incident occurred?
Was the effectiveness of this training assessment
and were the results of that assessment
satisfactory? (Note: it is recognized that not all
training activity needs to be subject to formal
assessment, but this should be determined on a
case-by-case basis, and the principles of quality
risk management may help in deciding what
training needs formal assessment.)
Do change control procedures ensure that the
relevant staff are always formally informed (and
given training, where necessary) when a change is
occurring in a procedure, set of instructions, or in
a policy document?
Were all relevant changes that were made to
the work activity in which the incident occurred
communicated to the staff member in question?
Is a supervisor or manager available when a
staff member is performing the work activity or
process of interest in order to assess if there
are questions about how to exactly execute the
activity or process correctly? (This concerns the
need for sufficient person-to-person interaction
for operators.)
If the answer is NO to any of the above
questions, the cause of the incident may not be
human error; it may be related to one or more of
the areas above.
When executing the item of work of concern, are
staff required to make absolute judgments that
may be considered beyond their capabilities or
experience?
[Author’s note: One other very useful question
that can be asked here is: If the person’s life
depended on it, could they correctly perform the
task without any training?]

LICENSED TO JOSE CASTELLA

Vesper Book.indb 131 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
132 Root Cause Investigations for CAPA: Clear and Simple

If the answer is YES to any of the above

questions, the cause of the incident may not be
human error; it may be related to one or more of
the areas above.
Questions to help establish whether Yes No N/A Comment
the cause of the incident may be
related to a lack of staff empowerment
in their own area of work:
Do operating staff have any means of measuring
or knowing their own performance when working
through a process or when carrying out a piece
of work, either during the process/work activity
or immediately after they have completed it? (An
example of this would be where documented
acceptance criteria or examples of acceptable
versus unacceptable work are provided to
operators.)
Are there in-process checks or other monitoring
systems in place that allow potential problems to
be detected and prevented?
Do operators have a means of keeping track of
their work as it progresses?
Do operators have any means for reversing an
unintended action, and do they have sufficient
time available for error detection and correction?
Are operators routinely given regular feedback on
the quality of their work so that they know when
they are carrying out the work activity correctly?
Do operating staff have a means of adjusting
their manufacturing process or other work
activity to correct for any potential problems
or nonconformances? (Note: it is important
to ensure that any adjustments that staff can
make are kept within the design space/validated
process.)
If the answer is NO to any of the above
questions, the cause of the incident may not be
human error; it may be related to one or more of
the areas above.

How you can use this

The questions shown above can be a checklist for an investigator.
As shown in the note above each section, the checklist considers
the process, procedures, equipment, work environment, training,

LICENSED TO JOSE CASTELLA

Vesper Book.indb 132 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 133
communications, and staff empowerment. If one finds that
equipment is involved, for example, applying the “five whys” tool
(see Chapter 9) or another approach that dives into the active and
latent causes would be appropriate.

Other tools
If you do a web search for tools related to investigating human error,
you will find a large number of them. Each has its own advantages
and disadvantages. Some tools are very detailed and can be used in
cataloging and creating metrics about specific human factors that
cause or contribute to unwanted events. This can be very helpful
but also requires significantly more effort as one learns to effectively
use the tool. There are other tools that use a defined set of categories
and failure causes. These are helpful to new investigators but can be
limiting as they do not encourage one to think outside of the lines or
boundaries of the tool.

CONCLUSION
One of the most important improvements you can make in your
organization’s root cause investigation program would be to not
allow human error to be named as the root cause of an unwanted
event. When you see that there is a person at the “sharp end” of a
failure sequence, there are other factors that cause or contributed to
that event. Your challenge will be find those causes by using some of
the tools discussed in this chapter.

VALUABLE RESOURCES
Two very useful manuals are available (free!) on the web and are
among the best detailed guides for understanding human factors
and performance. Both were written for the US Department of
Energy and have been used in preparing this chapter:

Human Performance Improvement Handbook, Volume 1: Concepts

and Principles, 2009. (DOE-HDBK-1028-2009)
https://www.standards.doe.gov/standards-documents/1000/1028-BHdbk-
2009-v1/@@images/file

LICENSED TO JOSE CASTELLA

Vesper Book.indb 133 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
134 Root Cause Investigations for CAPA: Clear and Simple

Human Performance Improvement Handbook, Volume 2: Human

Performance Tools for Individuals, Work Teams, and Management,
2009. (DOE-HDBK-1028-2009)
https://www.standards.doe.gov/files/doe-hdbk-1028-2009-human-
performance-improvement-handbook-volume-2-human-performance-tools-
for-individuals-work-teams-and-management

REFERENCES
DOE (2009) Human Performance Improvement Handbook, Volume
2: Human Performance Tools for Individuals, Work Teams, and
Management. DOE-HDBK-1028-2009.

Edmondson, A. (2019) The Fearless Organization. New York, NY:

Wiley

Edmondson, A. (2011) Strategies for learning from failure. Harvard

Business Review, April 2011.

Edmondson, A. (1999) Psychological safety and learning behavior

in work teams. Administrative Science Quarterly, Vol. 44, No. 2,
pp. 350–383.

HFACS (2014) The HFACS Framework. https://www.hfacs.com/

definitions.html. Accessed 8 Feb 2020.

HFES (2019) What is human factors/ergonomics. https://www.hfes.

org/about-hfes/what-is-human-factorsergonomics. Accessed 3 Mar
2020.

IOM (2000) To err is human: Building a safer health system (complete

report). Washington: Institute of Medicine. https://www.nap.edu/
download/9728. Accessed 8 Feb 2020.

Juran, J. (1992) Juran on Quality by Design: The New Steps for Planning
Quality into Goods and Services. New York, NY: The Free Press

Kahneman, D. (2011) Thinking, Fast and Slow. New York, NY: Farrar,
Straus and Giroux

LICENSED TO JOSE CASTELLA

Vesper Book.indb 134 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Human Errors and Human Factors 135

Langewiessche, W. (2014) The human factor. Vanity Fair, September

2014. https://www.vanityfair.com/news/business/2014/10/air-france-
flight-447-crash. Accessed 6 Feb 2020.

Leveson, N. (2011) Engineering a Safer World: Systems Thinking Applied

to Safety. Cambridge, MA: MIT Press.

Moura, R., Beer, M. Patelli, E., Lewis, J., Knoll, F. (2016) Learning from
major accidents to improve system design. Safety Science, Vol.
84, April 2016, pp. 37–45. https://strathprints.strath.ac.uk/70459/1/
Moura_etal_SS2016_Learning_past_accidents_improve_system_
design.pdf. Accessed 3 Mar 2020.

Napler, K. (2014) The myth of multitasking. Psychology Today (online).

https://www.psychologytoday.com/us/blog/creativity-without-
borders/201405/the-myth-multitasking. Accessed 8 Feb 2020.

Poska, R. (2010) Human error and retraining: An interview with

Kevin O’Donnell, Ph.D., Irish Medicines Board. Journal of
Validation Technology, Vol. 16, Issue 3.

Reason, J. (2000) Human error: Models and management. BMJ,

Vol. 320, pp. 768–770. https://www.ncbi.nlm.nih.gov/pmc/articles/
PMC1117770/. Accessed 3 Mar 2020.

Reason, J. (1997) Managing the Risks of Organizational Accidents.

Aldershot, England: Ashgate Publishing.

Reason, J. (1990) Human Error. Cambridge, UK: University Press.

Shappell, S. and Wiegmann, D. (1997) A Human error approach

to accident investigation: The taxonomy of unsafe operations.
International Journal of Aviation Psychology 7:4, pp. 269–271.

Simon, H. (1982) Models of Bounded Rationality and Other Topics in

Economics. Cambridge, MA: MIT Press.

Vesper, J. (1993) Training for the Healthcare Manufacturing Industries:

Tools and Techniques to Improve Performance. Boca Raton, FL: CRC
Press.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 135 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 136 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

METHODS AND TOOLS USED

WHEN CONDUCTING
INVESTIGATIONS

If you ever do work around your home or in the kitchen or on your

bicycle, you know the importance of having the right tool to do the
job. Trying to get to a hard-to-reach bolt with a large pair of pliers
can be frustrating; having a properly sized socket wrench with
an extender makes all the difference in the world. Or, if you are
cooking, having a knife with a blade in the optimal size will make
your chopping safer and more effective and efficient.

When we look at those doing root cause analysis (RCA), we often

see they are using the same one or two tools (i.e., the “five whys,”
fishbone diagrams) regardless of the problem or its complexity.
There are a number of different tools that can be used—some simple
and others complex—and they will result in faster, more complete
investigations. Presenting some of these methods and tools and
when and how to use them is the focus of this chapter.

WHY USE METHODS AND TOOLS?

A fundamental question is why use methods and tools in the first
place? How are these going to help you get to the causes of an
unwanted event?

137
LICENSED TO JOSE CASTELLA

Vesper Book.indb 137 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
138 Root Cause Investigations for CAPA: Clear and Simple

Using a method—the process for accomplishing a task—gives

a standardized structure to guide your activities. The methods
often will involve a tool such as a technique (the five whys) or set of
symbols with rules for combining them (a fault tree).

Another important reason for using methods and tools is that

they will help you avoid biases that reduce objectivity leading to
incorrect conclusions. Some of the typical biases that can affect an
investigation and the investigators—either novices or experts—
include those described below.

• Anchoring bias: Remembering and relying on the first value or

data encountered and then, through adjustment, accommodating
other values or data. For example, when hearing a cost estimate
range, we as buyers tend to anchor on the lower amount; the
seller tends to anchor on the higher amount in that range. In
an investigation, a subject matter expert might say that the root
cause is X; those listening would become anchored or influenced
by that statement, making it difficult for them to consider other
possibilities.

• Authority bias: Giving greater credence to something said by

a person of higher rank or someone perceived to be an expert.

• Availability: Overweighting facts that readily come to mind;

ignoring information that may not be immediately accessible.

• Confirmation bias: Picking and choosing data or selectively

interpreting it in order to support the premise or the experiment.
Data that does not support the premise is ignored or explained
away. Often this is done without a deliberate intent of doing so.

• Hindsight bias: Looking at the facts after an (unwanted) event

has occurred thinking that anyone involved in that situation
“should have known” it was going to have happened as it did.
The saying “hindsight is 20/20” illustrates this.

• Optimism bias: Having (misplaced) confidence that projects

will run flawlessly, on time, and under budget.

• Selection bias: Taking samples or picking examples that distort

the results—for example, taking nonrandom “convenience

LICENSED TO JOSE CASTELLA

Vesper Book.indb 138 5/29/2020 10:56:04 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 139

samples” because it is easier and faster than picking samples

truly at random.

In addition to these cognitive biases, we all are also affected by

cognitive illusions that, simply put, mess up (or negatively affect)
our brains and the way we think. Optical illusions are examples
of this category. Daniel Kahneman, a Nobel Prize winner who has
studied and written extensively about cognitive biases has said,
“There are many biases, and I certainly do not claim to be immune
from them. I suffer from all of them” (Kahneman, 2013). He has
said that even though his brain knows when he sees a particular
illustration (Figure 1) that the horizontal lines are the same length,
when confronted with the drawing, he perceives the middle line as
longer than the others. Having methods and tools that help us avoid
or minimize the influence of these cognitive biases and illusions can
make our investigations more data-driven and effective.

Figure 1. Müller-Lyer illusion: An example of a cognitive illusion.

SPECIFIC METHODS AND TOOLS

Many of the methods and tools described below can be used for both
collecting and analyzing data. For simplicity, they are all described
in this chapter. Tools and approaches that are more specific to
understanding human factors that lead to human error are examined
in Chapter 8. The International Society of Pharmaceutical Engineers
(ISPE) and the Parenteral Drug Association (PDA) have prepared a
summary of tools that can be used in quality improvement and root
cause investigations that is available on the web (ISPE and PDA,
2019).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 139 5/29/2020 10:56:07 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
140 Root Cause Investigations for CAPA: Clear and Simple

Flow charting/Process mapping

A flow chart or a process map is a visual representation of the
sequenced steps or tasks in an activity that uses standardized
symbols. Flow charts be presented in a number ways from simple
line drawings (Figure 2) to more complicated swim-lane diagrams
(Figure 3). Flow charts are very effective ways to document,
communicate, and better understand what goes into, comes from,
and happens within a process. Some organizations attach flow
charts to their procedures in order to show a graphic description of
the task that is defined in the procedure and how that task fits into
the larger process or system.

The symbols used in flow charting have long been used in data
processing and information technology (IT) applications. Figure 4
shows some of the more commonly used symbols.

Figure 2. Example of a process map (PDA, 2008)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 140 5/29/2020 10:56:09 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 141
Figure 3. Example of a swim lane flow chart (Wikipedia, 2019)

Figure 4. Examples of flow chart symbols originally used in computer

programming

LICENSED TO JOSE CASTELLA

Vesper Book.indb 141 5/29/2020 10:56:11 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
142 Root Cause Investigations for CAPA: Clear and Simple

One variation on a detailed process map would be to use it

in identifying sources of contamination. For example, if one was
concerned about cellulose fibers getting into an injectable product,
the process map could be annotated with where paper, cardboard,
wipes, and the like were found or used.

Cause and effect diagram/Fishbone diagram

This is also known as an Ishikawa diagram because of the work of
Japanese statistician Kaoru Ishikawa in the 1940s in formalizing
this tool. The fishbone diagram helps event investigators collect
and organize causes—direct, contributing, and root—that are
potentially related to the unwanted event that is identified in the
“head” of the fish, usually on the far right side of the diagram. Early
versions of the diagram used four categories known as the “four
Ms”—machine, materials, methods, and manpower. More recently,
practitioners often use six categories (Figure 5):

• Methods

• Materials

• People

• Equipment

• Facilities

• Environment (or “mother nature”)

Other categories can be used or added as appropriate and can

organize ideas that come from brainstorming or as information
emerges from the investigation. Sometimes investigation team
members get frustrated over what category to put something in—
is it materials, or is it the method used to inspect the materials? It
usually doesn’t matter; the important thing is getting the item on the
diagram. When using a fishbone diagram, it is important to “drill
down” by asking why or how. In this way finer and finer details (the
smaller bones in this analogy) are documented.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 142 5/29/2020 10:56:11 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 143

A fishbone diagram can also be useful as you are writing the

report and methodically describing what potential causes were
considered and what items were rejected as causes for a given
reason.
Figure 5. Example of a fishbone diagram

LICENSED TO JOSE CASTELLA

Vesper Book.indb 143 5/29/2020 10:56:12 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
144 Root Cause Investigations for CAPA: Clear and Simple

Brainstorming
The term “brainstorming” was used first by advertising executive
Alex Faickney Osborne in 1948 and then later in his 1953 book, Applied
Imagination, to describe a method of generating novel ideas—the
wilder and crazier the better. Since then it has been a method used
to solicit ideas or thoughts for most anything including potential
hazards and, in our case, reasons for an unwanted event.

Osborne identified four key rules of brainstorming (in Kartoglu,

2018):

1. The more ideas the better.

2. Practice no criticism.

3. The wilder the idea, the better.

4. Complement and improve already existing ideas.

Usually brainstorming is led by a facilitator who encourages the

team and reminds the members of the four rules identified above.
Sometimes, someone is asked to be the scribe and write the responses
on flipchart paper. While generating ideas this way is often useful,
it is subject to some of the biases mentioned earlier, such as the
authority bias—people will sometimes not speak up and give an
idea if it is at odds with what a more regarded or organizationally
senior person would say. Or, if someone has a louder voice and is
someone team members might regard as having more knowledge
and confidence, his or her ideas may shut out the voicings of others.
One way to minimize these biases is to have participants write
down their ideas on Post-It® notes. The facilitator can then clarify
and distill the ideas; these ideas can then be used to augment the
investigation and as inputs for other tools such as fishbone diagrams
and fault trees.

Timelines
A relatively simple tool that has special utility when you are trying to
understand an event where the sequence or chronology is important
is a timeline. A simple timeline (Figure 6) could be used in presenting

LICENSED TO JOSE CASTELLA

Vesper Book.indb 144 5/29/2020 10:56:12 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 145

information about an event, while a comparison timeline (Figure

7) can be used in helping to understand the difference between
two or more items; change analysis (described below) would be a
complement to this tool.

Figure 6. Simple timeline to show chronology or sequence

Figure 7.Timelines used to show comparison of two events

Five whys
If you have spent much time with a six- or seven-year-old child, you
probably have experienced being asked multiple “why” questions––

LICENSED TO JOSE CASTELLA

Vesper Book.indb 145 5/29/2020 10:56:14 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
146 Root Cause Investigations for CAPA: Clear and Simple

Why is the sky blue? Why can you not see the blue at night? Why do
stars only come out at night? And on and on. That same tenacity at
asking questions is what drives the five whys approach to problem
solving. In talking with those involved in pharma/biopharma
investigations, the five whys method is one of the most commonly
used. The intent of the five whys is to look for the series of cause
and effect relationships that form the failure sequence. By asking
these repeated why questions, you are in fact creating a fault tree
(discussed later in this chapter).

The five whys method can be used in two different ways. First, it
can be used to generate hypotheses—what are the possible reasons
that the unwanted event could happen? Think of this as looking at
the problem broadly. Second, the method can help go more deeply
into a particular set of reasons that are evidence-supported.

The method usually starts by looking the symptom—the feature

of the unwanted event that can be noticed or measured—and asking
why that occurred. That response is followed by another why, and
so on until a root cause is determined. As you are moving through
the series of whys, it is important to look for evidence to support
the particular response. If there is no supporting evidence, you
may be going down the wrong path. Figure 8 shows an example
with supporting evidence. Even though the method is called the
five whys, there is nothing magical about the number of times why
is asked. You may get to the root cause (something that you can
reasonably take action against) in three tries; it could take you six
or seven as well. What is important is getting to the point where
specific actions can be taken.

One caution in using the five whys is that once you start in a
category (for instance, “equipment”), you are confined in that
category. For example, if you are asking why your car will not start
and you are not looking for evidence as you move down, you become
locked into a particular pathway (e.g., people or machines). To avoid
this, consider asking why about a particular fishbone category—for
example, methods or materials. If evidence cannot be found for that
category, you have at least tried to broadly look at potential causes.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 146 5/29/2020 10:56:14 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 147
Figure 8. Example of five whys along with evidence

Change analysis
Things are working well, specifications are being met, and then all
of a sudden your results are not as expected. You have a problem.
One way to determine the cause of the problem is to perform a
change analysis and look at the differences between the acceptable
and nonacceptable product, process, event, or decision. To isolate
one or more factors responsible for difference, change analysis uses
a set of categories including what, where, when, who, and extent
(how much or how many). Once the factors that have changed are
identified by this method, the five whys could be used to determine
the root cause of the change.

(Change analysis is also called “is/is not” and is used in the

proprietary Kepner-Tregoe, or KT, method for problem solving and
decision making that was developed in the 1960s.)

For change analysis to be effective, the problem needs to be

very specific. For example, “Clinical trial material X was exposed to
temperatures above the 8 degree C temperature specification when
being shipped from the manufacturing site to the clinical study site
in Phoenix, Arizona” would be better than “The shipment of clinical
trial material was out of specification.”

LICENSED TO JOSE CASTELLA

Vesper Book.indb 147 5/29/2020 10:56:15 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
148 Root Cause Investigations for CAPA: Clear and Simple

A pharma company was reviewing its deviation data and

noticed that it was having recurring problems due to failed line
clearances. (Line clearances are required in filling, packaging,
and labeling operations and involve removing anything product/
lot specific from the work area, including signage and refuse.
Additionally, equipment is disassembled to some extent so as to
ensure that labels or dosage forms have not gotten caught in the
mechanical mechanisms of the equipment.) The investigator used
changed analysis when collecting and analyzing the data. Look at
Table 1 and see if you can determine the reason for the line clearance
failures.

Table 1. Data used in a change analysis comparing situations at two

different manufacturing and packaging locations.

Problem statement: The number of line clearance failures is higher during some
months than at other times during the year.

IS IS NOT
What There are a higher number of There is not a problem with
quality incidents due to presence other operational metrics
of unwanted (“rogue”) tablets
Where The higher failure rate is at the The higher rate is not seen at
Tech Ops facility other facilities
When The higher rate occurs in Feb, The significantly higher rate does
Aug, and Nov not occur at other times
Extent The higher rate is seen across all There is not one line higher than
solid labeling/packaging lines another at the TO facility
Who Most operational associates have Most operational associates at
>25 years of experience other facilities have 8–12 years of
experience

As the investigator did the analysis, he was able to see that the
site where the problems were occurring (and recurring) was where
there were long-service employees who had considerable amounts
of vacation. They would take their vacations in the winter (often
leaving the cold Midwest for warmer climes), at the end of summer
with their children or grandchildren (before school started), and
before the Christmas holidays. This insight brought attention to the
inadequate training of replacement and temporary personnel used
to fill in for the vacationing personnel.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 148 5/29/2020 10:56:15 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 149

Five Ws and One H (5W1H)

The only problem-solving tool thought to be based on a poem is
known as 5W1H. It is also known as the “Kipling Method” because
of a poem’s first stanza written by Rudyard Kipling (1922) entitled
“I Keep Six Honest Men”:

I keep six honest serving-men

(They taught me all I knew);
Their names are What and Why and When
And How and Where and Who.
I send them over land and sea,
I send them east and west;
But after they have worked for me,
I give them all a rest.

This method is organized around six or so questions that are asked

by the investigator. The questions are:

• What is the unwanted event?

• Where does the unwanted event occur?

• When does the unwanted event occur?

• Why does the unwanted event occur?/Why is this a problem?

• How did we and will we respond?

• Who is involved/Who are the stakeholders?

Putting the first three questions together is the basis of the

problem description; including all of them is similar to the opening
paragraph of a news article that conveys the basic facts. Performing
a root cause analysis using other tools will hopefully answer the
why question. When writing the report, these six categories can also
provide a useful organizational framework.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 149 5/29/2020 10:56:15 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
150 Root Cause Investigations for CAPA: Clear and Simple

Fault trees
A fault tree (FT) is graphical tool that can be used to identify the
actual or potential root causes of an event; the tool can be used
prospectively (in a risk assessment) or reactively (in an incident
investigation). It can give details about the failure pathway or failure
mechanism and the various factors that are needed for the unwanted
event to occur. FTs are described as “top-down” tools because they
begin with an unwanted event at the top of the diagram and then
work down to identify the root causes. FTs use deductive reasoning
by asking the question, “What caused . . . ?” or more simply, “Why?”
(As mentioned earlier, the five whys method can be used to create
FT diagrams.) When used in a root cause investigation, FTs allow the
team to consider multiple paths to the top event (the failure or fault);
as evidence is collected, this supports particular failure pathways.

Fault tree analysis was developed in the early 1960s by Bell

Laboratories for guidance systems in the US Air Force Minute Man
missile system.

Constructing a fault tree diagram requires following a number

of rules to help ensure consistency of use and interpretation. A well-
prepared, detailed diagram shows the root (or basic) causes and
also the other factors that are or are not required for the top event
to occur.

A fault tree begins with the “top event.” Usually this is the
symptom that was observed. The diagram builds incrementally,
moving deeper and deeper into the reasons why this event occurred.
It is important that you move through these systematically; you are
wanting to understand each different level of the events.

The standard graphic symbols used in a fault tree are shown in

Figure 9. Figure 10 shows an annotated example of this.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 150 5/29/2020 10:56:15 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 151
Figure 9. Symbols used in a simple fault tree

LICENSED TO JOSE CASTELLA

Vesper Book.indb 151 5/29/2020 10:56:15 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
152 Root Cause Investigations for CAPA: Clear and Simple
Figure 10. Example of a simple fault tree

LICENSED TO JOSE CASTELLA

Vesper Book.indb 152 5/29/2020 10:56:16 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 153

As with the five whys, fault trees can be used to consider

potential reasons for the failure or incident, which can help guide
the start of the investigation. As evidence is gathered, one particular
section—the actual failure path—may grow in detail.

A relatively simple fault tree can be developed by following

these steps:

1. Identify the top event—this is usually the symptom you

observed.

2. Identify what could cause this top event. These are known as
fault events. Use rectangles to name these; if you do not intend
to go any further into the cause of a fault event, put the name in
a diamond shape. A diamond acts as a placeholder to show that
you considered this but, for whatever reason, are not going into
more detail.

3. Connect this layer of fault events to the top event by using a gate
(a logical connector). Usually at this first level, each of the fault
events you identify is necessary and sufficient to cause the top
event, so you use an OR gate. This would indicate that any of
the events below the gate are potential causes for the top event.
If two or more events are needed, meaning that they are both/
all necessary but in and of themselves not sufficient, you use an
AND gate.

4. Consider the evidence or data that you have. What events are
supported by the evidence? Go deeper into these by asking
what caused each of these events based on the data you have.
If you are making an informed guess, this would be where you
would search out data to support or reject your idea. Again,
think incrementally. Add AND or OR gates as appropriate.

5. Continue until you get to the basic (or root) causes, where you
can take action so these things do not happen again.

As you can probably see, this is a more structured way of doing

the five whys, with the benefit of being able to see where additional
contributors to the event (those that you noted with the AND gate)
are required.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 153 5/29/2020 10:56:16 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
154 Root Cause Investigations for CAPA: Clear and Simple

Observation
Looking and seeing employ the sense of sight, but observation takes
seeing to another level. Sherlock Holmes, the detective created
by the author Sir Conan Doyle (1930) has an exchange with his
colleague, Dr. Watson, in the story “A Scandal in Bohemia,” and
makes the distinction between seeing and observing. At issue was
whether Watson, who had been up and down a set of stairs on many
occasions, knew how many stairs there were. Watson claimed his
eyes are just as good as Holmes, to which Holmes replied:
“Quite so,” [Holmes] answered, lighting a cigarette, and throwing
himself down into an armchair. “You see, but you do not observe. The
distinction is clear. For example, you have frequently seen the steps
which lead up from the hall to this room.”
“Frequently.”
“How often?”
“Well, some hundreds of times.”
“Then how many are there?”
“How many? I don’t know.”
“Quite so! You have not observed. And yet you have seen. That is just
my point. Now, I know that there are seventeen steps, because I have
both seen and observed.” (Emphasis added.)

Observation is an important method in collecting information when

it comes to an investigation. Observation takes work that often uses
our other senses beyond sight, such as smell, hearing, and touch as
a complement to our vision.
Amy Herman, an art historian, lawyer, educator, and author
of the book Visual Intelligence (2017) teaches first responders and
healthcare professionals how to really observe a scene through the
use of artworks. The intent, she says, is helping people develop the
skill “in understanding how to look slowly and how to look more
carefully.” Her approach has four steps which she calls the “Four
As,” some of which will be discussed in more detail in later chapters:

1. Assess: What do you see in front of you?

2. Analyze: What is important? What is there? What is missing?
3. Articulate: Telling someone what you see or writing about it.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 154 5/29/2020 10:56:16 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 155

4. Act: Make a decision; do something with the information that

you now have.

Within the Assess step, she uses another acronym, COBRA, to guide
the observer through the process:

Camouflage: Some things are more difficult to find than others.

Hunt for what might be hidden—items that are small, blend into the
background, or are part of a pattern. Be aware that we all have blind
spots where we might miss something important.

One thing at a time: Observe a scene methodically, perhaps by

superimposing a grid (which could be real or virtual) and examining
each unit section by section. As you are doing this, do not try to
do anything else because attempting to multitask causes you to put
your attention on information that is not relevant.

Break often: Intensively focusing your attention consumes

considerable mental energy, so it important to take a break from your
observation. Every 20 minutes, take five minutes and do something
else. If you are doing this for 90 minutes, take 10 minutes to relax.

Realign expectations: We all can be subject to selection bias,

where we see what we want to see, or confirmation bias, when
we give importance to findings that support our hypothesis and
disregard (or explain away) things that don’t fit. When we are too
attached to a position or a hypothesis, we are less skeptical about
what confirms our position. Turn off your filters; be willing to be
surprised.

Ask: Ask someone else to observe with you. They may have a
very different set of experiences and expertise that allows them to
pick up something you didn’t see or didn’t think was important. Try
to find someone who is neutral and not as invested in the situation
as you are. They may also be able to more precisely name something
that can help as you communicate or articulate your findings.

While observing, take notes that are objective and descriptive;

minimize subjectivity here and do not draw conclusions. Instead
of writing, “Many of the incoming pallets of raw material were
damaged and spilling sodium chloride,” write “The first four of
eight pallets of 40-pound sodium chloride bags unloaded from the

LICENSED TO JOSE CASTELLA

Vesper Book.indb 155 5/29/2020 10:56:16 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
156 Root Cause Investigations for CAPA: Clear and Simple

semi had tears and rips on the lowest two layers; white crystals that
appear to be sodium chloride are seen on the floor of the truck.” Also,
it is not yet time to conclude that the ripped bags were made by the
poor handling or truck loading practices. That can be determined
with more investigation.

A3
The A3 approach to problem solving is really more than a simple
tool—it is a formalized method that takes one through the problem-
solving and improvement process. The result is a record of the
investigation and its results that are used for documentation and
communication. The approach comes from the Toyota Quality
System (Shook, 2009) and is centered on a piece of ISO A3 paper,
11.8 x 16.5 inches (29.7 x 42.0 cm), which has formally designated
places for presenting different types of information. And yes, the
method provides a defined way to fold the paper so it fits neatly in
a report or folder.

Figure 11. Example of an A3 problem solving template use to guide

users through the problem solving process and document results.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 156 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 157

Several different variations of the A3 template exist. The sections

are described in more detail below:

Header: Basic information for identification and approval

purposes.

Background: Contextual information about the situation—e.g.,

where, when, who, how discovered.

Current conditions: The problem that was observed—for

example, black particulates found on the outside surface of 325 mg
aspirin tablets.

Goal/targets: The underlying reason why you are wanting to

fix the problem—the goal or targets were not met. For example, it
could be “protect the product from contamination by ensuring that
HVAC systems are properly maintained,” or “ensure data quality/
data integrity by reducing documentation errors on batch records
for all products made at the Mayton site.” This can help as you later
look to see where else this corrective action may apply.

Cause analysis: The results of your root cause analysis would

go in this box; however, this is where the one-page A3 tool may
be too limiting for pharma/biopharma investigations. In the
Toyota Quality System model, this is where the results of the five
whys would be entered or the output from another type of tool. In
pharma/biopharma investigations, health authorities also want to
know what you considered, ruled out, and why. You could use this
A3 Cause Analysis space for a summary or conclusion of your root
cause analysis and have an attachment that provides more details.

Proposed countermeasures: A countermeasure is not

necessarily the solution that addresses the root cause(s), but an
action or set of actions that help you achieve your target or return
you to the “should be” state. This could mean also including a short-
term fix that would be the immediate action to prevent the problem
from getting worse (“stopping the bleeding”) and the correction (a
remedy to fix what was affected by the incident if that is possible).
As countermeasures are selected it is important that they are aligned
with the root, contributing, and proximal cause(s) identified.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 157 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
158 Root Cause Investigations for CAPA: Clear and Simple

Plan: Defining, communicating, and executing the plan of

action that would include the countermeasures that were identified.
This would be a typical project management plan with actions,
timeframes, and assignment of responsibilities.

Follow-up: Evaluating the results would be an effectiveness

check done in different ways at different times. As discussed in a
later chapter, this could be a very simple confirmation done during
a quality audit or a much more extensive “Stage 3” ongoing process
verification.

As can be seen in this description of the A3 sections, they

guide the user through all the steps of the overall investigation and
corrective action process.

Events and Causal Factors Charting and Analysis

A graphical tool (Figure 12) that is useful when initially collecting
information and then when analyzing for the root and contributing
cause(s) is Events and Causal Factors (ECF). It is referred to as a
“chart” when collecting and organizing the information; “analysis”
is used when causes (explanations as to why the event happened)
and corrective actions are being identified. The predominant feature
of ECF is a chronological or sequential timeline on which events,
actions, decisions, and contextual information are placed. It is a very
useful tool in creating a graphical model that allows a group to have
a common understanding of what happened.

ECF is not usually appropriate if the event is simple and the

cause-effect relationships are obvious.

There are several variations of ECF that have been published

(US NRC, 2010; NRIF, 2014). A proprietary version called TapRoot®
is similar to ECF.

ECF charts can be developed as the investigation unfolds as more

evidence and knowledge become apparent. Doing a logic check at
the end can show where there are knowledge gaps that require more
investigation. The ECF chart can help generate the story or narrative
that describes what happened and why; it is usually attached to the
investigation report.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 158 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 159

A limitation of the ECF tool is that it is very linear in structure,

i.e., events are arranged by sequence or timeline. However, by using
extensions—conditions, controls that worked/did not work—that
are placed above or below the sequenced items, one can see how
these had or did not have an impact on the failure sequence.

When preparing an ECF chart, a standard set of symbols is used

as shown in Table 2. Some practitioners use standardized colors of
Post-It notes instead of symbols and shapes.
Table 2. Symbols used in Event and Causal Factors charting

Symbol Name Meaning

Event An action that someone performs or
something that happens. Should include
the actor (who—a person, machine, or
instrument); activity (did what); and object
(the thing to or on which the activity was
performed).
Decision A choice that is made before an action
is taken.

Context Factors that affect a decision or an action.

Relevant background information that
helps understand what was going through
a person’s mind, e.g., competing priorities,
cultural issues.
Can also include an explanation of the
event, such as speed, specification, or
result and controls that were present and
worked.
Failed or A method to prevent or a way to detect
nonexistent an adverse event, action, or deviation.
control
Deviation, The unwanted event that causes some
accident sort of anomaly, variance or harm.

Connector
between events;
Connector from
a condition
Assumed event Data is not available to confirm event; no
solid data available.
(Dotted lines can make any symbol into
one that is assumed.)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 159 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
160 Root Cause Investigations for CAPA: Clear and Simple

Beginning the ECF charting usually starts with noting the

observed symptom that caused people to notice that an unwanted
event occurred. Some practitioners begin with finding all the initial
events (on the far left) and then move right toward the unwanted
event itself. Personal experience has shown that starting on the
right and moving left is a more effective way to collect information,
although it can be subject to hindsight bias (seeing how events
appear to align in causing the accident or deviation).

The horizontal display of actions, events, and symptoms

identify the “whats” associated with the unwanted event. It is
important to understand why these occurred; for this we describe
the context or underlying factors behind the actions and decisions.
It could be controls that were present but didn’t work, or someone
having an incorrect mental model or picture of how something was
to supposed to work. These factors are placed vertically above or
below the respective action, event, or decision. Included would be
procedures intended to describe the task with the relevant steps and
sub-steps, as well as information available to the personnel at the
time that was to be used in making a decision.

The vertically-placed conditions should also include

organizational and personal factors that may have influenced or
impeded the performance.

Information that is added to the ECF chart can come from any of
the other tools we have discussed; the chart is a tool used to organize
and display the information so that it can be more easily analyzed to
discover the root, proximal, and contributing causes.

Using the completed chart, you can create the narrative—tell

the story—of how the unwanted event occurred. Based on the chart,
you should be able to include information about the intent of the
process, how it was progressing and being (or not being) controlled,
actions taken, potential goal conflicts, and why the course of actions
and decisions made sense to the person at the time (also known as
bounded rationality [Simon, 1982]). This narrative can be included
in the investigation report; the ECF chart should also be included,
as it is graphical summary of what happened. Figure 12 presents an
actual event due to a mix-up and the resulting ECF chart.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 160 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 161
Figure 12. Example of an events and causal factors chart with
information about a specific mix-up

LICENSED TO JOSE CASTELLA

Vesper Book.indb 161 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
162 Root Cause Investigations for CAPA: Clear and Simple
Once the chart has been constructed, analysis can begin by
asking questions like:

• What was the unwanted event that resulted in the initial

symptom?

• What was the action, event, or decision that caused the unwanted
event? This is the “proximal event.”

• What were the events that began the path toward the ultimate
unwanted event (e.g., failure)? (Note: you may find more than
one of these!) These would be the “root cause(s).”

• What were the events or conditions that didn’t cause the

unwanted event but made the event worse, occur faster, or more
intensely? These are the “contributing causes.” Consider these
items as you are establishing your corrective actions.

• What are the controls that did not work or did not exist? These
are candidates for corrective actions.

SO WHAT TOOL SHOULD BE USED? TOOL SELECTION

GUIDANCE
For those who have spent years conducting investigations, intuition
is often their guide in selecting the tool or tools to use. For those
starting out, some rules of thumb may be useful:

• Tools often have an inherent primary utility—for example:

– Data collection: Observation, interviews, flow charts,

process maps, brainstorming.

– Data organization: Timelines, fishbone diagrams, A3, fault

trees, causal factors charting.

– Data analysis: 5Ws and 1H, five whys, causal factors

analysis.

• Some tools can be used in multiple ways, such as causal factors

and the five whys.
• Tools have a range of simplicity and difficulty.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 162 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Methods and Tools Used When Conducting Investigations 163

In selecting the tool or tools to use, consider the goal—collecting,

organizing, analyzing data—and then try using the simplest one.

As you move through the process, you might find the tool being
used is not helping you achieve your objective. In that case, try
another one.

CONCLUSION
As with other activities—preparing lab samples, doing home
repairs, performing surgery—having the right tools will contribute
to better and more efficient outcomes. The same holds true when
investigating an unwanted event; there are a number of tools beyond
the commonly used fishbone and five whys. Taking some time to
learn and apply them can greatly improve your investigational
skills.

REFERENCES
Doyle, A.C. (1930) The complete Sherlock Holmes. Garden City, NY:
Doubleday.

Faickney-Osborn, A. (1953) Applied Imagination: Principles and

Procedures of Creative Problem Solving. New York, NY: Charles
Scribner’s Sons.

Herman, A. (2017) Visual Intelligence. New York, NY: Houghton

Mifflin Harcourt.

ISPE and PDA (2019) ISPE – PDA guide to improving quality culture
in pharmaceutical manufacturing facilities. https://ispe.org/sites/
default/files/regulatory/ispe-pda-guide-to-improving-quality-culture.
pdf. Accessed 5 Mar 2020.

Kahneman, D. (2013) Socialsciencebites.com podcast, January 2013.

http://socialsciencebites.libsyn.com/daniel-kahneman-on-bias.

Kartoglu, U. (2018) Go authentic. http://kartoglu.ch/goauthentic/.

Accessed 6 Feb 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 163 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
164 Root Cause Investigations for CAPA: Clear and Simple

Kipling, R. (1922) Just So Stories. Garden City, NY: Doubleday, Page

& Co. https://www.loc.gov/item/41042441. Accessed 21 Apr 2020.

NRIF (2014) Events and conditional factors analysis manual, Second

edn. Noordwijk Risk Initiative Foundation. http://www.nri.
eu.com/ecfa.html. Accessed 6 Feb 2020.

PDA (2008) Technical Report #54: Quality Risk Management for Aseptic
Processes. Bethesda, MD: PDA.

Shook, J. (2009) Toyota’s secret: The A3 report. MIT Sloan Management

Review, July. sloanreview.mit.edu/article/toyotas-secret-the-a3-
report/. Accessed 18 Apr 2020.

Simon, H. (1982) Models of Bounded Rationality and Other Topics in

Economics. Cambridge, MA: MIT Press.

US NRC (2010) Root cause evaluation manual. https://www.nrc.gov/

docs/ML1021/ML102120054.pdf. Accessed 20 Mar 2020.

Wikipedia. (2019) Swim lane (image used under creative commons

license). Accessed 25 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 164 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

INTERVIEWS

One of the primary ways of collecting information concerning

a quality event is talking with those who were involved or with
subject matter experts who can share their experience and analysis.
On the face of it, it seems rather simple—ask the questions, get the
answers. In reality, however, it can be complicated by schedules,
organizational dynamics, fear, and how the mind works. In this
chapter we will examine these potential barriers to collecting
information by way of interviews and ways they can be overcome.

INTERVIEWS COMPARED TO INTERROGATIONS

When many of us think of interviewing a person, the visual that
comes to mind is what we see on a television drama: the good cop/
bad cop routine or Jack Bauer on the television show 24. In most
of those cases, we’re not seeing an interview—we’re watching an
interrogation. There is a significant difference. Interrogation involves
a presumption that the person you are talking with is a suspect
for an unwanted act. The tone of the interaction is accusatory and
has the goal of seeking a confession or obtaining evidence that the
person usually does not want to give up. An interview, on the other
hand, is performed in a non-accusatory and conversational way: the
interviewer is seeking information and understanding. (One of the
best interviewers these days is Terri Gross on the National Public
Radio program Fresh Air.)

165
LICENSED TO JOSE CASTELLA

Vesper Book.indb 165 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
166 Root Cause Investigations for CAPA: Clear and Simple

FEAR
As you are getting ready to meet with those involved in the event,
think of their predicament. Because you want to talk with them,
they may fear that their job is on the line. As was discussed in
Chapter 5, the concept of psychological safety is critically important.
If personnel perceive that there is a safe space where they can be
honest and vulnerable without fear of retribution, they will be much
more willing to be candid (Edmondson, 2018).

In the North American pharma/biopharma industry, we see a

wide spectrum of practices that are rooted in the company’s culture.
On one side, there is the “no fault/no blame/no fear” culture, where
the emphasis is on people stepping up and identifying where they
have made a mistake or contributed to a problem without any fear of
retribution (see Chapter 5). Those working for such an organization
sometimes complain that this approach reduces individual
accountability and promotes carelessness, since if personnel make a
mistake they just need to tell their supervisor. At the other extreme,
there are organizations trying to prevent problems and losses by
scaring people into not making mistakes. They do this by requiring a
person to have a face-to-face encounter with a senior leader or being
given demerit points for which, if a certain number are obtained in
a period, formal reprimands are given or the person is dismissed.
This can obviously cause apprehension that affects the sharing
of information or encourages presenting it in a such a way as to
protect oneself or others. One of H. Edwards Deming’s principles
was “drive out fear, so that everyone may work effectively for the
company” (Deming, 1982, p. 24).

AN INTERESTING CASE STUDY OF HOW OUR

MEMORIES CAN WARP
If you are a follower of television news in the US, you undoubtedly
know of the problems that Brian Williams, the NBC television
network news anchor, experienced in early 2015. If you had
not heard, he was found to have embellished his experience as a
passenger when he was flying in a military helicopter in Iraq while
covering a news story in 2003. He claimed his helicopter was hit by

LICENSED TO JOSE CASTELLA

Vesper Book.indb 166 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Interviews 167

antiaircraft fire, when, in actuality, it was the lead helicopter that was
struck; his aircraft was not affected. In April 2015, it was reported
that there were other exaggerations attributed to him (Roig-Franzia,
et al., 2015; Farhi, 2015).

There are two different views we can take in considering

how Mr. Williams’s story changed over time. On one hand, his
exaggerations might be blamed on showmanship or intentionally
telling the proverbial “fish that got away” story. On the other hand,
his inflation of the facts could be due to what happens when one’s
memories are saved, retrieved, subtly changed, resaved, and then
found to have drifted considerably from the original event (Chabris
et al., 2014) causing a “false memory” (Robb, 2015). (Either case is a
bad situation for a journalist to put himself in.) If we think of it this
way, Mr. Williams’s predicament provides a learning opportunity
for those involved in conducting interviews related to deviation and
quality event investigations.

HOW MEMORIES ARE CREATED—AND RECREATED

To better understand the cognitive interview process, which is
a widely-accepted and evidence-based approach for obtaining
information from witnesses (Wikipedia, 2020), we need to first have
a high-level understanding of how our brains store and retrieve
information. Information is thought to have a “storage” strength and
a “retrieval” strength. As we hear, read, or experience an event, that
experience initially goes into our short-term memory storage, where
there are relatively simple chemical changes to the brain’s synapses
that form the electrical connections between neurons (Carey, 2014;
Bjork, 2012). For the information to become a long-term memory,
the neurons must make proteins that consolidate the memories into
the brain cells. Pulling the information from the brain requires
retrieval of that memory, a task that is made easier each time that
memory is retrieved. Studying by the use of flash cards or quizzes or
teaching a concept to someone (or yourself) strengthens retrieval, in
part by re-storing that information in the neurons and increasing the
links that the particular memory has with other experiences. Studies
have shown self-quizzing is far more effective for long-term storage/
retrieval than just multiple re-readings of that information.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 167 5/29/2020 10:56:17 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
168 Root Cause Investigations for CAPA: Clear and Simple

But the retrieval and re-storage of the information may cause the
memory to be altered in some ways (Bridge et al., 2012). For example,
if you are telling a colleague the story of an event you witnessed,
you may see their eyes widen at a certain point or that they appear
to be bored at another point. Your brain is registering this, saving
the additional new “meta” information as it re-stores and forms a
new, “richer” memory of the story you are telling. The next time
you tell the story, you might give a subtle punch to the part that
excited the previous listener and quickly gloss over the part that was
a bit duller. Doing this several times can wipe out the initial version
of the story, replacing it with a new, slightly different account.
Conceivably, this could have been what affected Mr. Williams’s
memory if he did multiple retellings of his story.

WAYS TO OBTAIN THE MOST ACCURATE RECOUNTING

OF AN INCIDENT
If you need to collect information from a witness or someone
involved in quality event or incident, there is some guidance that
is based on cognitive research and investigators’ experience. The
first point is to try to get the information as soon after the event as
possible. Accident investigators refer to the optimal time window
for this as the “golden hours” (College of Policing, 2019) that last
up to 24 hours after the incident. After that time, memories fade or
change and important pieces of evidence degrade or disappear. The
sooner you can talk with those involved, the better!

The location of the interview can also be important. Doing it

in the work environment of the interviewee or where the incident
occurred can make the person more comfortable. Another advantage
is that seeing the specific location allows the investigator/interviewer
to apply their observation skills (see Chapter 9).

In some situations, people who witness an incident are asked to

write a statement of what happened. This can be helpful, but those
who have researched knowledge acquisition and management have
an expression: “We know more than we can say; we say more than
we can write” (Polanyi, 1966, p 4). So while a written statement might

LICENSED TO JOSE CASTELLA

Vesper Book.indb 168 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Interviews 169

have some utility, getting the person to talk about what happened
by using a structured interview process is preferred.

THE COGNITIVE INTERVIEW PROCESS

The widely accepted approach for doing interviews of witnesses and
those involved in an incident is called “cognitive interviewing” and
is structured in a way to maximize the reliability of the information
being obtained (Fisher et al., 1992). Doing this as soon after the event
as possible—within the golden hours—helps to minimize the drift
in the story that can occur. The cognitive interview process has five
specific tasks; each is identified below and described in some detail.
Table 1 summarizes these points and specific suggested questions.

Table 1. Interviewing guidelines based on the Cognitive Interview

Process by Fisher and Geiselman (1992)

Phase/Step Goal Example question Comments

or instruction
1. Introduction
Develop Develop a personal, What is your typical
rapport meaningful day like?
relationship between
interviewer and
interviewee.
Encourage Have interviewee Since you were
interviewee understand that he or involved in this, we
to be she is an important need to get your
involved source of information. perspective of things.
We’re after facts and
not wanting to place
blame.
Need for Let interviewee know I know this is going
concen- that remembering to take some work
tration and discussing this on your part—I’m
incident takes effort. going to ask you
to concentrate on
what was happening
before and during the
incident.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 169 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
170 Root Cause Investigations for CAPA: Clear and Simple

Phase/Step Goal Example question Comments

or instruction
Transfer Let interviewee know We’ll start with
control that since they were you going from the
present or involved beginning of the
in the incident, the situation to the end.
interviewer will be I’m going to do my
following their lead best not to interrupt
as they describe the you, because I don’t
event. want to interfere with
your telling me the
story. Then, we’ll go
back over some of the
specifics and try to
probe them in more
detail.
Detailed Ensure that As you go through the
recall interviewee knows events, we need to
that even small have you give details
details—sounds, like what sounds you
smells, feelings—are heard, if there were
important to note; any things that were
they shouldn’t self- odd or different.
censor. Don’t worry about
going too deep.
2. Open-ended narration
Restatement Have the interviewee Before you begin Give interviewee
of context mentally go back to to tell me what enough time to do
the time and place of happened, take a few this; have them signal
incident, recreating minutes, and in your when they are ready
as many factors as mind, go back to the to continue.
possible. event. Think about
where you were, what
was going on, anything
you saw, heard, felt,
or smelled. What
were you thinking and
feeling at that time?
Take a bit of time, and
then we’ll talk about
the event.
Begin the Have the interviewee Tell me, in your own Do not interrupt—it
narration present the story, words, from the will frustrate the
with the interviewer beginning to the end, interviewee!
identifying items to what happened.
probe in more detail. Take brief notes using
the exact words
spoken.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 170 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Interviews 171

Phase/Step Goal Example question Comments

or instruction
3. Ask follow-up questions; probe for details
Principle of Obtain more details Let’s start with… Tell Avoid asking added
detail of each “scene” me about… (using questions while
or section of the the interviewee’s own interviewee is
story, beginning with words as captured thinking.
the scene that the above).
interviewer feels is Ask fewer but open-
most promising. ended questions.

Tell interviewee to
be as specific as
possible and add as
many details as they
remember.

Pause for 3–4 seconds

after each answer is
given.

When appropriate,
ask interviewee to
make a sketch or
drawing.
Principle of Focus on each scene You’re talking about
momentum or section of the the sound you
story; do not jump heard when the unit
around. malfunctioned. Was
there anything else
that you noticed at
the time?
Multiple and Enhance recall by Let’s start at the end Do the reverse order
varied recall asking the interviewee of the event when only when the story
to tell about the event you realized the has been completely
in reverse order or malfunction... What told and other follow-
from a different point were you doing right up questions have
of view. before that? been asked; otherwise
it could distort the
You said your chronology in the
supervisor was on witness’s mind.
the other side of the
room. What do you
think she would have
seen or heard?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 171 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
172 Root Cause Investigations for CAPA: Clear and Simple

Phase/Step Goal Example question Comments

or instruction
4. Review
Check for Review what the Let me see if I’ve Clarify uncertainties
accuracy interviewee said gotten this correct. or discrepancies.
so as to ensure On Tuesday at 2 PM,
accuracy and allow you were… Read back your notes.
the interviewee to
Point out and
add any additional
try to clarify any
information.
contradictions; allow
for interviewee to say,
“I don’t know…”
5. Close
Thank Share your I know that you’re
interviewee appreciation with busy, but I really
interviewee for the appreciate your taking
assistance. time to talk with me.
Describe Mention what We’ll be looking at
next steps happens next, that some of the other
you’ll be available if information we’ve
any new information gathered to try and
is available, etc. understand why
things happened the
way they did. In the
meantime, if you
think of anything else,
please give me a call.
Here’s my number
and email address…

1. Introduction: During this first step, the interviewer establishes

rapport with the interviewee and starts to build a level of trust.
One doesn’t just begin asking for incident-related facts, but
rather, uses neutral questions that can build a relationship. For
example, “What do you do in a typical workday?” or “What
was the path that got you to your current position?” The
interviewer will also mention that the interview process requires
concentration and a recall of details that the individual, at the
outset, may not think of as important or valuable. This could
include anything related to sounds, smells, and feelings.

2. Open-ended narration: The interviewer next asks the

interviewee to mentally go back to the time and place of the
incident and think about it silently for a moment in order to

LICENSED TO JOSE CASTELLA

Vesper Book.indb 172 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Interviews 173

reestablish the context. When ready, the person begins to tell the
story. It’s critical to avoid interrupting the interviewee—a story
that is free-flowing is what is desired here. Nonverbal cues like
nodding one’s head can provide useful feedback to the person
giving the information without breaking up the narrative. One
of the most difficult aspects of this technique for the interviewer
is not interrupting the person and asking for more detail; the
opportunity of drilling down occurs after the first complete
telling of the story. While the interviewee is giving the story, the
interviewer should take only minimal notes and focus on what
is being said.

3. Asking follow-up questions, probing for detail: Once the whole

story has been told, the interviewer asks questions, inquiring
about more details. If the interview takes place shortly after
the event, the witness may still have a variety of informational
bits still available—information that hasn’t yet been converted
into long-term memory or simply discarded. The interviewer
can select different scenes or interesting points in the story and
probe for more specifics, or go back to the beginning and follow
the chronology forward. Another useful technique is to ask the
witness to tell the story in reverse order or to ask the interviewee
to give the perspective from another person—for example, what
an operator in an adjacent room might have heard when a piece
of equipment failed.

4. Review: After the information has been obtained, the interviewer

should go back over the facts, asking the witness to confirm or
clarify them as needed. Times, places, names of materials, and
the like are import to verify.

5. Close: Thanking the interviewee and describing the next steps

in the investigation process are done as the interview comes
to an end. The interviewer might also give his or her phone
number to the interviewee in case other details come to mind.

CONCLUSION
Interviews are an important part of investigating an incident as
one tries to determine the root, contributing, and proximal causes

LICENSED TO JOSE CASTELLA

Vesper Book.indb 173 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
174 Root Cause Investigations for CAPA: Clear and Simple

of a quality event or deviation. Conducting and documenting the

interview as soon after the event as possible helps to minimize the
“drift” of facts and to ensure that some of the subtle information
surrounding the event is retrieved from the interviewee. Using the
cognitive interview process is a structured way that can help the
investigation team understand the quality event and increase the
chances of preventing a recurrence.

REFERENCES
Bjork, R. (2012) Storage strength vs. retrieval strength. https://www.
youtube.com/watch?v=1FQoGUCgb5w. Accessed 8 Feb 2020.

Bridge, D.J. and Paller, K.A. (2012) Neural correlates of reactivation

and retrieval-induced distortion. Journal of Neuroscience,
32(35):12144–12151.

Carey, B. (2014) How We learn: The Surprising Truth about When,

Where, and Why It Happens. New York, NY: Random House.

Chabris, C.F. and Simons, D. (2014) Why our memory fails us. New
York Times. https://www.nytimes.com/2014/12/02/opinion/why-
our-memory-fails-us.html. Accessed 3 Mar 2020.

College of Policing (2019) Investigation process: Golden hour.

https://http://www.app.college.police.uk/app-content/investigations/
investigation-process/ - golden-hour. Accessed 8 Feb 2020.

Deming, W.E. (1982) Out of the Crisis. Cambridge, MA: MIT Press.

Edmondson, A. (2018) The Fearless Organization: Creating Psychological

Safety in the Workplace for Learning, Innovation, and Growth. New
York, NY: Wiley.

Farhi, P. (2015) NBC News finds Brian Williams embellished at least

11 times. Washington Post. www.washingtonpost.com › lifestyle ›
style › 2015/04/25. Accessed 3 Mar 2020.

Fisher, R.P. and Geiselman, R.E. (1992) Memory Enhancing Techniques

for Investigative Interviewing: The Cognitive Interview. Springfield,
IL: Charles C. Thomas.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 174 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Interviews 175

Polanyi, M. (1966) The Tacit Dimension. Garden City, NY: Doubleday

and Company.

Robb, A. (2015) A memory expert explains Brian Williams’s ‘false’

memory. New Republic, 6 May. http://www.newrepublic.com/
article/120990/memory-expert-explains-brian-williamss-false-
memory-helicopter. Accessed 3 Mar 2020.

Roig-Franzia, M., Higham, S., Farhi, P. (2015) Within NBC, an intense

debate over whether to fire Brian Williams. Washington Post.
www.washingtonpost.com › lifestyle › style › 2015/02/11. Accessed
3 Mar 2020.

Wikipedia (2020) Cognitive interview. Accessed 8 Feb 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 175 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 176 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

IMMEDIATE ACTIONS AND

CORRECTIONS

When an unwanted situation is observed (through direct observation

of the symptoms or some sort of notification like a complaint) the
actions taken can be important in limiting the impact on the things
of value—people, products, facilities, materials, equipment—that
might be in harm’s way. When the observation is disconnected
in time from when the event actually occurred, for example, a
trend analysis going in an unwanted direction or a serious patient
complaint, immediate actions can occur but will be very different
than if the actual event is caught in progress or right after it
happened. In this chapter we will look at what might be done to
prevent the problem from becoming worse.

IMMEDIATE ACTIONS
Earlier, we defined immediate action as: the action(s) taken to stop the
event or nonconformance from getting worse. In other words, you are
trying to figuratively “stop the bleeding.” Initiating an investigation
because of a deviation would not be considered an immediate action.

The specific course of action taken is highly variable and

obviously depends on the event, but can often be based on several
principles that include:

177
LICENSED TO JOSE CASTELLA

Vesper Book.indb 177 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
178 Root Cause Investigations for CAPA: Clear and Simple

• Prevent: Addressing the source of the problem by shutting it

down, turning it off, or diverting the hazard. This reduces the
likelihood of other things of value being affected.

• Isolate or quarantine: Limiting or confining items affected or

potentially affected from those that were not affected by the
source of the harm (i.e., hazard).

• Inform: Communicating to stakeholders what is happening or

has happened. This could include organizational management,
healthcare professionals, patients, and regulatory officials.
Often there is an escalation procedure to ensure the appropriate
level of management within a firm is informed. In some cases,
such as with field alert reports (FARs) to the US FDA, there are
requirements that the FAR must be submitted within three days.

Some of these immediate actions could be spontaneous

improvisations based on one’s knowledge and experience. Having
a mental model of how the components in a system fit together
would be important in comprehending the signals coming from the
incident. Applying risk-based thinking would also be important.
Two questions one would want to ask are: What are the risks if a
course of action is taken? and What are the risks if nothing is done?

Immediate actions can also be detailed in a procedure or

checklist; everyone likely to be involved should be trained on these.
The procedure might be very informal, as with a mnemonic, such as
PASS (pull the safety pin, aim, squeeze the handle, and sweep the
nozzle that discharges the extinguishing material toward the base
of the flames) that applies when using a fire extinguisher. At other
times, the actions may be well defined in a checklist or procedure
and practiced.

As immediate actions are being taken, it is important to

preserve any evidence that could be important in the investigation.
For example, samples and photos taken after the incident has been
stabilized may be very helpful in understanding the causes.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 178 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Immediate Actions and Corrections 179

CORRECTIONS
The International Standards Organization (ISO) defined corrections
to be: “Action to eliminate a detected nonconformity” (ISO 9000:2005
– 3.6.2). Associated with this definition are two notes:

Note 1 – A correction can be made in conjunction with corrective

action (3.6.5); Note 2 – Corrections can be, for example, rework
(3.6.7) or re-grade (3.6.8).

In the pharma and biopharma world, a correction could be the

rejection of a raw material. Rework or reprocessing is usually not
possible or permitted when making drug product dosage forms;
it might be allowed when producing drug substances or active
pharmaceutical ingredients (APIs). The marketing authorizations
or product approvals/registrations must be considered before doing
a rework or reprocessing.

In some situations the immediate action may also serve as a

correction—for example, correcting an entry on a batch record
when the mistake is initially made. If there is a time lag between
the immediate action and what is done next, that gap separates the
immediate action from the correction. For example, if a sprinkler
head in a fire suppression system (also sometimes called a “drop”)
is hit and a water deluge occurs, one of the immediate actions would
be to turn off the water. The corrections would be to replace the
sprinkler head, discard items damaged by the water, and clean up
the mess.

CONCLUSION
When an incident occurs, the immediate actions can make a
significant difference in the scope and impact of the situation. In
some cases the knowledge of a system and common sense may be
enough; in other cases having some level of preparation (“just in
case”) or even a checklist or procedure may be warranted.

Do not spend a lot of time deciding if something is an immediate

action or a correction. Just be sure to give yourself credit for the
actions taken by documenting them.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 179 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 180 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

CORRECTIVE ACTIONS AND

PREVENTIVE ACTIONS

At this point in your investigation process you have, ideally, evidence

that points to the causes of the unwanted event. These could be
some combination of root causes, proximal causes, and possibly one
or more contributing causes. What is essential is that the root and
proximal causes be “actionable”—that is, you can identify specific
controls that can be put in place to prevent them from happening
again.

Earlier we used the Global Harmonisation Task Force’s words

(GHTF, 2010) as we defined a corrective action to be:

• Action to eliminate the cause of a detected nonconformity (3.6.2)

or other undesirable situation
Note 1: There can be more than one cause for nonconformity.

Note 2: Corrective action is taken to prevent recurrence whereas

preventive action (3.6.4) is taken to prevent occurrence.

Note 3: There is a distinction between correction (3.6.6) and corrective

action.

And preventive action:

• Action to eliminate the cause of a potential nonconformity

(3.6.2) or other undesirable situation

181
LICENSED TO JOSE CASTELLA

Vesper Book.indb 181 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
182 Root Cause Investigations for CAPA: Clear and Simple
Note 1: There can be more than one cause for nonconformity.

Note 2: Preventive action is taken to prevent occurrence whereas

corrective action (3.6.5) is taken to prevent recurrence.

Note 3: There is a distinction between correction (3.6.6) and corrective

action.

Since the unwanted happened and you are wanting to not have it
happen again, you are taking a corrective action.

Some people get tied up in knots thinking about what a preventive

action would be. Let us say one site (Site A) had the problem and
another site (Site D), with a similar process, has not experienced the
problem. Putting controls in place to prevent a recurrence at Site A
would be the corrective action; applying the same actions at Site D
to prevent the event would be considered a preventive action. What
about implementing the controls in a different area at Site A? You
could call that either type of action; most practitioners would call
it a preventive action since the problem had not occurred in that
area. As we have said regarding other issues, do not let yourself
get sidetracked on whether it is a “true” preventive action. The
important thing is you do not want the event to occur or reoccur.

In this chapter, we will look at the range of options available

that can be taken for either corrective actions or preventive actions.
This is followed by some important cautions about the limitations of
these control options.

LINKING CORRECTIVE ACTIONS TO THE CAUSES

One of the most important factors needed in correcting an unwanted
event is that the actions put in place address the causes that were
found during the investigation. If the causes were not found or
assumed, the corrective actions will probably be a waste of time.
(The appropriate actions to take if your diligent investigation could
not find the root or proximal causes are discussed later in this
chapter.) Corrective actions that are not aligned with the root cause
are usually why effectiveness checks show that the problem has not
been resolved.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 182 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 183

When selecting the corrective actions to apply, you are targeting

the root and proximal causes, not the contributing causes that may
be present, since the contributing causes only increase the likelihood
of the event happening or the speed or intensity of the event itself.
Finding ways to reduce or eliminate contributing causes is definitely
to your benefit, particularly if the effort or cost for doing so is low.

CHANGE CONTROL AND RISK ASSESSMENT

If you have identified one or more corrective actions to solve your
problem and you are working in a GMP setting, do not forget about
change control. Depending on what is proposed, there may be
actions that will need management and regulatory/health authority
reviews and approvals. Failure to identify these can result in delays
and potential compliance and regulatory issues.

You also should consider new or residual risks that might occur
as a result of the change—this is part of the risk management process
discussed earlier (Chapter 6). A residual risk is defined as risk that
remains after the corrections, corrective actions, and preventive
actions have been put into place. The risk assessment does not
necessarily have to be complicated. For most simple proposed
corrective actions, five questions can get you through the process:

1. What could go wrong?

2. How likely is it to occur?

3. If it did happen, how bad could it get?

4. Do we need to do something about this?

5. What options are available to us?

LOOKING AHEAD TO AN EFFECTIVENESS CHECK

As you are establishing one or more corrective actions, how will you
know if they will accomplish your goal? If there are several different
roughly equivalent options, might one option be easier than another
to demonstrate that it has been successfully implemented and is

LICENSED TO JOSE CASTELLA

Vesper Book.indb 183 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
184 Root Cause Investigations for CAPA: Clear and Simple

effective? Consider planning your effectiveness check strategy as

you are putting the corrective actions in place.

THE RANGE OF CORRECTIVE ACTION OPTIONS

There are many options that may be appropriate given the specific
root and proximal causes that were identified. Some of these are
listed and described below. While there is no absolute priority list,
some actions are more preferred than others; those that are listed
at the beginning of the following list should be considered before
those farther down on the list.

Elimination
If your cause indicated that something specific was the cause of the
unwanted event, could it be eliminated? For example, if cardboard
cartons that caused nonviable particulate contamination were used
to transport items into a classified environment, would it be possible
to get rid of the cardboard? Elimination also applies to processing
steps or activities associated with a process. Having a detailed flow
diagram or process map is very helpful here.

Eliminating the hazard (the source of harm) may take some

work and creativity, but it can result in a significant cost savings to
the organization.

Substitution
If elimination of a hazard is not possible, consider substituting a
less hazardous action or material. For example, in an equipment
preparation area (grade C/D or ISO-8), washed/cleaned parts were
placed in plastic bags and heat sealed. Nonviable air samples
showed that the heating element was sometimes generating low
amounts of fume when sealing the bags. The firm eliminated the
bagging/sealing process step and substituted it with putting the
parts into clean plastic totes that had a lid which could be tightly
sealed. Another example is that instead of using general-purpose
copy paper for batch records in an aseptic processing area, the

LICENSED TO JOSE CASTELLA

Vesper Book.indb 184 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 185

instructions are printed on low-shedding paper that can be safely

autoclaved. And some firms are going even further, substituting the
paper records with notebook computers.

Engineering controls/automation
One way to eliminate “human error” is to not have people involved
in the activity, something often called “engineering the person out.”
Automating and then validating the process can provide a high
degree of assurance that it will be performed the same way every
time. For example, some firms use robots to load and unload freeze
dryers or they have highly automated aseptic filling lines placed in
isolators.

Automation is not a panacea, as will be discussed below.

Isolation
Isolating an activity can remove the hazard from what it could
adversely affect. Using an isolator in which aseptic filling occurs
is an obvious example, as is keeping wood pallets out of drug
manufacturing areas. Isolation may be physical (like placing
unapproved or rejected materials in a quarantine lockup) or
temporal (relying on timing and scheduling).

Alerts or notifications
Alarms can be useful in certain environments to notify personnel,
for example, if an air handling system is not functioning correctly.
The type of alarm—visual or auditory—needs to be considered in
terms of where it will be used and the personnel working in that
area. A major concern with alarms, however, is alarm fatigue—
“sensory overload when clinicians (in a hospital setting, for
example) are exposed to an excessive number of alarms, which can
result in desensitization to alarms and missed alarms” (Sendelbach
et al., 2013). Research has shown that 70–99% of all alarms are false
(Gaines, 2019).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 185 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
186 Root Cause Investigations for CAPA: Clear and Simple

Design of forms and documents

Poorly designed forms are one of the major causes of documentation-
related errors. For example, legibility problems often occur because
people must insert 200 characters of handwritten text in a space that
really only fits 50 characters.

An environmental monitoring group was using paper-based

forms for recording viable and nonviable data in their controlled
areas. The forms had been modified by different people at different
times, which resulted in many mistakes by those filling out the
forms. The team working on improving the forms—and decreasing
mistakes—looked for “exemplars”—data collection sections that
were the best of the set. These were further improved by the use
of background colors (shades of gray) and lines to offset certain
sections filled in by other people. Indexing was used—alternating
sets of instructions in white and light gray that logically grouped
related collection points.

“Cues” can be used to remind users of the information to be

included or the format. For example, having a date column with the
format in light or small type in the header (e.g., ## Mon ## or Da Mon
Year) can help writers put the date in correctly (i.e., 22 Oct 2020) as
well as help the reader interpret how the entry is to be made.

Having standardized footnote comments can also help

standardize and simplify form completion.

5S: Sort, Set in Order, Shine, Standardize, Sustain

This approach has its origins in Toyota Motor Company as a
complement to the other quality improvement programs developed
by the firm. The original 5S and the more literal meanings are shown
in Table 1.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 186 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 187
Table 1. Terms and definitions of the Toyota 5S system (adapted from
ASQ, undated)

Japanese Translated English Definition and comments

Seiri organize sort Eliminate whatever is not needed by
separating needed tools, parts, and
instructions from unneeded materials.
This can apply to workspaces as well
as documents and forms.

Seiton orderliness set in order Arrange whatever remains by neatly

placing and identifying parts and tools
for ease of use. Information from
users can help match the positioning
of objects and data collection fields in
a way that is aligned with how people
do the job.

Seiso cleanliness shine Clean the work area by conducting a

cleanup campaign. Ensure that only the
right, appropriate tools are available
and each is in optimal shape.

Seiketsu standardize standardize Schedule regular cleaning and

maintenance by conducting seiri,
seiton, and seiso daily. Create practices,
procedures, and checklists so
consistency is achieved.

Shitsuke discipline sustain Make 5S a way of life by forming the

habit of always following the first four
S’s. Make it simple for people to do
the right thing; provide feedback and
proper incentives.

Uncoupling or loosely coupling process steps or

operations
A tightly coupled process is similar to the cars on a train—each is
hooked to another which is hooked to another, and on and on. This
means that if a problem occurs, the impact can be transmitted to the
other connected units. One place you can see this is in escalators for
underground subway or rail lines or moving walkways in airports:
instead of one very long escalator or moving walkway, they are
usually broken up into several shorter ones. This would mean that

LICENSED TO JOSE CASTELLA

Vesper Book.indb 187 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
188 Root Cause Investigations for CAPA: Clear and Simple

if something broke down, the problem would not affect the entire
system (e.g., walkway, process, etc.) but only part of it.

A downside of uncoupling is that it may make the operation

less efficient because of the start-stop or multiple pieces. On the
other hand, a process that is uncoupled could reduce the impact
should there be a problem.

Decreasing the frequency of an event happening

Risk is defined as the combination of likelihood of an unwanted
event occurring and the severity of impact should the event occur.
By reducing the frequency of doing something what could result in
a negative consequence, you are reducing risk. Interventions would
be an example of this: you are reducing the risk of contaminating
the product by decreasing the number of interventions made.

This strategy was used in regards to preventive maintenance

(PM) in aircraft (Reason et al., 2003)—an airline did a study to
show that if they reduced the frequency of how often certain PMs
were performed, they could reduce potential problems. (They also
discovered that deviations most frequently occurred in the re-
assembly process: parts could be omitted or installed improperly.)

Changing the source

This action is obviously simple but can get very complicated
depending on what is involved. If there are a large number of defects
found in incoming components, it may be advisable to change to a
different vendor but that can be very complicated, requiring audits,
qualification samples, increased testing or inspection, and the like.

Duplicating the assets

This is another option that can get expensive and complicated. Having
a single supplier of materials—particularly critical materials—can
be very risky should the supplier have quality or supply issues. So,
having a second approved supplier would be advisable. In other

LICENSED TO JOSE CASTELLA

Vesper Book.indb 188 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 189

cases, having a back-up qualified instrument would be appropriate

in case there is a problem with one that is typically used. Some
companies that make medically necessary products have their own
second or third sites (or a contract manufacturing site) with the
capacity to make up a shortage should one of the other sites have
some sort of failure.

Providing additional information

If there are is nothing else you can do, providing additional
information may be one remaining option. It is not regarded highly
because, simply put, we as people tend to not read instructions. The
lengthy instruction manuals that come with appliances are often
shortened to a one-page “quick start” guide, usually with graphics.

Additional information may be in the form of signs or posters—

for example, poster-sized photos showing the proper sequence of
gowning mounted in the gowning room. If these types of job aids
are used, you need to link them to the underlying document or
procedure so proper change control procedures can be applied.

One firm had multiple recurrences of too many people in

their changing room at one time. (The ventilation system and
environmental monitoring program only supported three people in
the room at one time.) Typical signs (for example, NO MORE THAN
3 PEOPLE ALLOWED IN ROOM AT ONCE) did not seem to work.
They then changed their approach and instead of giving a directive,
they used a sign on the door’s window that asked a question, “How
many people are in the room?” followed by “If 3, wait until someone
leaves. (Maximum limit: 3 people).” In doing this, they took
advantage of a bit of psychology: people want to answer questions.
There is something in us that is uneasy if a question is kept open. So,
in this case, the question prompted the person wanting to enter the
area to count how many others were actually in the room and then
apply the rule. (And yes, that sign had a unique identity number
that linked it to the gowning room procedure so it would be under
change control.) Another example is seen on doors to patient rooms
at a hospital: HAVE YOU WASHED YOUR HANDS?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 189 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
190 Root Cause Investigations for CAPA: Clear and Simple

Adding or changing a procedure

Addressing a deviation by changing a procedure is one of the most
frequent corrective actions. What is the result? Usually more and
more details being “bolted on” to the procedure, oftentimes with no
valid reason except the need to do something. Modifying a procedure
is appropriate when the information is incorrect or unclear to those
using it.

A vaccine manufacturer had a seven-page procedure addressing

the setup and operation of a peristaltic pump. On one particular
day, an operator threaded the tubing through the pumping head’s
gates (which rotate). When the pump began operating, fluid went in
the wrong direction, resulting in a substantial material and financial
loss. The investigation team looked at the procedure and decided
to modify it. Instead of adding something to the procedure, they
removed most of the text, reducing its length from seven pages to
three. This was done by having three photographs: the first, what
the pump looks like without tubing; second, someone beginning to
put the tubing through the first gate, indicating the proper direction;
and the third photo was the tubing fully in place. The other action
taken was to place small arrow labels on the pump itself showing
the proper flow of tubing and material. Since then, the firm has
never had a recurrence.

Because of the importance of procedures, this topic is covered in

more detail in Chapter 13.

Training
Training or “re-training” is the other most frequently seen corrective
action, often because the investigation was not or very poorly
performed. Health authority inspectors know this, and effectiveness
checks—the corrective action did not prevent a recurrence—show
this as well.

One inspector used this against a firm when he was performing

an inspection of a quality control laboratory. Using the firm’s training
records, he pointed to an employee who was trained, re-trained, and
re-re-retrained because of laboratory failures. In pointing this out

LICENSED TO JOSE CASTELLA

Vesper Book.indb 190 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 191

to the lab manager, the inspector said this looked like there was a
significant problem with the lab’s training program; the manager
reluctantly agreed. The inspector then used some logic Jiu Jitsu
on the manager: “You’ve just admitted your training program is
inadequate. So, how do you know that a particular result that was
within limits that this lab analyst generated was truly within the
specification? How do you know that it was not just luck that caused
a bad product to be tested with the results of a good product because,
as you just said, you have an inadequate training program?” The
inspector had backed the manager into a corner that he could not
get out of.

While training should be limited to where there is a true

knowledge or skill deficiency, the event and its resolution should be
communicated to all stakeholders. More detail on communication is
found in Chapter 19; additional detail on training can be found in
Chapter 14.

CORRECTIVE ACTIONS SPECIFIC TO CAUSES

CATEGORIZED AS “HUMAN ERROR”
If the investigation used one of the models discussed in Chapter 8
and was able to identify the root and/or proximal cause(s), these
models can guide you to more specific actions that are applicable.

For example, if using the human factors analysis and classification

system (HFACS) tool (Shappell et al., 1997), you can ask “why”
three or four times to determine a more precise reason. For example,
why are people routinely violating a procedure? You may discover
that the technician is trying juggle two or three goals—following
GMPs, getting the needed number of units produced, and keeping
to a strict schedule—and believes that the procedural shortcuts they
take are “no big deal.” Continuing to ask why can lead to broader
organizational issues (discussed more below).

If you make use of the “event precursors” listed in Chapter 8,

these in themselves are quite specific conditions/situations that can
be addressed. For example, if the investigation shows that, while
preparing a sample, the lab analyst added a reagent twice because

LICENSED TO JOSE CASTELLA

Vesper Book.indb 191 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
192 Root Cause Investigations for CAPA: Clear and Simple

they were interrupted, the corrective action would be, in part, to

limit distractions and interruptions.

Mistake proofing (poka-yoke)

One of the best ways to prevent an unwanted event that would be
considered a human error is to try to make it impossible for that
event to occur or to make it immediately detectable by the process of
mistake proofing (or as it is known in Japan, poka-yoke). Often this will
involve some sort of design or engineering solution to the problem.
If you only consider cables used for your electronics, you can see
mistake proofing in action. For example, an HDMI cable between a
video source and a television can only be inserted into the devices
in one way. The three-prong US standard 120 volt electronic plug
can only be connected to an electronic outlet one way; the Lightning
cable that attaches into an iPad® can be inserted in one of two ways,
but it doesn’t make a difference—both ways work. Mistake proofing
makes it easy for someone to do the right thing.

“Nudges”
Nobel prize winner Richard Thaler and legal scholar Cass Sunstein
define nudges in their book of the same name (Thaler et al., 2009) as
positive reinforcements and indirect suggestions as ways to influence
the behavior and decision making of groups or individuals. Nudging
is a form of choice architecture that attempts to guide people to an
appropriate decision or action through the use of psychological-
based cues. One of the most notable nudges was implemented at
Amsterdam’s Schiphol airport. Not long after the airport was built,
facility management was reviewing the operational costs and noticed
that they were over budget in housekeeping and cleaning. In asking
“why?” multiple times, they found that cleaners were spending
more time cleaning the men’s washrooms, particularly around the
urinals. The issue, to put it most delicately, was the poor aim that
some men had when using the appliance. To reduce the problem
of poor aim, they had urinals made with the image of a black fly
etched into the porcelain. (If you are not a male, ask a man in your

LICENSED TO JOSE CASTELLA

Vesper Book.indb 192 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 193

life what a target can do in terms of an incentive to improve “aim.”)

In other words, the designers of the solution used something they
knew about how people (i.e., men) think and took advantage of that.

Nudges go beyond simply making it easy to do the correct thing

(for example, having only black or blue pens with indelible ink
available to those who need to record information in a notebook).
Humans, for instance, can be resistant to making changes after
having arrived at a decision. Organizations that want to encourage
retirement savings often will set a default savings withholding rate
of, say 3 percent; new hires can reduce or raise that if they want.

The use of nudges is sometimes criticized for shaping behavior

in ways that manipulate an individual or consumer. Grocery stores
are notorious for how items are positioned to build sales: middle
shelves (the easiest to reach) are for the best sellers (or those that the
store is trying to promote), lower shelves are products that children
are interested in, and bottom shelves are for the cheapest products.

Swiss cheese
When we looked at accident models (Chapter 7), we were considering
how unwanted events occurred because of failures due to barriers
that were either not present or that had inadequacies (conceptually
represented by the holes in the Swiss cheese). This model, however,
can be viewed another way, with the cheese slices representing
controls that are put in place to prevent a recurrence.

An important concept that the Swiss cheese model illustrates is

a layering approach—you are not relying only on one control, but
on redundancies should the initial control fail. Instead of having just
one control to prevent an unwanted event, it is often more feasible
and robust to have multiple controls that are not “perfect” to prevent
an event. For example, we do not rely only on finished product
testing; rather, there are tests conducted on incoming materials so
they can be released for use, in-process tests, qualified equipment,
and validated processes, all of which together provide confidence
that the product meets its requirements.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 193 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
194 Root Cause Investigations for CAPA: Clear and Simple

Verifying, checking, reviewing

One of the common corrective actions that is applied when people
are involved in an unwanted event is to add a verification or check
step. Batch records and logs then become bloated, sometimes with
every step being signed off by a second person.

If one looks at the GMP regulations, requirements, and

guidances from different countries and regions, health authorities in
the US, Canada, UK, EU, and PIC/S do not give definitions of these
terms in the GMPs. What is the common concept in the words—
check, review, verify—is that health authorities want there to be a
confirmation that something is done and has been done correctly.
Two key questions are then, (1) What do these words each mean? and
(2) If they mean different things, how are they then to be applied?
Could risk-based thinking help us understand their use?

For much of industry, “verify” or “verification” is something

done in real time. Witness is often a synonym—you are watching
someone perform a critical activity to ensure that it is done and done
correctly. “Review” or “check” are terms that, as used in pharma/
biopharma, are legitimately done after the fact; they do not need to
be performed in real time.

Why is this so important? The emphasis on data integrity is part

of this. Reviewing, checking, and verifying information help ensure
the data is “reliable and trustworthy,” a requirement found in the
US 21 CFR 11—the regulation that covers electronic documents and
electronic signatures, but has elements that can be conceptually
applied to paper-based documents as well. Other characteristics of
a well-prepared document are arranged as an acronym known as
ALCOA-plus.

DEFINING KEY TERMS

Since there are no “official” definitions that can be found in the GMP
requirements of the US, Canada, and EU, we can look to industry
practice and see how it puts these words into practice (Table 2).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 194 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 195
Table 2. Terms frequently used to provide confidence that an activity
was correctly performed

Term Typical industry definition Example

Verify (a) Having real-time witness of an Adding materials (“charging”) to a
event. blender.

(b) Having a second person repeat Performing a line clearance on a

the same activity with the hope that packaging/labeling line.
the same result will be achieved.
Double Having a second person repeat the Recalculating a result—for example,
check same activity with the hope that the a reconciliation.
same result will be achieved.
Review Having a second person examine Reviewing a procedure for technical
something (usually a document) accuracy prior to approval.
generally for adherence to certain
requirements. Often a word or two
follows the requirement for “review.”
Check Examining something to ensure it Checking filter assembly to confirm
meets requirements or is correct. it was properly set up and installed;
Often used interchangeably with checking a printout of a validated,
review. Best practice is to provide automated system (e.g., autoclave) to
some guidance for what is required. ensure that it operated as intended.
Often a word or two follows the
requirement for “review.”

What all of these actions do is “confirm” that something is

correct, meets the specifications or requirements, or that the proper
action was taken.

Knowing that there are no clear GMP definitions on what

verification, review, and check mean, and knowing that some
regulators distinguish between critical and noncritical entries in a
record, how can we apply risk-based thinking to what “confirmation”
is? Or, in other words, what level of confidence do you really
need that an action or event occurred or that the recorded data is
accurate? If everything is held to the same high standard, meaning
that everything is important, then in fact, nothing is important.

Table 3 illustrates the ways that confirmation can be achieved.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 195 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
196 Root Cause Investigations for CAPA: Clear and Simple
Table 3. Terms used to provide confirmation using a risk-based
approach

Term Proposed definition Example

Device or • Correct operation or action • Closing and securing door to
process incorporated into function of cabinet washer; otherwise alarm
controlled device. sounds and washer does not
• Qualified equipment or device. start.
• Device does not function if • Electronic batch record that
correct requirements or entries does not move to next step
are not present or not met. unless previous step and its
requirements are met.
• Often used for safety or
mission-critical reasons.
• No additional human
confirmation is needed.
Check of • Human review of a machine- • Reviewing printout from
automated generated record or data to washer of the cycle that was
equipment ensure proper operation or that used and ensuring that correct
intentions were achieved. time, temperatures, etc. were
• Qualified device or equipment. achieved.
• This occurs at the end of the
machine’s operation and before
the next processing step.
• Situation can be remediated
if needed but may have some
negative impact (repeat of
operation/delay of next step).
Verify • Direct, real-time observation of • Charging a reactor.
actions, events, or behaviors (i.e., • Adding something to a batch.
witness).
• Line clearance.
• Could also be a second,
independent performance of
the task.
• Impact could be on CQAs,
SISPQ, or important business
goals.
• Immediate intervention possible.
Check • Direct observation of the results • Identifying and counting items
of an action, event, or behavior on autoclave rack before cycle
by a second person. is started to ensure rack was
• Situation can be remediated loaded correctly.
if needed without significant • Periodic review of logs
impact or loss. and records that are being
completed during the work shift.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 196 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 197

Term Proposed definition Example

Review • Examination of data or evidence • QA batch record review.
after the fact by a second • Technical (content) review of a
person to ensure correct action procedure before it is approved.
was taken or correct result was
achieved.
• Most useful if there have been
confirmations of different types
that have occurred earlier in the
activity.
• Impact can range from minor
(changes to a draft procedure)
to extreme (rejection of a batch
because of failures that occurred
during manufacture).
• Often no opportunity for
remediation without significant
impact.

As one moves down the list of practices (Table 3) from the device
or process controlled confirmation to review, the rigor decreases and
there is more time between the event that is being confirmed (and
in a sense controlled) and the confirmation performed. There is also
an increased risk to operators (e.g., safety and health), products,
and patients if the event is not executed fully and properly.

If a confirmation is truly needed as a corrective action, ensure

that the term that is used has the appropriate amount of rigor based
on the risk of what you want to confirm.

WHERE DO QUALIFICATION AND VALIDATION FIT INTO

CORRECTIVE ACTIONS?
Equipment qualification and process validation are ingrained into
GMPs around the world. Two useful definitions of these activities
are:
Equipment qualification: Action of proving that any premises,
equipment, and supporting systems work correctly and actually lead
to the expected results (WHO, 2016).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 197 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
198 Root Cause Investigations for CAPA: Clear and Simple
Process validation: The collection and evaluation of data, from
the process design stage through commercial production, which
establishes scientific evidence that a process is capable of consistently
delivering quality products (FDA, 2011).

So are qualification and validation appropriate to select

as corrective actions? Simply put, no. Qualification provides
documented evidence the equipment or instrument involved in
the correction performs as intended (that is, to the predetermined
specifications); validation provides similar evidence that the outcome
of the operation or activity consistently results in the pre-determined
outcome. Qualification and validation are therefore not corrective
actions, but rather can be used as evidence of effectiveness (see
Chapter 15).

WHEN YOU CANNOT PREVENT, TRY TO MANAGE

When looking at ways to prevent behaviors often defined as
“human error,” you may find yourself at a dead-end: there may be
no effective solution available. In this case, look for ways to manage
those events.

Error management consists of three phases, each described

below.

1. Error detection: Observing or determining that an error has

been or is about to be made.

2. Error recovery: Responding to the situation in a way that brings

the situation back to an acceptable condition.

3. Error tolerance: Having a system that is robust enough to allow

or permit a situation without resulting in a failure.

An example can take us through these three phases: I was driving

from my home to the airport, usually a quick 12-minute trip. On this
particular day, instead of turning right on the street that I always
take to the airport, I went directly ahead. Halfway through the
intersection I realized that I was headed to the gym I use, not the
airport. Error detection—recognizing my mistake—was very fast
and efficient.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 198 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 199

I knew that farther ahead was another street that could get me
to the airport. I continued ahead, made a right turn, and was on this
parallel road without any problem. Error recovery had just occurred.

Unfortunately, after one mile of driving there was a diversion

due to road construction. I wasn’t worried because, for whatever
reason, I had an extra 20 minutes before I had to appear for my
flight; this was a day I wasn’t running tight on time. I was able to
tolerate the initial error.

Error management is a useful concept to include in either

procedures or on-the-job training. When possible, have an approved
option along with clues or signals that the performer can use to
quickly detect the problem. And, having some flexibility with
time, materials, and equipment is important so the “plan B” can be
implemented without causing too much delay or anxiety.

WHEN THE ROOT CAUSES CANNOT BE FOUND

There may be instances when the root cause cannot be found
despite a diligent effort by investigators; they may have identified
“probable” root causes.

In these situations using training as the corrective action even

though there was no solid evidence that a lack of knowledge and
skills caused the problem is a waste. Also, revising the procedure
will not likely be an effective measure.

If a definitive root cause cannot be identified, the actions that

can be taken are those that would mitigate (i.e., reduce the impact)
of the event should it recur as well as find a way to collect more data
that could lead to determining the root cause(s). (See Chapter 16 for
ways to defend this when writing the report.)

SHORT TERM VS. LONG TERM

Significant corrective actions often will take many months to
become operational. Large projects often require months if not years
to accomplish a multitude of tasks such as writing requirement

LICENSED TO JOSE CASTELLA

Vesper Book.indb 199 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
200 Root Cause Investigations for CAPA: Clear and Simple

documents; designing, ordering, installing equipment or building

facilities; and conducting qualifications. During this period, one still
must consider ways to prevent a recurrence of the problem.

In the corrective action plan, the interim controls need to be

specifically called out and detailed. This could involve having an
extra inspection step, an additional person or two, or procedures
that may be adequate for a relatively brief period but are not
realistically sustainable. The interim controls must also go through
a change control process.

Having a timeline associated with these corrective actions will

help not just with planning but in reducing any concerns a health
authority may have.

RESIDUAL RISKS OF CORRECTIVE ACTIONS

Risk management has a concept called residual risk—the risk
that remains after controls are put in place. This can apply when
corrective actions are being planned and implemented.

Identifying residual risks may not necessarily require any type

of formal risk assessment, but asking the simple question “what
if…?” or “what could go wrong in putting this corrective action into
action?” may be enough.

We do know, however, of some common residual risks that

should be considered. These include:

• Alarm fatigue: If you have ever spent time as a patient or visitor

in an acute-care hospital you know it is not a restful place.
Beeps, alarms, blinking lights, sirens, and automated voices all
contribute to a cacophony that has been shown to actually cause
poor patient outcomes (Sendelbach et al., 2013). Professional
caregivers often tune these out as they are part of the general
background noise. In other situations, people may disregard
an actual alarm because there have been too many false alarms
and people just assume it is a problem with the equipment. An
interesting case where this happened in the pharma industry
was a major theft at a drug company’s distribution center. The

LICENSED TO JOSE CASTELLA

Vesper Book.indb 200 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 201

thieves triggered the alarm three times during a storm with the
alarm company responding each time by visiting. On the fourth
triggering, the alarm company did not respond—they figured
that the storm was causing false alarms; the thieves made their
move, stealing nearly $80 million of brand-name drug products
(NPR, 2010; Mahoney, 2012).

• Not really understanding the control: Some corrective actions

are put into place without a thorough knowledge of how they
actually work. It becomes a “black box” where things happen.
An example of this was with what is known as the Gaussian
Copula function (Figure 1). This mathematical formula was
used for many years by financial institutions to calculate their
level of risk. At the end of every business day, computers
would run through the calculation, arriving at a number that
management would use as a relative measure of their financial
exposure and risk. The mathematician who developed and
published a technical paper describing the formula said it was
proven to work under certain conditions but he had no evidence
of its performance under other conditions, specifically when all
asset categories were moving in parallel. Everything moving
down at once is what happened during the financial collapse
of 2008–2009. Yet, institutions were trying to determine their
risk using an unvalidated, inappropriate tool (Krantz, 2009).
Understanding how something really works is important in
reducing overall risk.

Figure 1.The Gaussian Copula function (Krantz, 2009)

• An over-reliance on technology: Automatic spell check has

caused many of us frustration or sometimes embarrassment as
we write texts, emails, or reports and have our intended word
automatically changed into something else. There are often news
articles of drivers who are super-reliant and overly trusting of
the their GPS systems. In his book The Glass Cage (2014), Nicholas

LICENSED TO JOSE CASTELLA

Vesper Book.indb 201 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
202 Root Cause Investigations for CAPA: Clear and Simple

Carr provides examples of how this has contributed to near-

misses and actual crashes of planes. He defines two conditions
in particular: Automation complacency—having (misplaced)
confidence that the computer or system will work flawlessly,
handling any challenge presented, and automation bias: placing
undue weight on the information given by a computer or system
(e.g., if it is on the internet it must be true). These systems are not
always perfect; we need to use common sense and sometimes
question the results.

• Inadequate change control: Corrective actions can sometimes

have a ripple effect: a change in one thing may also impact
something else. For example, changing a procedure may require
other changes to forms or logs, training materials, job aids, batch
records, signs, and the like. These potential ramifications need
to be considered early on.

• Organizational resistance: Changes and corrective actions

might be viewed as needed and important, but unless there is
true agreement by stakeholders, particularly by those who will
be most affected by the change, the implementation will not be
successful. The “we’ve always done it this way” mode can be
hard to overcome. Using a variety of different approaches can
help, but try to avoid the position of “it’s a compliance issue.”
Rather, think of the scientific rationale and what the change
means to patients—how will it better support safety, identity,
strength, purity, quality, and availability of the product.

• Sustainability: In the heat of the moment, often with corrective

actions that occur as the result of a health authority inspection,
an organization will submit and have a plan accepted but
several months later realize it is not sustainable. It may require
much greater effort, more resources (people or money), or other
things that were not fully considered. Asking the question, “Can
we maintain this over an extended period?” and considering the
risks can help make a plan or corrective action more reasonable
in terms of practicality.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 202 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Actions and Preventive Actions 203

CONCLUSION
To be effective, corrective actions need to be aligned with the root
and proximal causes. Fortunately there are a number of different
approaches that can be taken depending on what was discovered.
If the causes are not found, efforts should emphasize how to detect
the event should it happen again and to protect the things of value.

REFERENCES
ASQ (Undated) What are the five S’s (5S) of lean? https://asq.org/
quality-resources/lean/five-s-tutorial. Accessed 26 Mar 2020.
Carr, N. (2014) The Glass Cage: Automation and Us. New York, NY: W.
W. Norton and Company.
FDA (2011) Guidance for industry process validation: General
principles and practices. https://www.fda.gov/files/drugs/published/
Process-Validation--General-Principles-and-Practices.pdf. Accessed
27 Apr 2020.
Gaines, K. (2019) Alarm fatigue is way too real (and scary) for nurses.
Aug 19, 2019. https://nurse.org/articles/alarm-fatigue-statistics-
patient-safety/. Accessed 8 Feb 2020.
GHTF (2010) Quality management system – Medical Devices –
Guidance on corrective action and preventive action and related
QMS processes. Global Harmonisation Task Force. http://www.
imdrf.org/docs/ghtf/final/sg3/technical-docs/ghtf-sg3-n18-2010-qms-
guidance-on-corrective-preventative-action-101104.pdf. Accessed
27 Apr 2020. NOTE: The work of the now-defunct Global
Harmonisation Task Force became part of the International
Medical Device Regulators Forum (IMDRF) in 2011.
Krantz, J. (2009) Recipe for disaster: The formula that killed Wall
Street. https://www.wired.com/2009/02/wp-quant/. Accessed 3 Mar
2020.
Mahoney, E. (2012) Feds say they crack $80 million drug heist from
pharmaceutical warehouse in Enfield. https://www.courant.com/
news/connecticut/hc-enfield-eli-lilly-drugs-0504-20120503-story.
html. Accessed 4 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 203 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
204 Root Cause Investigations for CAPA: Clear and Simple

NPR (2010) Thieves grab $75 million worth of Eli Lilly pills. Mar
17, 2010. https://www.npr.org/templates/story/story.php?storyId=
124758613. Accessed 3 Mar 2020.

Reason, J. and Hobbs, A. (2003) Managing Maintenance Error. Boca

Raton, FL: CRC Press/Taylor and Francis Group.

Sendelbach, S. and Funk, M. (2013) Alarm fatigue: a patient safety

concern. AACN Adv Crit Care, 24(4):378–86.

Shappell, S. and Wiegmann, D. (1997) A human error approach

to accident investigation: The taxonomy of unsafe operations.
International Journal of Aviation Psychology. 7:4, pp. 269–271.

Thaler, R. and Sunstein, C. (2009) Nudge: Improving Decisions about

Health, Wealth, and Happiness. New York, NY: The Penguin
Group.

WHO (2016) [Draft] Guidelines on validation. Geneva: World Health

Organization. https://www.who.int/medicines/areas/quality_safety/
quality_assurance/validation-without_appendices_2016_05_17.pdf.
Accessed 27 Apr 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 204 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

PROCEDURES: CAUSES OF
PROBLEMS AND POTENTIAL
CORRECTIVE ACTIONS

In Chapter 12 we noted that revising procedures after an unwanted

event is one of the most common corrective actions. You can probably
prove this to yourself at your firm if you can sort the different types
of actions taken. Often, however, changing the procedure (and the
subsequent training that occurs) is not always the best course of
action to prevent a recurrence. In this chapter we will present the
reasons for that. We will also look at how checklists can be used as a
way to assure correct and complete performance.

Standard operating procedures (SOPs or simply, procedures) pro-

vide the instructions on how to do a task. ICH Q7 defines them as:
Description of the operations to be carried out, the precautions to be
taken, and measures to be applied directly or indirectly related to
the manufacture of medicinal products, APIs, or intermediates (ICH,
2000, p. 41).

PROCEDURES AS A CAUSE AND A CONTRIBUTOR TO

UNWANTED EVENTS
If one were to ask, “What’s the problem with procedures? How are
they involved with deviations?” there might be a variety of answers,
but they can be organized into five categories:
205
LICENSED TO JOSE CASTELLA

Vesper Book.indb 205 5/29/2020 10:56:18 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
206 Root Cause Investigations for CAPA: Clear and Simple

• Procedures are outdated, not accurate, incomplete, or not

aligned with how the task is actually performed.

• Procedures are not available or not being used by those

performing the task.

• Personnel misinterpret the procedure.

• Procedures are not written for the users.

• Personnel are violating the procedure.

Each of the sets of reasons have different causes, but all can
result in similar compliance, performance, quality, safety, and
business impacts.

Procedures are outdated, not accurate, incomplete . . .

Reason and Hobbs (2003) state that 80% of the procedures used in the
nuclear power industry are incorrect or inaccurate. (They do not say
where these problems were found in the document lifecycle, that is,
prior to review or after approval.) Strictly following a procedure that
is inaccurate will result in an incorrect outcome or it may prompt the
user to make an improvisation to avoid the anticipated error. Those
writing procedures need to be knowledgeable practitioners of the
task; having the people who perform the task as part of the writing
(or at least the review) of the procedure will bring a practical reality
to the document. If a copy-and-paste method is used in creating new
procedures or revising one, this can contribute to mistakes in the
document. An incomplete procedure could mean missing perhaps
a key step or not describing all of the potential issues that might be
anticipated.

A very telling question about the documentation quality system

to ask is, “How long does it usually take to make a change in a
procedure?” If the answer is beyond several weeks, it is likely that
people will make the change on their own—the procedure does
not change, but the practice does because people know that it takes
“forever” to have the document officially changed. In those cases,
personnel do not start the change process unless there is a significant
amount of detail in the SOP to change.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 206 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 207

There are occasions when the procedural description (the steps

and substeps to be followed) does not match how the task actually
flows. This could happen because the writer had not actually
walked the process or had a current process map or flow diagram,
or perhaps because the task was rearranged. Representing a process
where several things are happening at one time can be a challenge;
having a process map or flow chart can help.

Procedures are not being used by those performing the

task or are not available
Personnel are supposed to follow the procedures as written, but
often they do not actually have the documents in hand as they are
performing the task. For example, technicians will be following a
batch record as they prepare, wash, inspect, and autoclave parts
needed for an aseptic filling process. Supporting that batch record,
however, could be a dozen procedures that describe setup and
operation of the washer, what to look for when inspecting parts, how
to clean equipment, and how to document and reconcile quantities.
These procedures may be available in a binder or online, but most
of the time technicians do not have the documents in front of them
when they do the task—it is expected that they know the steps and
substeps to the point of successful performance. (Some firms that are
moving to electronic batch records and using notebook computers
or iPads® have procedures and support documents that can be called
up as needed. One firm has large flat screens that display checklists
and procedures for team members to look at when performing
certain tasks.)

Users misinterpret instructions

What is clearly understandable to you may not be to someone else,
particularly as a person new to the industry and your firm. Words
can sometimes be vague or misconstrued. One story (perhaps
apocryphal) is of a very inexperienced microbiology lab technician
who did not have proper training (only a “read and understand”
method was used). She understood the step of “pour melted
growth media into test tube and leave room between the media and

LICENSED TO JOSE CASTELLA

Vesper Book.indb 207 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
208 Root Cause Investigations for CAPA: Clear and Simple

inserting a sterile cotton ball into the top of the tube” to mean for
her to physically leave the laboratory, instead of what was intended:
leaving a space between the media and cotton. As workforces get
more diverse and technology transfers extend to different countries,
having a consistent understanding of instructions is critical.

Procedures are not written for the users

Procedures should be written with the users in mind. A number
of years ago, a firm performed a reading-level evaluation of its
procedures using the Flesch-Kincaid tool (available as preference
option in Microsoft Word >> Tools >> Spelling and Grammar)
and found them to be at the 12–14 grade level, similar to a college
textbook for an introductory course. While that might be appropriate
for scientists with native English skills working in a development
lab, it was not adequate for production technicians who were not as
skilled in reading complex sentences. (By comparison, a random set
of articles in the New York Times were tested using Flesch-Kincaid
and ranged from 11–12 grade level; this paragraph is at an 11th grade
level). As with other writing and training endeavors, it is important
to know your audience. Procedure writers also need to understand
how and where the procedures are to be used and the appropriate
level of detail that they need to contain (see below).

Personnel violating the procedure

As was seen in the chapter on human error, a violation is when
someone knows the correct, required steps or technique that should
be performed and decides to do something different. Taking an
unapproved shortcut—an intentional deviation from an approved
procedure—is an example.

Often, violations are not as simple as deciding to not follow

a procedure One needs to understand what competing pressures
the technician, operator, analyst, or manager was trying to balance.
Was it a decision about either trying to fill the last 15 percent of the
batch or do a thorough sanitization of the room? Or being required
to have someone do a real-time verification (witnessing) of a step

LICENSED TO JOSE CASTELLA

Vesper Book.indb 208 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 209

but the second person was not available when the step had to be
completed? Or a supervisor sharing their password with someone
else to enter the data because the supervisor could not keep up with
the workload assigned? Most people do not want to simply violate
rules or procedures; there is often an underlying reason for it.

THE BIGGEST WRITING CHALLENGE: APPROPRIATE

LEVEL OF DETAIL
For those who are writing procedures, the most difficult part is
determining the level of detail that will enable users to perform
the task safely, correctly, efficiently, and consistently. Level of
detail means how much information and the number the steps and
substeps the instructions will include. Table 1 shows examples of
different levels of detail.

Table 1. Different levels of detail for driving instructions

Level of Number Example

detail of steps
(Goal of N/A Drive from New York’s LaGuardia Airport (LGA) to the Park
procedure) Ridge (New Jersey) Hotel
Low 3 1. LGA to Garden State Parkway (GSP)
2. Take GSP Exit #172
3. Follow Mercedes Rd to Hotel
Medium 6 1. LGA to Grand Central Parkway
2. Take Triborough Bridge to George Washington Bridge
(GWB)
3. Take GWB to Route 80W
4. Take 80W to Garden State Parkway (GSP)
5. Take GSP Exit #172
6. Follow Mercedes Rd to Hotel
High 9 1. LGA to Grand Central Pkwy
2. Take Triborough Bridge
3. Use lanes for Major Deegan Expressway
4. Look for George Washington Bridge (GWB) to New
Jersey
5. Once across GWB, find route 80W
6. Take 80W to Garden State Parkway (GSP)
7. Take GSP North to exit 172, Park Ridge/Montvale (last NJ
exit)
8. Take first right to Mercedes Blvd
9. Take next right to Brae Blvd (look for Hotel sign on left)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 209 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
210 Root Cause Investigations for CAPA: Clear and Simple

Level of Number Example

detail of steps
Fine 32 1. Start out going East on TERMINAL B C AND D towards
TERMINAL B by turning left. 0.0
(MapQuest 2. Stay straight to go onto TERMINAL C AND D. 0.
approx. 3. TERMINAL C AND D becomes TERMINAL C AND D. 0.0
1998) 4. Take the ramp towards PARKWAY WEST. 0.2
5. Keep LEFT at the fork in the ramp. 0.1
6. Merge onto GRAND CENTRAL PKWY W. 1.7
7. GRAND CENTRAL PKWY W becomes I-278 E. 1.0
8. I-278 E becomes I-278 E/TRIBOROUGH BRIDGE. 1.8
9. I-278 E/TRIBOROUGH BRIDGE becomes I-278 E. 0.3
10. Take I-87 NORTH RAMP towards UPSTATE. 0.2
11. Merge onto I-87 N/MAJOR DEEGAN EXWY. 2.8
12. Take the I-95/CROSS BRONX EXPWY exit 0.1
13. Stay straight to go onto ramp. 0.0
14. Stay straight to go onto ramp. 0.1
15. Keep LEFT at the fork in the ramp. 0.4
16. Merge onto I-95 S/CROSS BRONX EXWY/US-1 S. 0.1
17. Take I-95 S/CROSS BRONX EXWY towards UPPER
LEVEL/BRIDGE. 0.1
18. …
32. Turn RIGHT onto BRAE BLVD. 0.0 (DESINATION)

If you had done this trip many times, you might only need the
simple memory jog that the low level of detail provides. If you had
never been there and were unfamiliar with the highways, bridges,
and how they change names, a higher level of detail may be best.

The fine level of detail comes from MapQuest directions in

the late 1990s that breaks the journey into 32 steps. (The use of
capitalization, spacing, and distance of each leg of the trip is what
was the output when the directions were printed.) As you can see,
the extra detail can get in the way—there is some redundancy in the
last example above (steps #2 and #3, steps #13 and #14). This was the
time before smart phones and widely available GPS systems. Think
about how practical this fine level of detail would be as you are
driving yourself in rush hour New York City traffic and it is getting
dark in the evening. Excessive detail does not make the procedure
better!

(By comparison, in August 2019, that journey was described in

MapQuest in 20 steps.)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 210 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 211

The impact of excessive detail found in a procedure (in this

case, actually a checklist, which is somewhat different) was cited as
a possible contributing factor in the airplane crash investigation of
Swissair flight 111 in 1998. Investigators found that it would take
about 30 minutes for flight crews to complete all the activities in
the checklist once they first noticed smoke in the cabin. The plane,
however, crashed 20 minutes after the first report of the smoke
(Canada Transportation Safety Board, 1998).

Deciding upon the level of detail that is necessary and

appropriate depends on the knowledge and experience of the users,
the training that is or will be provided, and the other documents
and information sources (see below) available to the user. What
distinguishes one level of detail from another (beyond the number of
steps and substeps) are the words that are selected. As seen in Table
2, some verbs or action words are “macro” in nature—they include
many actions. Other words name much more discrete actions to be
taken. Generally, if the task being performed is routine, less detail
needs to be included in the procedure. This can be thought of in
terms of risk—if there are higher levels of uncertainty (for example,
when the step says “clean the room,” what am I really supposed to
do?), there are higher levels of risk present. To decrease that risk,
providing more detail results in less uncertainty on behalf of the
procedure user but, at the same time, can complicate the written
document.

The level of detail in a procedure was discussed in the 1978

Preamble to the US FDA Current Good Manufacture Practices
(CGMPs):

Some comments suggested deletion from §211.80(a) of the words “in

detail” because it would require the documentation of minutiae and
voluminous written procedures.

The Commissioner agrees that the phrase “in detail” could be construed
to include description of insignificant portions of the procedure, which
is not the intent. Therefore, he is inserting the word “sufficient” before
the word “detail” (FDA, 1978, comment #202, p. 45043).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 211 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
212 Root Cause Investigations for CAPA: Clear and Simple
Table 2. Examples of words used in different levels of detail

Low detail Operate

Maintain
(More “macro,” complex actions) Investigate
Install
Audit
Calibrate
Medium Start up
Disassemble
Review
Clean
Load
Sample

High Connect
Enter
Adjust
Set
Record
Confirm

Fine detail Turn on

Press
(More “micro,” discrete, simple actions) Write down
Move
Turn off
Brush

Some firms have the philosophy that the procedures should be

written at a level so “someone coming off the street” can successfully
perform the tasks. This is not reasonable; all personnel need to have
the “education, training, (or) experience” (FDA, 1978, p. 45078) to
perform their assigned tasks. There are very few positions in pharma/
biopharma where a complete novice can find all the information
they need to be successful in a written procedure.

THE INFORMATION ECOSYSTEM

For a pharma or biopharma firm with even a very basic quality
management system (QMS), procedures do not exist in a vacuum—
there are other documents that will usually support a procedure

LICENSED TO JOSE CASTELLA

Vesper Book.indb 212 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 213

and those performing a task. Table 3 shows how these are typically
arranged.

Table 3. Types of documents usually found in a pharma/biopharma

organization

Name of document Description

Policies The rationale, expectations, commitment, and
connection with business goals and compliance. Covers
the “what” and the “why.”
Procedures (SOPs), Different levels of “what” and “how to”; stress rules,
methods, protocols relationships, roles, responsibilities (“who”), and
accountabilities; how to perform a task safely, efficiently,
and effectively.
Work instructions Detailed “how-tos” that apply to a particular job or
team.
Job Aids (Sometimes equivalent to work instructions.) Steps,
critical steps, and tips; usually observable actions—often
illustrations, signage.
Guidelines, guidebooks, Recommended practices; could supplement training;
playbooks sometimes similar to work instructions but with more
flexibility.
Checklists “Quick and simple and tools aimed to buttress the
skills of expert professionals” and embrace “a culture of
teamwork and discipline” (Gawande, 2009).
Structured on-the-job Detailed process descriptions that integrate various
training guides procedures. Include key activities (“what”), critical steps/
substeps (critical “hows”), and observable performance
behaviors and outputs.

DO WE NEED A PROCEDURE FOR THIS?

Here are four key questions that can help determine if a procedure
is needed.

• Is a procedure required by the regulations?

• Is the task done by different people? At different times? In
different locations?
• Could the task directly or indirectly affect the safety, identity,
strength, purity, quality, or availability of the product if not
done correctly?
• Is the task important?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 213 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
214 Root Cause Investigations for CAPA: Clear and Simple

If a “yes” or “maybe” is the answer to any one of the questions, a

procedure should be prepared and used.

WHAT SHOULD A PROCEDURE LOOK LIKE?

When the US FDA was writing the 1978 version of the CGMPs, this
question was asked of the FDA and they provided this answer in
the Preamble with the first publishing of the final CGMP regulation:
“The Commissioner believes that, with relatively few exceptions, the
CGMP regulations do describe “what” is to be accomplished and
provide great latitude in “how” the requirement is achieved. For
example, written records and procedures are required, but FDA will
recognize as satisfactory any reasonable format that achieves the
desired results” (emphasis added, FDA, 1978, comment #3, p. 45016).

The most commonly seen format for written procedures is the

outline or “engineering” style. For example:

1. Setup of equipment

1.1. Step 1

1.2. Step 2

1.2.1. Substep 1

1.2.2. Substep 2

1.2.2.1. Sub-substep 1

1.2.2.2. Sub-substep 2

1.3. Step 3

1.4. Step 4

If you are using this format, a caution is not to have more than
four levels of indentation—more than that can cause the reader to
become disoriented.

Another format is a modified playscript as shown in Table 4. If

one person does all of the tasks, the “who” column can be eliminated.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 214 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 215
Table 4. Example of a column playscript format for a procedure

Step # Who (job What Critical how Caution,

title) other info

A practice that is becoming much more common is using photos

(for example, in sequence if you are showing a process, or a photo of
how something is to be set up), particularly if the outcome is what is
important and not so much how it is accomplished.

REVISING A PROCEDURE AS A CORRECTIVE ACTION

Does a procedure need to be revised as a result of an unwanted event?
Here are two things to consider: First, unless the root, contributing,
or proximal cause(s) directly point to an inaccurate, incomplete, or
misleading procedure or something else points to an inadequacy in
the document, there probably will not likely be a benefit in revising
it. What is often seen is that something new is “bolted on” to the
procedure, making it even more complex. If there is a deficiency in
the procedure, instead of adding, consider how it can be simplified.
Could a drawing, flowchart, or photo be used to add clarity? What
about using a job aid to support the performer when and where they
are doing the task? Consider asking those doing the task what they
would recommend.

One of the arguments for not adding anything to a procedure

comes from James Reason (1990), who writes that most failures
involving people performing a procedure are due to their omitting a
step. Why then would you want to make a procedure longer, giving
people more things that they could potentially not do?

Second, you may want to apply risk-based thinking to the

situation. If the event has not occurred before and the impact of the
event was low, you might consider what else could be done other
than change the procedure. Was knowledge lacking? Is there another
way to provide it? Is just one person out of many having the problem
or is it a group of people? If, on the other hand, the unwanted event

LICENSED TO JOSE CASTELLA

Vesper Book.indb 215 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
216 Root Cause Investigations for CAPA: Clear and Simple

shows to be part of a trend, that might be a consideration in changing

the document.

CHECKLISTS
In his book The Checklist Manifesto: How to Get Things Right, Atul
Gawande (2009) describes the very interesting history behind
checklists and what makes for a good one. He states, “Under
conditions of complexity, not only are checklists a help, they are
required for success” (p. 79). And, “Checklists can provide protection
against . . . elementary (simple) errors” (p. 50). Checklists are meant
to be tools for performance, not procedures or training guides.

Based on what he learned at Boeing, an aircraft manufacturer

that develops checklists for its planes, Gawande describes two
types of checklists. The first is “read-do,” a list of actions that are
to happen and the person then performing them. This would be
appropriate for one person working at a task that has a moderate
level of criticality and also gives the person some level of flexibility.
At a certain point they would stop and ensure that the actions were
taken. The second type of checklist is a “do-confirm,” where actions
to be taken are again written down, but this time they are confirmed
by or to a second person. Tasks that have a high level of criticality
would benefit from this approach. (If you are flying on a commercial
airline, it is common to hear a flight attendant say, “Arm doors and
cross-check” with responses from other members of the flight crew
like “One-L, One-R.” This is an example of do-confirm.)

An effective checklist is practical, short, and simple with five to

nine items; it is not meant to cover everything; they cannot be long
(see the Swissair 111 situation above). The checklist should be tested
as it is being developed and refined/improved based on experience.
They need to be precise and easy to use under possibly difficult
conditions.

Figures 1 and 2 are examples of checklists that were developed

with Dr. Gawande’s assistance. (An article published in The New
Yorker by Dr. Gawande that gives an overview of his checklist work
can be found at his website: atulgawande.com >> The Checklist.)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 216 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 217
Figure 1. Checklist recommended by World Health Organization for
use in surgical procedures (WHO, 2009)

Surgical Safety Checklist

Before induction of anaesthesia Before skin incision Before patient leaves operating room

(with at least nurse and anaesthetist) (with nurse, anaesthetist and surgeon) (with nurse, anaesthetist and surgeon)

Has the patient confirmed his/her identity, Confirm all team members have Nurse Verbally Confirms:
site, procedure, and consent? introduced themselves by name and role. The name of the procedure
Yes Confirm the patient’s name, procedure, Completion of instrument, sponge and needle
Is the site marked? and where the incision will be made. counts
Yes Has antibiotic prophylaxis been given within Specimen labelling (read specimen labels aloud,
the last 60 minutes? including patient name)
Not applicable Whether there are any equipment problems to be
Yes addressed
Is the anaesthesia machine and medication Not applicable
check complete? To Surgeon, Anaesthetist and Nurse:
Yes Anticipated Critical Events What are the key concerns for recovery and
Is the pulse oximeter on the patient and To Surgeon: management of this patient?
functioning? What are the critical or non-routine steps?
Yes How long will the case take?
Does the patient have a: What is the anticipated blood loss?
Known allergy? To Anaesthetist:
No Are there any patient-specific concerns?
Yes To Nursing Team:
Difficult airway or aspiration risk? Has sterility (including indicator results)
No been confirmed?
Yes, and equipment/assistance available Are there equipment issues or any concerns?

Risk of >500ml blood loss (7ml/kg in children)? Is essential imaging displayed?

No Yes
Yes, and two IVs/central access and fluids Not applicable
planned

This checklist is not intended to be comprehensive. Additions and modifications to fit local practice are encouraged. Revised 1 / 2009 © WHO, 2009

Figure 2. Checklist for creating checklists (Checklistproject, 2010)

A CHECKLIST FOR CHECKLISTS

Development Drafting Validation

Do you have clear, concise Does the Checklist: Have you:

objectives for your checklist?
Utilize natural breaks in workflow Trialed the checklist with front line
(pause points)? users (either in a real or simulated
Is each item:
Use simple sentence structure and situation)?
A critical safety step and in great basic language? Modified the checklist in response
danger of being missed? to repeated trials?
Have a title that reflects its
Not adequately checked by other objectives?
mechanisms? Does the checklist:
Have a simple, uncluttered, and
Actionable, with a specific logical format? Fit the flow of work?
response required for each item? Detect errors at a time when they
Fit on one page?
Designed to be read aloud as a can still be corrected?
verbal check? Minimize the use of color?
One that can be affected by the Can the checklist be completed in
Is the font:
use of a checklist? a reasonably brief period of time?
Sans serif?
Have you considered: Upper and lower case text? Have you made plans for future
review and revision of the
Adding items that will improve Large enough to be read easily? checklist?
communication among team Dark on a light background?
members?
Involving all members of the team Are there fewer than 10 items per
in the checklist creation process? pause point?

Is the date of creation (or revision)

clearly marked?

Please note: A checklist is NOT a teaching tool or an algorithm

Last updated 1/14/10

LICENSED TO JOSE CASTELLA

Vesper Book.indb 217 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
218 Root Cause Investigations for CAPA: Clear and Simple

CONCLUSION
Procedures are an important tool to document and communicate
information and requirements. The challenge is making them useful
tools that promote consistent, correct performance of the tasks.
Before changing procedures after an unwanted event, be sure that
the procedures caused or contributed to the deviation or that they
did not provide a mitigation strategy to protect the things of value.
Well-prepared checklists can help trained personnel perform their
tasks consistently, resulting in effective, efficient, and compliant
results.

REFERENCES
Canada Transportation Safety Board (1998) http://www.tsb.gc.ca/eng/
rapports-reports/aviation/1998/a98h0003/a98h0003.html. Accessed
3 Mar 2020.

Checklistproject (2010) A checklist for checklists. www.checklistproject.

org. Accessed 1 Sep 2012. (Site not available as of 3 Mar 2020.)

FDA (1978) Current good manufacturing practice in manufacture,

processing, packing, or holding. Final Rule (U.S.) Federal
Register, Vol. 43, No. 190. https://www.fda.gov/media/78493/
download. Accessed 3 Mar 2020.

Gawande, A. (2009) The Checklist Manifesto: How to Get Things Right.

New York, NY: Metropolitan Books.

Gawande, A. (2007) The checklist. New Yorker. http://www.newyorker.

com/reporting/2007/12/10/071210fa_fact_gawande. Accessed 3 Mar
2020.

ICH (2000) Q7—Good manufacturing practice guide for active

pharmaceutical ingredients. International Conference on
Harmonisation. https://database.ich.org/sites/default/files/Q7_
Guideline.pdf. Accessed 22 Feb 2020.

Reason, J. and Hobbs, A. (2003) Managing Maintenance Error: A

Practical Guide. Boca Raton, FL: CRC Press, Taylor & Francis
Group.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 218 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Procedures 219

Reason, J. (1990) Human Error. Cambridge, UK: Cambridge

University Press.

WHO (2009) Surgical safety checklist. https://www.who.int/

patientsafety/safesurgery/checklist/en/. Accessed 3 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 219 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 220 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

TRAINING AS A CORRECTIVE
ACTION

If “revise the procedure” is one of the most common corrective

actions, “train (or retrain) the operator, technician, or analyst” is
certainly one of the others. If lack of knowledge and skills are the
root, proximal, or contributing cause(s), then yes, training should
be provided. But usually training is used as the default corrective
action which often means that a good, root-cause investigation was
not performed. Beware: auditors and health authority inspectors are
aware of this and, if in their judgment they see this action overused,
they will ask for evidence.

In this chapter we will assume that a lack of knowledge and

skills caused or contributed to the unwanted event, and we will
look at ways that training can be effectively used. Certain portions
of this chapter are adapted from “Creating a Culture and System for
Learning” by James Vesper in Phase Appropriate GMP for Biological
Processes, edited by Trevor Deeks (2018).

TRAINING AS PART OF A SYSTEM

As with procedures, training does not exist in a vacuum. For training
to be effective, other supplementary factors need to be included.
As shown in Figure 1, to achieve a goal, a number of things need
to occur. If everything works, the goal is achieved. More often,
however, there are barriers that arise or structures that need to be

221
LICENSED TO JOSE CASTELLA

Vesper Book.indb 221 5/29/2020 10:56:19 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
222 Root Cause Investigations for CAPA: Clear and Simple

addressed and strengthened in order to achieve the goal. Chapter 8

provides a detailed description of each of these levels.

Figure 1. Example of a performance model (Vesper, 1993)

In this model, knowledge and skills—training—was

intentionally listed last. Why? Because if the other elements are not
in place and functioning, simply providing knowledge and skills
will not result in a sustained performance.

TACIT AND EXPLICIT KNOWLEDGE

The goal of training is to provide the relevant knowledge and skills
so someone can accomplish a particular activity or task or make
the appropriate decision(s) given a situation. In doing this, we are
providing the learner two different types of knowledge: tacit and
explicit (Polanyi, 1962).

“Tacit knowledge” can include a variety of information—

concepts, sensory inputs, and images—that one uses to interpret

LICENSED TO JOSE CASTELLA

Vesper Book.indb 222 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Training as a Corrective Action 223

a situation or phenomenon. Other authors have expanded on this,

saying that tacit knowledge is highly personal, not easily shared,
and deeply rooted in action in a significant context (Nonaka, 1991).
“Explicit knowledge,” on the other hand, is formal, systematic, easily
communicated, and shared through written words, procedures,
specifications, formulas, and the like. Tacit knowledge is “knowing
how”; explicit knowledge is “knowing that” (Brown et al., 2000).
Tacit and explicit knowledge—know how and know that—are often
needed together to accomplish a task or goal. A classic example
given is playing chess (Ryle, 1949): knowing the rules of the game
(the know that from explicit knowledge) doesn’t mean that one can
play the game that requires the know how from tacit knowledge. A
person learns to play chess by practicing with someone, usually
someone more knowledgeable and skilled. Polanyi (1966) gives
another example: riding a bicycle. One can create the complex
mathematical equations that describe the physics by which the rider
achieves and maintains balance, but that is not what the rider uses
to learn to ride; he or she repeatedly tries and tries riding until they
are successful. It is by doing the task that the tacit information of
know how is acquired.

Recently, the neurological basis for this separation of know

that from know how has been described: knowing that is stored
in the medial temporal lobe and the temporal and frontal cortices
of the brain, parts that have become optimized for conscious
knowledge. Knowing how is encoded in the cerebellum and central
ganglia, parts of the brain that excel in reflex response—units of the
brain that do not consciously express themselves (Marcus, 2012).

INSTRUCTIONAL METHODS—WAYS TO PRESENT

KNOWLEDGE AND SKILLS
Assuming that your investigation and analysis have shown that
the cause(s) of the unwanted event is a deficit in the person’s
knowledge and skills, there are different ways to provide them the
information that is needed. One approach is content focused; the
other is competency focused.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 223 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
224 Root Cause Investigations for CAPA: Clear and Simple

Content-focused approach to training

Content training exists in many forms: leader-led presentations,
workshops, e-learning courses, and webinars are some of the most
typical examples. A common characteristic of these is that the learner
has little or no control of what they are learning or when it happens.
Rather, the training is dependent on the instructor or curriculum
developer. Most courses (high school and undergraduate university)
are designed this way: learners move through the courses as a cohort
with the goal of getting a grade indicating successful completion
of the course. In industry, these sessions are sometimes given in
large group settings; there is often no learner assessment conducted
(meaning no one really knows if or what the participant learned).

(With most nonsynchronous e-learning courses that learners

can do at their own time and pace, faster learners can move through
the material at a speedier rate.)

Advantages of content-based training include that it is easy to

administer, it is “comfortable” to most people because we are all
familiar with it, and there is a level of control on the part of the
instructor or organization. A major drawback is that unless the
sessions are well designed with hands-on activities that involve
the participants (such as case studies, reflection, and group work)
and stimulate social learning (participants learning from each other,
discussed more below), content training is sometimes considered
“death by PowerPoint.”

One other variation of content-based training often used for

instruction on procedures is referred to as “read and understand”
(or as some say, “read and hope”). While it does have some utility,
“read and understand” is greatly overused, particularly in situations
when someone needs to correctly perform a cognitive task (i.e.,
thinking through a problem like determining the significance of
a temperature excursion) or a motor task (i.e., packing a shipping
carton with cool packs using a diagram). The limitations of the “read
and understand” approach were described by Michael Polanyi
(1966), who wrote about knowledge management, saying “We know
more than we can say; we say more than we can write.” In practical
terms this means that if you are getting trained by only reading a

LICENSED TO JOSE CASTELLA

Vesper Book.indb 224 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Training as a Corrective Action 225

procedure, the knowledge that you are acquiring is limited to what

has been written on the page.

Usually “read and understand” standard operating procedure

(SOP) training requires that the learner read the SOP prior to seeing
the task performed. This is a mistake, particularly for a person who
is new to an area or technology. The problem is that without any
experience, they do not have a mental model or a picture in their
mind of the equipment, instrument, or process. A better practice is
to have the learner watch a skilled person correctly perform the task,
perhaps with someone else describing it to the novice (this could be
done live or using video) and then have the learner read through the
procedure while they mentally map what they are reading to what
they have seen.

Competency-based training approach

Competency-based training is used in a variety of professional and
vocational development programs as a way for a person to achieve
the knowledge and skills needed to safely and effectively perform
the tasks required in a given role or position. The competency
approach differs from the content-based training (or academic)
approach described above in that the goal of the learner is not just
to compete a course, receive a grade, and then advance to the next
level (or perhaps graduate or receive a certificate), but rather be able
to achieve a defined level of performance.

Competency-based training is not concerned with groups but

rather with the individual—what can be done to help this person
optimally achieve the intended goal. For some it might require more
coaching and practicing. For others, less. Often, structured on-the-
job training (discussed below) is integrated with a competency-
based approach that gives a level of confidence that the person can
successfully perform the task.

When assessing if a person has acquired the knowledge and

skills, a pass/fail scoring system is used with criteria that can be
readily observed. Additionally, the task is broken down to include
the critical success factors for the task, which helps make it easier to

LICENSED TO JOSE CASTELLA

Vesper Book.indb 225 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
226 Root Cause Investigations for CAPA: Clear and Simple

remediate a non-successful performance. For critical tasks (experts

or trainers must define what criticality is for a task in a given
context), it may be useful for the person to do multiple successful
performances.

When developing a competency model, there are certain

competencies that are universal in a GxP environment. For
example, recordkeeping techniques incorporating the ALCOA-plus
specifications that result in reliable and trustworthy records must
be used by those in product development, manufacturing, and
distribution. Other competencies are more job and role specific. Table
1 provides examples of competencies for someone who performs
root-cause analyses and writes investigation reports.

Table 1. Examples of competencies for a root-cause investigator and

report writer

• Proficient in using a variety of methods and tools for root-cause analysis.

• Applies critical thinking in the course of doing investigations and analyses.

• Develops relationships with personnel at all organizational levels.

• Strategically approaches an investigation to make the best use of time.

• Keeps an open mind while doing an investigation and avoids biases that could
interfere with it.

• Identifies corrective actions that are aligned with the root, proximal, and
contributing causes.

• Writes clear investigation reports that are technically and grammatically correct.

• Can assess the criticality of an event and the need to inform leadership and
request needed resources.

• Investigates and writes up the report to meet the expectations of health

authorities, management, technical experts, and operations leadership.

Advantages to using competency-based training include

that it is specific to what the learner/performer does in his or her
job; unnecessary content is not included. Also, it is focused on
the individual with the goal of helping the individual succeed;

LICENSED TO JOSE CASTELLA

Vesper Book.indb 226 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Training as a Corrective Action 227

adjustments are made as needed to provide remedial coaching and

extra practice time. Even with this individual focus, a competency-
based approach that includes structured on-the-job-training will
usually take less time in moving someone from a novice to a
confident, competent performer. Drawbacks of competency-based
training include that it takes time and expertise to initially identify
what the competencies of a position are and to establish the learning
plans for personnel. This approach also requires more active
involvement from instructors, coaches, and mentors.

Using stories in learning events

There are a variety of ways of presenting content—activities, games,
stories, demonstrations, and video. Stories are one of the most
powerful ways of conveying information. At a conference some
years ago, one of the leading figures in information technology and
education said, “We are a campfire people. For generations, we
have sat around campfires telling each other important stories of
who we are and where we have come from.” If the intent of a story
is to encourage someone to do the right, correct thing, a story with
a happy ending is appropriate—for example, how preparing an air
shipment properly ensured that a vaccine needed in an emergency
arrived at its destination safely and helped prevent a major outbreak.
On the other hand, if the intent of the story is to warn someone about
what not to do, the story should have a negative conclusion—a
horror story, in other words. For example, a technician doing a
bubble-point filter integrity test didn’t fully understand it and wrote
down results that did not make sense. He was finishing the sterile
filtration of the liquid when a quality unit person looked at the batch
records and realized there was a major problem. At this point, the
lot could not be saved; a $1 million batch of biological product had
to be scrapped.

Social learning
One of the most powerful ways that we learn is through social
learning—we learn by watching and listening to others. In the 1930s,
the Russian researcher Lev Vygotsky examined how infants learn

LICENSED TO JOSE CASTELLA

Vesper Book.indb 227 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
228 Root Cause Investigations for CAPA: Clear and Simple

language. They don’t use books or formal instruction; rather, they

are dependent on those who know more—parents, older siblings,
and the like. Vygotsky (1978) used the term “more knowledgeable
others” when referring to these people.

Social learning can be used with content- or competency-based

approaches. It can (and should) be a part of any of the learning
approaches described here.

The “more knowledgeable other” may be someone designated as

a trainer or supervisor, or it may be a peer with a bit more experience
who informally shows the novice how to do a task or where certain
information can be quickly accessed. Generally, this is viewed as
a very positive side of teamwork. Occasionally, however, learning
from others in a workgroup can be a negative, such as when a long-
experienced (and often respected) incumbent tells the novice, “Yes,
I know the procedure says that, but here is a faster way to do it . . .”

Structured on-the-job training

Structured on-the-job training (S-OJT) can be combined with other
approaches (for example, e-learning) that can supply foundational
information-type knowledge; it is particularly complementary with
competency-based training. While most often used with tasks that
are motor-based (e.g., packing a shipping container, preparing a lab
sample for chemical analysis), S-OJT is also appropriate for cognitive
skills, such as reviewing the output from a data logger or reviewing
batch records in anticipation of product release.

On-the-job training (OJT) that is not formally structured is

one of the oldest methods of providing someone with knowledge
and skills. An apprentice would learn in stages and through a
combination of observation, performance, and trial and error,
with the master identifying what went wrong and what could be
improved. The apprentice would move on to the next level based on
time spent and if the master was satisfied. (There were also financial
and social factors that came into play here.)

When OJT is not conducted properly, it can be a very passive

activity, as connoted in the British phrase for it, “sitting next to

LICENSED TO JOSE CASTELLA

Vesper Book.indb 228 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Training as a Corrective Action 229

Nelly.” S-OJT, on the other hand, is done using a formalized approach

with specially trained instructors using a standardized outline that
includes safety and compliance topics, what can typically go wrong,
critical steps, and the like. Table 2 shows a more complete list of
what is usually included. S-OJT trainers need to have the skills (and
patience) to perform the task and also to coach the learners as they
develop their competency.

Table 2.Topics/sections typically included in structured on-the-job

training

• Goal of the S-OJT activity

• Intent of the process
• Specific outcomes/outputs
• Context/background information
• Prerequisites
• Environmental, health, and safety information and requirements
• Specific regulatory requirements
• Equipment/materials/documents needed in task performance
• Key points to know and remember
• What can typically go wrong (e.g., defects)/how to detect/what to do
(troubleshooting)
• Learning guide with steps, substeps, and output criteria
• Performance assessment (with criteria) and knowledge questions to be asked
• Confirmation of successful performance (statement and signatures, dates)

Typical (i.e., unstructured) OJT has several serious potential

deficiencies. Since the task’s steps and substeps (the whats and the
hows) are not defined in a structured training guide and because
there can be several trainers, inconsistencies occur in what is told
and shown. Also, OJT trainers may not be competent in the task itself
or in the complementary communication and coaching activities
that are needed. Finally, demonstrations of the learner’s successful
performance are not typically required or are done without
standardized criteria of what constitutes acceptable performance.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 229 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
230 Root Cause Investigations for CAPA: Clear and Simple

In contrast, S-OJT is a planned, systematic way of providing

knowledge and skills and includes (adapted from Jacobs, 2003):

• A criteria-based decision on when S-OJT should be used, based

in part on the complexity of the task, its criticality (i.e., what
failure would mean), and the difficulty someone would have in
learning the task).

• S-OJT trainers who are selected in part because of the skills they
have in performing the task and in providing feedback and
coaching to the learner.

• S-OJT units that are developed using a structured, templated

approach along with people who are exemplary task performers.

• S-OJT units that are presented to the learners using a rational

plan that sets the learners up for success (for example, moving
from simpler subtasks to more complex ones). This includes
demonstration, explanation, and having the learner try the task,
with feedback and coaching provided from the trainer.

• Assessment of learner performance according to predefined

criteria that would include quantity and quality attributes.
This is done without the trainer saying anything unless there is
danger to people or equipment or facilities.

What type of skills training should be used?

If you need to provide knowledge or skills training for personnel,
what approach should you use? Table 3 identifies four types of
training and suggestions of when they may be most appropriate.

The coversheet in the OJT and S-OJT approaches would include

the details provided in Table 2 above. The headings for a specific
users guide in the S-OJT approach is shown in Table 4.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 230 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Training as a Corrective Action 231
Table 3. Four types of training and when they may be most useful

Read and Instructor-led On-the-job Structured on-

understand training training (OJT) the-job training
only (R&U) (ILT) (S-OJT)
Only method used With instructor guide Unstructured—use Uses task coversheet
or detailed PPT slides of a generic template AND a specific, valid
AND a specific task learning guide
cover sheet
• SOP describes an • For knowledge- • When some • When outcome
administrative task based tasks variation in AND method
performing the (how it is done)
• SOP has clear • When
task is acceptable are both critical
rules that do explanations are
not need an important to • When job aids, • When variation in
explanation understanding the guides, photos, task performance
task videos, etc. are is not permitted
• Words on the
easily available
paper say it all; no • When the learner • When job aids,
to person when
real value in saying needs help in guides, photos,
performing task
more discerning actions videos, etc.
or options to take • When outcome is are either not
• SOP supports a
important but not available or not
template or form • When there
critical able to show
that needs to be are gray areas
critical actions
filled out that need to be • When instructors
distinguished or mentors • For complicated
• Form or template
have significant tasks
is online and uses • When conveying
experience that
a real-time edit rationale is • When there are
they can convey
for correctness important a large number
• When task is not of personnel who
• When risk of • For tasks that are
too difficult need to be trained
failure is relatively independent of
low the environment • When the • When a high
where they are equipment, degree of
• When the what
performed instrument, etc. is performance
(output) is more
very or somewhat standardization is
important than • When social
intuitive; a helpful required
the how (method) learning—learning
user interface
from others—is • When the
• When the SOP
valuable experience,
can be available
expertise level
for reference • When creating
of instructors is
relationships
• When the moderate
(learner–learner
equipment or
or learner– • When conveying
instrument has
instructor) is (through practice)
an interface that
important tacit knowledge is
guides the user
important

LICENSED TO JOSE CASTELLA

Vesper Book.indb 231 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
232 Root Cause Investigations for CAPA: Clear and Simple
Table 4. Headings and an example of an S-OJT learning guide

Learning guide: Perform environmental monitoring sampling

Step/sub What How Criteria Photo

1 Label samples Write reference All required (Photo shows
map location and information is example of a
description on correct and properly filled
sample labels included out label on a
petri dish)
ALCOA+
2 Place samplers in Use locations Location (Photo shows
correct locations defined by map matches map, image of map)
description, and
sample label
3… (Guide
continues…)

ASSESSMENT AND EVALUATION OF THE LEARNING

To have confidence that the learners have acquired the needed
knowledge and skills, an assessment of that learning is conducted at
the end of the training event. You also want to have confidence that
personnel are applying and using the knowledge and skills back in
their workplace—something called transfer. Assessment of transfer
is usually done 4–12 weeks after the training event. (Chapter 15
discusses this in more detail.)

CONCLUSION
More often than not, training is given as a corrective action to a
quality event or deviation, particularly if a true or probable root
cause has not been found. Training is a powerful tool to improve
knowledge and skills if the investigation shows that they are lacking.
Otherwise, you will be wasting time and resources. If the problem is
performance related, solving it needs to consider human factors as
well as other contributors discussed in other chapters.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 232 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Training as a Corrective Action 233

REFERENCES
Brown, J.S. and Duguid, P. (2000) The Social Life of Iinformation.
Boston, MA: Harvard Business School Press.

Deeks, T., ed. (2018) Phase Appropriate GMP for Biological Processes.
Bethesda, MD: PDA/DHI.

Jacobs, R L. (2003) Structured On-the-Job Training, 2nd edn. San

Francisco, CA: Berrett-Koehler Publishers, Inc.

Marcus, G. (2012) Guitar Zero: The New Musician and the Science of
Learning. New York, NY: Penguin Press.

Nonaka, I. (1991) The knowledge creating company. Harvard Business

Review, 69(6):96–104.

Polanyi, M. (1966) The Tacit Dimension. Garden City, NY: Doubleday

and Company.

Polanyi, M. (1962) Tacit knowing: Its bearing on some problems of

philosophy. Review of Modern Physics, 34(4):601–616.

Ryle, G. (1949) The concept of mind. Middlesex, UK: Penguin.

Vesper, J. (1993) Training for the Healthcare Manufacturing Industries:

Tools and Techniques to Improve Performance. Buffalo Grove, IL:
Interpharm Press (now CRC Press).

Vygotsky, L.S. (1978) Mind in Society: The development of Higher

Psychological Processes. Cambridge, MA: Harvard University
Press.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 233 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 234 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

CORRECTIVE ACTION
EVALUATION AND EFFECTIVENESS
CHECKS

If you have identified the root cause(s) and put in place the
appropriate corrective actions, the next step is to have confidence
that your actions have been successful. Evaluation and effectiveness
checks are a way to do that. And, even if you show that the efforts
did not have the intended positive impact, you can use this as an
opportunity to gain information and knowledge.

There are several different ways to develop confidence in the

effectiveness of your solution. These can be done at different times.

One of the best ways to have a comprehensive plan to check

effectiveness is to follow the advice from the poet T.S. Eliot: “The
end is where we start from.” In other words, thinking about your
strategy as you are beginning your CAPA implementation is
advisable.

FORMATIVE AND SUMMATIVE EVALUATION

Evaluation has two different phases: formative and summative.
Formative evaluation is done while the project is still in development.
For example, if you are trying different designs of data collection
sheets or user interfaces on a notebook application, in the formative
evaluation phase, you are collecting feedback to make improvements

235
LICENSED TO JOSE CASTELLA

Vesper Book.indb 235 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
236 Root Cause Investigations for CAPA: Clear and Simple

in the tool. This can be accomplished by getting feedback from users

and customers or by using a more quantitative approach if errors or
successes can be counted—for example, how many documentation
errors have typically been seen on batch records using the old style
versus the new style. Formative evaluation can provide confidence to
stakeholders on the specific solution before it is put into widespread
practice.

Summative evaluation is used in making a decision: Do we want

to implement this new format or new interface to all departments and
all sites? Summative evaluation is used following the development
of the tool, process, or project that included formative evaluation.

TIMING AND METHODS FOR EFFECTIVENESS CHECKS

When and how would you do effectiveness checks? There are no
clear rules, but what follows are some general guidelines that you
can consider and apply.

One consideration that would shape your effectiveness check

plan would be the criticality of the unwanted event and the corrective
action. The larger the scope, the more complex the action, and the
more significant the impact of failure, the more formal and robust
the effectiveness check should be.

For checks that will be more significant or complicated, putting

together a plan may be useful. This would include things such as:

• What will be measured? Is the symptom of the problem (which,

if it occurred, could be due to a different root cause) or the root
cause(s) identified?

• Where will it be measured? This could include the specific place

in the process or activity as well as the location of the site(s) or
department(s).

• When will it be measured? Will this be done upon completion

of implementing the new solution or after a time when people
have become familiar with using it, or both?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 236 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Action Evaluation and Effectiveness Checks 237

• How will it be measured? What is the approach that will be used?

Surveys? Interviews? Objective, quantifiable (e.g., countable)
evidence? Is there a standard, appropriate tool for measuring
a specific attribute? What about sampling—type of sample,
such as random or convenience, as well as number of sampling
points?

• How will the measurements be analyzed? What sort of statistical

treatment, trend analysis, comparisons, and the like would be
valid and provide more information? Who will do the analysis?

• Who will perform the measurements? Might there be some

unintentional bias that could be present, and if so, is that a
problem?

• Who will the results be communicated to? Management,

vendors, other stakeholders (including personnel), or a health
authority?

Qualification and validation

As mentioned in a previous chapter, qualification and validation
are really not corrective actions. Instead, they give you confidence
that the solution works. In some cases, this could show that a new,
improved, or changed piece of equipment or setting works—for
example, changing a vial cleaning process because of a higher level
of particle contamination on the incoming glass containers.

A guideline (GHTF, 2010) to corrective actions used in the

medical device industry divides the effectiveness checks into two
parts, verification and validation:
Verification: Confirmation through provision of objective evidence
that specified requirements have been fulfilled.

Note 1: The term “verified” is used to designate the corresponding

status.

Note 2: Confirmation can comprise activities such as:

• Performing alternative calculations.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 237 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
238 Root Cause Investigations for CAPA: Clear and Simple
• Comparing a new design specification with a similar
proven design specification, undertaking tests performing
demonstrations, and reviewing and approving documents
prior to issue.

Validation: Confirmation through provision of objective evidence that

the requirements for a specific intended use or application have been
fulfilled.

Note 1: The term “validated” is used to designate the corresponding

status.

Note 2: The use conditions for validation can be real or simulated.

Under most quality systems, the verification step as defined

above would be done in conjunction with change control. Validation,
in which you are more concerned that outcomes are consistently
achieved, would be aligned with this effectiveness check.

Examples of approaches for evaluation and

effectiveness checks
As mentioned earlier, the specific approach or approaches that are
selected need to be thoughtfully aligned with the criticality, scope,
and impact of the unwanted event and the complexity and timing of
the solution (i.e., the corrective action); however, here are examples
that may be helpful. Some of these approaches can be combined to
give a more robust evaluation.

• Pre/post or before/after: If you have data describing the

situation before the new approach was implemented, this could
be compared to data collected after the solution was put in
place. Time should be given between implementation and data
collection to give those using the new solution time to become
comfortable with it. While less reliable due to potential biases,
self-reports of those doing a task or process could describe the
benefits they see in the new approach.
• Recurrence: This is the most typical way to look at effectiveness—
did the corrective action “turn off” the problem? If you had a
very specific root cause and a tightly aligned corrective action

LICENSED TO JOSE CASTELLA

Vesper Book.indb 238 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Action Evaluation and Effectiveness Checks 239

this will be a useful approach if the unwanted event occurred

frequently enough to be measured in a reasonable amount of
time. If looking for recurrence, you need to differentiate what
you are looking for—root or proximal cause(s) or, alternatively,
the symptom by which the problem is observed.
• Indirect measures: While not proof, indirect measures can
provide evidence concerning the success of the corrective
actions.
• Audits: This would include looking for continued effective
implementation of a corrective action when auditing contractors,
key vendors, and other facilities.
• Spot checks: Periodic random looks to ensure compliance with
a corrective action. For example, checking the training records
of 5–10 personnel in a department each quarter to ensure areas
are meeting their training requirements.
• Trend analysis: Looking at a large number of results over time
or a given scope to look for shifts or patterns.
• Complaints: Specific feedback from internal or external
“customers.”
• Near misses: If you have a situation where there is a high level
of detection before the harm occurs, detecting the event can be
useful. Sometimes these are called “near misses.”

If the unwanted event does occur, it does not necessarily mean

that the corrective action taken was not effective. It could mean that
the same symptom appears (the event) but another root cause and
failure sequence are active—they were not uncovered or addressed
earlier. For example, my car did not start (the symptom) due to a
battery problem last month and I replaced the battery (the valid root
cause). Now my car won’t start, but it is because I am trying to start
it with the wrong key: same symptom but different root cause.

Timing
As you consider your corrective action solution(s) to the problem,
you may decide that a one-time effectiveness check is sufficient or

LICENSED TO JOSE CASTELLA

Vesper Book.indb 239 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
240 Root Cause Investigations for CAPA: Clear and Simple

that multiple checks over a defined period are more appropriate.

If sustainability is important or if something could affect ongoing
usage, doing more periodic checks would be important, at least
for a period. You have the option for determining and setting the
frequency of these. When setting the timing, be sure to not be overly
optimistic.

DOCUMENTING THE EFFECTIVENESS CHECKS

You want to be able to have evidence to show that your actions
have been taken and (hopefully) that they have been effective. If
the correction and corrective actions were processed through
your change management quality system element, it should have
provisions for these checks. The approach you use to record and
track the corrections (e.g., paper-based, electronic) needs to have the
capability of documenting the checks.

A consideration when doing this is not to keep the investigation

in an “open” status when effectiveness checks are being performed.
If an auditor or inspector reviewed lists of deviations, this could
make it appear that the firm is taking an excessive amount of time in
completing the needed actions.

EVALUATION AND EFFECTIVENESS CHECKS RELATED

TO TRAINING AND PERFORMANCE
If the investigation pointed to a true lack of personnel knowledge
and skills, there are two particular ways that this can be examined.
At the end of the training session, there can be an assessment of
learning, sometimes called a Level 2 assessment (Kirkpatrick, 1994;
Kurt, 2016). The questions you want to answer are, “What did
the learner learn? What did the learner not learn?” This type of
assessment is conducted at the end of the training, using a method
that is as close to how the person uses the information or skill as
possible. If training is provided using structured on-the-job training
(S-OJT) as discussed in Chapter 14, a demonstration of performance
is included in that method. As the person is performing the task, the

LICENSED TO JOSE CASTELLA

Vesper Book.indb 240 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Corrective Action Evaluation and Effectiveness Checks 241

instructor can ask questions that are listed on the S-OJT worksheet
such as:

• Why is this task important?

• What are the things that can go wrong when performing this
task?

• What would give you an indication that something is starting

to go wrong?

• What would you do should something start to go wrong?

If the task is more of a cognitive-type task (such as reviewing lab

notebooks or writing an investigation report), having the person
correctly do that task will give you a level of confidence that they
have gained the needed knowledge and skills.

Level 2 assessments give you information as to whether the

person has the knowledge and skills at the end of the training event.
What you do not know, however, is if they can actually do the task
back in their job, under actual conditions of use. Level 3 assessments
(Kirkpatrick, 1994; Kurt, 2016) conducted 4 to 16 weeks after the
training event provide this information. Here you are asking the
question, “Is the person effectively/correctly applying the knowledge
and skills as they perform the task?” This is where a host of other
factors can influence performance: other personnel in the work
area (maybe those with more experience) who might be violating
the procedure by taking shortcuts, not having the appropriate tools
available, inadequate feedback from supervision, work conditions
(high levels of noise, distractions, and stress) that interfere with
good performance, and other factors. These assessments should be
done without being preannounced—you want to get as much of a
real-life view of the situation as possible.

If there was learning (a successful Level 2 assessment) but not

proper transfer into the workplace, using a performance checklist
or considering the TWIN error precursors discussed in Chapter 14
should be used.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 241 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
242 Root Cause Investigations for CAPA: Clear and Simple

A CAUTION
As you think of metrics, you need to consider what inappropriate
behaviors they can potentially produce. For example, the head of a
manufacturing site had “decreased deviations” as one of his major
goals. Can you think of several ways this could be accomplished?
One might be fewer people reporting deviations. A metric of
closing out investigations within 30 days can result in inadequate
investigations that do not find the true root causes and that often
have “retrain personnel” as the principle corrective action.

CONCLUSION
Effectiveness checks provide confidence that the actions taken were
effective in correcting the underlying problem and preventing
a recurrence. There are a number of ways that this can be done
depending on the significance of the unwanted event and the
potential impact that the event could have on the product and
patients. A failed effectiveness check often means that the actual
root causes were not identified or that the actions taken were not
adequate. When several different methods are used at different time
points following the corrective actions, an organization can have
high level of confidence in the fixes that were applied.

REFERENCES
Kirkpatrick, D.L. (1994) Evaluating Training Programs: The Four Levels.
San Francisco, CA: Berrett-Koehler.

Kurt, S. (2016) Kirkpatrick model: Four levels of learning

evaluation. Educational Technology, October 24, 2016. https://
educationaltechnology.net/kirkpatrick-model-four-levels-learning-
evaluation/. Accessed 5 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 242 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

WRITING THE REPORT

If you ask someone from a pharma or biopharma manufacturer what

they produce, they will quickly say, “medicine” or “tablets,” or they
may give the names of their top-selling products. Only after more
prodding might someone say ”data,” “information,” or ”documents.”
What they and many in our industry do not understand is the
value that information coming from an investigation can have on
improving processes, products, and systems, immediately as well
as in the future.

Q10, the International Conference on Harmonisation (ICH)

Guideline on Pharmaceutical Quality Systems (2008) describes
a modern quality management system (QMS), with quality
risk management and knowledge management as enablers or
foundations of the QMS (Figure 1). Q10 defines knowledge
management as the “systematic approach to acquiring, analysing,
storing and disseminating information related to products,
manufacturing processes and components” (p. 14). If there can be
a “bright side” or positive element in a failure, it is that you have
now acquired new and additional information about the product
or process. This could include identifying a new failure mode or a
control that did not perform as expected. It is important that this
information be captured; reports are one way to do that.

In this chapter, we will look at some general considerations

when writing a report, ways of presenting the information, and the
topics in an abbreviated and an extended report.

243
LICENSED TO JOSE CASTELLA

Vesper Book.indb 243 5/29/2020 10:56:20 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
244 Root Cause Investigations for CAPA: Clear and Simple
Figure 1. Components of a pharmaceutical quality system according
to Q10 (ICH, 2008)

GENERAL CONSIDERATIONS OF AN INVESTIGATION

REPORT
Your report may be read by a variety of people (or “audiences”)
who are looking for answers to their specific questions. Table 1
shows some audiences with a sample of questions they could have
concerning a quality event or deviation.

Table 1. Readers of investigation reports and questions they want

answered

Audience Question
Area/department • What happened?
involved in incident • Why and how did it happen?
• If something was affected, can it be saved (reworked,
reprocessed)?
• Is there a disciplinary action that should be taken?
• What is the impact to the schedule/other products?
Management • What are the business implications of the incident?
• What does this mean for product availability?
• What are the compliance or regulatory implications?
• How bad was it?
• How will it be fixed?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 244 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Writing the Report 245

Audience Question
Quality • Can products involved be released?
• What are the implications to other products and processes?
• What are the compliance or regulatory implications?
• What does this mean for product availability?
• Was the investigation adequate?
• Was the root cause found?
• Were scope and impact properly assessed?
Health authority • What does this mean for patient safety?
(inspectors) • Was the root cause found?
• If the cause wasn’t found, how will patients be protected if it
happens again?
• How might this affect drug availability?
• Is the process/facility (still) in a state of control?
• Were scope and impact properly assessed?
• Is this incident a recurrence or part of a trend?
• Are other products or lots in the market potentially affected
by this same issue?

Formats
There is no singular standard for what a report looks like. They can
and do vary in a number of ways, including the complexity of the
investigation and the type of system (paper-based or electronic)
used for documenting. For example, some firms (often smaller,
early start-ups) use paper-based forms for simple investigation
and narrative reports for more complex ones. Other firms use an
electronic system (TrackWise® is frequently seen; other database
options also exist) that has designated fields one fills in. Other
organizations use a combination—putting summaries into the fields
that come from the written report; a PDF version of the report is
uploaded into the application.

The type, complexity, and length of the report should be based

on the significance of the event or incident. As with the investigations
discussed earlier, there is not “one-size-fits-all.” Some situations
may just require a note in a batch record or notebook, others may
have a one-page incident report, while others (hopefully, a small
percentage) would need an extensive, more formal report.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 245 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
246 Root Cause Investigations for CAPA: Clear and Simple

Including your rationale and decision-making process

A key question that a reader of your report, be it short and simple or
long and complex, has is, “How do you know?” How do you know
that materials were not affected by the deviation? How do you
know that the scope of the incident was limited to three batches?
Being able to provide an information-based justification for your
conclusions is essential.

To name or not to name

When writing the report, do you include the names of those involved
in the unwanted event? Here, industry practice varies considerably.
National laws, particularly in Europe, and also labor contracts may
spell out what can and cannot be done. Here is the range of options
that can be seen:

• No names are used: Only the roles are included (e.g., production
supervisor, laboratory analyst, materials mover). In some cases,
firms will have a confidential annotated copy that does have
names that can be used to link the incident with reviews of
training records, qualifications, and corrective actions.

• Unique identifiers are used: This can range from unique initials
of the person (e.g., GAB) or the whole or partial employee
number (e.g., X-1140).

• Full names: Firms that use this approach say that this is meant
to hold people responsible for their actions. This, however, does
seem to be placing blame on those at the “sharp end” of the
incident. (The use of identifiers like full name may be subject to
national labor laws or contracts.)

Word choice
When writing the report, one needs to be careful about what
words to use. Words that are imprecise or vague (“probably” or
“typically”) should not be used, or if they are, should have evidence
to support them. For example, saying “most probable root cause”

LICENSED TO JOSE CASTELLA

Vesper Book.indb 246 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Writing the Report 247

would need to be supported by what was considered and ruled out

(and why) and why this likely cause should be considered to be the
most probable one.

As you are writing, you are not just telling the story and providing
facts; you also need to manage the anxiety or fear of the reader. For
example, instead of saying “massive contamination,” provide a
quantitation of contamination: “30 to 40 percent of the batch showed
visible particulate matter.” Be aware that some reviewers have their
own list of “trigger words” that are to be avoided.

Having a writing style that is clear and simple is important. If

you can convey the information without a lot of complex language,
it is to your advantage. Having flowcharts and photos is also very
helpful.

It is also useful to have a list of standardized words that are used

for consistency. For example, using “visible particulate matter” or
“particles” can make searches easier and results more comparable.

Give yourself credit

Sometimes writers do not specifically give themselves (or their firms)
credit for what was done: for example, not specifically saying what
items were checked that did not yield any positive information—
“the logbook was reviewed and all entries were appropriately
completed as per procedure.” When describing something, if you
can legitimately (i.e., can support with evidence) reinforce a good or
required practice it can be useful. Instead of saying “the analytical
balance,” say “the calibrated analytical balance” or not just “the
technician” but “the fully qualified technician.” Again, you need to
be prepared to show the evidence for claims like this that you make.

Spelling counts
Small details such as correct spelling, proper grammar, good
sentence structure and organization are important as they give
the reader a sense of the firm’s attention to detail. (If the report is
written in English by people for whom English is not the primary

LICENSED TO JOSE CASTELLA

Vesper Book.indb 247 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
248 Root Cause Investigations for CAPA: Clear and Simple

language, I personally do not judge this the same way.) Instead of

long paragraphs, consider using bullet points that reduce the words
needed to present the facts.

Searches for past events

It is important to include a search for previous similar events.

Having a database or online capability to search reports is invaluable
for this—you want to do more than just asking personnel for what
they remember. It is useful to search for both the symptom (e.g.,
“temperature excursion”) as well as the root cause (e.g., “thermostat
malfunction”).

The report should describe the search terms used (e.g,

“temperature excursion,” “chillroom failure,” or other words or
phrases) and the timeframe for the search. Health authorities want
to see at least one year searched; in some cases when there are few
opportunities for failure, multiple years should be included. It is
helpful to include a the justification for the reason a time period was
selected: for example, “A search for past events was made covering
all chillrooms at the Cambridge, MA, site for a full two years, as this
was when the site became operational.”

Attachments

Oftentimes, attachments provide the bulk of the document. These

could include analytical results, particularly important procedures
and training records. If the process or system is complex or unique,
having an attachment of the flowchart and a narrative of how the
system works can be useful. Putting this as an attachment does
not interrupt the telling of the story; if someone needs background
information they can find it here. (One firm has a folder with
standardized flowcharts and detailed descriptions of different unit
operations, which are pulled from the folder and included in the
reports when appropriate.)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 248 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Writing the Report 249

Low-significance (impact) events

Documentation of these events can be done in situ—that is, on the
records being generated with the event, such as batch records, log
books, and data collection sheets. These recorded events, categorized
using a risk-based decision tree, would include what happened,
where, how, and immediate action and correction taken. Generally
for a one-off (i.e., infrequent), low-level event, no corrective or
preventive action is needed beyond the immediate action or
correction. However, these must be reviewed and approved as part
of the document and/or batch review and release process.

As mentioned above, the language used should be standardized

to support efficient tracking and trending.

While these events do not impact product safety, identity,

strength, purity, quality, and availability (SISPQ-A), preventing
them may support continual improvement efforts.

Moderate-significance (impact) events

Using a risk-based approach to your investigations, this category
would include events that are not low-level events but might
potentially have an impact that could affect SISPQ-A or be a
compliance or business requirement. These events would require
more diligent investigations and thorough documentation.

High-significance (impact) events

These events may or may not have an actual significant impact
on processes, products, patients, or compliance but need to be
thoroughly understood to prevent a recurrence. These are the
investigation reports that are the longest and often are written up
as formal reports.

Short form example

For those investigations classified as having moderate significance
and/or when the cause(s) are easily known, a one- or two-page form
(paper or electronic) may be all that is needed. Figure 2 shows an

LICENSED TO JOSE CASTELLA

Vesper Book.indb 249 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
250 Root Cause Investigations for CAPA: Clear and Simple

example of such a form that integrates quality risk management

principles.

Figure 2. “Short form” for investigations using risk-based headings

NARRATIVE PRA EXAMPLE – Incident Investigation

Describe the event Description of the problem (unwanted event):Ǧͳͳ
ȋǦͳ͹Ȍ ͸
ʹͲ ʹͲͳͷǤ
Describe the context How discovered:Ǧͳͷǣ͵ͲǡʹͲ
ʹͲͳͷ Ƭ
Ǥ
Describe the context Causeǣ
Ǧȋ ǦͺȌ
Ǥ Ǧͺ
ͳͳ ʹͲͳͶȋͳͶǦͲʹͲͷͳȌǤ
Describe the context Historical review.
ͳʹͲͲͷǤ
ʹͲͲ͵Ǥ
Ǥ
Hazard identification Scope of the event:Ǧͳͳ ȋ
Ȍ Ǥ

Ǧ
Ǥ
Hazard identification Impact: Ǧͳͳ
(Risk analysis and ǣ
risk assessment not ͳǤ
needed because ͻǤ
there was no impact ʹǤ
to the things of value Ǥ
(i.e., contents of ͵Ǥ ȋͲͻǦʹͻͳȌ
freezer.) ȋȌǡ
ȋǦͳ͹Ȍͳ͸
Ǥ
Risk Treatment Immediate actions:
Ǥ
Risk Treatment Correction: Ǧȋ ǦͺȌ
ȋͳͷǤͲͲͶͳȌǤʹʹǣͲͲʹͲ
ʹͲͳͷǤ
Risk Treatment Corrective action: Ͷ
͸Ǥȋ Ǧ
ͳͷǤͲͲͲͺǤȌ
Risk communication Communication:Ǣ
Ǥ
Review and Effectiveness check:
monitoring
Those involved in ǣ
the investigation and
RA
ǣȋȌȀ
ȋȌǣȋȌȀ
ȋȌǣȋȌȀ

ǡʹͲͳͷ
LICENSED TO JOSE CASTELLA

Vesper Book.indb 250 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Writing the Report 251

Long, formal reports

If the investigation has high significance or the investigation itself
was extensive, a longer, formal report may be appropriate. Table 2
shows headings that should be considered for inclusion.

Table 2. Outline of topics found in a formal investigation report

• Executive summary (stand-alone first page)

• Incident description—basic facts
• Immediate actions
• Scope of the investigation
• Investigation approach—tools and methods used
• Historical results review—key words, time frame, results
• Investigation findings and results
• Cause conclusion
• Product impact assessment and justification
• Patient impact (provided by medical expert, as needed)
• Corrections
• Corrective actions
• Recommended disposition and justification
• What was learned because of this event; how this information can be used
• Effectiveness check
• Background information (optional)
• Documents, flowcharts, photographs, supporting evidence attached (as needed)
• Names, titles of those involved in investigation

For long reports, consider having a one-page, stand-alone

executive summary that gives the reader a “digested” overview
of all the pertinent facts. This would be written as one of the last
steps in the report writing process and would include the problem
description; root, proximal, and contributing causes; scope and
impact; and immediate actions, corrections, and corrective actions.

The rationale in having this summary is that it will give the

reader (and specifically an auditor or health authority inspector)
a quick view of what happened and actions taken. In some cases,

LICENSED TO JOSE CASTELLA

Vesper Book.indb 251 5/29/2020 10:56:22 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
252 Root Cause Investigations for CAPA: Clear and Simple

inspectors can be satisfied with the summary and avoid the time of
reading through the whole report—a benefit to all involved.

When writing a report, keep in mind that a reader will want

to determine what evidence you have that supports a particular
root or proximal cause. This would include “positive” facts that are
confirmatory, as well as what was looked at which did not support
what is alleged. For example, if a room went out of temperature
specification due to a malfunctioning thermostat, you might say that
testing showed the thermostat to have failed but other components
(that you would name) in the HVAC system were functioning
normally.

If tools like a fishbone diagram or a causal factors chart were

used, they can be effective models to present the findings, both
positive and negative. For example, the report could state what was
examined in terms of the major fishbone categories (i.e., people,
equipment, methods, environment, facilities, and materials) and
what was found or not found. Providing details on what was
examined, even though nothing “positive” resulted gives the reader
an idea of the thoroughness and diligence used by the investigators.

Question and answer format

One unique approach by an organization was to use standard
questions that would be answered in the report. The result was an
easy-to-read, clear report. Table 3 shows an example of how this
could be prepared.

Table 3. Example of a question and answer investigation report format

What went wrong? Roof leak with water dripping into hallway.
Where? Building 23 hallway CA, leading to solid dosage
manufacturing offices.
When did it occur? Overnight (after 11 pm) 10 July 2019.
When was it discovered? 6 AM 11 July 2019.
Who discovered the event? Stan Smith (extension #7345).
What immediate action was S. Smith called plant services; barrier placed around
taken? water; personnel detoured from hallway.
Etc. . . .

LICENSED TO JOSE CASTELLA

Vesper Book.indb 252 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Writing the Report 253

CONCLUSION
Investigation reports, whether they are brief notes in a batch record
or longer, formal documents, need to provide enough details so one
can reconstruct the event and investigation, and see the alignment
between the event and the correction and the actions taken.
The report is not just a document to file; rather, it helps provide
knowledge to others. A good test of a report is asking if it will be
self-explanatory to a reader three or four years from now without
the assistance of the writer.

REFERENCE
ICH (2008) Pharmaceutical quality system – Q10. International
Conference on Harmonisation. https://database.ich.org/sites/
default/files/Q10_Guideline.pdf. Accessed 3 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 253 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 254 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

REVIEW AND APPROVAL OF THE

INVESTIGATION AND REPORT

Before the investigation can be closed out and finalized, health

authorities expect that the documentation of it—the report—be
reviewed and approved.

The US CGMPs (FDA, 2019) provide the “whats” that are to

happen:
211.22 (a) There shall be a quality control unit that shall have the
responsibility and authority to approve or reject all components, drug
product containers, closures, in-process materials, packaging material,
labeling, and drug products, and the authority to review production
records to assure that no errors have occurred or, if errors have
occurred, that they have been fully investigated.

211.192 All drug product production and control records, including

those for packaging and labeling, shall be reviewed and approved by
the quality control unit to determine compliance with all established,
approved written procedures before a batch is released or distributed.
Any unexplained discrepancy (including a percentage of theoretical
yield exceeding the maximum or minimum percentages established in
master production and control records) or the failure of a batch or any
of its components to meet any of its specifications shall be thoroughly
investigated, whether or not the batch has already been distributed.
The investigation shall extend to other batches of the same drug
product and other drug products that may have been associated with

255
LICENSED TO JOSE CASTELLA

Vesper Book.indb 255 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
256 Root Cause Investigations for CAPA: Clear and Simple
the specific failure or discrepancy. A written record of the investigation
shall be made and shall include the conclusions and followup.

Details about how to review and approve are not provided in

the regulations—it is up to the firm to define these in its procedures.
In general, “review” refers to subject matter experts critically
looking at the report from a technical, quality, operations, and
business perspective to ensure that the information, logic, construct,
and conclusions are correct and appropriate. “Approval” would
mean that the organization is in agreement with the contents of the
report (information, conclusions, and the like) and will be taking
the actions that are specifically made in the report or that would
result from the report (releasing or rejecting the batch in question,
for example).

The review and approval phase is often where delays occur;

these can happen for a variety of reasons:

• No specific requirements as to how the report should be written.

There may not be formats or templates that everyone has agreed
upon beforehand or what specifically should be covered in a
section.

• Individual personal preferences of reviewers and approvers—a

situation that can contribute for writers “shopping” to find
a reviewer or approver who may be more aligned with the
investigation report writer’s approach.

• Inadequate, nonspecific feedback given to the writers—a

frustrating situation for the writers where they are not sure why
the reviewer or approver has a concern over a section in the
report.

Overcoming some of these hurdles is discussed below.

STATED REQUIREMENTS
Whether the documentation of the incident is performed using
an online tool or in a paper format, if is helpful if templates are
provided that guide the writers, reviewers, and approvers through

LICENSED TO JOSE CASTELLA

Vesper Book.indb 256 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Review and Approval of the Investigation and Report 257

what is expected in each section. For example, if one of the sections is

“incident description,” the annotated form might say, “Include what
the problem is (problem statement), when the problem occurred,
when and how it was discovered, who discovered the problem, and
the immediate action taken.” This could be expanded further with
an example that writers could use as a model.

MINIMIZING PERSONAL PREFERENCES

In their book on how to critically read, Alder and Van Doren (1972)
discuss three “stages” of reading, each having a particular goal.
Specifically, the three stages and how they apply to reports can be
described as follows:

Stage 1—The structural stage. What is the report about as a whole?

• Can you summarize it briefly?

• Can you outline the various sections of the report? Do they

logically fit together? Do they tell a story?

• What was the problem that the writer was intending to solve in
the report?

Stage 2—The interpretive stage. What is being said in detail?

• What are the arguments the writer was making (e.g., root cause,
proximal cause)?

• What problem was the writer trying to solve? Did the author
solve it (e.g., corrections and corrective actions aligned with the
causes)? Are there problems that were not solved?

• Can you create a mental model or picture of the incident/event

in your mind?

Stage 3—The critical stage. Is it true? So what?

• Do you thoroughly understand what the writer is saying? (You

must understand before you can make judgments!)

LICENSED TO JOSE CASTELLA

Vesper Book.indb 257 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
258 Root Cause Investigations for CAPA: Clear and Simple

• Where is the writer:

– Correctly informed?

– Uninformed?

– Misinformed?

– Illogical?

– Providing an incomplete analysis?

The authors also distinguish between “disagreement”—where

there are specific faults identified (as above) and “dislike”—where
the reviewer has a personal preference (e.g., word choice) that is
different from the writer’s. The reviewer’s personal preferences
that are at odds with the writer’s are one of the reasons why report
churning (that is, going back and forth and back and forth between
writers and reviewers) occurs.

There are two particular practices that have been seen to make
reviews more effective and efficient. First, in can be helpful to have
a matrix or table of the roles of different reviewers and the points on
which they are to focus. This could also be in the form of a checklist
(see Table 1 and Table 2). Second, having a training session where
all reviewers and approvers together can look at sample reports
in order to “calibrate their eyeballs” can be effective in developing
consistency in reviews. To be most effective, these calibration
sessions need to be repeated periodically as new reviewers and
approvers join in the process.

GIVING FEEDBACK
If you are providing feedback to a writer, you need to remember that
most investigation report authors are not trained technical writers
and that they are learning their craft as they are writing the reports.
As a reviewer, you can provide comments that can strengthen both
the writer and the specific report or, if done incorrectly, can be
frustrating and disheartening.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 258 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Review and Approval of the Investigation and Report 259

How you provide the feedback can make a difference as well.

If you know the person and have a good working relationship
with each other, written feedback (annotated reports or emailed
comments) will usually work. On the other hand, if the writer is
a relatively new one and/or you do not know them well, having a
face-to-face (or a virtual equivalent) session may be better.

Some suggestions for giving feedback include:

• Look for the positive things in the report; provide at least one at
the start and at the conclusion of your feedback session.

• Be specific—what specifically works and what specifically does

not work in the report?

• Be descriptive—are there suggestions you can give the author to

improve the report?

• Use the continue and consider model—for practices you want to

positively reinforce, tell the writer something like this: “As you
are building your case for the identified root cause, continue
to refer to the fishbone diagram.” Or, for practices that you
think are not effective, say, “Consider doing a spellcheck before
sending out the report for review—it really helps make the
report look more professional.”

• Keep your focus on the report, not on the person.

RECEIVING FEEDBACK
Getting feedback on an investigation report (or other piece of writing)
is not always that pleasant, even if the reviewer is professional and
courteous. The most important thing is realizing that their feedback
is not about you, it is the report that they are commenting upon.
In actuality, their goal is similar to yours—to have a report that is
complete, clear, correct, and that can successfully withstand a critical
review by an auditor or inspector.

Some suggestions to make receiving feedback more useful (and

possibly less uncomfortable) include:

LICENSED TO JOSE CASTELLA

Vesper Book.indb 259 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
260 Root Cause Investigations for CAPA: Clear and Simple

• Ask for it—when you ask someone to review your report,

it subtly changes the dynamic in that it gives you a level of
control. When asking, be specific if there is something you want
the reviewer to focus on, for instance a section outlining the
controls that are in place or the description of the process.

• Ask for examples—if the reviewer criticizes something that you

are unsure about, ask for an example or two. Is there a word that
they would prefer to be used?

• Ask for the rationale—be sure you understand the reason for
their change. Is this something that may be new to you or comes
from the reviewer’s experience?

• If there is a conflict between reviewers (not an uncommon

situation where one reviewer has a negative comment and others
find a section not to be problematic), understand the issue—is
it a dislike (personal preference) or a disagreement (technical or
content-related)? Conflicts can also arise when reviewers want
different things. Again, try to discern the underlying issue.

INCLUDING THE BASIS OF THE REVIEWERS’ AND

APPROVERS’ SIGNATURES
While the report is being written and in the review and approval
cycle, it is important to have the word DRAFT written as a footer or
as a watermark on the page. From a technical and legal perspective,
this means that the report is not complete. Only when the final
approval signature is placed on the document should the word
DRAFT be removed.

One of the most useful requirements in the US FDA’s regulation

on electronic records and electronic signatures known as Part 11
(FDA, 1997) is that there needs to be a stated basis for the signature.
In other words, what does my signature mean in this given situation?
This is recommended for all investigation reports, be they in a paper-
based or electronic form. For example, the basis for the reviewer’s
signature could be:

LICENSED TO JOSE CASTELLA

Vesper Book.indb 260 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Review and Approval of the Investigation and Report 261
My signature below indicates that I have read this report and that it is
accurate and complete based on my knowledge and experience.

An example of the basis for an approver’s signature could be:

My signature below indicates that I am in agreement with the content,
conclusions, and recommendations made in this report.

CHURNING METRICS
If you are managing the investigation/CAPA process, it can be
useful to track the number of cycles (or churns) that it takes to go
from a draft report to the final, approvable one. For a short report,
one or sometimes two cycles is typical; for a much longer, more
extensive investigation report, one can expect to see another one
or two cycles. If you see that there are multiple cycles, you need
to understand why. Not only does this consume the time of the
reviewers and writer, but it can also delay decisions about the lot or
lots in question. While these lots are on hold, work-in-process (WIP)
inventory charges are being assessed. The sooner a decision can be
made about the affected lots, the lower the WIP charges.

CONCLUSION
Reviews of a draft report by subject matter experts and leadership
help ensure the report is on target and answers questions that might
be asked about the situation in the future by auditors and inspectors.
Reviews are meant to improve the report. Clarity, accuracy, and
completeness are important to consider, as well as that the report is
as long as needed but no longer. Approval, including a final sign-
off by the Quality Unit, means that the report has been accepted by
the organization and the organization will implement the report’s
recommendations.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 261 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
262 Root Cause Investigations for CAPA: Clear and Simple
Table 1. An example of a checklist covering the investigation and
CAPA process that could be used by the lead investigator

Investigation checklist Investigation:

Investigation team leader: Date:

1. Did the investigation team  Yes Comment

have members’ and subject
 No
matter experts’ support with
the appropriate knowledge and  Not applicable
skills?
2. Is it clear what the incident  Yes
was and how it varied from the
 No
“should be” or expected?
 Not applicable
3. Are the facts clearly stated as  Yes
to the what, where, when, who,
 No
etc.?
 Not applicable
4. Were information and evidence  Yes
collected through interviews
 No
with key people involved and
the taking of samples if possible?  Not applicable
5. Were previous similar events  Yes
looked for, compared, and
 No
conclusions made? Were key
words used? Was a time frame  Not applicable
of at least 12 months used?
6. Were training records of key  Yes
people involved reviewed and
 No
conclusions made?
 Not applicable
7. Were the protocols and  Yes
reports for the qualifications
 No
and validations of equipment,
methods, and processes  Not applicable
reviewed and conclusions
made?
8. Were maintenance and  Yes
calibration procedures
 No
and records reviewed and
conclusions made?  Not applicable
9. Was the scope of the  Yes
investigation initially defined
 No
and then reduced or expanded
based on the facts?  Not applicable

LICENSED TO JOSE CASTELLA

Vesper Book.indb 262 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Review and Approval of the Investigation and Report 263

10. Was an impact evaluation  Yes

completed that examined
 No
the potential patient, safety,
identity, strength, purity, and  Not applicable
quality (SISPQ), regulatory, and
compliance impacts?
11. Did the investigation identify  Yes
root cause(s)?
 No
 Not applicable
12. Did the investigation identify  Yes
contributing causes(s)?
 No
 Not applicable

13. Did the investigation identify  Yes

the proximal cause(s)?
 No
 Not applicable
15. Did the investigation document  Yes
what was examined but
 No
found not to be causing or
contributing to the incident?  Not applicable
16. If the root cause(s) were not  Yes
identified, were detection
 No
methods and controls defined?
 Not applicable
17. Were the immediate actions  Yes
taken upon discovery of the
 No
incident described? Were the
actions effective?  Not applicable
18. Were corrections to fix the  Yes
item/system affected described?
 No
Were the corrections effective?
 Not applicable
19. Were corrective actions taken  Yes
to prevent it from happening
 No
again taken? Were they
effective?  Not applicable
20. Has there been appropriate  Yes
communication about this
 No
incident with the stakeholders?
 Not applicable
21. Was a plan developed for an  Yes
effectiveness check?
 No
 Not applicable

LICENSED TO JOSE CASTELLA

Vesper Book.indb 263 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
264 Root Cause Investigations for CAPA: Clear and Simple
Table 2. An example of a checklist for investigation report reviewers

Checklist for Root Cause Investigation and CAPA Reports

Intent: This checklist can be used as you review reports for root cause investigations and
CAPAs. If something is marked as not applicable (N/A), you should have a strong, valid
reason for marking it that way.

Item # Requirement Detail Present?

1. Executive summary Stand-alone first page with all key Y N N/A
details. 
2. Problem description What, where, when (occurrence), Y N N/A
who, how/when the deviation 
(symptom) was discovered.
3. Problem statement Requirement/should be versus as is; Y N N/A
gap is evident. 
4. Immediate actions Immediate actions taken to minimize Y N N/A
taken further impacts. 
5. Scope—initial and final Items/products/equipment involved Y N N/A
in the deviation; how you know the 
scope does not extend beyond this.
6. Impact Impact of the deviation on the Y N N/A
product, state of control (qualified 
or validated state), users/patients.
Alternatively, how you know the
deviation did not have an impact.
7. History/trend analysis Look-back to determine if root Y N N/A
cause(s) or symptoms have 
occurred in past. Should be at least
1 year look-back with search terms
provided.
8. Risk assessment and Results of retrospective risk Y N N/A
evaluation assessment that answers the 
question, “What is the risk that this
deviation could have an impact on
the thing(s) of value (see above)?”
(Impacts and likelihood based on
controls in place.)

Evaluation would be event

classification (minor, major, critical).
9. Root cause analysis Mention of the method(s) used to Y N N/A
method(s) determine root cause. 

LICENSED TO JOSE CASTELLA

Vesper Book.indb 264 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Review and Approval of the Investigation and Report 265

Item # Requirement Detail Present?

10. Results of root cause What was found to be the root Y N N/A
analysis cause and proximal cause and why 
these are viewed as the causes (must
be actionable).
What potential causes were Y N N/A
considered, rejected, and why. 
Contributing causes that were Y N N/A
identified. 
11. Personnel interviewed Summary of interviews/statements Y N N/A
of those interviewed (position and 
some sort of unique personnel
identifier used).
12. Controls present What controls worked as intended; Y N N/A
what controls did not work as 
intended.
13. If no definitive root Probable cause(s) identified. Y N N/A
cause identified… 

14. If human error is Specific actions are identified and Y N N/A

considered… should be other than training or 
changing procedure.
(Human error not
considered a valid root
cause; human factors
should be identified)
15. Corrections made Corrections taken if applicable; what Y N N/A
was done to material, product, etc. as 
a result of the deviation.
16. Corrective action(s)/ Actions taken to prevent recurrence Y N N/A
due dates/person of this specific deviation; must be 
responsible aligned with the root and proximal
causes.
17. Preventive actions/ Actions taken elsewhere to prevent Y N N/A
due dates/person a similar problem from recurring. 
responsible
18. Communication What was done to communicate Y N N/A
issue to stakeholders (technicians, 
leadership, health authorities as
applicable).
19. Effectiveness checks Completed and/or proposed Y N N/A
actions to determine if actions were 
effective.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 265 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
266 Root Cause Investigations for CAPA: Clear and Simple

Item # Requirement Detail Present?

20. Personnel involved in Names, position, expertise of Y N N/A
investigation personnel. 
21. Description of system Flowcharts, process maps, floor Y N N/A
process, equipment, plans, narratives as appropriate. 
etc. involved in
deviation
22. Data or links to Copies of key records (e.g., training Y N N/A
data or documents records), SOPs. 
reviewed
23. Spelling has been Spellcheck! Y N N/A
reviewed 

REFERENCES
FDA (2019) Part 211 Current good manufacturing practice for
finished pharmaceuticals. Code of Federal Regulations Title 21.
https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfcfr/CFRSearch.
cfm?CFRPart=211. Accessed 4 Mar 2020.

FDA (1997) Part 11 Electronic records, electronic signatures. Code

of Federal Regulations Title 21. https://www.accessdata.fda.gov/
scripts/cdrh/cfdocs/cfcfr/CFRSearch.cfm?CFRPart=11. Accessed 4
Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 266 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

COMMUNICATION

Even though the topic of communication is one of the last steps

in the investigation and corrective action process and we are
discussing it in one of the last chapters in this book, communicating
with personnel, health authorities (as needed), management, and
other stakeholders may be appropriate throughout the process,
depending on the scope and potential impact of the event.

“Communication,” as we are using the word here, is not the same

as “training.” Communication is meant to provide information that
can result in knowledge and awareness; it may or may not demand
a change in practice or behavior. Training, on the other hand, does
have a specific intent to change behavior—that is one of the ways that
you know the training has been effective. Communication should be
included in all major or critical deviations; training is appropriate
only if the investigation determines there was a deficiency in
knowledge or skills.

The basic messages that should be conveyed through

communication are:

• This particular unwanted event has occurred.

• This is the scope and impact, particularly to patients, of the
event.
• We as an organization are sharing what we have learned due to
this event.

267
LICENSED TO JOSE CASTELLA

Vesper Book.indb 267 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
268 Root Cause Investigations for CAPA: Clear and Simple

• To prevent this or a similar situation from happening in other

areas, these are actions that should be considered.

WHO SEES WHAT?

Communication is important, but it needs to be done in a thoughtful
way; for example, consider who should receive the message. One of
the easiest—and worst—ways of communicating about an incident
would be sending out a group text or e-mail message to everyone
about every possible event. Personnel would soon be overburdened
by the number of notifications. Some factors to consider when
deciding who should be notified could be:

• Significance (scope and impact) of the event

• Immediacy (how quickly the information needs to be shared)

• Location (department or plant site)

• Similar equipment, materials, components, processes, or

products

• Who should take action

Health authorities may need to receive communication about

certain events. For example, if there is an issue (e.g., mix-up,
contamination, stability failure) of a marketed and distributed small
molecule product, the US FDA requires a Field Alert Report (FAR)
be submitted within 3 days of receiving information concerning
the quality problem (FDA, 2018). A similar US FDA requirement—
the Biological Product Deviation Report (BPDR)—exists for large
molecule products (FDA, 2020). The European Medicines Agency
(EMA) has a similar program for reporting defective products
(EMA, undated). Completing these reports may incorporate some
of the information obtained during the investigation.

Communicating an event with a firm’s management is often

referred to as escalation. Organizations should have a procedure
that provides details on who sees what and how quickly various
categories of incidents, including complaints and adverse events,

LICENSED TO JOSE CASTELLA

Vesper Book.indb 268 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Communication 269

need to be communicated once they are first observed. The speed

and organizational level of escalation is typically based on impact
significance and scope of the problem.

METHODS FOR INCIDENT COMMUNICATION

There is no one way to inform people about a particular incident;

it very much depends on the organization and the incident itself.
What follows are several options that have been seen in the industry.

• An abbreviated summary of the event emailed to stakeholders.

For operational areas, supervisors may share this with personnel
during a shift or team meeting.

• Simple posters. One manufacturing site wanted to communicate

their “learnings of the week” that could include deviations or
other things that they felt were important. To do so, personnel
in the departments would arrive at a consensus of what they
felt was important and, using a standardized PowerPoint®
template, create a poster that was placed on the bulletin board
outside the department’s office. This way, information could
be shared with departmental personnel and others walking by.
The emphasis here was “What did we learn that is important to
share with others?” (Figure 1).

• Summaries using in-plant video monitors. A pharma site that

used video monitors would put images and brief summaries
of certain deviations in their daily video slide shows (that also
included work anniversaries and other announcements).

• On-demand summaries. One firm is doing a pilot/demonstration

for safety incidents. Using an electronic work-order system,
maintenance personnel can directly link to cautions and
incidents relevant to the task they were to do. For example, if
they are entering a confined space, they would see a document
(or, in the future, a video) on the procedure and also summaries
of incidents where the problems occurred. The intent is for
maintenance people to specifically see where they might be
vulnerable to risks.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 269 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
270 Root Cause Investigations for CAPA: Clear and Simple
Figure 1. Example of a poster used to communicate lessons learned

Learning of the Week

20 August 2019
What we learned:
It’s important to
consider GMPs when
making a change to
improve safety!

Background From the materials handling group

A safety suggestion was made to install “speed bumps” on the
main hallways so forklifts would need to move more slowly.
(There have been 2 near-miss accidents this past year due to
speeding forklifts.) As part of the change control process, we
had different people talk about the proposed change and we
realized there could be a risk to glass containers (and the
finished products) if the vials were subjected to extra physical
stress due to vial-to-vial contact. Particulates might be
generated or the necks chipped that could prevent a good seal
and the assurance of sterility. We’ll look for other ways to
reduce the potential of forklift accidents.
Bottom line
Change control takes time, but having different people involved
contributing their perspectives is important!
Want to know more?
Call Chris Summers at extension 7345 or email
csummers@acmepharma.com

Concept created by LearningPlus, Aug 2019

COMMUNICATING POTENTIAL RISKS

One of the messages that may be communicated could be concerning
risk—the likelihood of a deviation occurring and the severity should
it occur (Vesper, 2016). In their work at Bell Telephone Laboratories,
one of the most important industrial research centers in the 1900s,
Shannon and Weaver (1949) created a model of the communication
process and three categories of problems that could arise, specifically:

• Technical problems—the accuracy transmitting the signal

between the transmitter and the receiver.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 270 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Communication 271

• Semantic problems—issues in the interpretation of the

information by the receiver compared against the sender’s
intended meaning.

• Effectiveness problems—the success of the information in

producing the desired outcome or result.

In terms of risk communication, there can be technical problems

if a report is garbled or if color codes are not apparent on a heat map,
but it is likely that semantic and effectiveness problems will be more
significant. Semantic problems, for example, could be the receiver
misinterpreting a vague term like “low risk” or “high likelihood of
occurrence.” An effectiveness problem might be a decisionmaker
who is not persuaded by the communicated message to take
appropriate risk-reducing actions such as timely implementation
of the corrective action. The communication needs to answer the
“So what?” question. Specifically, How should the recipient of the
information use it? What decisions are to be made? What actions
should be taken?

Other researchers (Zikmund-Fisher, 2011) who have looked at

communicating health risks focused on the best ways of presenting
information to achieve a desired outcome, examples of which are
shown in Table 1.

Table 1. Risk concepts and messages (based on Zikmund-Fisher, 2011)

Risk concept Risk message (example) Purpose of message

Possibility It could happen/it might Awareness, avoid surprise
not happen.
Relative possibility It is more likely to happen. Recognize best option
Comparative possibility This is more likely to Recognize best option
happen than that.
Categorical possibility There is a high chance of Motivate to act
this happening.
Relative probability This risk is higher by this Recognize best option
much (e.g., 50%).
Absolute probability The risk is this (e.g., 10%). Motivate to act
Comparative probability The risk is 20% if I do X; it Make magnitude dependent
is 80% if I do Z. decisions on options
Incremental probability The risk will increase by Make magnitude dependent
25% if I do this. decisions on options

LICENSED TO JOSE CASTELLA

Vesper Book.indb 271 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
272 Root Cause Investigations for CAPA: Clear and Simple

CONCLUSION
Communicating information about a quality event and the related
impacts and risks is part of knowledge management. It not only builds
awareness of such events among stakeholders, it also contributes
to transparency within an organization. Beyond simple awareness,
communicating problems can also provide direction to personnel
on how they can prevent and respond should circumstances arise.

REFERENCES
EMA (undated) Quality defects and recalls. https://www.ema.europa.
eu/en/human-regulatory/post-authorisation/compliance/quality-
defects-recalls#reporting-obligations-section. Accessed 3 Mar 2020.

FDA (2020) Biological product deviation reports, Feb 14, 2020.

https://www.fda.gov/vaccines-blood-biologics/report-problem-
center-biologics-evaluation-research/biological-product-deviations.
Accessed 3 Mar 2020.

FDA (2018) Field alert reports, Jul 18, 2018. https://www.fda.gov/drugs/

surveillance/field-alert-reports. Accessed 3 Mar 2020.

Shannon, C. and Weaver, W. (1949) The Mathematical Theory of

Communication. Urbana, IL: University of Illinois Press.

Vesper, J. (2015) Q9+ ten years: Examining risk communication.

Special edition, Journal of Validation Technology. http://www.
ivtnetwork.com/article/q9-ten-years-examining-risk-communication.
Accessed 1 Mar 2020.

Zikmund-Fisher, B.J. (2011) To “know” your risk: Some

thoughts on goals in risk communication. Paper
presented at the FDA Risk Communication Advisory
Committee, Silver Spring, MD. http://www.fda.gov/
downloads/AdvisoryCommittees/CommitteesMeetingMaterials/
RiskCommunicationAdvisoryCommittee/UCM280630.pdf.
Accessed 3 Mar 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 272 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LEARNING FROM SUCCESSES AND

FAILURES

If you do a Google search for pithy quotes on the phrase “learning

from mistakes,” you will find sites that have upwards of 1,000 of
relevant ones. (For example, go to www.brainyquote.com/topics/
mistakes-quotes.) Most of the quotes fit into the category of “mistakes
happen but they can be valuable sources of learning.”

Publicly discussing mistakes has become its own industry: the

Wall Street Journal described Eli Lilly and Company’s “Failure
Parities” (Burton, 2004). In Los Angeles, the Museum of Failure
(www.failuremuseum.com) showed examples of over 100 objects
that, looking at them with hindsight, prompts the question, “Who
thought that was a good idea?” For entrepreneurs, there are
“failurecons” (www.thefailcon.com) and similar events held around
the world where it becomes a badge of honor to share stories of
what went wrong in an endeavor and what was learned. Failcon’s
motto: “Embrace your mistakes. Build your success.”

In this chapter we will look at why we sometimes do not

learn from mistakes and failures, as well as some structured and
unstructured ways that we can do this better.

When we are pursuing a goal or are on a path, it is not until we

recognize on our own or through the intervention of someone else
that we have made a mistake. For example:

273
LICENSED TO JOSE CASTELLA

Vesper Book.indb 273 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
274 Root Cause Investigations for CAPA: Clear and Simple

• We see that we’re going the wrong direction on a highway.

• An instrument gives a result showing that we have incorrectly

diluted a sample.

• A colleague points out that the scale used in weighing out

materials has been out of calibration for the past two weeks but
the situation had not been communicated.

For most of us, when things like that happen, embarrassment,

shame, self-doubt, and fear take hold as well as thoughts of how to
hide, redo, or fix the problem so no one else discovers it—things
we learned from a very early age: when we do something that is
“wrong,” we get blamed for it. In school, if we did not select the right
answer because we interpreted the question in a different way than
the teacher intended, we still will get the answer scored as wrong.
So, from early on, we are molded by the social environment around
us to see errors, mistakes, failures, and not meeting expectations
or goals as bad—something to be avoided. But errors, mistakes,
failures, and the like are part of what we are as humans. According
to St Augustine, “Fallor ergo sum,” or “I err, therefore I am.” In
complicated systems, failures are inevitable (Perrow, 1999).

How then can we learn from our errors and failures if we’ve
grown up being told mistakes are bad? How do our organizations
extract some sort of value from deviations and quality events? This
chapter presents several different approaches to learning from
mistakes that are used in a variety of organizations.

“FAIL FAST, FAIL OFTEN” (BUT FAIL SAFELY)

Some entrepreneurs including many in Silicon Valley have adopted
the mantra of “fail fast, fail often.” That approach can be useful in
some phases of discovery or development, but it is not the best if
one is trying to operate in a state of control.

When someone is learning a task and you want them to acquire

hands-on experience and the tacit knowledge—the “know how”—
that comes through practice, “fail safely” needs to be a requisite
part of that mantra. Being able to develop one’s skills in a low-risk

LICENSED TO JOSE CASTELLA

Vesper Book.indb 274 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Learning from Successes and Failures 275

setting is critical. In the information technology (IT) world, having

multiple instances of a computer application usually includes a
“sandbox” where programmers can experiment without affecting
the validated production system. Some pharma and biopharma
firms and training institutes (like the PDA’s Training and Research
Institute in Bethesda, MD) have production lines or manufacturing
suites dedicated to training without putting product (or patients) at
risk. New technologies such as augmented reality and virtual reality
are being used in providing an authentic learning experience in a
safe way.

Success is a useful teacher, but experiencing a failure can be

an even more compelling memorable experience. We have all had
situations where, afterwards, we say “I will never do that again!”
Providing a safe environment for learning by failures—and having
a “soft landing” where no real harm occurs—is a very powerful way
to acquire knowledge.

CHARACTERISTICS OF ORGANIZATIONS THAT LEARN

FROM MISTAKES
Based on their study of a variety of organizations, Cannon and
Edmondson (2005) Identified two types of barriers that prevented
organizations from learning. The first, social systems, were
mentioned above, specifically those things that we have learned
through experience and observation that are intended to protect our
personal positions, egos, and self-esteem. The second barrier is with
technology systems—not having the ability to track and identify
events or not having the foundational knowledge of the process that
can help stakeholders understand the issue. The authors propose
three steps organizations can take:

• Identifying failures: Having a safe environment where people can

report failures; using minimal inventory practices that make
problems more visible (the rocks in the river metaphor where
having a bloated inventory is the high water level that covers up
the problems that occur); trending and looking for anomalies,
including those that are considered not serious.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 275 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
276 Root Cause Investigations for CAPA: Clear and Simple

• Analyzing failures: An organization that has a long-term outlook,

patience, and openness; investigating small failures, not only
large ones; using formal analysis tools, not just expert intuition;
having a thorough understanding of the underlying process;
commitment to deep learning and continually asking “why?”
to get beyond the easy answers like “the procedure was not
followed.”

• Deliberately experimenting: A spirit of innovation; a willingness

to accept that there will be “failures”; teams of experts who can
design and interpret a study; well-designed experiments that
can capture knowledge.

One of the most critical aspects in identifying failures is having

an organizational culture where people feel safe in reporting
such events. Harvard researcher Amy Edmonson (1999) studied a
number of nursing units in different hospitals and observed that
where the nurses had higher levels of coaching and higher quality-
of-relationship scores, there were also a higher number of error
events that were reported. In studying the culture of the nursing
units more carefully, looking at the management style of the head
or supervising nurse, it was apparent that groups that had higher
levels of internal trust and worker cohesion were more likely to
point out errors. In units where the nurses felt nervous and defensive
about admitting mistakes, or where talking about issues with the
supervisor was like going to the principal’s office, the reporting of
errors was suppressed.

WHAT ABOUT A “BLAMELESS” CULTURE?

There is a least one multinational pharma firm that has established a
“blameless” culture, where there are no penalties for someone if they
come forward admitting a mistake or error. Managers at the site, as
well as managers at other firms, have said that going “blameless”
concerns them in that it would lead to carelessness and people not
taking responsibility for their actions.

Edmondson (2011) proposed a spectrum of failures (Chapter

8, Table 2), including many that would be “blameless” while some

LICENSED TO JOSE CASTELLA

Vesper Book.indb 276 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Learning from Successes and Failures 277

events—such as fraud, data integrity violations, cover-ups—would

justify disciplinary action. Other situations, such as when the
process is unstable or when it is overly complex, would not warrant
discipline. Another approach, used by the US Air Force, gives
personnel 24 hours after an event to report it without retribution.
After 24 hours, if it is discovered, those involved would be subject to
disciplinary action. It is important that the organization is initially
(and repeatedly) clear with all personnel about what their position
is and, even more importantly, that the organization stands by
its policy and is consistent in applying it. The first time that the
organization goes back on its promises, trust will dissolve quickly.

AFTER-ACTION REVIEWS
When analyzing failures as part of the learning from mistakes,
the organization is trying to make sense or find meaning based on
the quality event or deviation that has occurred. This can be done
during reflection when people carefully consider an event, action, or
decision and gain personal insights.

One defined approach for this reflection is through after-action

reviews, a method formalized by military organizations (US Army,
1993; Garvin, 2000). True after-action reviews not only occur when
there has been a failure but also when there has been a success, so
as to hopefully repeat the success and apply lessons learned to other
situations. In its most basic form, an after-action review seeks to find
answers to four key sets of questions:

1. What was supposed to happen? What was planned?

2. What actually did happen? What were the results?

3. Was there a difference between what was planned and the actual
results? Why? Why not?

4. What can we learn from this? What do we want to continue/

sustain? What should we do differently next time?

For large, complicated projects, an after-action review may

require significant time, while in other situations, an adequate
review may be done very quickly, such as at the end of a work shift

LICENSED TO JOSE CASTELLA

Vesper Book.indb 277 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
278 Root Cause Investigations for CAPA: Clear and Simple

or a meeting. Asking “What did we learn today?,” “How can we

use this information?,” and “How can we improve?” links learning,
knowledge generation, and continual improvement.

An approach (Birkinshaw et al., 2016) that seeks to increase the

“return on failure ratio” identifies a short process to pull as much
benefit as possible when a failure happens or when a less than
optimal result occurs. The three steps are:

1. Extract as many insights as possible concerning the failure.

This can include “assets” such as what was learned, what
assumptions should be changed, awareness about future trends,
improvement of knowledge and skills by individuals and team
members, identification of development needs, and what they
would do differently next time. Liabilities of direct, external,
internal, and reputational costs must also be considered.

2. Share the results with others in the organization. What were the
essential elements of value that have broad applicability?

3. Periodically review patterns of failure. Step back and look to see

if there are particular trends that point to underlying issues or
vulnerabilities.

NASA’s Goddard Space Flight Center in Maryland established

a similar program called “Pause and Learn” (PaL) that is led by
the site’s Chief Knowledge Management Officer. In a white paper
describing the process (Rogers, 2004), the roles and responsibilities
of attendees and management were described, as shown in Table
1. PaL sessions are conducted throughout a larger project, focus on
one recent event, and do not result in a lengthy report (that isn’t
usually read).

A brochure describing the program (Goddard Space Flight

Center, undated), explains that the PaL facilitator starts the meeting
with these four ground rules:

• Be discreet. A PaL session is a closed-door discussion among

team members. Unless explicitly stated otherwise, what gets
said in the room stays in the room.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 278 5/29/2020 10:56:23 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Learning from Successes and Failures 279

• Be honest. When the activity being discussed directly involves

you, call it as you see it.

• Be tolerant. Others’ opinions and perspectives are equally

important, regardless of rank or experience.

• Be a team. When looking at an individual’s actions, view it from

the perspective of team responsibility for ensuring excellence.

Table 1. Roles and responsibilities of those involved in NASA’s Plan

and Learn process (Rogers, 2004)

Project attendees need to: Supporting staff need to:

• Show up to the event when scheduled • Gather attendees; some projects

already hold debrief or talk-down
– Bring notes or supporting documents
sessions which can be used for PaL
– Be prepared to restate portions of an sessions
event in your own words
• Have a moderator who will review the
• Do not consider this a lecture or events
critique
– Encourage participation
– Relate what happened from your own
– Summarize key events
point of view
• Have junior leaders restate portions of
– Explore alternative courses of action
their activity
– Handle discovery of errors positively
• Do not lecture or critique
• Follow up on needed actions that you
– Ask why certain events were taken
have identified for yourself
– Ask how those involved reacted to
– The PaL is not intended to be an
situations
action assignment forum
– Ask when actions were initiated
– The team may agree on an action or
improvement for themselves – Exchange “war stories”
– Likewise, you may have actions you – Relate events to subsequent results
identify for your own improvement
– Explore alternative courses of action
– Handle discovery of errors positively
– Take notes during the PaL, so all team
participants can listen and learn
– Prepare simple report of notes and
submit back to the team for review

LICENSED TO JOSE CASTELLA

Vesper Book.indb 279 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
280 Root Cause Investigations for CAPA: Clear and Simple

THE ROLE OF LEADERSHIP

Creating an organizational culture that intentionally learns from its
mistakes and failures is something that needs to start from the top.
Leadership needs to not just say that this is important, but needs to
show it. It needs to reward its personnel who, despite the personal
and social pressures, have the courage to point out problems that
they were involved with. Having policies in place and consistently
being in alignment with these policies gives personnel a clear
message that everyone has a role in continual improvement.

CONCLUSION
To look at life, one can say that everything we do is some sort of an
experiment. We often get the intended results, but now and then,
particularly when there may be information or experience lacking,
the outcome is less than what we were hoping for. To continually
improve—a characteristic of every quality program—it is essential
to continually learn. Having leadership that sees the value of this and
encourages this as a daily practice can find value even in failures.

REFERENCES
Birkinshaw, J. and Haas, M. (2016) Increase your return on failure.
Harvard Business Review, May 2016, pp. 88–93.

Burton, T. (2004) By learning from failures, Lilly keeps drug pipeline

full. Wall St Journal, Apr 21, 2004. https://www.wsj.com/articles/
SB108249266648388235. Accessed 6 Mar 2020.

Cannon, M. and Edmondson, A. (2005) Failing to learn and learning

to fail (intelligently): How great organizations put failure to
work to innovate and improve.https://www.researchgate.net/
publication/228630786_Failing_to_Learn_and_Learning_to_Fail_
Intelligently_How_Great_Organizations_Put_Failure_to_Work_to_
Innovate_and_Improve. Accessed 6 Mar 2020.

Edmondson, A. (2011) Strategies for learning from failure. Harvard

Business Review, April 2011.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 280 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Learning from Successes and Failures 281

Edmondson, A. (1999) Psychological safety and learning behavior

in work teams. Administrative Science Quarterly, 44(2):350–383.

Garvin, D.A. (2000) Learning in Action: A guide to putting the learning

organization to work. Boston, MA: Harvard Business School Press.

Goddard Space Flight Center (Undated) Pause and Learn. https://

www.nasa.gov/centers/goddard/pdf/431367main_OCKO-Pal-
Brochure-Rev_noLOGO.pdf. Accessed 2 March 2020.

Perrow, C. (1999) Normal Accidents: Living with High-Risk Technologies.

Princeton, NJ: Princeton University Press.

Rogers, E. (2004) Introducing the Pause and Learn (PaL) process:

Adapting the army after action review process to the NASA
project world at the Goddard Space Flight Center. NASA White
paper. https://www.nasa.gov/centers/goddard/pdf/287922main_
PALwhitepaperV3.pdf. Accessed 1 Mar 2020

US Army (1993) A leader’s guide to after-action reviews. Training

Circular 25–20. Washington, DC: US Army Headquarters.
https://www.acq.osd.mil/dpap/ccap/cc/jcchb/Files/Topical/After_
Action_Report/resources/tc25-20.pdf. Accessed 2 March 2020.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 281 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 282 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

MANAGEMENT RESPONSIBILITIES

In the previous chapters, we have discussed the process steps

and tools used by investigators and writers in conducting and
documenting a root cause investigation and proposing corrective
actions. But what about management? What are the roles and
responsibilities that leadership have? How can leadership support
a program for effective investigations and corrective actions? These
questions will be the emphasis of this chapter.

Written GMP requirements for management regarding

investigations and corrective actions are limited but include the
following from US, European, and global health authorities:

US CGMP– 211.180 (FDA, 2019)

(f) Procedures shall be established to assure that the responsible officials
of the firm, if they are not personally involved in or immediately aware
of such actions, are notified in writing of any investigations conducted
under 211.198 (complaints), 211.204 (returned drug products where
the issue implicates other products or lots), or 211.208 (salvaging of
drug products) of these regulations, any recalls, reports of inspectional
observations issued by the Food and Drug Administration, or any
regulatory actions relating to good manufacturing practices brought
by the Food and Drug Administration.

EU GMP – Chapter 1, Quality Management (EMA, 2012)

1.5 Senior management has the ultimate responsibility to ensure
an effective Pharmaceutical Quality System is in place, adequately
resourced and that roles, responsibilities, and authorities are defined,

283
LICENSED TO JOSE CASTELLA

Vesper Book.indb 283 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
284 Root Cause Investigations for CAPA: Clear and Simple
communicated and implemented throughout the organisation.
Senior management’s leadership and active participation in the
Pharmaceutical Quality System is essential. This leadership should
ensure the support and commitment of staff at all levels and sites
within the organisation to the Pharmaceutical Quality System.

1.4 A Pharmaceutical Quality System appropriate for the manufacture

of medicinal products should ensure that:

(xiv) An appropriate level of root cause analysis should be applied

during the investigation of deviations, suspected product defects
and other problems. This can be determined using Quality Risk
Management principles. In cases where the true root cause(s) of
the issue cannot be determined, consideration should be given to
identifying the most likely root cause(s) and to addressing those.
Where human error is suspected or identified as the cause, this should
be justified having taken care to ensure that process, procedural
or system- based errors or problems have not been overlooked, if
present. Appropriate corrective actions and/or preventative actions
(CAPAs) should be identified and taken in response to investigations.
The effectiveness of such actions should be monitored and assessed, in
line with Quality Risk Management principles.

ICH Q10 – Pharmaceutical Quality System (ICH, 2008)

2. MANAGEMENT RESPONSIBILITY
Leadership is essential to establish and maintain a company-wide
commitment to quality and for the performance of the pharmaceutical
quality system.

2.1 Management Commitment

(a) Senior management has the ultimate responsibility to ensure an
effective pharmaceutical quality system is in place to achieve the
quality objectives, and that roles, responsibilities, and authorities are
defined, communicated, and implemented throughout the company.

3.2.2 Corrective Action and Preventive Action (CAPA) System

The pharmaceutical company should have a system for implementing
corrective actions and preventive actions resulting from the
investigation of complaints, product rejections, non-conformances,
recalls, deviations, audits, regulatory inspections and findings, and
trends from process performance and product quality monitoring. A

LICENSED TO JOSE CASTELLA

Vesper Book.indb 284 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Management Responsibilities 285
structured approach to the investigation process should be used with
the objective of determining the root cause. The level of effort, formality,
and documentation of the investigation should be commensurate with
the level of risk, in line with ICH Q9. CAPA methodology should
result in product and process improvements and enhanced product
and process understanding.

WHAT CAN LEADERSHIP DO?

These regulatory requirements provide outlines of responsibilities,
but not much detail. Recommendations for leadership to consider
based on what has been seen in effective and not-so-effective
programs include the following:

• Set the expectations. Why accept that quality events will

occur and reoccur in the first place? While most every country
accepts that there will be traffic fatalities, government officials
in Sweden challenged that notion. “In 1997, the Swedish
parliament wrote into law a ‘Vision Zero’ plan, promising to
eliminate road fatalities and injuries altogether. ‘We simply
do not accept any deaths or injuries on our roads,’ says Hans
Berg of the national transport agency” (Economist, 2014). “You
get what you tolerate,” is a quote that is attributed to multiple
authors.
• Remember the golden hours. Beginning the investigation as
soon after the unwanted event has been observed provides the
best chance of collecting useful information and evidence. While
not possible in all situations (for example, customer complaints
or environmental monitoring excursions) encouraging
investigators to not wait until the deadline is on the horizon
can result in more effective root cause investigations. Providing
enough resources to investigate as soon as possible will make
for better investigations.
• Thorough investigations can take time—do not let
investigations languish. These are two sides of the same coin.
Do not obsess on the 30-day time limit, as doing so often results
in investigations where “human error” is presented as the
root cause. At the same time, investigations need to be done
in a timely way. (If investigations do take more than 30 days,

LICENSED TO JOSE CASTELLA

Vesper Book.indb 285 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
286 Root Cause Investigations for CAPA: Clear and Simple

management and the quality unit need to be kept abreast of

what is happening with an interim report.)
• Leadership and learning are indispensable to each other (J.F.
Kennedy). Leadership can provide an environment that supports
learning by ensuring psychological safety where people can ask
questions and admit mistakes; appreciate differences in people
and their points of view; be open to new ideas and taking risks;
and allow time for reflection (Garvin et al., 2008).
• Conducting investigations and writing reports takes
knowledgeable, skilled people. A large manufacturing site
was often not able to get to root and proximal causes in their
investigations of process problems. When asked, they said that
the primary investigators they used were newly graduated
engineers that the firm wanted to be exposed to a variety of areas.
While their intentions were commendable, for investigations,
you want people involved who have considerable expertise and
can apply their experience, intuition, and analytical skills. If not
leading the investigations, they need to take an active part in
them.
• Avoid “temporary” patches wherever possible. Patches or
Band-Aids are generally temporary fixes often put in place
when there is not enough knowledge about the true root
cause(s). Sometimes this is because the root causes(s) can’t be
found despite a robust investigation or, probably more often,
because the investigation was not thorough enough. These
short-term fixes can be forgotten about until the temporary
patch fails. If an organization frequently uses patches, it usually
indicates that a “firefighting” culture of problem response
exists. Firefighting is frequently engrained in an organization’s
fabric—recognizing the firefighters as organizational heroes
contributes to perpetuating this. Why not recognize those who
prevent fires?
• Do not accept “retrain the technician” or “change the
procedure” as the sole (or frequent) corrective actions. When
these are specified as ways to fix the problem, it usually indicates
that only a cursory investigation was performed. These actions
might be part of a total solution, but on their own, they will have

LICENSED TO JOSE CASTELLA

Vesper Book.indb 286 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Management Responsibilities 287

little positive impact. Training or retraining should be used only

when there is evidence to show a lack of knowledge or skills.
• Consider what incentives, disincentives, and misincentives
may be present. People rarely want to do a bad job or cause
problems, but often they are put in positions where they need
to make choices. If personnel are taking shortcuts or violating
procedures, you need to consider what the underlying drivers of
that behavior are. Are they trying to juggle competing priorities?
Are they needing to accomplish tasks under unreasonable
conditions or time constraints?
• Prioritize investigations based on risk. Having a risk-based
triage system that emphasizes events with significant risk can
prevent overloading the system. If everything is important, then
nothing is truly important.
• People will make mistakes. This is a given. We are not perfect.
Consider the “human factors” that can be error traps that result
in undesired results. Find ways to make it easy for people to do
the right thing.
• Keep it simple. A favorite quote attributed to Einstein is, “Make
it as simple as possible, but no simpler.” Complexity makes a
process or system more vulnerable to failure.

INVESTIGATIONS AND QUALITY CULTURE

Since around 2005, health authorities, pharma firms, and pharma
industry associations like the International Society of Pharmaceutical
Engineers (ISPE) and the Parenteral Drug Association (PDA) have
been interested in what specifically contributes to quality outcomes
and how these could be quantified and used in risk ratings and
comparisons. What came from the effort was a set of survey
questions that could be used in rating an organization’s quality
culture on 21 different dimensions. Two of these dimensions related
to continuous improvement and CAPA robustness are of interest to
us here: root cause and human error. Table 1 shows characteristics
for each of these categorized into five levels, with five being traits
seen in organizations with the most mature and developed quality
cultures.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 287 5/29/2020 10:56:24 AM

Vesper Book.indb 288
288

Continuous Improvement: CAPA Robustness

1 2 3 4 5
 No program or techniques  Rudimentary program in  Robust program in place  CAPA effectiveness rate is  CAPA effectiveness rate is
to determine true root place to categorize root with standardized monitored improving and decreasing
cause cause but no techniques techniques or tools to  Individual CAPA records are number of repeat and total
 No repeat deviation used to dig deep to find true identify true root cause (e.g. verified to address the true deviations over a number of
monitoring root cause 5 whys, fish bone, etc.) root cause years

Root Cause
 A basic repeat deviation
monitoring program
 Deviation investigations  Ineffective human error  Effective human error  Human factors and HE  HE & HP champions/
frequently attribute Human prevention program, program and improvement prevention is mandated practitioners across
Error (HE) as root cause metrics/monitoring in place initiatives in place and when designing a process organization
 Lacking basic knowledge of but show little or no metrics/ monitoring show  Evidence of Associate‐  Associates proactively use
HE prevention improvements limited improvement initiated HE reduction HE prevention tools
 CAPA frequently involves  Formal training in human initiatives  Demonstrate sustained
just re‐training error and human factor  All relevant Associates reduction of HE deviations
concepts (e.g. Kaizen, 5S, 6 trained on HE/Human over a number of years

Human Error
blocks, etc.) Performance (HP)
 Prioritization of error
reduction activities based on

LICENSED TO JOSE CASTELLA

risk analysis
an organization’s quality culture evolves (PDA, 2018)
currently are and plot a course to make improvements.
Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Root Cause Investigations for CAPA: Clear and Simple

Table 1. Range of root cause and human error that can be observed as
Using a tool like this can help leadership determine where they

5/29/2020 10:56:24 AM
Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Management Responsibilities 289

CONCLUSION
Having a successful, robust program for investigations and
corrective actions requires that leadership have a comprehensive
understanding on how and why failures can occur and continually
look for ways to improve. Leaders need to not only set and
communicate the vision to others in the organization, but help
eliminate barriers and provide a variety of resources to get there.

REFERENCES
Economist (2014) Why Sweden has so few road deaths. The Economist,
26 Feb 2014.

EMA (2012) EU Guidelines for Good Manufacturing Practice for

Medicinal Products for Human and Veterinary Use, Chapter
1, Pharmaceutical Quality System, 1.4-1.5. https://ec.europa.eu/
health/sites/health/files/files/eudralex/vol-4/vol4-chap1_2013-01_
en.pdf. Accessed 2 Mar 2020.

FDA (2019) Current good manufacturing practice regulations, 21

CFR 211,180. https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/
cfcfr/CFRSearch.cfm?fr=211.180. Accessed 2 Mar 2020.

Garvin, D., Edmondson, A., Gino, F. (2008) Is yours a learning

organization? Harvard Business Review, March 2008.

ICH (2008) Pharmaceutical quality system, Q10. https://database.ich.

org/sites/default/files/Q10_Guideline.pdf. Accessed 2 Mar 2020.

PDA (2018) PDA Quality Culture Guided Assessment Tool. Bethesda,

MD: PDA.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 289 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 290 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Appendix 1

DEFINITIONS

The definitions below come from ICH guidelines or other sources

as indicated. Many of the other definitions are composites from
a variety of sources and revised to be appropriate for pharma/
biopharma.

Accident An adverse outcome that was not caused by chance or fate. Most
accidents and their contributing factors are predictable and the
probability of their occurrence may be reduced through system
improvement (Kartoglu, 2015).
CAPA A system for implementing corrective actions and preventive
(corrective actions resulting from the investigation of complaints, product
action/preventive rejections, nonconformances, recalls, deviations, audits, regulatory
action) system inspections and findings, and trends from process performance and
product quality monitoring (ICH Q10).
Common cause Fluctuation caused by unknown factors resulting in a steady but
variation random distribution of output around the mean or average of the
attribute being measured. It is considered the “noise” in a system,
process, activity, or situation; it is the natural pattern of data.
Contributing A factor, situation, or agent that accelerates or intensifies the
cause occurrence of the unwanted event. If the contributing cause is
removed, it does not prevent the unwanted event from occurring.
Control Actions taken to reduce risk by reducing the chances of the
unwanted event occurring; sometimes called prevention.
Control strategy A planned set of controls, derived from current product and
process understanding, that assures process performance
and product quality. The controls can include parameters and
attributes related to drug substance and drug product materials
and components, facility and equipment operating conditions, in-
process controls, finished product specifications, and the associated
methods and frequency of monitoring and control (ICH Q10).

291
LICENSED TO JOSE CASTELLA

Vesper Book.indb 291 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
292 Root Cause Investigations for CAPA: Clear and Simple

Correction Action to eliminate a detected nonconformity.

Note 1: A correction can be made in conjunction with corrective
action.
Note 2: Corrections can be, for example, rework or re-grade
(GHTF).
Corrective Action to eliminate the cause of a detected nonconformity or
action other undesirable situation.
Note 1: There can be more than one cause for nonconformity.
Note 2: Corrective action is taken to prevent recurrence, whereas
preventive action is taken to prevent occurrence.
Note 3: There is a distinction between correction and corrective
action (GHTF).
Critical process A process parameter whose variability has an impact on a critical
parameter (CPP) quality attribute and therefore should be monitored or controlled
to ensure the process produces the desired quality (ICH Q8R2).
Critical quality A physical, chemical, biological, or microbiological property or
attribute (CQA) characteristic that should be within an appropriate limit, range, or
distribution to ensure the desired product quality (ICH Q8R2).
Detectability The ability to discover or determine the existence, presence, or
fact of a hazard (ICH Q9).
Deviation Departure from an approved instruction or established standard
(ICH Q7).
Discrepancy A categorical term that would include something outside of the
expected range; an unfulfilled requirement; a nonconformity, defect,
deviation; an out-of-specification, out-of-limit, or out-of-trend result.
Harm Damage to health, including the damage that can occur from loss of
product quality or availability (ICH Q9).
Hazard The potential source of harm (ICH Q9).
Immediate Actions taken when the event is first observed or discovered to
actions limit or minimize the impact or scope of the event.
Knowledge Systematic approach to acquiring, analysing, storing, and
management disseminating information related to products, manufacturing
processes and components (ICH Q10).
Likelihood The chance that the unwanted event and consequence will occur.
Mitigation Actions taken to reduce risk by reducing the impact or
consequence of the unwanted event should it occur; also called
protection.
Nonconformity Nonfulfillment of a requirement. (GHTF).
Preventive action Action to eliminate the cause of a potential nonconformity or
other undesirable situation.
Note 1: There can be more than one cause for nonconformity.
Note 2: Preventive action is taken to prevent occurrence, whereas
corrective action is taken to prevent recurrence (GHTF).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 292 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Definitions 293

Proximal cause The event that causes or sets off a series of events that results in
the symptom. “The straw that broke the camel’s back.”
Qualification Action of proving and documenting that equipment or ancillary
systems are properly installed, work correctly, and actually lead
to the expected results. Qualification is part of validation, but
the individual qualification steps alone do not constitute process
validation (ICH Q7).
Quality event An unwanted result or situation that requires some sort of
response, usually defined by a procedure. It is the umbrella category
that would include excursions, nonconformances, deviations, out-of-
specification results, and accidents.
Residual risk Risk remaining after risk treatment measures (risk control and/or
mitigation) have been taken.
Risk The combination of the probability (likelihood) of occurrence of
harm and the severity (impact) of that harm (ICH Q9).
Risk treatment Activities taken to reduce risk. Typically includes control, mitigation,
and preparation.
Root cause Causal factor that, if corrected, would prevent recurrence of the
same or similar accidents. Root causes are the specific underlying
causes, can be reasonably identified, are under the control of
management to fix, and effective recommendations can be
developed to correct/prevent them (adapted from Rooney and
Vanden Heuvel, 2004).
Severity A relative measure of the impact of the unwanted event’s
consequence(s) on the thing of value.
Special cause A factor that has changed the system, process, activity, or situation.
variation It can be due to a new or neglected factor that has, for some
reason, appeared or emerged. Special cause variation is the “signal”
that is expressed.
Specification A list of tests, references to analytical procedures, and appropriate
acceptance criteria that are numerical limits, ranges, or other
criteria for the test described. It establishes the set of criteria to
which a material should conform to be considered acceptable for
its intended use. “Conformance to specification” means that the
material, when tested according to the listed analytical procedures,
will meet the listed acceptance criteria (ICH Q7).
State of control A condition in which the set of controls consistently provides
assurance of continued process performance and product quality
(ICH Q10).
Symptom Circumstances, events, or conditions that indicate a problem
situation exists or has occurred.
Validation A documented program that provides a high degree of assurance
that a specific process, method, or system will consistently produce
a result meeting predetermined acceptance criteria (ICH Q7).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 293 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
294 Root Cause Investigations for CAPA: Clear and Simple

Verification Confirmation through provision of objective evidence that specified

requirements have been fulfilled.
Vulnerability The “soft spot” of a system, process, or activity; would be the place
that if attacked or affected the most serious impact or damage
would result.

REFERENCES
GHTF (2010) Quality management system – medical devices –
guidance on corrective action and preventive action and related
QMS processes. Global Harmonisation Task Force. http://www.
imdrf.org/docs/ghtf/final/sg3/technical-docs/ghtf-sg3-n18-2010-qms-
guidance-on-corrective-preventative-action-101104.pdf. Accessed
27 Apr 2020. NOTE: The work of the now-defunct Global
Harmonisation Task Force became part of the International
Medical Device Regulators Forum (IMDRF) in 2011.
ICH Q7 (2000) Good manufacturing practice guide for active
pharmaceutical ingredients. https://database.ich.org/sites/default/
files/Q7%20Guideline.pdf.
ICH Q8R2 (2009) Pharmaceutical development (revision 2). https://
database.ich.org/sites/default/files/Q8%28R2%29%20Guideline.pdf.
ICH Q9 (2005) Quality risk management. https://database.ich.org/sites/
default/files/Q9%20Guideline.pdf.
ICH Q10 (2008) Pharmaceutical Quality System. https://database.ich.
org/sites/default/files/Q10_Guideline.pdf.
Kartoglu, U. (2015) Pharmaceutical and vaccine quality illustrated.
Freely available at www.kartoglu.ch. Accessed 26 Mar 2020.
Rooney, J. and VandenHeuvel, L. (2004) Root cause analysis for
beginners. Quality Progress, 37(7).

LICENSED TO JOSE CASTELLA

Vesper Book.indb 294 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Appendix 2

INCIDENT INVESTIGATOR’S
WORKSHEET

This worksheet can be used to help you collect and organize

information to be used in writing an Incident Investigation Report.

Compiling information is not a linear process—as you learn more,

you may need to revise some of the information already collected.
Not all the questions may be applicable to your investigation, but
try to complete as many as possible. Using this form may stimulate
your thinking! If a question is answered in another part of the form,
you can simply refer to it.

All related procedures take precedence over this worksheet.

If there is any perceived conflict, follow the relevant procedures.

Part 1:The facts

What is involved? Be as specific as you can be—product name/item number and

lot numbers(s), equipment name/tag number, room name/number,
procedure number/revision date, etc. Note:This information could
change during the course of the investigation: you may find that
the scope of the problem has widened or narrowed.
What is the anomaly? Identify the “as is” or “as found”—this is the reason that this
incident is being investigated.
What is the Identify the “should be”—specification, regulatory commitment,
requirement? etc. along with the source (e.g., USP, finished product specification).
When did it happen? Include the date, time, and any other relevant factors (e.g., first
day after a shutdown).

295
LICENSED TO JOSE CASTELLA

Vesper Book.indb 295 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
296 Root Cause Investigations for CAPA: Clear and Simple

Where did it happen? Be as specific and descriptive as possible (e.g., room name,
location within room, interior or exterior of equipment, step of the
process).
When was it observed? The observation of the incident could be sometime after it
happened.
How was it observed This would explain how the incident was discovered (e.g., by
or discovered? chance, as part of a standard inspection practice).
Who observed or Name, phone (important for interviewing and follow-up questions),
discovered it? and title.
What was the first Identify the immediate actions taken (if any) to stop the problem
thing done when from getting worse or from affecting more product.
the incident was
discovered?
Who “owns” the Name, phone (important for interviewing and follow-up questions),
process? and title of the process owner.
What were the Identify the relevant environmental conditions (e.g., temperature,
conditions at the time humidity) or data from environmental monitoring (e.g., viable,
of the event? nonviable particulates).
What procedure was If a procedure was used, what was its name/number and revision
involved? number? What step or substep in the procedure was used?
Was anything different Identify what may have been different—first batch made after a
with this event shutdown, changes in a batch record, etc.
compared to similar or
“normal” situations?

Part 2.The investigation plan

What is your plan State the documents that will be examined, the people to be
in conducting the interviewed, the records to be reviewed, etc.This becomes a
investigation? “checklist”; it will be expected that each item listed here is
commented on in the report. Usually, the following items (2A–2D)
will be included in your review.

Part 2A. Historical review

Has this—or something This is used to determine if this event is an isolated event or one
very similar—happened that has occurred before. If it is recurring event, you will need to
before? explain why.
What are the review Use a relevant time period based on the number of events, cycle
parameters used in (e.g., calibration period). Consider other places that the material,
your search? What equipment, or test may have been used. Look-backs should be at
was your rationale for least one year (or more if the opportunity for the unwanted event
the selecting the time rarely occurs).
period you examined?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 296 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Incident Investigator’s Worksheet 297
Part 2B.Training record review
If people are Examine the relevant training records for the person, including
directly or indirectly when they were trained and the version of the document on
involved in this which they were trained.
incident, did they
have the appropriate
education, training, or
experience?

Part 2C. Qualification and validation record review

If this was a qualified Examine, as needed, the relevant qualification and validation
device or a validated documents and how they compare to the conditions surrounding
process, what were this incident.
the relevant ranges or
conditions included in
the protocol?

Part 2D. Calibration, maintenance, and repair record review

Was this instrument Examine the schedule and records to determine if the defined
or piece of equipment scheduled events had occurred. Determine if there were any
properly calibrated and recent problems with the device.
maintained? When?
Has it been repaired
recently?

Part 2E.Typical items to consider for scope (as appropriate)

What did your review Consider

of these items find? • Raw materials
• Batch records
• Processing parameters (speeds, times, etc.)
• In-process specifications, inspections, and test results
• Cleaning procedures and records
• Packaging and labeling components
• In-process or equipment hold times
• Calculations

Part 2F. Additional items to consider for scope (expand items as

necessary)
What were the other There may be other areas identified in the investigational plan
items that were (above).You may want to include the findings in sections like this.
examined during the
investigation?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 297 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
298 Root Cause Investigations for CAPA: Clear and Simple
Part 3. Product impact
Did the incident If products or materials are identified as being within the scope
impact the safety, of the investigation (see Part 1), you must determine if there was
identity, strength any adverse impact on the products or materials. If you determine
(potency), purity, or there is no product impact, you will need to fully justify this below.
quality (SISPQ) of
the product? Did the
incident cause the
product to not meet
a release specification
or regulatory (filing)
requirement?
What is your reason Provide your justification in terms of safety, identity, strength,
for your classifying the purity, and quality (SISPQ), specifications, risk to the patient, and
impact in this way? regulatory (filing) requirements.
Might this incident have Consider stability information that is available, including factors
affected the stability of that have been identified that may negatively affect the shelf-life
this product? How or of the product.
how not?

Part 4. Cause investigation

What were the This is the set of items that could have directly caused the
POTENTIAL root incident.You could get this list through a simple brainstorming
causes you considered activity or a more formal root cause analysis process.Your work
for the incident? here gives evidence that you didn’t just consider the first or most
obvious cause(s) that you identified.You will later give a reason for
why some of these were eliminated from consideration.
What was the proximal The proximal cause is the event closest to the incident. It is
or initiating event sometimes called the “sharp end” of the causal sequence.
believed to cause the
incident?
What were the root Here you would identify the assignable cause(s) or the most
causes believed to probable cause(s).
actually cause the
incident?
What were the reasons Your reasons for what you suspected but found no evidence to
for eliminating the strengthen your argument that you have found the assignable
other potential causes cause(s) or the most probable cause(s).
you considered?
What were some of Contributing (or indirect or latent) causes are those that set the
the contributing (or stage for the event to occur. Identifying them (and providing a
indirect) causes for rationale) can help you create a more effective preventive action
the incident? How do (PA) plan.
you know these were
involved?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 298 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Incident Investigator’s Worksheet 299

If you didn’t find an Sometimes, for a one-time event, you will not be able identify the
assignable or most reason.This may be because of a lack of information.You need to be
probable cause, do you able to give evidence that you reasonably tried to discover the cause.
have a reason why? Maybe you will have an idea that will make it easier next time to solve
the problem if it happens again. If so, you can include it here.
What were the Identify the process you used to come to these conclusions—for
methods used in the example, by using a fishbone diagram or cause mapping process.
investigation? In some simple, obvious cases, the method may have been
through direct observation.

Part 5. Corrections
What must be done to Corrections are actions taken to fix the materials, equipment, or
the items affected by the process or potentially “rework” the material. If a satisfactory “fix”
incident to allow their cannot be made, an intermediate or product may be rejected and
release or use? destroyed.
How do you know Provide your justification for the correction you are recommending.
the corrections will be Your justification needs to logically and scientifically relate to the
adequate? impact.

Part 6. Corrective actions

What must be done so Corrective actions are taken so that this particular problem
THIS problem does not does not happen again. (From a risk-based perspective, this is a
happen again? control that focuses on the reducing the likelihood of the event
happening.)
How do you know the Provide your justification for the corrective action(s) you are
corrective action(s) will recommending.Your justification needs to logically and scientifically
be adequate? relate to the impact. It may be useful to use statistics and
statistical thinking in building your case.
Should the event happen From a risk-based perspective, this is mitigation that is meant to
again, how can you reduce reduce the impact of the unwanted event.
risk to the product or
other things of value?
When will these Identify the time frame and who (by name, phone number, title) is
corrective actions be overseeing the activities.
completed? Who is
responsible for the
corrective actions?

Part 7. Preventive actions

What can be done so Preventive actions are closely tied to the root (direct) and
a similar event doesn’t contributing causes identified above. Include those that are
happen? necessary and sufficient to prevent a recurrence.Training is usually
not an adequate, stand-alone preventive action. Use multiple or
redundant layers of prevention to achieve a more robust solution.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 299 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
300 Root Cause Investigations for CAPA: Clear and Simple

How do you know the Provide your justification for the preventive action(s) you are
preventive action(s) will recommending.Your justification needs to logically and scientifically
be adequate? relate to the impact and risk.
When will these Identify the time frame and who (by name, phone number, title) is
preventive actions overseeing the activities.
be completed? Who
is responsible for
overseeing these
actions?

Part 8. Recommended disposition

Based on the Options may include approve, reject, or rework.Your reason or
information collected justification for “why” will probably include the points identified in
and in consultation with Part 4, Product Impact.
other stakeholders,
what should be done
with this material?
Why?

Part 9. Communication
Who should know Consider if there needs to be a report of this incident to
about this incident regulatory authorities. Also, should it be escalated to senior
and its outcome? management? Should other people or sites be notified so they can
Those involved? Other take preventive actions?
sites? Management?
Regulatory officials?

Part 10. Investigation personnel

Who was involved in Include names of team members as well as expert resources
the investigation and/or used.
provided subject matter
expertise? (Names,
titles, and specialties.)

Part 11. Background information (optional)

What additional This is an optional section that could be a process flow diagram,
information may be description of a method, etc. Putting information in this section
useful for a technically allows for a simpler, more streamlined report.
competent person to
more fully understand
the process, equipment,
or material?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 300 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Incident Investigator’s Worksheet 301
Part 12. Attachments to include (optional)
What are the This is an optional section that grows during the course of the
documents that you investigation and might include procedures, batch records, training
will want to include in records, etc.
the full report?

LICENSED TO JOSE CASTELLA

Vesper Book.indb 301 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

Vesper Book.indb 302 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

INDEX

5S 186 cause and effect diagram 142

accident theory 93 change analysis 55, 145, 147
active errors 120–122 change control 82, 183, 189, 200, 202,
adverse mental state 123 238
adverse physiological state 123 changing procedures 218
after-action reviews 277 changing the source 188
alarm fatigue 185, 200 check 80, 119, 158, 183, 184, 194, 195,
ALCOA 32, 194, 226 201, 216, 235, 236, 238, 239, 242
alerts 66, 185 checklist 116, 129, 132, 178, 179, 211,
anchoring bias 138 216, 217, 241, 258
as is 49, 51, 54, 185 checklists 104, 110, 128, 205, 207, 216,
authority bias 138, 144 218
availability 2, 77, 138, 202, 213, 249 churning 258, 261
background information 248 cognitive illusions 139
bias 42, 69, 77, 90, 111, 112, 114, 121, cognitive interview process 167, 169,
138, 144, 155, 160, 202, 237 174
big data 71, 73 commission errors 108
blame 111, 112, 113, 126, 166, 246 common cause variation 70, 71
blameless culture 112 communication 14, 49, 50, 67, 70, 78,
blunt end 109 82, 123, 156, 191, 229, 267–269,
brainstorming 42, 80, 142, 144, 162 270, 271
CAPA 15, 16, 17, 23, 60, 61, 77, 235, competencies 31, 32, 33, 36, 226, 227
261, 284, 287 competency-based training 31, 32, 225,
capability of personnel 117 226, 227, 228

303
LICENSED TO JOSE CASTELLA

Vesper Book.indb 303 5/29/2020 10:56:24 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
304 Root Cause Investigations for CAPA: Clear and Simple
complaints 19, 20, 63, 64, 72, 239, 268, effectiveness checks 59, 182, 190, 235,
285 236, 237, 238, 240, 242
condition of operators 123 elimination 58, 184
confirmation bias 121, 138, 155 empowerment 133
content-based training 224, 225 engineering controls 58, 185
contributing cause 57, 158, 221 equipment failure 14, 108
correction 58, 87, 115, 157, 179, 198, ergonomics 103, 108
240, 249, 253 error detection 198
corrective action 8, 15, 21, 35, 57, 58, error management 198, 199
89, 96, 157, 158, 179, 181, 182, error precursors 128, 241
184, 190, 192, 197, 199, 200, 202, error recovery 198, 199
215, 221, 232, 236, 238, 239, 242, error tolerance 198
267, 271, 284 error traps 64, 104, 110, 287
corrective action plan 200 escalation procedure 50, 178
critical 3, 9, 19, 21, 23, 32, 34, 40, 44, 45, events and causal factors (ECF) 158,
48, 50, 52, 56, 68, 80, 82, 83, 86, 159, 160
108, 110, 111, 114, 121, 124, 125, evidence 16, 21, 48, 50–53, 60, 63, 69,
129, 188, 194, 195, 208, 225, 226, 73, 111, 146, 150, 153, 158, 165,
229, 257, 259, 267, 275, 276 167, 168, 178, 181, 198, 199, 201,
critical stage 257 221, 237, 239, 240, 246, 247, 252,
critical thinking 34, 40, 44, 45, 56 285, 287
data mining 71, 72 execution failures 109
decision errors 121 expectations 2, 6, 13, 14, 17, 37, 42, 65,
decision tree 50, 84, 85, 120, 249 67, 103, 118, 155, 274, 285
design of forms 186 expertise 8, 32, 39, 41, 50, 68, 69, 80,
design of the job 117 111, 155, 227, 286
deviation 16, 18, 19, 20, 22, 23, 25, 26, explicit knowledge 222, 223
28, 37, 58, 64, 77, 83, 86, 88–90, facilitation techniques 42
98, 109, 113, 120, 121, 126, 148, facilitator 34, 39, 40, 42, 43, 44, 80, 144,
160, 167, 174, 177, 190, 208, 218, 278
232, 244, 246, 268, 270, 277 factorial model 100
direct observation 64, 65, 73, 177 fail fast, fail often 274
disincentives 6, 9, 287 failure chain 111
domino theory 95 failure mode effects analysis (FMEA)
Edmondson, Amy 64, 113 77, 80, 81, 83
effectiveness 17, 34, 35, 59, 119, 158, failure to correct known problems
182–184, 190, 198, 235–240, 124
242, 271 fault tree 80, 138, 146, 150, 153

LICENSED TO JOSE CASTELLA

Vesper Book.indb 304 5/29/2020 10:56:25 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Index 305
feedback 15, 64, 73, 115, 116, 118, 230, human error 9, 14, 21, 34, 57, 64, 65,
235, 236, 239, 241, 256, 258, 259 104, 108, 109, 111, 115, 116, 126,
finished product 26, 29, 193 128, 129, 133, 139, 185, 191, 192,
fishbone diagram 80, 142, 143, 252, 198, 208, 285, 287
259 human factors 9, 14, 21, 64, 98, 103,
five whys 133, 137, 138, 145, 146, 147, 104, 108, 116, 120, 133, 139, 191,
150, 153, 157, 162, 163 232, 287
five Ws and one H (5W1H) 149 human factors analysis and classification
flow charts 41, 140, 162 system (HFACS) 120, 124, 125,
formal reports 61, 249, 251 191
formative evaluation 235, 236 human performance model 116
Gaussian copula function 201 immediate action 8, 33, 37, 49, 58, 68,
Gawande, Atul 216 88, 157, 177, 179, 249, 257
genotype 109 immediate actions 49, 50, 88, 177, 178,
giving feedback 115, 258, 259 179, 251
global harmonization task force impact 4, 5, 7, 8, 16, 21–23, 38, 48, 50–
(GHTF) 15, 181, 237 52, 57, 58, 60, 64, 73, 80–82, 84,
GMP 13, 17–21, 23, 25–29, 32, 35, 51, 87–89, 103, 115, 159, 177, 179,
52, 89, 183, 194, 195, 221, 283 187–199, 202, 211, 215, 235, 236,
golden hours 21, 51, 111, 168, 169, 285 238, 242, 249, 251, 267–269, 287
good manufacturing practice 32 inadequate supervision 124, 125
Google 38, 72, 273 incentives 6, 9, 287
Google flu trends 72 incident communication 269
ground rules 42, 278 individual factors 103
hazard 42, 80, 81, 82, 97, 105, 178, 184, information ecosystem 212
185 inspectors 14, 15, 17, 26, 31, 45, 60, 82,
health authorities, health authority 13, 88, 104, 190, 221, 252, 261
15, 157, 194, 248, 255, 267, 268, Institute of Medicine (IOM) 5, 6, 115
283, 287 intentional errors 109
Health Canada 17 interim controls 200
healthcare 3, 5, 6, 15, 63, 73, 82, 89, International Conference on Harmon
154, 178 isation (ICH) 17, 19, 20, 23, 26,
Heinrich, H.W. 95 27, 29, 56, 77, 80, 205, 243, 284
Herman, Amy 154 interpretive stage 257
hierarchical model 98, 103, 116, 120 interrogation 165
high reliability organizations (HROs) interview 165, 167, 168, 169, 174
7, 8 investigation plan 53
hindsight bias 112, 138, 160 investigation reports 14, 35, 36, 60,
Holmes, Sherlock 154 226, 249, 253, 260

LICENSED TO JOSE CASTELLA

Vesper Book.indb 305 5/29/2020 10:56:25 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
306 Root Cause Investigations for CAPA: Clear and Simple
investigation team 33, 45, 51, 55, 56, models 7, 14, 21, 32, 45, 65–67, 69, 72,
142, 174, 190 93–95, 97, 98, 103–105, 115,
Ishikawa diagram 100, 142 116, 191, 193, 252
is/is not 44, 147 monitoring 17, 29, 31, 35, 52, 59, 83, 98,
ISO-9000 14 117, 186, 189, 285
ISO-13485 14, 15 multilinear events sequencing (MES)
isolation 185 97
Kahneman, Daniel 139 near misses 112, 239
Kipling, Rudyard 149 negligence 110
know how 223, 274 noise 68, 70, 103, 122, 200, 241
know that 9, 26, 52, 58, 66, 110, 118, nuclear power 3, 7, 14, 21, 103, 116,
191, 206, 223, 246 126, 206
lapses 109, 121 nudges 192, 193
latent errors 120, 122, 124 observation 15, 36, 39, 47, 49, 63, 64,
lead investigator 34, 36, 53, 104 65, 73, 154, 155, 162, 168, 177,
learning from mistakes 273, 274, 277 228, 275
lessons learned 35, 61, 115, 277 observe 9, 17, 36, 37, 49, 154, 155
level of detail (in a procedure) 208, O’Donnell, Kevin 129
209, 210, 211 omission errors 108
level of effort 23, 26, 27 opportunities to practice 118, 119
Leveson, Nancy 111 optimism bias 42, 138
limitations 95, 105, 123, 182, 224 organizational culture 112, 125, 276,
major 19, 107, 119, 123, 185, 186, 200, 280
224, 227, 242, 252, 267 organizational processes 114, 125
management 9, 14, 17, 23, 29, 34–36, organizational resistance 202
39, 48, 50, 57, 59, 64, 70, 77, 78, organizational structure 117
82, 83, 85, 90, 98, 111, 112, 118, other industries 3, 116
123, 124, 126, 134, 158, 168, 178, out of specification (OOS) 19, 22, 23,
183, 192, 198, 199–201, 212, 55, 118, 128, 147
224, 237, 240, 243, 250, 267, 268, over-reliance on technology 201
272, 276, 278, 283, 284, 286 owner 37, 41
medical devices 14, 15 patient impact 88, 89
Medicines and Healthcare products perceptual errors 121
Regulatory Agency 15 personal readiness 123
mental model 8, 65, 66, 67, 69, 73, 93, pharmaceutical quality system (ICH
160, 178, 225, 257 Q10) 56, 284
minor 19, 85 pharmacovigilance 15, 33, 70
mistake proofing 192 phenotype 109
planned inappropriate operations 124

LICENSED TO JOSE CASTELLA

Vesper Book.indb 306 5/29/2020 10:56:25 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Index 307
planning failures 109 244, 249, 250, 256, 261, 268, 272,
poka-yoke 192 274, 276, 277, 280, 283–287
preventive action 15, 16, 58, 181, 182, quality auditors 14, 26, 31, 60
249, 284 quality control 18, 20, 21, 22, 23, 25, 27,
probable root cause 232, 246 28, 29, 190
problem description 55, 149, 251 quality control unit 20, 21, 22, 27, 28
problem statement 51, 54, 257 quality culture 287
procedure 13, 14, 16, 18–23, 27, 37, 39, quality risk management (QRM) 23,
50, 54, 55, 57, 82–84, 87, 88, 109, 29, 77–80, 82–85, 90, 243, 250
111, 118, 121, 122, 125, 126, 140, quality unit 13, 18, 27, 28, 34, 36, 37, 50,
178, 179, 189–191, 199, 202, 51, 52, 53, 126, 227, 261, 286
205, 206, 208, 210–215, 221, quality unit leadership 34
225, 228, 241, 247, 268, 269, 276, rationale 18, 22, 25, 28, 202, 246, 251,
286 260
procedures (SOPs) 5, 18–25, 28, 34, 35, RCA 137
59, 98, 100, 104, 110, 118, 124, Reason, James 97, 109, 120, 215
125, 133, 140, 160, 189, 190, 199, receiving feedback 116, 259
200, 205–209, 212, 214, 216, records 18–22, 25, 26, 28, 33, 43, 95,
218, 221, 223, 224, 248, 256, 287 119, 122, 125, 157, 184, 185, 190,
process control charts 70 194, 202, 207, 226–228, 236,
process improvement 88 239, 246, 248, 249, 260
process map 140, 142, 184, 207 recurrences 2, 9, 59, 189
production operator 33 reflection 56, 224, 277, 286
providing additional information 189 regulatory requirements 285
proximal cause 57, 111, 157, 191, 215, rejection 2, 179
239, 252, 257 reports 14–16, 19, 35, 36, 45, 60, 61, 83,
psychological safety 38, 64, 112, 114, 94, 178, 201, 226, 238, 243, 245,
166, 286 248, 249, 251, 253, 257–260,
Q9 19, 23, 26, 27, 29, 77, 78, 80 268, 286
qualification 59, 64, 188, 197, 198, 237 report writer 34, 256
qualified personnel 18, 22, 25, 28, 124 report writing 251
quality 6, 7, 13–16, 18–23, 25–29, 31, residual risk 82, 183, 200
33, 34–37, 42, 47, 50–56, 59, 60, resilience 8
63, 64, 77, 78, 80–85, 88–90, 97, resource management 123, 124, 125
98, 100, 107, 111, 120, 124–126, review 16, 17, 20, 21, 28, 29, 37, 60,
139, 156–158, 165, 167, 168, 61, 63, 77, 78, 80, 83, 108, 128,
174, 186, 188, 190, 202, 206, 212, 194, 195, 197, 206, 249, 256, 259,
213, 227, 230, 232, 238, 240, 243, 260, 277
rewards 118

LICENSED TO JOSE CASTELLA

Vesper Book.indb 307 5/29/2020 10:56:25 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

risk 3, 7, 13, 14, 17, 19, 23, 26, 27, 29, single-event model 94, 95
34–36, 38, 50, 59, 63, 64, 70, 73, situational awareness 7, 67, 68, 69
77, 78, 80–86, 88–90, 104, 105, situational factors 122
114, 150, 178, 183, 188, 194, 195, skill-based errors 121
197, 200, 201, 211, 215, 243, 249, Skinner, B.F. 115
250, 270, 271, 274, 275, 287 slips 109, 121
risk analysis 80, 104 social learning 224, 227, 228
risk-based thinking 34, 35, 77, 83, 88, 90, special cause variation 70, 71
178, 194, 195, 215 spelling 45, 61, 208, 247
risk evaluation 70, 81, 82 sterile cockpit rule 129
risk matrix 81 stories 2, 227, 273
risk reduction 70, 81, 82 structural stage 257
risk review 83 structured on-the-job training (S-OJT)
risk treatment 81 32, 225, 228–230, 240, 241
role of leadership 280 structured on-the-job training (S-OJT)
root cause 8, 15, 16, 20, 21, 23, 45, 57, learning guide 32, 225, 228, 240
107, 108, 111, 119, 133, 137–139, subject matter expert (SME) 33, 36, 38,
146, 147, 149, 150, 157, 162, 182, 138
199, 232, 235, 236, 238, 239, 246, substitution 58, 184
248, 257, 259, 283, 285–287 summative evaluation 235, 236
root cause analysis 21, 137, 149, 157 supervisory violations 124
sabotage 114, 122 sustainability 202, 240
Sandle, Tim 13 Sweden 285
scope 21, 37, 38, 41, 45, 51, 52, 54, 57, Swiss cheese model 193
58, 61, 80, 88, 179, 236, 238, 239, symptom 63, 64, 73, 146, 150, 153, 160,
246, 251, 267–269 162, 236, 239, 248
selecting the team 41 systems accident 100, 102
selection bias 138, 155 tacit knowledge 118, 222, 223, 274
sharp end 109, 111, 133, 246 timelines 56, 144, 162
shortcuts 122, 124, 191, 241, 287 time pressure 3
should be 18–23, 25–27, 31, 37, 41, 42, to err is human 5, 115
49, 51, 54, 58, 61, 64, 65, 70, 78, tool selection 162
81–83, 88, 109, 110, 113, 157, Toyota Motor Company 186
160, 162, 178, 184, 191, 200, 208, training 5, 31, 32, 36, 37, 44, 45, 58, 61,
212, 214, 216, 221, 230, 236, 238, 80, 83, 96, 110, 111, 116, 118, 119,
241, 245, 247–249, 251, 256–268, 123–126, 133, 148, 190, 191, 199,
271, 287 202, 205, 207, 208, 211, 212, 216,
signal 42, 68, 69, 70, 72, 270 221, 222, 224–230, 232, 239–241,
signal detection 69, 70 246, 248, 258, 267, 275, 287

LICENSED TO JOSE CASTELLA

Vesper Book.indb 308 5/29/2020 10:56:25 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.
Index 309
trend 16, 22, 41, 63, 177, 216, 237, 239
triage 19, 63, 83, 85, 287
TWIN 114, 126, 128, 241
US FDA 15, 50, 55, 178, 211, 214, 260,
268
validation 59, 197, 198, 237, 238
value of a team 38
verification 125, 158, 194, 195, 208, 237,
238
verify 194
violations 104, 110, 113, 121, 122, 124,
208, 277
Williams, Brian 166
World Health Organization (WHO) 2,
4, 7–9, 17, 19, 20, 23, 25, 27, 29,
31, 37–39, 41–45, 49, 50, 51, 53,
55, 61, 62, 64, 65–67, 69, 70, 80,
85, 94, 100, 103, 111, 112, 117,
123, 139, 144, 147–149, 154, 155,
157, 162, 165, 168, 190, 193, 201,
202, 206–209, 214, 215, 224–228,
230, 237, 241, 244, 256, 257, 268,
271, 273, 276, 280, 286
written report 24, 37, 44, 245

LICENSED TO JOSE CASTELLA

Vesper Book.indb 309 5/29/2020 10:56:25 AM

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

Suite 600, Bethesda, MD 20814.

LICENSED TO JOSE CASTELLA

TR84 Integrating Data Integrity Requirements into Manufacturing & Packing Operations
No ratings yet
TR84 Integrating Data Integrity Requirements into Manufacturing & Packing Operations
65 pages
Download full (Ebook) The Combination Products Handbook: A Practical Guide for Combination Products and Other Combined Use Systems by Susan Neadle (editor) ISBN 9781032291628, 1032291621 ebook all chapters
100% (2)
Download full (Ebook) The Combination Products Handbook: A Practical Guide for Combination Products and Other Combined Use Systems by Susan Neadle (editor) ISBN 9781032291628, 1032291621 ebook all chapters
81 pages
?GMP Compliant Equipment Design?ECA Guideline?
100% (1)
?GMP Compliant Equipment Design?ECA Guideline?
5 pages
ISPE Boston Chapter Webinar: ISPE Baseline Guide Volume 5, 2nd Edition
0% (1)
ISPE Boston Chapter Webinar: ISPE Baseline Guide Volume 5, 2nd Edition
63 pages
Continued Process Verification
No ratings yet
Continued Process Verification
4 pages
ISPE Guide：ATMPs – Autologous Cell Therapy 前沿治疗药物 – 自体细胞疗法
No ratings yet
ISPE Guide：ATMPs – Autologous Cell Therapy 前沿治疗药物 – 自体细胞疗法
148 pages
Pda 2
No ratings yet
Pda 2
32 pages
Llenado Aseptico PDF
100% (1)
Llenado Aseptico PDF
20 pages
FDA Inspection For Pharmaceutical Industry
No ratings yet
FDA Inspection For Pharmaceutical Industry
11 pages
Media Fill Guideline Comparison
No ratings yet
Media Fill Guideline Comparison
53 pages
Biopharmaceutical Manufacturing Process Validation and Quality Risk Mana...
No ratings yet
Biopharmaceutical Manufacturing Process Validation and Quality Risk Mana...
25 pages
Usp 1116 Pres Mar 2016
No ratings yet
Usp 1116 Pres Mar 2016
40 pages
Sutton - JGXP .14.3 PDF
No ratings yet
Sutton - JGXP .14.3 PDF
9 pages
Indenx IP 2018 PDF
No ratings yet
Indenx IP 2018 PDF
17 pages
Fundamentals of Cleaning and Disinfection Programs For Aseptic Manufacturing Facilities
No ratings yet
Fundamentals of Cleaning and Disinfection Programs For Aseptic Manufacturing Facilities
52 pages
JVT 2012 v18n2 The Contamination Control Plan in Facility Validation
100% (1)
JVT 2012 v18n2 The Contamination Control Plan in Facility Validation
6 pages
Data integrity in pharmaceutical and medical devices regulation operations best practices guide to electronic records compliance 1st Edition Lopez - Discover the ebook with all chapters in just a few seconds
100% (2)
Data integrity in pharmaceutical and medical devices regulation operations best practices guide to electronic records compliance 1st Edition Lopez - Discover the ebook with all chapters in just a few seconds
56 pages
Pda Technical Report 39
No ratings yet
Pda Technical Report 39
17 pages
Jett Urs P P: Reparation Rocess
No ratings yet
Jett Urs P P: Reparation Rocess
7 pages
ECA Contamination Control
No ratings yet
ECA Contamination Control
6 pages
Aseptic Media Fill
100% (1)
Aseptic Media Fill
2 pages
VMP Theory
No ratings yet
VMP Theory
34 pages
Annex1-Q&As Version1 240411 070843
No ratings yet
Annex1-Q&As Version1 240411 070843
8 pages
Scale-Up Using QBD Webinar ISPE-cjp v3
No ratings yet
Scale-Up Using QBD Webinar ISPE-cjp v3
20 pages
Oel Protocol Flexible Capsule Filling Machine Isolator
No ratings yet
Oel Protocol Flexible Capsule Filling Machine Isolator
19 pages
Recent Trends in Sterile Inspections
No ratings yet
Recent Trends in Sterile Inspections
40 pages
Cleanroom Requirments
No ratings yet
Cleanroom Requirments
36 pages
Media Fill FDA 483 Observations 3
No ratings yet
Media Fill FDA 483 Observations 3
9 pages
A Refresher On Disinfectant Wet Contact Time
No ratings yet
A Refresher On Disinfectant Wet Contact Time
5 pages
EJPPS 2019 Volume 24 Number 3 Cleanroom Garments From A Quality Risk Management Approach 1
No ratings yet
EJPPS 2019 Volume 24 Number 3 Cleanroom Garments From A Quality Risk Management Approach 1
13 pages
Yang 2013
No ratings yet
Yang 2013
9 pages
PQR Pharmout
No ratings yet
PQR Pharmout
32 pages
Oq Faw1005
No ratings yet
Oq Faw1005
66 pages
Eguide: Contamination Control in Pharmaceutical Industry
No ratings yet
Eguide: Contamination Control in Pharmaceutical Industry
17 pages
Characterization of Airflow Patterns, Identification of Barrier System Design Flaws, and Cleanroom/Barrier System Integration Mistakes
100% (2)
Characterization of Airflow Patterns, Identification of Barrier System Design Flaws, and Cleanroom/Barrier System Integration Mistakes
43 pages
Pda TR17 1992
No ratings yet
Pda TR17 1992
23 pages
BioPhorum-Environmental-monitoring-a-February-2019 (Áp D NG Đư C) - Share
No ratings yet
BioPhorum-Environmental-monitoring-a-February-2019 (Áp D NG Đư C) - Share
29 pages
Warning Letter - Deficiencies in Validation and OOS - ECA Academy
0% (1)
Warning Letter - Deficiencies in Validation and OOS - ECA Academy
2 pages
Bioburden Considerations in Equipment-Cleaning Validation: Did Not Include Swab Sampling of The Transfer Lines
No ratings yet
Bioburden Considerations in Equipment-Cleaning Validation: Did Not Include Swab Sampling of The Transfer Lines
20 pages
Standards PDF
No ratings yet
Standards PDF
20 pages
Alconox Detergent Cleaning Validation
No ratings yet
Alconox Detergent Cleaning Validation
12 pages
TRS961 Annex7
No ratings yet
TRS961 Annex7
25 pages
Comparability Protocols For Human Drugs and Biologics Chemistry Manufacturing and Controls Information Guidance For Industry
100% (1)
Comparability Protocols For Human Drugs and Biologics Chemistry Manufacturing and Controls Information Guidance For Industry
24 pages
Using A PQRI Approach in Process Validation
No ratings yet
Using A PQRI Approach in Process Validation
18 pages
FILL FINISH Solutions EN
No ratings yet
FILL FINISH Solutions EN
32 pages
Edited Excerpts From Actual 483 Observation Reports by Food and Drug Administration Investigators
No ratings yet
Edited Excerpts From Actual 483 Observation Reports by Food and Drug Administration Investigators
4 pages
Pda TR32 2004
No ratings yet
Pda TR32 2004
153 pages
SC PDG Cleaning Verification 19 SEP 2013 PDF
No ratings yet
SC PDG Cleaning Verification 19 SEP 2013 PDF
40 pages
Improved Utilization of Self-Inspection Programs Within The GMP Environment-A Quality Risk Management Approach
No ratings yet
Improved Utilization of Self-Inspection Programs Within The GMP Environment-A Quality Risk Management Approach
10 pages
PDA Vol. 77, Issue 3 - VinaGMP
No ratings yet
PDA Vol. 77, Issue 3 - VinaGMP
111 pages
02 Industry Guidelines - GAMP PDF
No ratings yet
02 Industry Guidelines - GAMP PDF
28 pages
Introduction To The ASTM E3106 Standard Guide To Science-Based and Risk-Based Cleaning Process Developmentand Validation
100% (1)
Introduction To The ASTM E3106 Standard Guide To Science-Based and Risk-Based Cleaning Process Developmentand Validation
14 pages
FDA Inspection Biologics
100% (1)
FDA Inspection Biologics
82 pages
Introductionto ASTME3418
No ratings yet
Introductionto ASTME3418
12 pages
PDA TR No. 48
100% (2)
PDA TR No. 48
31 pages
Corrective Action and Preventive Action (CAPA) in Pharmaceutical Industry
From Everand
Corrective Action and Preventive Action (CAPA) in Pharmaceutical Industry
Chandrasekhar Panda
No ratings yet
Site Master File (SMF) in Pharmaceutical Industry
From Everand
Site Master File (SMF) in Pharmaceutical Industry
Chandrasekhar Panda
No ratings yet
Clean Room Standards
From Everand
Clean Room Standards
Felicia Dunbar
No ratings yet
Good Documentation Practices (GDP) in Pharmaceutical Industry
From Everand
Good Documentation Practices (GDP) in Pharmaceutical Industry
Chandrasekhar Panda
No ratings yet
Current Good Manufacturing Practices (cGMP) for Pharmaceutical Products
From Everand
Current Good Manufacturing Practices (cGMP) for Pharmaceutical Products
Chandrasekhar Panda
No ratings yet
DCGI Drive To Push Generics and Phase
No ratings yet
DCGI Drive To Push Generics and Phase
5 pages
Government Regulations
No ratings yet
Government Regulations
9 pages
Introduction To PHarmaceutical Quality Management
No ratings yet
Introduction To PHarmaceutical Quality Management
22 pages
Mohamed 06 10 2024 1728183456 Drjobpro
No ratings yet
Mohamed 06 10 2024 1728183456 Drjobpro
2 pages
QC 2 Standards and Specifications
No ratings yet
QC 2 Standards and Specifications
3 pages
Mega Investor Presentation 4q2018
No ratings yet
Mega Investor Presentation 4q2018
47 pages
Focus On Cgmps & Fda Inspections: Welcome
No ratings yet
Focus On Cgmps & Fda Inspections: Welcome
40 pages
Sonya Hobbs: 4567 Main Street Metropolis, New York 98052 (716) 555 - 0100
No ratings yet
Sonya Hobbs: 4567 Main Street Metropolis, New York 98052 (716) 555 - 0100
2 pages
Cosmetics Regulatory: Framework in Korea
No ratings yet
Cosmetics Regulatory: Framework in Korea
11 pages
1-4 ChangeControl
100% (1)
1-4 ChangeControl
62 pages
03 2020 003CTc SAT
No ratings yet
03 2020 003CTc SAT
68 pages
About Us Page
No ratings yet
About Us Page
2 pages
c1231 - 1SUSP40 PW PDF
No ratings yet
c1231 - 1SUSP40 PW PDF
36 pages
Import Policy For Medicines. Govt of Pak (13-12-2013)
No ratings yet
Import Policy For Medicines. Govt of Pak (13-12-2013)
2 pages
Nidhi-Tbi: National Initiative For Developing and Harnessing Innovations
No ratings yet
Nidhi-Tbi: National Initiative For Developing and Harnessing Innovations
52 pages
Second Party Audit
100% (1)
Second Party Audit
4 pages
04 - ECA Event Overview
No ratings yet
04 - ECA Event Overview
6 pages
Data Integrity in Pharma Industry
100% (1)
Data Integrity in Pharma Industry
46 pages
Opportunities For Export of Ayurvedic Products To African Countries
No ratings yet
Opportunities For Export of Ayurvedic Products To African Countries
12 pages
Chlorhexidine 15 C NF
No ratings yet
Chlorhexidine 15 C NF
15 pages
Validation and Qualification of Pharmaceutical Products
100% (2)
Validation and Qualification of Pharmaceutical Products
25 pages
Curriculum Vitae: Norbert Helmke
No ratings yet
Curriculum Vitae: Norbert Helmke
16 pages
Ik - S Report
No ratings yet
Ik - S Report
21 pages
QRM
No ratings yet
QRM
110 pages
O 717 Quality Management System
No ratings yet
O 717 Quality Management System
17 pages
GMP Haccp Checklist v01 PDF
No ratings yet
GMP Haccp Checklist v01 PDF
16 pages
Biomat State of The Art Flyer
No ratings yet
Biomat State of The Art Flyer
12 pages
HVAC Design For Pharmaceutical Facilities
100% (4)
HVAC Design For Pharmaceutical Facilities
57 pages
1.GxP Assessment Document of SCADA System Form
No ratings yet
1.GxP Assessment Document of SCADA System Form
9 pages
9 150928065812 Lva1 App6892 PDF
No ratings yet
9 150928065812 Lva1 App6892 PDF
55 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

18081

Uploaded by

18081

Uploaded by

Licensed to Enger, Tehya/PDA: Copying and Distribution Prohibited.

LICENSED TO JOSE CASTELLA

DHI Publishing, LLC

LICENSED TO JOSE CASTELLA

Vesper Book.indb 1 5/29/2020 10:55:43 AM

Front matter.indd 2 3/16/2020 11:03:32 AM

LICENSED TO JOSE CASTELLA

Vesper Book.indb 2 5/29/2020 10:55:43 AM

1 WHY INVESTIGATIONS AND CORRECTIVE

2 REGULATORY REQUIREMENTS AND

Vesper Book.indb 3 5/29/2020 10:55:43 AM

3 ROLES AND RESPONSIBILITIES 31

4 THE BIG PICTURE: INVESTIGATIONS AND

5 THE INITIAL DISCOVERY OF AN EVENT 63

6 APPLYING RISK-BASED THINKING TO

7 MODELS USED IN DESCRIBING INCIDENTS 93

Vesper Book.indb 4 5/29/2020 10:55:43 AM

Chain-of-events models or domino theory 95

8 HUMAN ERRORS AND HUMAN FACTORS 107

9 METHODS AND TOOLS USED WHEN

Vesper Book.indb 5 5/29/2020 10:55:43 AM

11 IMMEDIATE ACTIONS AND CORRECTIONS 177

12 CORRECTIVE ACTIONS AND PREVENTIVE

13 PROCEDURES: CAUSES OF PROBLEMS AND

14 TRAINING AS A CORRECTIVE ACTION 221

Vesper Book.indb 6 5/29/2020 10:55:43 AM

Tacit and explicit knowledge 222

15 CORRECTIVE ACTION EVALUATION AND

16 WRITING THE REPORT 243

17 REVIEW AND APPROVAL OF THE

Vesper Book.indb 7 5/29/2020 10:55:43 AM

Communicating potential risks 270

19 LEARNING FROM SUCCESSES AND FAILURES 273

20 MANAGEMENT RESPONSIBILITIES 283

APPENDIX 1: DEFINITIONS 291

APPENDIX 2: INCIDENT INVESTIGATOR’S

Vesper Book.indb 8 5/29/2020 10:55:43 AM

Investigations are an essential part of the regulated healthcare

This is a time of significant and unprecedented opportunity

Vesper Book.indb 9 5/29/2020 10:55:43 AM

monitoring and testing, and manufacturing intelligence data

In parallel, external influences are affecting the business

With this change in emphasis come corresponding opportunities

Improvement means challenging the status quo. It comes

Vesper Book.indb 10 5/29/2020 10:55:44 AM

The objectives of failure investigations and corrective actions

This book incorporates three of the most essential process

As members of the global healthcare product community,

Vesper Book.indb 11 5/29/2020 10:55:44 AM

objective. This volume captures information and presents insight

Vesper Book.indb 12 5/29/2020 10:55:44 AM

Thanks to the participants in workshops and also the clients who

Thanks to my colleagues at ValSource, particularly Hal

Thanks to Umit Kartoglu and Tom Reeves for their

Vesper Book.indb 13 5/29/2020 10:55:44 AM

Vesper Book.indb 14 5/29/2020 10:55:44 AM

So I had to stop. What this means is that this is not going to be a

This book is based on public and in-house workshops that I

The chapters are arranged to first provide some high-level

Vesper Book.indb 15 5/29/2020 10:55:44 AM

Chapter 1, Why Investigations and Corrective Actions Matter,

Chapter 2, Regulatory Requirements and Expectations, looks at

Chapter 3, Roles and Responsibilities, identifies who

Chapter 4, The Big Picture: Investigations and Corrective

Chapter 5, The Initial Discovery of the Event, considers how

Chapter 6, Applying Risk-based Thinking to Quality Events

Vesper Book.indb 16 5/29/2020 10:55:44 AM

this. The ICH Q9 Quality Risk Management model is presented and

Chapter 7, Models Used in Describing Incidents, presents ways

Chapter 8, Human Errors and Human Factors, is the chapter

Chapter 9, Methods and Tools Used When Conducting

Chapter 10, Interviews, describes why getting statements as soon

Chapter 11, Immediate Actions and Corrections, looks at

Vesper Book.indb 17 5/29/2020 10:55:44 AM

Chapter 12, Corrective Actions and Preventive Actions, defines

Chapter 13, Procedures: Causes of Problems and Potential

3 ROLES AND RESPONSIBILITIES 31

5 THE INITIAL DISCOVERY OF AN EVENT 63

7 MODELS USED IN DESCRIBING INCIDENTS 93

Chain-of-events models or domino theory 95

8 HUMAN ERRORS AND HUMAN FACTORS 107

11 IMMEDIATE ACTIONS AND CORRECTIONS 177

14 TRAINING AS A CORRECTIVE ACTION 221

Tacit and explicit knowledge 222

16 WRITING THE REPORT 243

Communicating potential risks 270

19 LEARNING FROM SUCCESSES AND FAILURES 273

20 MANAGEMENT RESPONSIBILITIES 283

APPENDIX 1: DEFINITIONS 291