Data Management - Unit 1
Data Management - Unit 1
Data Management - Unit 1
1
Clinical Trial
What is Clinical Trial?
Any investigation in human subjects intended to discover or verify the clinical, pharmacological
metabolism, and excretion of an investigational product(s) with the object of ascertaining its
safety and/or efficacy. The terms clinical trial and clinical study are synonymous
2
Multidisciplinary Teams in CT
Clinical
Investigator
Pharmacovigilanc
Site coordinator
e
Clinical Data
Management Clinical Biostatistician
Trials
Regulatory affairs Project manager
3
Introduction to CDM
What is Clinical Data Management?
Clinical data management (CDM) is the process of collecting and managing research data in
accordance with regulatory standards to obtain quality information that is complete and
error-free.
The goal is to gather as much of such data for analysis as possible that adheres to federal,
4
Why CDM
• Review & approval of new drugs by Regulatory Agencies is dependent upon a trust that clinical trials
data presented are of sufficient integrity to ensure confidence in results & conclusions presented by
pharma company
• Hence companies must assure that all staff involved in the clinical research are trained & qualified to
5
DM Role in Clinical Research
CDM has evolved from a mere data entry process to a much diverse process today
• The data management function provides all data collection and data validation for a clinical trial
program
• Data management is essential to the overall clinical research function, as its key deliverable is the data to
support the submission
• Assuring the overall accuracy and integrity of the clinical trial data is the core business of the data
management function
• Data is collected to establish whether the objective of the Clinical Trial is met
7
Important Abbreviations
• DM – Data Management
• DB – Database
• QC – Quality Control
• SDTMIG - Study Data Tabulation Model Implementation Guide for Human Clinical Trials
9
Objective of CDM
Data Collection --
• Paper, Electronic and Remote data capture
Data integration --
• Integration of data received from all sources in a single DB.
10
Scope of CDM
• Main scope of CDM is to Collect, Validate and Analyze the clinical data
• data collection instrument such as Paper CRF, Electronic CRF, Clinical database etc
• tools for Validation such as Edit Checks, User Acceptance Testing etc
11
Importance of CDM
CDM is a vital vehicle in Clinical Trials to ensure integrity & quality of data being transferred from trial
• That trial database is complete, accurate & a true representation of what took place in trial
• That trial database is sufficiently clean, to support statistical analysis, its subsequent presentation & interpretation
• At the study level, data management ends when the database is locked and the Clinical Study Report is final
• At the compound level (of the drug), data management ends when the submission package is assembled and complete
12
CDM Communication and interfaces
13
Interdependent groups in CDM
Data Cleaning
Programming Biostatistics
14
CDM Activities – At different stage of study
Conduct
• CRF Design Phase • Database QC
• Data Base Design • Database Lock
• Edit Check • Data Entry
• Database Activated • Query Management
• Coding of Medical Terms
• SAE Reconciliation
• Data Base Update
Start Up Closeout
Phase Phase
15
CDM Activities – Start Up Phase
Protocol
CRF Design
Validation/
Deviation
Procedure
Database
Activation 16
CDM Activities – Conduct Phase
Data
Database Entry/Loading Discrepancy
Activation (CRF and external Management
data)
Safety Data
Coding Terms
Reconciliation
Query
Resolution
Resolution and
Update DB
Safety Data
Reconciliation
Manual Check, QC
Coding Terms
& CRF Tracking
DB Lock and
freeze
18
CDM Activities – Startup to Closeout Phase
Start up phase Conduct phase Closeout phase
19
CDM Activities – Start Up Phase
Roles and Responsibilities
CRF Designers -Design CRF as per protocol
DB designers -Design DB as per protocol OR CRF OR CRF Specs and activate the same
Programmers -Program Validation and Derivation procedures, and activate the same
Data Managers -Review the CRF prior to activation, test the database prior to activation, write the
validation and derivation procedures/checks and test the same prior to activation
20
CDM Activities – Conduct & Closeout Phase
Roles and Responsibilities
Data Entry/Data Loaders- Manually enter the data (in case of paper studies), load data in case of
electronic studies) and external data (Example: lab, ECG, subject diaries etc.)
Data Managers -Identify and resolve discrepancies, issue queries to site & resolve them, carry out manual
Safety Data Managers -Perform the safety reconciliation by comparing the clinical database with the
safety database
Dictionary Coders -Code medical terms collected during clinical trial. Example: Medications and Adverse
events
21
Data Capture Tools
• Paper Based eCRF
• Differences:
Data Entry associate will enter the Investigator enters the data into the
data in to the Clinical Data base database
A printed, optical, or electronic document designed to record all of the protocol required
23
CRF Design
24
CRF & CRF Design
The following points are to be borne in mind while designing a CRF:
• Use of consistent formats, font style and font sizes throughout the CRF booklet
• Visual cues, such as boxes that clearly indicate place and format of data to be recorded should be
provided to the person recording the data as much as possible
• Using the option of “circling of answers” should be limited as it's hard to interpret; instead check boxes
would be appropriate
• Clear guidance about skip patterns like what to skip and what not to skip should be mentioned at
appropriate places
25
CRF & CRF Design
The following points are to be borne in mind while designing a CRF:
• Provide boxes or separate lines to hold the answers. This indirectly informs the data recorder where to
write/enter the response and helps to differentiate it visually from the entry fields for other questions
• Not to split modules/sections (a set of one or more related groups of questions that pertain to a single
clinical study visit) like, for example, AE section should not be split and laid across pages such that
information related to a single AE will have to be collected from different pages
• Use “no carbon required (NCR)” copies to ensure exact replica of CRF
• Use instructions including page numbers where data has to be entered (e.g., during a follow-up visit, the
investigator is supposed to record whether any AE has occurred and if occurred, details of the AE has to
be recorded in the AE module. Hence, the field corresponding to this question on the module for the
particular visit would be having the options “yes” or “no”. There should be an instruction “If ‘yes’, please
provide the information in the AEs page (page no. XX)”
27
CRF & CRF Design
Eg: A sample case report form (CRF) page. An adverse event page of CRF is depicted showing codes, and skips
questions
28
CRF & CRF Design
Eg: Illustration of a well-designed and poorly designed data fields imparting the significance of visual cues to
help the site personnel to understand the format
30
CRF & CRF Design
Eg: Illustration of Coding on the case report form module
• These templates are of great help while conducting multiple studies in the same research area, as
have the same design principles that help the user to enter data with ease since the design is familiar
to them; there is no need for special training on these modules of CRFs.
• Most commonly used standard CRF templates are inclusion criteria, exclusion criteria, demography,
medical history, PE, AE, concomitant medication and study outcome modules,
• Whereas the modules which captures efficacy data are not unique. Their design varies from study to
study depending on the protocol specifications. 32
CRF & CRF Design
Eg: Connectivity/well referenced case report forms
• Linking of CRF (paper CRF and eCRF) pages wherever necessary is known as CRF connectivity.
• Each CRF booklet is assigned with unique subject ID and it is the duty of site personnel to make sure
that same ID is entered on all pages of CRF booklet.
• Consistently entered subject ID will help in tracking the missing CRF pages.
• The fields such as protocol ID, site code, subject ID, and patient initials make database designing
easier and helps linking CRF pages to the study database.
• The fields like protocol ID and visit labels are informative features as they provide brief descriptions
of the study and the schedule of assessments, respectively.
• The CRF version number is a critical field that prevents an incorrect CRF page being used.
• All pages of the CRF booklet should be numbered in sequential order, which will help in identifying
queries through data validation procedures and manual reviews. 33
CRF & CRF Design
Eg: Challenges In Case Report Form Designing
Challenge Mitigation
Commonly encountered challenges • Proper planning by a team of data management personnel, biostatisticians,
in CRF designing are consistency in clinicians, and medical writers.
the design, collection of precise • Objectives should be defined clearly before designing.
data and user-friendliness. • Consistent design is a crucial aspect as it reduces the number of mistakes in
data entry.
• Maintaining standard CRF templates would resolve this issue.
Collection of extraneous data is • Attention should be paid to avoid duplication.
another issue as processing this • Design the CRF to avoid referential and redundant data collection.
becomes tedious. In such instances, • For example, collecting calculated fields/derivable data should be avoided
ensuring accuracy and quality and to ensure that data collection is cost-effective.
become major challenges.
Designing user-friendly CRF to • Simple/standard designs should be incorporated wherever possible.
reduce data entry errors is again a • Providing CRF completion guideline aids in minimizing the challenges in data
challenge. capture and data entry.
34
CRF & CRF Design
Eg: Case Report Form Completion Guidelines
• A CRF completion guideline is a document to assist the investigator to complete the CRF in a step by step
manner and is drafted concurrently in line with the CRF and protocol
• It should be prepared in such a way that it enables the site personnel to complete the CRFs with ease and
legibility.
• CRF completion manual should provide clear instructions to site personnel for accurate completion of CRFs
along with clear expectations including proper instructions on handling unknown data.
• For example, if exact date is unknown, then use a preferred notation in the place of missing value (i.e., UK/UNK/2012). The
language used should be simple with clear instructions, concise, and easy to understand.
35
CRF & CRF Design
Eg: Case Report Form Completion Guidelines
36
Annotated CRF
The annotated CRF is a blank CRF including treatment assignment forms that maps each
blank on the CRF to the corresponding element in the database.
• The annotated CRF should provide the variable names and coding.
• Each page and each blank of the CRF is represented in an annotated CRF.
37
Annotated CRF
38
Annotated CRF
39
Annotated CRF
40
Database and Database Design
Databases are the clinical software applications, which are built to facilitate the CDM tasks to
carry out multiple studies.
Generally, these tools have built-in compliance with regulatory requirements and are easy to
use.
41
Database Design & Validation
42
Edit Checks
• Edit checks,” sometimes called “constraints” or “validation,” automatically compare
entered values with criteria set by the form builder.
• The criteria may be a set of numerical limits, logical conditions, or a combination of the
two.
• If the entered value violates any part of the criteria, a warning appears, stating why the
input has failed and guiding the user toward a resolution (without leading her toward any
particular replacement).
43
Edit Checks
• In above example an edit check is created in system as per protocol requirement, wherein
less then 40 years of subject is not to be enrolled. Site entered age as 25 and prompt
validation popped up.
44
CRF Tracking
• The entries made in the CRF will be monitored by the Clinical Research Associate (CRA)
for completeness and filled up CRFs are retrieved and handed over to the CDM team.
• The CDM team will track the retrieved CRFs and maintain their record.
• CRFs are tracked for missing pages and illegible data manually to assure that the data are
not lost.
• In case of missing or illegible data, a clarification is obtained from the investigator and the
issue is resolved.
45
CRF Entry
• Data entry takes place according to the guidelines prepared along with the DMP.
• This is applicable only in the case of paper CRF retrieved from the sites.
• Usually, double data entry is performed wherein the data is entered by two operators separately.
• The second pass entry (entry made by the second person) helps in verification and reconciliation
by identifying the transcription errors and discrepancies caused by illegible data.
• Moreover, double data entry helps in getting a cleaner database compared to a single data entry.
• Earlier studies have shown that double data entry ensures better consistency with paper CRF as
denoted by a lesser error rate
46
Discrepancy management
• This is also called query resolution.
• Discrepancy management helps in cleaning the data and gathers enough evidence for the
deviations observed in data.
• Almost all CDMS have a discrepancy database where all discrepancies will be recorded and
stored with audit trail.
47
Discrepancy management
• Based on the types identified, discrepancies are either flagged to the investigator for clarification
or closed in-house by Self-Evident Corrections (SEC) without sending DCF to the site.
• For discrepancies that require clarifications from the investigator, DCFs will be sent to the site.
• Investigators will write the resolution or explain the circumstances that led to the discrepancy in
data.
• When a resolution is provided by the investigator, the same will be updated in the database.
• In case of e-CRFs, the investigator can access the discrepancies flagged to him and will be able to
48
provide the resolutions online.
Medical Coding
• Medical coding helps in identifying and properly classifying the medical terminologies associated with the
clinical trial.
• Technically, this activity needs the knowledge of medical terminology, understanding of disease entities,
drugs used, and a basic knowledge of the pathological processes involved.
• Functionally, it also requires knowledge about the structure of electronic medical dictionaries and the
hierarchy of classifications available in them.
Eg: Adverse events occurring during the study, prior to and concomitantly administered medications and pre-or co-existing
illnesses are coded using the available medical dictionaries.
• Commonly, Medical Dictionary for Regulatory Activities (MedDRA) is used for the coding of adverse events as
well as other illnesses and World Health Organization–Drug Dictionary Enhanced (WHO-DDE) is used for
coding the medications. 49
AE/SAE Reconciliation
50
Database Locking
• After a proper quality check and assurance, the final data validation is run.
• If there are no discrepancies, the SAS datasets are finalized in consultation with the
statistician.
• All data management activities should have been completed prior to database lock.
• To ensure this, a pre-lock checklist is used and completion of all activities is confirmed.
• This is done as the database cannot be changed in any manner after locking.
• Once the approval for locking is obtained from all stakeholders, the database is locked and
clean data is extracted for statistical analysis.
51
Database Locking
• Generally, no modification in the database is possible. But in case of a critical issue or for
other important operational reasons, privileged users can modify the data even after the
database is locked.
• This, however, requires proper documentation and an audit trail has to be maintained with
sufficient justification for updating the locked database.
52
Computer System Validation (CSV)
• Computerized System Validation (CSV) is the documented process of “achieving and
maintaining compliance with applicable GxP regulations and fitness for intended use by
the adoption of principles, approaches, and life cycle activities within the framework of
validation plans and reports and by the application of appropriate operational controls
throughout the life of the system”
• The objective of CSV process 11.10 (a), is to ensure the accuracy, reliability, consistent
intended performance, and the ability to discern invalid or altered records.
53
Computer System Validation (CSV)
• The overall CSV process must take the following four criteria in account:
1. Will the system be able to meet the required regulatory standards laid down by 21
CFR Part 11
2. The impact those systems might have on the: Accuracy, Reliability, Integrity,
Availability and Authenticity of the required records and signatures
3. That a robust risk analysis is carried out to ensure the highest risk aspects of the
systems are tested adequately
Preparation, Review and • Content of DMP – are Objective, study and protocol overview, study team
Approval of DMP information, study directory, study specific training, CRF & eCCF design &
development, clinical data management system, data processing, data security,
backup and restoration, regulatory guidelines, SAE reconciliation, medical coding,
data transfer, study status report, data management QC, QA, Database Freeze &
lock, Interim database lock and data transfer, archival process
55
SOPs in Data Management
SOP Name Purpose
• The aim of this SOP is to describe the development and content of the Clinical
Data Management Study Binder.
Data Management Study Binder
• This SOP defines the process for the DM study binder preparation, maintenance,
storage and transfer to the Client
Preparation and Review of the • The aim of this Standard Operating Procedure (SOP) is to describe the
Database Design Document preparation and review of the database design.
• The aim of this SOP is to define the process by which a study specific Risk
Management Plan (RMP) shall be prepared, reviewed and approved at Clinical
Risk Management Plan
Data Management (CDM).
• RMP includes – Risk (Identification, Analysis, Reporting, Mitigation etc)
56
SOPs in Data Management
SOP Name Purpose
• The aim of this Standard Operating Procedure (SOP) is to define the process by
Design and Development of the which study specific Case Report Form (CRF) shall be designed, reviewed, and
Case Report Form approved in compliance with the protocol, applicable regulations, guidelines and
SOPs.
Database Designing • The aim of this SOP is to define the process by which the study specific database
shall be designed.
Designing and Programming Edit • The aim of this SOP is to define the process by which study specific edit checks
Checks shall be designed and programmed in CDM.
• The aim of this SOP is to define the process by which the User Acceptance Testing
User Acceptance Testing
(UAT) shall be performed to validate the designed database, test programmed
edit checks and user roles for study specific clinical data management activities.
57
SOPs in Data Management
SOP Name Purpose
Database Go Live • The aim of this SOP is to define the process for ‘Database Go-Live’ initiating the
CDM activities in the production instance.
Receipt and Tracking of Case • The aim of this Standard Operating Procedure (SOP) is to describe the Receipt
Report Forms and Tracking of Case Report Forms
• The aim of this Standard Operating Procedure is to define the process by which
Data Entry and Verification
data entry and verification shall be performed for all paper Case Report Form
(CRF) data collection studies performed by CDM.
58
SOPs in Data Management
SOP Name Purpose
Medical Coding • This Standard Operating Procedure (SOP) describes the procedure for medical
coding and review of coded terms in Clinical Data Management (CDM).
59
SOPs in Data Management
SOP Name Purpose
• In most clinical trials, there are external data which are not recorded in the CRF
or entered in the database screen but are submitted to the CDM Department by
external vendors in defined formats, at a defined frequency.
• As per project specific conventions, CDM receives such data either from a central
External Data Handling or local vendor. Generally, such vendors provide electronic transfer of
computerized data to CDM.
• External data generally cover but are not limited to the following:
• Laboratory Data
• Pharmacokinetics (PK) / Pharmacodynamics (PD) Data
• Device Data (Electrocardiograms, Vital Signs, Images etc.)
• Electronic Patient Diaries / Electronic patient reported outcome (ePRO)
60
SOPs in Data Management
SOP Name Purpose
• The objective of this Standard Operating Procedure (SOP) is to explain how SAE
reconciliation will be performed in Clinical Data Management (CDM).
• The SAE Reconciliation Plan is prepared, reviewed and approved by Data
Manager in consultation with Sponsor, PV team, Medica monitor.
• The SAE Reconciliation Plan may have the following details, however, is not
limited to
✓ The frequency of SAE reconciliation
SAE Reconciliation ✓ Details of safety database holder
✓ The format that the data will be supplied in
✓ Variables to be reconciled
✓ Type of match
✓ The cut-off time point, after which no SAEs will be added to the Clinical
Database, even if the Safety Database is updated
✓ Any special requirements for SAE reconciliation
61
SOPs in Data Management
SOP Name Purpose
• The objective of this Standard Operating Procedure (SOP) is to explain how the
procedure of Database Freeze, Lock and Unlock shall be performed upon
Database Freeze Lock and Unlock
completion of all project-specific activities in Clinical Data Management (CDM).
• The aim of this SOP is to describe the process of data sharing and situations
Data Sharing when the data can be shared outside CDM, and how data transfer will be
authorised and performed.
• This Standard Operating Procedure (SOP) describes the process for Quality
Control (QC) to be performed by Clinical Data Management (CDM) personnel at
various stages of the CDM project lifecycle.
Quality Control
• Quality Control is performed to ensure data integrity and quality of the final
deliverables by CDM. The detailed QC process is described in the Data
Management Plan (DMP) or an independent document such as the Study Specific
62
Procedure (SSP) document.
SOPs in Data Management
SOP Name Purpose
• The aim of this SOP is to describe the process for change control within Clinical
Data Management (CDM).
• Change control is the process by which all the changes related to CDM are
documented, processed, tracked and controlled.
• The CDM procedures involve possibilities to implement changes of various types
that include but are not limited to:
Change Control in Clinical Data
✓ Database change driven by a project specific decision or due to change in the
Management
study documents like Database Design Document (DBDD), Edit Check
Specification (ECS)
✓ Coding dictionary version change
✓ Software change
✓ Process change
✓ Non-significant administrative change
63
Thank You!!!
64