100% found this document useful (1 vote)
191 views

Domain 7 - Security Operations

This document provides an overview of Domain 7: Security Operations from the CISSP exam. It discusses key aspects of security operations including administrative security controls like least privilege, separation of duties, and job rotation. It also covers digital forensics, incident response management, and business continuity planning and disaster recovery. Ensuring proper access controls, monitoring systems, and maintaining operations are important parts of security operations to keep networks, systems, and environments secure on a daily basis.

Uploaded by

Ngoc Do
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
191 views

Domain 7 - Security Operations

This document provides an overview of Domain 7: Security Operations from the CISSP exam. It discusses key aspects of security operations including administrative security controls like least privilege, separation of duties, and job rotation. It also covers digital forensics, incident response management, and business continuity planning and disaster recovery. Ensuring proper access controls, monitoring systems, and maintaining operations are important parts of security operations to keep networks, systems, and environments secure on a daily basis.

Uploaded by

Ngoc Do
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 143

Domain 7:

Security Operations
Domain 7: Overview
• Involving the application of information security concepts and best
practices to the operation of enterprise computing systems.
• Cover the tasks and situations that information security professionals
are expected to perform or are presented with on a daily basis.
Domain 7: Overview
•Security operations pertains to everything that takes place to keep
networks, computer systems, applications, and environments up and
running in a secure and protected manner.
•It consists of ensuring that people, applications, and servers have the
proper access privileges to only the resources to which they are entitled
and that oversight is implemented via monitoring, auditing, and
reporting controls.
•Operations take place after the network is developed and implemented.
This includes the continual maintenance of an environment and the
activities that should take place on a day-to-day or week-to-week basis.
These activities are routine in nature and enable the network and
individual computer systems to continue running correctly and securely.
Domain 7: Security Operations
• Administrative Security
• Forensics
• Incident Response Management
• Operational Preventive and Detective Controls
• Asset Management
• Continuity of Operations
• BCP and DRP Overview and Process
• Developing a BCP/DRP
• Backups and Availability
• DRP Testing, Training and Awareness
• Continued BCP/DRP Maintenance
• Specific BCP/DRP Frameworks
Domain 7: Security Operations
Unique Terms and Definitions
• Business Continuity Plan (BCP)—a long-term plan to ensure the
continuity of business operations
• Collusion—An agreement between two or more individuals to subvert
the security of a system
• Continuity of Operations Plan (COOP)—a plan to maintain operations
during a disaster.
• Disaster—any disruptive event that interrupts normal system
operations
• Disaster Recovery Plan (DRP)—a short-term plan to recover from a
disruptive event
Domain 7: Security Operations
Unique Terms and Definitions
• Mean Time Between Failures (MTBF)—quantifies how long a new or
repaired system will run on average before failing
• Mean Time to Repair (MTTR)—describes how long it will take to
recover a failed system
• Mirroring—Complete duplication of data to another disk, used by some
levels of RAID.
• Redundant Array of Inexpensive Disks (RAID)—A method of using
multiple disk drives to achieve greater data reliability, greater speed, or
both
• Striping—Spreading data writes across multiple disks to achieve
performance gains, used by some levels of RAID
Domain 7: Security Operations
Administrative security
• Administrative Security provides the means to control people's
operational access to data
Least Privilege or Minimum Necessary Access
• Dictates that persons have no more than the access that is strictly
required for the performance of their duties
• May also be referred to as the principle of minimum necessary
access
• Discretionary Access Control (DAC) – most often applicable
Domain 7: Security Operations
Need to know
• Mandatory Access Control (MAC)
• Access determination is based upon clearance levels of subjects
and classification levels of objects
• An extension to the principle of least privilege in MAC
environments is the concept of compartmentalization:
• A method for enforcing need to know goes beyond the reliance upon
clearance level and necessitates simply that someone requires access to
information.
Domain 7: Security Operations
Separation of Duties
• Prescribes that multiple people are required to complete critical
or sensitive transactions
• Goal of separation of duties is to ensure that in order for
someone to be able to abuse their access to sensitive data or
transactions; they must convince another party to act in concert
• Collusion is the term used for the two parties conspiring to undermine the security
of the transaction
Domain 7: Security Operations
Rotation of Duties/Job Rotation
• Also known as job rotation or rotation of responsibilities
• Provides a means to help mitigate the risk associated with any one individual having
too many privileges
• Requires that critical functions or responsibilities are not continuously performed by
the same single person without interruption
• “hit by a bus” or “win the lottery” scenario

Exam Warning: Though job or responsibility rotation is an important control, this, like
many other controls, is often compared against the cost of implementing the control.
Many organizations will opt for not implementing rotation of duties because of the cost
associated with implementation. For the exam, be certain to appreciate that cost is
always a consideration, and can trump the implementation of some controls.
Domain 7: Security Operations
Mandatory Leave/Forced Vacation
• Also known as forced vacation
• Can identify areas where depth of coverage is lacking
• Can also help discover fraudulent or suspicious behavior
• Knowledge that mandatory leave is a possibility might deter some
individuals from engaging in the fraudulent behavior in the first
place
Domain 7: Security Operations
Non-Disclosure Agreement
• A work-related contractual agreement that ensures that, prior to
being given access to sensitive information or data, an individual
or organization appreciates their legal responsibility to maintain
the confidentiality of sensitive information.
• Often signed by job candidates before they are hired, as well as
consultants or contractors
• Largely a directive control
Domain 7: Security Operations
Background Checks
• Also known as background investigations or preemployment screening
• Majority of background investigations are performed as part of a
preemployment screening process
• The sensitivity of the position being filled or data to which the individual will
have access strongly determines the degree to which this information is
scrutinized and the depth to which the investigation will report
• Ongoing, or postemployment, investigations seek to determine whether the
individual continues to be worthy of the trust required of their position
• Background checks performed in advance of employment serve as a
preventive control while ongoing repeat background checks constitute a
detective control and possibly a deterrent.
Domain 7: Security Operations
Privilege Monitoring
• Heightened privileges require both greater scrutiny and more
thoughtful controls
• Some of the job functions that warrant greater scrutiny include:
account creation/modification/deletion, system reboots, data
backup, data restoration, source code access, audit log access,
security configuration capabilities, etc.
Domain 7: Security Operations
Digital Forensics
• Provides a formal approach to dealing with investigations and evidence
with special consideration of the legal aspects of the process
• Forensics is closely related to incident response
• Main distinction between forensics and incident response is that forensics is
evidence-centric and typically more closely associated with crimes, while incident
response is more dedicated to identifying, containing, and recovering from security
incidents
• The forensic process must preserve the “crime scene” and the
evidence in order to prevent unintentionally violating the integrity of
either the data or the data's environment
Domain 7: Security Operations
Digital Forensics
• Prevent unintentional modification of the system
• Antiforensics makes forensic investigation difficult or impossible
• One method is malware that is entirely memory-resident, and not installed on the disk drive. If an
investigator removes power from a system with entirely memory-resident malware, all volatile
memory including RAM is lost, and evidence is destroyed.
• Valuable data is gathered during the live forensic capture
• The main source of forensic data typically comes from binary images of secondary
storage and portable storage devices such as hard disk drives, USB flash drives, CDs,
DVDs, and possibly associated cellular phones and mp3 players
• A binary or bit stream image is used because an exact replica of the original data is
needed
• Normal backup software will only capture the active partitions of a disk, and only
that data which is marked as allocated
Domain 7: Security Operations
Digital Forensics
The four types of data that exist:
• Allocated space—portions of a disk partition which are marked as
actively containing data.
• Unallocated space—portions of a disk partition that do not contain
active data. This includes memory that has never been allocated, and
previously allocated memory that has been marked unallocated. If a
file is deleted, the portions of the disk that held the deleted file are
marked as unallocated and available for use.
Domain 7: Security Operations
Digital Forensics
The four types of data that exist:
• Slack space—data is stored in specific size chunks known as clusters. A cluster
is the minimum size that can be allocated by a file system. If a particular file,
or final portion of a file, does not require the use of the entire cluster then
some extra space will exist within the cluster. This leftover space is known as
slack space: it may contain old data, or can be used intentionally by attackers
to hide information.
• “Bad” blocks/clusters/sectors—hard disks routinely end up with sectors that
cannot be read due to some physical defect. The sectors marked as bad will
be ignored by the operating system since no data could be read in those
defective portions. Attackers could intentionally mark sectors or clusters as
being bad in order to hide data within this portion of the disk.
Domain 7: Security Operations
Digital Forensics
• Numerous tools that can be used to create the binary backup including free
tools such as dd and windd as well as commercial tools such as Ghost (when
run with specific nondefault switches enabled), AccessData's FTK, or
Guidance Software's EnCase.
• The general phases of the forensic process are:
• the identification of potential evidence;
• the acquisition of that evidence;
• analysis of the evidence;
• production of a report
• Hashing algorithms are used to verify the integrity of binary images
• When possible, the original media should not be used for analysis
Domain 7: Security Operations
Live Forensics
• Forensics investigators have traditionally removed power from a
system, but the typical approach now is to gather volatile data.
Acquiring volatile data is called live forensics.
• The need for live forensics has grown tremendously due to non-
persistent tools that don’t write anything to disk
• One example from Metasploit…
Domain 7: Security Operations
Live Forensics - Metasploit
• Popular free and open source exploitation framework
• Metasploit framework allows for the modularization of the underlying
components of an attack, which allows for exploit developers to focus
on their core competency without having to expend energy on
distribution or even developing a delivery, targeting, and payload
mechanism for their exploit
• Provides reusable components to limit extra work
• A payload is what Metasploit does after successfully exploiting a target
Domain 7: Security Operations
Live Forensics – Metasploit & Meterpreter
• One of the most powerful Metasploit payloads
• Can allow password hashes of a compromised computer being dumped to an
attacker's machine
• The password hashes can then be fed into a password cracker
• Or the password hashes might be capable of being used directly in Metasploit's
PSExec exploit module, which is an implementation of functionality provided by
Sysinternal's (now owned by Microsoft) PSExec, but bolstered to support Pass the
Hash functionality.

Information on Microsoft's PSExec can be found at http://technet.microsoft.com/en-


us/sysinternals/bb897553.aspx. Further details on Pass the Hash techniques can be
found at http://oss.coresecurity.com/projects/pshtoolkit.htm
Domain 7: Security Operations
Live Forensics – Metasploit & Meterpreter
• Dumping password hashes with Meterpreter.
• In addition to dumping password hashes, Meterpreter provides features such
as:
• command execution on the remote system
• uploading or downloading of files
• screen capture
• keystroke logging
• disabling the firewall
• disabling antivirus
• registry viewing and modification
• Meterpreter's capabilities are updated regularly
Domain 7: Security Operations
Live Forensics – Metasploit & Meterpreter
• Dumping password hashes with Meterpreter.
Domain 7: Security Operations
Live Forensics – Metasploit & Meterpreter
• Dumping the registry with Meterpreter.
• Meterpreter was designed with detection evasion in mind
• Meterpreter can provide almost all of the functionalities listed above
without creating a new file on the victim system
• Runs entirely within the context of the exploited victim process, and all
information is stored in physical memory rather than on the hard disk.
Domain 7: Security Operations
Live Forensics – Metasploit & Meterpreter
• Dumping the registry with Meterpreter.
Domain 7: Security Operations
Live Forensics – Metasploit & Meterpreter
• If the forensic investigator removed the power supply from the
compromised machine, destroying volatile memory: there would be
little to no information for the investigator to analyze
Domain 7: Security Operations
Electronic Discovery (eDISCOVERY)
• legal counsel gaining access to pertinent electronic information during
the pre-trial discovery phase of civil legal proceedings
• seeks ESI, or electronically stored information
• ESI does not need to be conveniently accessible or transferable
• Data Retention Policy (IMPORTANT)
• Legal/Regulatory reasons?
• Business reasons?
Domain 7: Security Operations
Incident Response Management
• Every organization faces information security incidents
• Regimented and tested methodology for identifying and responding to
incidents is critical
• Computer Security Incident Response Team (CSIRT) is a term used for
the group that is tasked with monitoring, identifying, and responding
to security incidents
• Overall goal of the incident response plan is to allow the organization
to control the cost and damage associated with incidents, and to make
the recovery of impacted systems quicker
Domain 7: Security Operations
Incident Response Management – Methodology
Different books and organizations may use different terms and phases associated
with incident response; this section will mirror the terms associated with the
examination.
Step 1 - Detection (what I can’t prevent, can I detect?)
• Events are analyzed in order to determine whether these events might
comprise a security incident
• Emphasis on detective controls
Domain 7: Security Operations
Incident Response Management – Methodology
Step 2 - Containment (OK I’ve detected it, now what?)
• The point at which the incident response team attempts to keep
further damage from occurring
• Might include taking a system off the network, isolating traffic,
powering off the system, or other items to control both the scope and
severity of the incident
• Typically where a binary (bit by bit) forensic backup is made of systems
involved in the incident
Domain 7: Security Operations
Incident Response Management – Methodology
Step 3 - Eradication
• Involves the process of understanding the cause of the incident so that
the system can be reliably cleaned and ultimately restored to
operational status later in the recovery phase
• The cause of the incident must be determined BEFORE recovery
• Root cause analysis is key
Domain 7: Security Operations
Incident Response Management – Methodology
Step 4 - Recovery
• Involves restoring the system or systems to operational status
• Typically, the business unit responsible for the system will dictate when
the system will go back online
• Close monitoring of the system after it is returned to production is
necessary
Domain 7: Security Operations
Incident Response Management – Methodology
Step 5 - Reporting
• Most likely to be neglected in immature incident response programs
• If done right, this phase has the greatest potential to effect a positive
change in security posture
• Goal is to provide a final report on the incident, which will be delivered
to management
Domain 7: Security Operations
Incident Response Management – Methodology
• NIST Special Publication 800-61r2: Computer Security Incident
Handling Guide (see:
http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-
61r2.pdf
• 4 Step Lifecycle
• Preparation
• Detection & Analysis
• Containment, Eradication, and Recovery
• Post-incident Activity
Domain 7: Security Operations
Incident Response Management – Methodology
• Exam lists a 7-step lifecycle; book calls for 8-step (adding “Preparation):
• 1. Preparation
• 2. Detection (aka Identification)
• 3. Response (aka Containment)
• 4. Mitigation (aka Eradication)
• 5. Reporting
• 6. Recovery
• 7. Remediation
• 8. Lessons Learned (aka Post-incident Activity, Post Mortem, or Reporting)
Domain 7: Security Operations
Incident Response Management – Methodology
1. Preparation
• training, writing incident response policies and procedures, providing tools
such as laptops with sniffing software, crossover cables, original OS media,
removable drives, etc.
• Everything that you do to prepare for an incident
• Policy and procedures
• Incident handling checklist and other forms for tracking
• Classification
• Impact
Domain 7: Security Operations
Incident Response Management – Methodology
2. Detection (aka Identification)
• What are all of the inputs into my incident response process?
• Events  Incidents
3. Response (aka Containment)
• Step-by-step, depending upon classification & severity
• Forensic response? Protection of evidence, while containing damage
• Start root cause analysis
Domain 7: Security Operations
Incident Response Management – Methodology
4. Mitigation (aka Eradication)
• Root cause analysis completed (mostly/hopefully)
• Get rid of the bad things
5. Reporting
• Actually not really a step (happens throughout)
• More formal here; include incident responders (technical and non-technical)
Domain 7: Security Operations
Incident Response Management – Methodology
6. Recovery
• Restore systems and operations
• Increase monitoring
7. Remediation – broader in context
8. Lessons Learned (aka Post-incident Activity, Post Mortem, or
Reporting) – there’s always lessons
Domain 7: Security Operations
Operational Preventive And Detective Controls
• Intrusion Detection Systems (IDS) and Intrusion Prevention Systems
(IPS)
• True Positive: Conficker worm is spreading on a trusted network, and NIDS alerts
• True Negative: User surfs the Web to an allowed site, and NIDS is silent
• False Positive: User surfs the Web to an allowed site, and NIDS alerts
• False Negative: Conficker worm is spreading on a trusted network, and NIDS is
silent
Domain 7: Security Operations
Operational Preventive And Detective Controls
• NIDS, NIPS, HIDS, and HIPS
Domain 7: Security Operations
Operational Preventive And Detective Controls
• NIDS, NIPS, HIDS, and HIPS (detection types)
• Pattern Matching
• Protocol Behavior
• Anomaly Detection
• Security Information and Event Management
• Continuous Monitoring
• Data Loss Prevention (network & host)
Domain 7: Security Operations
Operational Preventive And Detective Controls
Endpoint Security
• HIDS/HIPS
• Antivirus
• Application Whitelisting
• Removable Media Controls
• Disk Encryption
• Privileged Access
Domain 7: Security Operations
Asset Management (Configuration Management)
The goal is to move beyond the default system configuration to one that
is both hardened and meets the operational requirements of the
organization.
• Hardened baseline configurations
• Center for Internet Security (see: http://www.cisecurity.org/)
• Disabling unnecessary services, removing extraneous programs,
enabling security capabilities such as firewalls, antivirus, and intrusion
detection or prevention systems, and the configuration of security and
audit logs
Domain 7: Security Operations
Asset Management (Configuration Management)
Baselining
• The process of capturing a point in time understanding of the current
system security configuration
• Helpful in responding to a potential security incident
• Continual baselining is important
Domain 7: Security Operations
Asset Management (Configuration Management)
Patch Management
• The process of managing software updates
• All software has flaws that are not fully addressed in advance of being
released
• Software vendors announce patches both publicly and directly to their
customers
• Once notified of a patch, organizations need to evaluate the patch from
a risk management perspective to determine how aggressively the
patch will need to be deployed.
Domain 7: Security Operations
Asset Management (Configuration Management)
Vulnerability Management
• Vulnerability scanning is a way to discover poor configurations and
missing patches in an environment
• Vulnerability management is used rather than just vulnerability
scanning to emphasize the need for management of the vulnerability
information
• Prioritization and remediation of the vulnerabilities
Domain 7: Security Operations
Asset Management (Configuration Management)
Domain 7: Security Operations
Domain 7: Security Operations
Asset Management (Configuration Management)
Vulnerability Management
Section 12.6 of the ISO/IEC 27002:2013 provides guidance on technical vulnerability management. A
vulnerability management process should be implemented in an effective, systematic, and repeatable way
with measurements taken to confirm its effectiveness. Vulnerability management starts with asset
management, the information required to support systems technically includes tracking operating system
software, version numbers, lists of software installed, and the person or persons responsible for
maintaining the systems. Additionally, the organization should define and establish the roles and
responsibilities associated with technical vulnerability management, including vulnerability monitoring,
vulnerability risk assessment, patching, asset tracking, and any coordination responsibilities required
thereof.
Domain 7: Security Operations
Asset Management (Configuration Management)
Vulnerability Management
Once a potential technical vulnerability has been identified, the organization should identify the
associated risks and the actions to be taken - such action could involve the patching of vulnerable systems
and/or applying other controls. Depending on how urgently a technical vulnerability needs to be
addressed, the action taken should be carried out according to the controls related to change
management or by following information security incident response procedures. Critical-risk and high-risk
systems should be addressed first. Patches should be tested and evaluated before they are installed to
ensure they are effective and do not result in side effects that cannot be tolerated; if no patch is available,
other controls should be considered. The technical vulnerability management process should be regularly
monitored and evaluated in order to ensure its effectiveness and efficiency.
Domain 7: Security Operations
Asset Management (Configuration Management)
Zero-Day Vulnerabilities and Zero-Day Exploits
• The average window of time between a patch being released and an
associated exploit being made public is decreasing
• Recent research even suggests that for some vulnerabilities, an exploit can be
created within minutes based simply on the availability of the unpatched and
patched program
• The term for a vulnerability being known before the existence of a patch (or
workaround) is zero day vulnerability.
• A zero-day exploit, rather than vulnerability, refers to the existence of exploit
code for a vulnerability which has yet to be patched
Domain 7: Security Operations
Change Management
• A system that does not change will become less secure over time
• Not an exact science, every organization will be a little different
• The general flow of the change management process includes:
• Identifying a change
• Proposing a change
• Assessing the risk associated with the change
• Testing the change (backout plan)
• Scheduling the change
• Notifying impacted parties of the change
• Implementing the change
• Reporting results of the change implementation
• Changes must be closely tracked and auditable
Domain 7: Security Operations
Continuity of Operations
Service Level Agreements (SLA)
• Critical where organizations have external entities perform critical services or
host significant assets and applications
• Goal is to stipulate all expectations regarding the behavior of the department
or organization that is responsible for providing services and the quality of
the services provided
• Availability is usually the most critical security consideration of a service level
agreement
• Organizations must negotiate all security terms of a service level agreement
prior to engaging with the company
• Cloud computing
Domain 7: Security Operations
Fault Tolerance
Backup
• Recoverability in the event of a failure
• Magnetic tape media is old technology, but still is the most common
repository of backup data
• Three basic types of backups exist: full backup; the incremental backup;
and the differential backup
Domain 7: Security Operations
Fault Tolerance
Backup
• Full backup - a replica of all allocated data on a hard disk
• The most costly in terms of media and time to backup
• Often coupled with either incremental or differential backups to balance the time
and media considerations
Domain 7: Security Operations
Fault Tolerance
Backup
• Incremental backup - only archive files that have changed since the last
backup of any kind was performed
• The most recent full backup and each and every incremental backup since the full
backup is required to initiate a recovery
• Time to perform each incremental backup is extremely short; however, the
downside is that a full restore can require many tapes, especially if full backups are
performed less frequently
• The odds of a failed restoration due to a tape integrity issue (such as broken tape)
rise with each additional tape required
Domain 7: Security Operations
Fault Tolerance
Backup
• Differential - will back up any files that have been changed since the
last full backup
• Only the most recent full backup and most recent differential backup are required
to initiate a full recovery
• As more time passes since the last full backup the length of time to perform a
differential backup will also increase
Domain 7: Security Operations
Fault Tolerance
Redundant Array of Inexpensive Disks (RAID)
• Mitigates the risk associated with hard disk failures
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive Disks (RAID)
Three terms that are important to understand with respect to RAID are: mirroring;
striping; and parity
• Mirroring - used to achieve full data redundancy by writing the same data to multiple
hard disks
• Write times are slower
• Read times are faster
• Most costly in terms of disk usage - at least half of the drives are used for redundancy
• Striping - increased the read and write performance by spreading data across
multiple hard disks
• Reads and writes can be performed in parallel across multiple disks rather than serially on one disk
• Parallelization provides a performance increase, and does not aid in data redundancy
• Parity - achieve data redundancy without incurring the same degree of cost as that of
mirroring in terms of disk usage and write performance
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive
Disks (RAID)
RAID 0: Striped Set
• Striping to increase the performance of read and
writes
• No data redundancy - poor choice if recovery of
data is the reason for leveraging RAID
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive
Disks (RAID)
RAID 1: Mirrored Set
• Creates/writes an exact duplicate of all data to
an additional disk
• Write performance is decreased
• Read performance can increase
• Highest disk cost
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive Disks (RAID)
RAID 2: Hamming Code
• Not considered commercially viable for hard disks and is not used
• Requires either 14 or 39 hard disks and a specially designed hardware
controller
• Cost prohibitive
• RAID 2 is not likely to be tested
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive Disks (RAID)
RAID 3: Striped Set with Dedicated Parity (byte level)
• Data, at the byte level, is striped across multiple disks
• An additional disk is leveraged for storage of parity information, which
is used for recovery in the event of a failure
RAID 4: Striped Set with Dedicated Parity (block level)
• Exact same configuration and functionality as that of RAID 3, but
stripes data at the block, rather than byte, level
• Employs a dedicated parity drive rather than having parity data
distributed amongst all disks, as in RAID 5
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive Disks
(RAID)
RAID 5: Striped Set with Distributed Parity
• One of the most popular RAID configurations
• Striped Set with Distributed Parity
• Leverages a block level striping
• Writes parity information that is used for recovery purposes
• Distributes the parity information across multiple disks
• Disk cost for redundancy is lower than that of a Mirrored
set
• Support for both hardware and software based
implementations
• Allows for data recovery in the event that any one disk fails
Domain 7: Security Operations
Fault Tolerance - Redundant Array of Inexpensive Disks (RAID)
RAID 6: Striped Set with Dual Distributed Parity
• Can allow for the failure of two drives and still function
• Redundancy is achieved by writing the same parity information to two different disks
RAID 1+0 or RAID 10
• Example of what is known as nested RAID or multi-RAID (one standard RAID level is
encapsulated within another)
• Configuration is a striped set of mirrors

NOTE: There are many and varied RAID configurations which are simply combinations of the standard RAID
levels. Nested RAID solutions are becoming increasingly common with larger arrays of disks that require a
high degree of both reliability and speed. Some common nested RAID levels include RAID 0+1, 1+0, 5+0, 6+0,
and (1+0)+0, which are also commonly written as RAID 01, 10, 50, 60, and 100, respectively.
Domain 7: Security Operations
Fault Tolerance - System Redundancy
Redundant Hardware
• Built-in redundancy (power supplies, disk controllers, and NICs are most
common)
• An inventory of spare modules to service the entire datacenter's servers
would be less expensive than having all servers configured with an installed
redundant power supply
Redundant Systems
• Entire systems available in inventory to serve as a means to recover
• Have an SLA with hardware manufacturers to be able to quickly procure
replacement equipment in a timely fashion
Domain 7: Security Operations
BCP and DRP Overview and Process (used to be Domain by itself)
Unique terms and definitions
• Business Continuity Plan (BCP)—a long-term plan to ensure the continuity of
business operations
• Continuity of Operations Plan (COOP)—a plan to maintain operations during a
disaster.
• Disaster—any disruptive event that interrupts normal system operations
• Disaster Recovery Plan (DRP)—a short-term plan to recover from a disruptive event
• Mean Time Between Failures (MTBF)—quantifies how long a new or repaired system
will run on average before failing
• Mean Time to Repair (MTTR)—describes how long it will take to recover a failed
system.
Domain 7: Security Operations
BCP and DRP Overview and Process
Business Continuity Planning and Disaster Recovery Planning are two very distinct
disciplines
Business Continuity Planning (BCP)
• Goal of a BCP is for ensuring that the business will continue to operate
before, throughout, and after a disaster event is experienced
• Focus of a BCP is on the business as a whole
• Business Continuity Planning provides a long-term strategy
• Takes into account items such as people, vital records, and processes in
addition to critical systems
Domain 7: Security Operations
BCP and DRP Overview and Process
Business Continuity Planning and Disaster Recovery Planning are two very
distinct disciplines
Disaster Recovery Planning (DRP)
• Disaster Recovery Plan is more tactical in its approach
• Short-term plan for dealing with specific IT-oriented disruptions
• Provides a means for immediate response to disasters
• Does not focus on long-term business impact
Domain 7: Security Operations
BCP and DRP Overview and Process
Business Continuity Planning and Disaster Recovery Planning are two very distinct
disciplines
Relationship between BCP and DRP
• Business Continuity Plan is an umbrella plan that includes multiple specific
plans, most importantly the Disaster Recovery Plan
• Two plans, which have different scopes, are intertwined
• Disaster Recovery Plan serves as a subset of the overall Business Continuity
Plan
• NIST Special Publication 800-34, provides a visual means for understanding
the interrelatedness of a BCP and a DRP, as well as Continuity of Operations
Plan (COOP), Occupant Emergency Plan (OEP), and others.
Domain 7: Security Operations
Domain 7: Security Operations
Disasters or Disruptive Events
Classifications of disasters
• Three common ways of categorizing the causes for disasters are as to whether the threat agent is
natural, human, or environmental in nature
• Natural—the most obvious type of threat that can result in a disaster are naturally occurring. This category includes
such threats as earthquakes, hurricanes, tornadoes, floods, and some types of fires (closely related to geographical
location)
• Human—the human category of threats represents the most common source of disasters. Human threats can be
further classified as to whether they constitute an intentional or unintentional threat
• Examples of human-intentional threats include terrorists, malware, rogue insider, Denial of Service, hacktivism,
phishing, social engineering, etc.
• Examples of human-unintentional threats are primarily those that involve inadvertent errors and omissions, in which
the person through lack of knowledge, laziness, or carelessness served as a source of disruption
• Environmental—focused on environment as it pertains to the information systems or datacenter. This class of threat
includes items such as power issues (blackout, brownout, surge, spike), system component or other equipment failures,
application or software flaws
• Analysis of threats and associated likelihoods is an important part of the BCP and DRP process
Domain 7: Security Operations
Disasters or Disruptive Events
Domain 7: Security Operations
Disasters or Disruptive Events
Errors and omissions
• Typically considered the single most common source of disruptive events
• Threat is inadvertently caused by humans, most often in the employ of the
organization, who unintentionally serve as a source of harm
• Data entry mistakes are an example of errors and omissions
Natural Disasters
• Include earthquakes, hurricanes, floods, tsunamis, etc.
• Likelihood of natural threats occurring is largely based upon the geographical location
of the organization's information systems or datacenters
• Generally have a rather low likelihood of occurring
• Impact can be severe
Domain 7: Security Operations
Domain 7: Security Operations
Disasters or Disruptive Events
Electrical or power Problems
• Much more common than natural disasters
• Considered an environmental disaster
• Uninterruptible power supplies (UPS) and/or backup generators
Temperature and Humidity Failures
• Critical controls that must be managed during a disaster
• Increased server density can provide for significant heat issues
• Mean Time Between Failures (MTBF) for electrical equipment will decrease if
temperature and humidity levels are not within an tolerable range.
Domain 7: Security Operations
Disasters or Disruptive Events
Warfare, terrorism, and sabotage
• Human-intentional threats
• Threat can vary dramatically based on geographic location, industry, brand
value, as well as the interrelatedness with other high-value target
organizations
• Cyber-warfare
• “Aurora” attacks (named after the word “Aurora,” which was found in a
sample of malware used in the attacks). As the New York Times reported on
2/18/2010: “A series of online attacks on Google and dozens of other
American corporations have been traced to computers at two educational
institutions in China, including one with close ties to the Chinese military, say
people involved in the investigation.”
Domain 7: Security Operations
Disasters or Disruptive Events
Financially-motivated Attackers
• Exfiltration of cardholder data, identity theft, pump-and-dump stock
schemes, bogus anti-malware tools, or corporate espionage, etc.
• Organized crime syndicates
Personnel Shortages
• Another significant source of disruption can come by means of having staff
unavailable
• Most organizations will have some critical processes that are people-
dependent
Domain 7: Security Operations
Disasters or Disruptive Events
Domain 7: Security Operations
Disasters or Disruptive Events
Personnel Shortages
• Pandemics and Disease
• Major biological problems such as pandemic flu or highly communicable infectious disease
outbreaks
• A pandemic occurs when an infection spreads through an extremely large geographical area, while
an epidemic is more localized
• Strikes
• Strikes usually are carried out in such a manner that the organization can plan for the occurrence
• Most strikes are announced and planned in advance, which provides the organization with some
lead time
• Personnel Availability
• Sudden separation from employment of a critical member of the workforce
Domain 7: Security Operations
Disasters or Disruptive Events
Communications Failure
• Increasing dependence of organizations on call centers, IP telephony, general
Internet access, and providing services via the Internet
• One of the most common disaster-causing events is telecommunications lines
being inadvertently cut by someone digging where they are not supposed to

NOTE: One of the eye-opening impacts of Hurricane Katrina was a rather significant outage of Internet2,
which provides high-speed connectivity for education and research networks. Qwest, which provides the
infrastructure for Internet2, suffered an outage in one of the major long-haul links that ran from Atlanta to
Houston. Reportedly, the outage was due to lack of availability of fuel in the area. In addition to this outage,
which impacted more than just those areas directly affected by the hurricane, there were substantial outages
throughout Mississippi, which at its peak had more than a third of its public address space rendered
unreachable.
Domain 7: Security Operations
The Disaster Recovery Process
The general process of disaster recovery involves responding to the
disruption; activation of the recovery team; ongoing tactical
communication of the status of disaster and its associated recovery;
further assessment of the damage caused by the disruptive event; and
recovery of critical assets and processes in a manner consistent with the
extent of the disaster.
• Different organizations and experts alike might disagree about the
number or names of phases in the process
• Personnel safety remains the top priority
Domain 7: Security Operations
The Disaster Recovery Process
Respond
• Initial response begins the process of assessing the damage
• Speed is essential (initial assessment)
• The initial assessment will determine if the event in question constitutes a
disaster
• The initial response team should be mindful of assessing the facility's safety
for continued personnel usage
Activate Team
If during the initial response to a disruptive event a disaster is declared, then
the team that will be responsible for recovery needs to be activated.
Domain 7: Security Operations
The Disaster Recovery Process
Communicate
• Ensure that consistent timely status updates are communicated back to the central
team managing the response and recovery process
• Communication often must occur out-of-band
• The organization must also be prepared to provide external communications
Assess
• More detailed and thorough assessment
• Assess the extent of the damage and determine the proper steps to ensure the
organization's ability to meet its mission and Maximum Tolerable Downtime (MTD)
• Team could recommend that the ultimate restoration or reconstitution occurs at the
alternate site
Domain 7: Security Operations
The Disaster Recovery Process
Reconstitution
• Successfully recover critical business operations either at primary or
secondary site
• If an alternate site is leveraged, adequate safety and security controls
must be in place in order to maintain the expected degree of security
the organization typically employs
• A salvage team will be employed to begin the recovery process at the
primary facility that experienced the disaster
Domain 7: Security Operations
Developing a BCP/DRP
• High-level steps, according to NIST 800-34:
• Project Initiation
• Scope the Project
• Business Impact Analysis
• Identify Preventive Controls
• Recovery Strategy
• Plan Design and Development
• Implementation, Training, and Testing
• BCP/DRP Maintenance
• NIST 800-34 is the National Institute of Standards and Technologies Information
Technology Contingency Planning Guide, which can be found at
http://csrc.nist.gov/publications/nistpubs/800-34/sp800-34.pdf.
Domain 7: Security Operations
Project Initiation
In order to develop the BCP/DRP, the scope of the project must be determined
and agreed upon. This involves seven distinct milestones:
• 1. Develop the contingency planning policy statement: A formal department
or agency policy provides the authority and guidance necessary to develop an
effective contingency plan.
• 2. Conduct the business impact analysis (BIA): The BIA helps to identify and
prioritize critical IT systems and components. A template for developing the
BIA is also provided to assist the user.
• 3. Identify preventive controls: Measures taken to reduce the effects of
system disruptions can increase system availability and reduce contingency
life cycle costs.
Domain 7: Security Operations
Project Initiation
In order to develop the BCP/DRP, the scope of the project must be determined
and agreed upon. This involves seven distinct milestones:
• 4. Develop recovery strategies: Thorough recovery strategies ensure that the
system may be recovered quickly and effectively following a disruption.
• 5. Develop an IT contingency plan: The contingency plan should contain
detailed guidance and procedures for restoring a damaged system.
• 6. Plan testing, training, and exercises: Testing the plan identifies planning
gaps, whereas training prepares recovery personnel for plan activation; both
activities improve plan effectiveness and overall agency preparedness.
• 7. Plan maintenance: The plan should be a living document that is updated
regularly to remain current with system enhancements.
Domain 7: Security Operations
Management Support
“C”-level managers:
• Must agree to any plan set forth
• Must agree to support the action items listed in the plan if an emergency
event occurs
• Refers to people within an organization like the chief executive officer (CEO),
the chief operating officer (COO), the chief information officer (CIO), and the
chief financial officer (CFO)
• Have enough power and authority to speak for the entire organization when
dealing with outside media
• High enough within the organization to commit resources
Domain 7: Security Operations
Other Roles
BCP/DRP Project Manager
• Key Point of Contact for ensuring that a BCP/DRP is completed and routinely
tested
• Must be a good manager and leader in case there is an event that causes the
BCP or DRP to be implemented
• Point of Contact (POC) for every person within the organization during a crisis
• Must be very organized
• Credibility and enough authority within the organization to make important,
critical decisions with regard to implementing the BCP/DRP
• Does not need to have in-depth technical skills
Domain 7: Security Operations
Other Roles
Continuity Planning Project Team (CPPT)
• Comprises those personnel that will have responsibilities if/when an
emergency occurs
• Comprised of stakeholders within an organization
• Focuses on identifying who needs to play a role if a specific emergency event
were to occur
• Includes people from the human resources section, public relations (PR), IT
staff, physical security, line managers, essential personnel for full business
effectiveness, and anyone else responsible for essential functions
Domain 7: Security Operations
Scoping the Project
• Define exactly what assets are protected by the plan, which emergency
events the plan will be able to address, and determining the resources
necessary to completely create and implement the plan
• “What is in and out of scope for this plan?”
• After receiving C-level approval and input from the rest of the
organization, objectives and deliverables can be determined
Domain 7: Security Operations
Scoping the Project
• Objectives are usually created as “if/then” statements
• For example, “If there is a hurricane, then the organization will enact plan H—the Physical
Relocation and Employee Safety Plan.” Plan H is unique to the organization but it does encompass
all the BCP/DRP subplans required
• An objective would be to create this plan and have it reviewed by all members of the organization
by a specific date.
• The objective will have a number of deliverables required to create and fully vet this plan: for
example, draft documents, exercise planning meetings, table top preliminary exercises, etc.
• Executive management must at least ensure that support is given for three BCP/DRP
items:
• 1. Executive management support is needed for initiating the plan.
• 2. Executive management support is needed for final approval of the plan.
• 3. Executive management must demonstrate due care and due diligence and be held liable under
applicable laws/regulations.
Domain 7: Security Operations
Assessing the Critical State
• Assessing the critical state can be difficult
because determining which pieces of the IT
infrastructure are critical depends solely on
the how it supports the users within the
organization.
• When compiling the critical state and asset
list associated with it, the BCP/DRP project
manager should note how the assets impact
the organization in a section called the
“Business Impact” section.
Domain 7: Security Operations
Conduct Business Impact Analysis (BIA)
• Formal method for determining how a disruption to the IT system(s) of an
organization will impact the organization
• An analysis to identify and prioritize critical IT systems and components
• Enables the BCP/DRP project manager to fully characterize the IT contingency
requirements and priorities
• Objective is to correlate the IT system components with the critical service it
supports
• Also aims to quantify the consequence of a disruption to the system component and
how that will affect the organization
• Determine the Maximum Tolerable Downtime (MTD) for a specific IT asset
• Also provides information to improve business processes and efficiencies because it
details all of the organization's policies and implementation efforts
The BIA is comprised of two processes; Identification of critical
assets and a comprehensive risk assessment.
Domain 7: Security Operations
Conduct Business Impact Analysis (BIA)
Identify Critical Assets
• BIA and Critical State Asset List is conducted for every IT system within the
organization, no matter how trivial or unimportant, leading to…
• A list of those IT assets that are deemed business-essential by the
organization
Conduct BCP/DRP-focused Risk Assessment
• Determines what risks are inherent to which IT assets
• A vulnerability analysis is also conducted for each IT system and major
application
Domain 7: Security Operations
Conduct Business Impact Analysis (BIA)
Domain 7: Security Operations
Determine Maximum Tolerable Downtime
• Describes the total time a system can be inoperable before an organization is
severely impacted
• It is also the maximum time it takes to execute the reconstitution phase
• Comprised of two metrics; Recovery Time Objective (RTO) and the Work
Recovery Time (WRT)
Alternate terms for MTD
• Depending on the business continuity framework that is used, other terms
may be substituted for Maximum Tolerable Downtime. These include
Maximum Allowable Downtime (MAD), Maximum Tolerable Outage (MTO),
and Maximum Acceptable Outage (MAO).
Domain 7: Security Operations
Failure and Recovery Metrics
• Used to quantify how frequently systems fail, how long a system may
exist in a failed state, and the maximum time to recover from failure.
• These metrics include the Recovery Point Objective (RPO), Recovery
Time Objective (RTO), Work Recovery Time (WRT), Mean Time
Between Failures (MTBF), Mean Time to Repair (MTTR), and
Minimum Operating Requirements (MOR).
Domain 7: Security Operations
Recovery Point Objective
• The amount of data loss or system inaccessibility (measured in time)
that an organization can withstand.
• “If you perform weekly backups, someone made a decision that your
company could tolerate the loss of a week's worth of data. If backups
are performed on Saturday evenings and a system fails on Saturday
afternoon, you have lost the entire week's worth of data. This is the
recovery point objective. In this case, the RPO is 1 week.”
• RPO represents the maximum acceptable amount of data/work loss
for a given process because of a disaster or disruptive event
Domain 7: Security Operations
Recovery Time Objective (RTO) and Work Recovery Time (WRT)
• Recovery Time Objective (RTO) describes the maximum time allowed
to recover business or IT systems
• RTO is also called the systems recovery time. One part of Maximum
Tolerable Downtime: once the system is physically running, it must be
configured.
• Work Recovery Time (WRT) describes the time required to configure a
recovered system.
• “Downtime consists of two elements, the systems recovery time and
the work recovery time. Therefore, MTD = RTO + WRT.”
Domain 7: Security Operations
Mean Time Between Failures
• Quantifies how long a new or repaired system will run before failing
• Typically generated by a component vendor and is largely applicable to
hardware as opposed to applications and software.
• A vendor selling LCD computer monitors may run 100 monitors 24 hours a
day for 2 weeks and observe just one monitor failure. The vendor then
extrapolates the following:
100 LCD Monitors x 14 days x 24 hours/day = 1 failure/33,600 hours
• The BCP/DRP team determines the correct amount of expected failures
within the IT system during a course of time.
• Calculating the MTBF becomes less reliant when an organization uses fewer
and fewer hardware assets.
Domain 7: Security Operations
Mean Time to Repair (MTTR)
• Describes how long it will take to recover a specific failed system
• Best estimate for reconstituting the IT system so that business continuity may
occur
Minimum Operating Requirements
• Describes the minimum environmental and connectivity requirements in
order to operate computer equipment
• Important to determine and document for each IT-critical asset because, in
the event of a disruptive event or disaster, proper analysis can be conducted
quickly to determine if the IT assets will be able to function in the emergency
environment
Domain 7: Security Operations
Identify Preventive Controls
• Preventive controls prevent disruptive events from having an impact
• The BIA will identify some risks which may be mitigated immediately
Recovery Strategy
• Once the BIA is complete, the BCP team knows the Maximum Tolerable
Downtime. This metric, as well as others including the Recovery Point
Objective and Recovery Time Objective, are used to determine the recovery
strategy.
• Always maintain technical, physical, and administrative controls when using
any recovery option
Domain 7: Security Operations
Recovery Strategy
Domain 7: Security Operations
Recovery Strategy
Supply Chain Management
• In an age of “just in time” shipment of goods, organizations may fail to acquire
adequate replacement computers.
• Some computer manufactures offer guaranteed replacement insurance for a specific
range of disasters. The insurance is priced per server, and includes a service level
agreement that specifies the replacement time. All forms of relevant insurance
should be analyzed by the BCP team.
Telecommunication Management
• Ensures the availability of electronic communications during a disaster
• Often one of the first processes to fail during a disaster
• Wired circuits such as T1s, T3s, frame relay, etc., need to be specifically addressed
• Power can be provided by generator if necessary.
Domain 7: Security Operations
Recovery Strategy
Utility Management
• Utility management addresses the availability of utilities such as power,
water, gas, etc. during a disaster
• The utility management plan should address all utilities required by business
operations, including power, heating, cooling, and water.
• Specific sections should address the unavailability of any required utility.
Recovery options
• Once an organization has determined its maximum tolerable downtime, the
choice of recovery options can be determined. For example, a 10-day MTD
indicates that a cold site may be a reasonable option. An MTD of a few hours
indicates that a redundant site or hot site is a potential option.
Domain 7: Security Operations
Recovery Options
Redundant Site
• A redundant site is an exact production duplicate of a system that has the capability to seamlessly
operate all necessary IT operations without loss of services to the end user of the system.
• A redundant site receives data backups in real time so that in the event of a disaster, the users of the
system have no loss of data.
• The most expensive recovery option
Hot Site
• A hot site is a location that an organization may relocate to following a major disruption or disaster.
• It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured
computers.
• Will have all necessary hardware and critical applications data mirrored in real time.
• A hot site will have the capability to allow the organization to resume critical operations within a
very short period of time—sometimes in less than an hour.
• Has all the same physical, technical, and administrative controls implemented of the production site.
Domain 7: Security Operations
Recovery Options
Warm Site
• Has some aspects of a hot site, for example, readily-accessible hardware and connectivity, but it will have
to rely upon backup data in order to reconstitute a system after a disruption.
• It is a datacenter with a raised floor, power, utilities, computer peripherals, and fully configured computers.
• MTD of at least 1-3 days
• The longer the MTD is, the less expensive the recovery solution will be.
Cold Site
• The least expensive recovery solution to implement.
• Does not include backup copies of data, nor does it contain any immediately available hardware.
• Longest amount of time of all recovery solutions to implement and restore critical IT services for the
organization
• MTD—usually measured in weeks, not days.
• Typically a datacenter with a raised floor, power, utilities, and physical security, but not much beyond that.
Domain 7: Security Operations
Recovery Options
Reciprocal Agreement
• A bi-directional agreement between two organizations in which one organization
promises another organization that it can move in and share space if it experiences a
disaster.
• Documented in the form of a contract
• Also referred to as Mutual Aid Agreements (MAAs)
Mobile Site
• “datacenters on wheels”: towable trailers that contain racks of computer equipment,
as well as HVAC, fire suppression and physical security.
• A good fit for disasters such as a datacenter flood
• Typically placed within the physical property lines, and are protected by defenses
such as fences, gates, and security cameras
Domain 7: Security Operations
Recovery Options
Subscription Services
• Some organizations outsource their BCP/DRP planning and/or
implementation by paying another company to perform those
services.
• Effectively transfers the risk to the insurer company.
• Based upon a simple insurance model, and companies such as
IBM have built profit models and offer services for customers
offering BCP/DRP insurance.
Domain 7: Security Operations
Related Plans
The Business Continuity Plan is an umbrella plan that contains others
plans:
• Disaster recovery plan
• Continuity of Operations Plan (COOP)
• Business Resumption/Recovery Plan (BRP)
• Continuity of Support Plan
• Cyber Incident Response Plan
• Occupant Emergency Plan (OEP)
• Crisis Management Plan (CMP)
Domain 7: Security Operations
Related Plans
Domain 7: Security Operations
Related Plans
Continuity of Operations Plan (COOP)
• Describes the procedures required to maintain operations during a disaster
• Includes transfer of personnel to an alternate disaster recovery site, and operations of that
site.
Business Recovery Plan (BRP)
• Also known as the Business Resumption Plan
• Details the steps required to restore normal business operations after recovering from a
disruptive event
• May include switching operations from an alternate site back to a (repaired) primary site.
• Picks up when the COOP is complete
• Narrow and focused: the BRP is sometimes included as an appendix to the Business Continuity
Plan
Domain 7: Security Operations
Related Plans
Continuity of Support Plan
• Focuses narrowly on support of specific IT systems and applications
• Also called the IT Contingency Plan, emphasizing IT over general business support
Cyber Incident Response Plan
• Designed to respond to disruptive cyber events, including network-based attacks, worms, computer
viruses, Trojan horses, etc.
Occupant Emergency Plan (OEP)
• Provides the “response procedures for occupants of a facility in the event of a situation posing a potential
threat to the health and safety of personnel, the environment, or property. Such events would include a
fire, hurricane, criminal attack, or a medical emergency.”
• Facilities-focused, as opposed to business or IT-focused.
• Focused on safety and evacuation, and should describe specific safety drills, including evacuation drills
(also known as fire drills)
• Specific safety roles should be described, including safety warden and meeting point leader
Domain 7: Security Operations
Related Plans
Crisis Management Plan (CMP)
• Designed to provide coordination among the managers of the
organization in the event of an emergency or disruptive event
• Details the actions management must take to ensure that life and
safety of personnel and property are immediately protected in case of
a disaster
• Crisis Communications Plan
• Component of the Crisis Management Plan
• Sometimes called the communications plan
• A plan for communicating to staff and the public in the event of a disruptive event
Domain 7: Security Operations
Related Plans
• Crisis Communications Plan
• Call Trees
• Is used to quickly communicate news throughout an organization without
overburdening any specific person
• Works by assigning each employee a small number of other employees they are
responsible for calling in an emergency event
• Most effective when there is two-way reporting of successful communication
• Should contain alternate contact methods, in case the primary methods are
unavailable
Domain 7: Security Operations
Calling Tree
Domain 7: Security Operations
Related Plans
• Crisis Communications Plan
• Automated Call Trees
• Automatically contact all BCP/DRP team members after a disruptive event
• Tree can be activated by an authorized member, triggered by a phone call, email,
or Web transaction
• Once triggered, all BCP/DRP members are automatically contacted
• Can require positive verification of receipt of a message, such as “press 1 to
acknowledge receipt.”
• Automated call trees are hosted offsite, and typically supported by a third-party
BCP/DRP provider
Domain 7: Security Operations
Related Plans
• Crisis Communications Plan
• Emergency Operations Center (EOC)
• The command post established during or just after an emergency event
• Placement of the EOC will depend on resources that are available
• Vital Records
• Should be stored offsite, at a location and in a format that will allow access during
a disaster
• Have both electronic and hardcopy versions of all vital records
• Include contact information for all critical staff. Additional vital records include
licensing information, support contracts, service level agreements, reciprocal
agreements, telecom circuit IDs, etc.
Domain 7: Security Operations
Executive Succession Planning
• Organizations must ensure that there is always an executive
available to make decisions during a disaster
• A common mistake is allowing entire executive teams to be
offsite at distant meetings
• One of the simplest executive powers is the ability to endorse
checks and procure money.
Domain 7: Security Operations
Plan Approval
• Now that the initial BCP/DRP plan has been completed, senior
management approval is the required next step
• It is ultimately senior management's responsibility to protect an
organization's critical assets and personnel
• Senior management must understand that they are responsible
for the plan, fully understand the plan, take ownership of it, and
ensure its success.
Domain 7: Security Operations
Backups and availability (again…)
• In order to be able to successfully recover critical business operations,
the organization needs to be able to effectively and efficiently backup
and restore both systems and data
• Verification of recoverability from backups is often overlooked
• Critical backup media must be stored offsite
• Ensure that the organization can quickly procure large high-end tape
drives (if necessary)
• If the MTTR is greater than the MTD, then an alternate backup or
availability methodology must be employed
Domain 7: Security Operations
Backups and availability (again…)
Hardcopy Data
• Hardcopy data is any data that are accessed through reading or
writing on paper rather than processing through a computer
system.
• In weather-emergency-prone areas such as Florida, Mississippi,
and Louisiana, many businesses develop a “paper only” DRP,
which will allow them to operate key critical processes with just
hard copies of data, battery-operated calculators, and other small
electronics, as well as pens and pencils
Domain 7: Security Operations
Backups and availability (again…)
Electronic Backups
• Archives that are stored electronically
• Full Backups
• Every piece of data is copied and stored on the backup repository
• Time consuming, bandwidth intensive, and resource intensive
• Will ensure that any necessary data is available
• Incremental Backups
• Archive data that have changed since the last full or incremental backup
• Differential Backups
• Archive data that have changed since the last full backup
Domain 7: Security Operations
Backups and availability (again…)
Electronic Backups
• Archives that are stored electronically
• Electronic vaulting
• Batch process of electronically transmitting data that is to be backed up on a routine, regularly
scheduled time interval
• Used to transfer bulk information to an offsite facility
• Good tool for data that need to be backed up on a daily or possibly even hourly rate
• Stores sensitive data offsite
• Can perform the backup at very short intervals to ensure that the most recent data is backed up
• Occurs across the Internet in most cases (important that the information sent for backup be sent
via a secure communication channel and protected through a strong encryption protocol)
Domain 7: Security Operations
Backups and availability (again…)
Electronic Backups
• Archives that are stored electronically
• Remote Journaling
• A database journal contains a log of all database transactions
• May be used to recover from a database failure
• Remote Journaling saves the database checkpoints and database journal to a remote
site
• Database shadowing
• Uses two or more identical databases that are updated simultaneously
• Can exist locally, but it is best practice to host one shadow database offsite
• Allows faster recovery when compared with remote journaling
Domain 7: Security Operations
Software Escrow
• Maintain the availability of their applications even if the vendor
that developed the software initially goes out of business
• Allow a neutral third party to hold the source code
• Should the development organization go out of business or
otherwise violate the terms of the software escrow agreement,
then the third party holding the escrow will provide the source
code and any other information to the purchasing organization.
Domain 7: Security Operations
DRP testing, training, and awareness
• Skipping these steps is one of the most common BCP/DRP mistakes
• A DRP is never complete, but is rather a continually amended method
for ensuring the ability for the organization to recover in an acceptable
manner
• Used to correct mistakes
• A DRP that will be effective will have some inherent complex
operations and maneuvers to be performed by administrators
• Each member of the DRP should be exceedingly familiar with the
particulars of their role in a DRP
Domain 7: Security Operations
DRP Testing
• In order to ensure that a Disaster Recovery Plan represents a viable
plan for recovery, thorough testing is needed
• Routine infrastructure, hardware, software, and configuration changes
materially alter the way in which the DRP needs to be carried out
• Ensure both the initial and continued efficacy of the DRP as a feasible
recovery methodology, testing needs to be performed.
• Different types of tests
• At an minimum, regardless of the type of test selected, tests should be
performed on an annual basis
Domain 7: Security Operations
DRP Testing
DRP Review
• Most basic form of DRP testing
• Focused on simply reading the DRP in its entirety to ensure completeness of coverage
• Typically performed by the team that developed the plan, and will involve team members
reading the plan in its entirety to quickly review the overall plan for any obvious flaws
Checklist
• Also known as consistency testing
• Lists all necessary components required for successful recovery, and ensures that they are, or
will be, readily available should a disaster occur
• Often performed concurrently with the structured walkthrough or tabletop testing as a first
testing threshold
• Focused on ensuring that the organization has, or can acquire in a timely fashion, sufficient
resources on which their successful recovery is dependent
Domain 7: Security Operations
DRP Testing
Parallel Processing
• Common in environments where transactional data is a key component of the
critical business processing
• Typically involves recovery of critical processing components at an alternate
computing facility, and restore data from a previous backup
• Regular production systems are not interrupted
• Transactions from the day after the backup are then run against the newly
restored data, and the same results achieved during normal operations for
the date in question should be mirrored by the recovery system's results
• Organizations that are highly dependent upon mainframe and midrange
systems will often employ this type of test.
Domain 7: Security Operations
DRP Testing
Partial and Complete Business Interruption
• This type of test can actually be the cause of a disaster, so
extreme caution should be exercised before attempting an
actual interruption test
• Testing will include having the organization stop processing
normal business at the primary location, and instead leverage
the alternate computing facility
• More common in organizations where fully redundant, load-
balanced, operations exist
Domain 7: Security Operations
Training
• An element of DRP training comes as part of performing the tests
• More detailed training on some specific elements of the DRP process may be
required.
Starting Emergency Power
• Converting a datacenter to emergency power, such as backup generators
• Specific training and testing of changing over to emergency power should be
regularly performed.
Calling Tree Training/Test
• Individuals with calling responsibilities are expected to be able to answer
within a very short time period, or otherwise make arrangements.
Domain 7: Security Operations
Awareness
Even for those members who have little active role with
respect to the overall recovery process, there is still the
matter of ensuring that all members of an organization are
aware of the organization's prioritization of safety and
business viability in the wake of a disaster.
Domain 7: Security Operations
Continued BCP/DRP maintenance
• The BCP/DRP must be kept up to date
• BCP/DRP plans must keep pace with all critical business and IT changes.
Change Management
• The Change Management process is designed to ensure that security is not adversely
affected as systems are introduced, changed, and updated.
• Includes tracking and documenting all planned changes, formal approval for
substantial changes, and documentation of the results of the completed change
• All changes must be auditable
• The change control board manages this process
• The BCP team should be a member of the change control board, and attend all
meetings to identify any changes that must be addressed by the BCP/DRP plan
Domain 7: Security Operations
BCP/DRP Mistakes
Common BCP/DRP mistakes include:
• Lack of management support
• Lack of business unit involvement
• Lack of prioritization among critical staff
• Improper (often overly narrow) scope
• Inadequate telecommunications management
• Inadequate supply chain management
• Incomplete or inadequate crisis management plan
• Lack of testing
• Lack of training and awareness
• Failure to keep the BCP/DRP plan up to date
Domain 7: Security Operations
Specific BCP/DRP frameworks
A handful of specific frameworks include NIST SP 800-34,
ISO/IEC-27031, and BCI.
NIST SP 800-34
• The National Institute of Standards and Technology (NIST)
Special Publication 800-34 “Contingency Planning Guide for
Information Technology Systems”
• May be downloaded at
http://csrc.nist.gov/publications/nistpubs/800-34/sp800-
34.pdf.
Domain 7: Security Operations
Specific BCP/DRP frameworks
ISO/IEC-27031
• Draft guideline that is part of the ISO 27000 series, which also includes ISO 27001 and ISO 27002
• Focuses on BCP (DRP is handled by another framework)
• The current formal name is “ISO/IEC 27031 Information technology—Security techniques—Guidelines for ICT Readiness
for Business Continuity (final committee draft).” According to http://www.iso27001security.com/html/27031.html,
ISO/IEC 27031 is designed to:
• “Provide a framework (methods and processes) for any organization—private, governmental, and nongovernmental;
• Identify and specify all relevant aspects including performance criteria, design, and implementation details, for improving ICT readiness as
part of the organization's ISMS, helping to ensure business continuity;
• Enable an organization to measure its continuity, security and hence readiness to survive a disaster in a consistent and recognized manner.”
• Terms and acronyms used by ISO/IEC 27031 include:
• ICT—Information and Communications Technology
• ISMS—Information Security Management System
• A separate ISO plan for disaster recovery is ISO/IEC 24762:2008, “Information technology—Security techniques—
Guidelines for information and communications technology disaster recovery services.” More information is available at
http://www.iso.org/iso/catalogue_detail.htm?csnumber=41532
Domain 7: Security Operations
Specific BCP/DRP frameworks
BS-25999
• British Standards Institution (BSI, http://www.bsigroup.co.uk/) released BS-25999, which is in two parts:
• “Part 1, the Code of Practice, provides business continuity management best practice recommendations. Please note that this is a guidance
document only.
• Part 2, the Specification, provides the requirements for a Business Continuity Management System (BCMS) based on BCM best practice.
This is the part of the standard that you can use to demonstrate compliance via an auditing and certification process.”14
BCI
• The Business Continuity Institute (BCI, http://www.thebci.org/) published a six-step Good Practice Guidelines (GPG) in
2008, latest version is 2013 which describes the Business Continuity Management (BCM) process:
• Management Practices
• PP1 Policy & Program Management
• PP2 Embedding Business Continuity
• Technical Practices
• PP3 Analysis
• PP4 Design
• PP5 Implementation
• PP6 Validation
Domain 7: Security Operations
Thank you.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy