CERC Dell Best

Download as pdf or txt
Download as pdf or txt
You are on page 1of 35

CERC SATA Best Practices Reference Guide

Authored By:
Worldwide Services Team

March 2007 rev A00

____________________

Information in this document is subject to change without notice.

© Copyright 2007 Dell Inc. All rights reserved.


Reproduction in any manner whatsoever without the written permission of Dell Inc. is strictly forbidden.

THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN
TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED
AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND.

Trademarks used in this text: Dell, the DELL logo, PowerEdge, PowerVault, Precision, and OpenManage
are trademarks of Dell Inc.; Microsoft, Windows, Windows NT, and Windows Server are either trademarks
or registered trademarks of Microsoft Corporation in the United States and/or other countries; Red Hat, Red
Hat Enterprise Linux, and Red Hat Linux are registered trademarks of Red Hat, Inc. in the United States
and other countries; Novell and Netware are registered trademarks of Novell, Inc., in the United States and
other countries.
Other trademarks and trade names may be used in this document to refer to either the entities claiming the
marks and names or their products. Dell disclaims proprietary interest in the marks and names of others.

Page 3
TABLE OF CONTENTS

OBJECTIVE AND SCOPE ................................................................................... 5


SECTION 1: INTRODUCTION ................................................................................................................. 5
CERC SATA 1.5/6ch................................................................................................................................. 5
CERC SATA 1.5/2s ................................................................................................................................... 6
Unsupported SATA Solutions ................................................................................................................... 6

SECTION 2: OVERVIEW OF STEPS ENSURING RAID BEST PRACTICES .................................. 7


Maintenance of Arrays .............................................................................................................................. 7
Recovery of Arrays .................................................................................................................................... 7
Upgrading and Reconfiguring Arrays........................................................................................................ 7

SECTION 3: MAINTENANCE OF ARRAYS .......................................................................................... 7


Utilities and Applications Used for Array Maintenance ............................................................................ 7
Definition of Drive and Array Status as Reported in the Controller BIOS ................................................ 9
RAID Support in Linux ........................................................................................................................... 12
Consistency Check of RAID Arrays – CERC SATA 1.5/6ch ................................................................. 14
Background Consistency Check of RAID Arrays – CERC SATA 1.5/6ch ............................................. 15
Consistency Check of RAID Arrays – CERC SATA 1.5/2s ................................................................... 15
Backup and Recovery of Data ................................................................................................................. 17

SECTION 4: RECOVERY OF ARRAYS ................................................................................................ 18


Capture System Logs and Details Surrounding Array Failures to Assist in the Recovery ...................... 18
Write Down the Circumstances or the Exact Steps Performed Preceding the Failure ............................. 19
Understand Possible Causes of Drive Array Failure ............................................................................... 19
Common BIOS Messages ........................................................................................................................ 19
Simple Troubleshooting Steps When a Failure Is Discovered................................................................. 20
Recovering from Arrays in a Degraded State .......................................................................................... 20
Recovering from Arrays in a Failed State................................................................................................ 21
CERC SATA 1.5/6ch Array Restoration - <CTRL><R> Enable/Restore RAID .................................... 21
CERC SATA 1.5/2s Array Restoration ................................................................................................... 21
Double Fault Scenario ............................................................................................................................. 22
Rebuilding ............................................................................................................................................... 22
Known Hard Drive Replacement Issues .................................................................................................. 22

SECTION 5: UPGRADING AND RECONFIGURING ARRAYS ....................................................... 23


Array Reconstruction ............................................................................................................................... 23
Capacity Expansion ................................................................................................................................. 23
RAID Level Migration ............................................................................................................................ 25
Past Known Issues ................................................................................................................................... 27

SECTION 6: PERFORMANCE ............................................................................................................... 28

APPENDIX: SATA BEST PRACTICES ............................................................. 30


CERC SATA 1.5/6ch Controller Specifications ....................................................................................... 30
Minimum System Requirements ............................................................................................................. 30

CERC SATA 1.5/2s Controller Specifications ......................................................................................... 31


Setting Up Automated Scheduling of Consistency Checks on Windows Systems .................................. 32

Page 4
OBJECTIVE AND SCOPE
This document contains the best practices for routine maintenance of systems using the CERC
SATA 1.5/6ch and CERC SATA 1.5/2s controllers to handle their RAID needs. This document is
not intended for addressing or recommending the type or size of arrays for specific applications.
These maintenance best practices are recommended to all Dell™ Enterprise users to avoid
failures, downtime, and data loss. These practices will help to ensure a better user experience by
maintaining the integrity of data and minimizing cost of downtime.

The document covers the following practices:


• Maintenance of arrays
• Recovery of arrays
• Upgrading and Reconfiguring arrays

SECTION 1: INTRODUCTION
The goal of Redundant Array of Independent Disks (RAID) is to provide better performance
and/or reliability from combinations of disk drives than the performance provided with non-RAID
configurations. Serial ATA (SATA) disks are based on a low-cost technology that replaces
Parallel ATA (PATA) disk drives in value servers. Serial ATA incorporates significant technical
enhancements over traditional ATA making it ideal for RAID implementations. Along with several
configuration benefits, SATA improves data transmissions through a point-to-point topology,
which eliminates bus sharing and allows up to a full 1.5 Gb/s bandwidth to each drive. The SATA
standard also specifies a power connector that is different from the 4-pin connector used by
Parallel ATA (PATA) drives. The larger numbers of pins are used to supply three different
voltages if required – 3.3V, 5V and 12V. A key feature supported by some SATA solutions (but
not PATA) is also hot-swapping.

Dell offers two cost-effective SATA RAID solutions, specifically the CERC SATA 1.5/6ch and the
CERC SATA 1.5/2s.

CERC SATA 1.5/6ch


The CERC SATA 1.5/6ch is a six-port Serial ATA I/O processor-based RAID controller that
supports advanced RAID technology features. The controller’s RAID features include:

• Optimized Disk Utilization – Enables use of the full capacity of all the drives, even if the
drive sizes vary.

• Online Capacity/Volume Expansion – Enables capacity expansion of the RAID array


during system operation.

• Online RAID Level Migration – Enables migration between RAID levels without
rebuilding the array from scratch.

• Multiple Arrays – Enables the user to create multiple arrays from a single set of drives.

• SATA Disk Hot Plug – The PowerVault 745N storage solution with a CERC SATA
1.5/6Ch supports hot plug hard drives. Hot plug hard drives can be added and removed
without shutting down the system.

The CERC SATA 1.5/6ch supports RAID levels 0, 1, 5, 10, and simple volume configurations. It
also supports automatic failover, which allows the controller to automatically rebuild an array
when a failed array is replaced with a new drive. This feature applies only to fault-tolerant arrays.

Page 5
The CERC SATA 1.5/6ch card is offered with Power Edge systems 700, 750, 800, 1800, 830,
850, and 1420SC; Power Vault system 745N; and Precision Workstation systems 470 and 670.
Figure 1 is a product image of the CERC SATA 1.5/6ch card:

Figure 1: CERC SATA 1.5/6ch Product Image

For more controller specifications and supported operating systems, please refer to Appendix A in
this document.

CERC SATA 1.5/2s


The CERC SATA 1.5/2s supports two SATA disk drives and is an integrated software-based
RAID implementation. It can be a cost-effective alternative when the more advanced capabilities
of a hardware implementation are not needed. The CERC SATA 1.5/2s supports RAID levels 0
and 1 and up to two single configured drives. No other RAID types (for example, 5,10, or 50) are
supported by the CERC SATA 1.5/2s. Supported systems include the Power Edge SC420,
SC1420, SC1425, 800, and 850; and Precision Workstation systems 470 and 670. The CERC
SATA 1.5/2s does not support hot plugging on any of these systems.

For more controller specifications and supported operating systems, please refer to Appendix B in
this document.

Unsupported SATA Solutions


The CERC SATA 1.5/2s cannot coexist with the CERC SATA 1.5/6ch controller. If both are
enabled, there might be boot issues. The CERC SATA 1.5/2s must be disabled in the BIOS
(System Setup) when using the CERC SATA 1.5/6ch, or the CERC SATA 1.5/6ch must be
removed when using the CERC SATA 1.5/2s. With the CERC SATA 1.5/2s BIOS disabled, the
attached drives will need to be managed by the CERC SATA 1.5/6ch. Multiple CERC SATA
1.5/6ch cards on a single system is also unsupported.

Migrating or upgrading from the CERC SATA 1.5/2s to the CERC SATA 1.5/6ch is not supported.
In addition, migrating the CERC SATA 1.5/2s from non-RAID mode (“RAID off”) to RAID mode
(“RAID on”) is also not supported.

Page 6
SECTION 2: OVERVIEW OF STEPS ENSURING RAID BEST
PRACTICES

The following is an overview of the steps that can be taken to ensure RAID Best Practices.

Maintenance of Arrays
• Run regular consistency checks on the system.

• Perform all recommended driver, firmware, and Storage Management Application


updates.

• Monitor System Event Logs and Array Manager Event Logs.

• Establish Best Practices for Backup and Recovery of data.

• Ensure that properly qualified SATA cables are used and that they are not excessively
bent.

Recovery of Arrays
• Capture system logs and details surrounding array failures to assist in the recovery.

• Write down the exact steps or circumstances that caused the system to get in the failed
state

Upgrading and Reconfiguring Arrays


• Before any Array Expansion operation, it is advisable to back up all critical data in the
event of an array reconstruction failure.

• Proper procedures should be followed when increasing array size depending on whether
an array expansion is done by increasing hard-drive size or by increasing the number of
drives.

SECTION 3: MAINTENANCE OF ARRAYS


Utilities and Applications Used for Array Maintenance
This section describes the main utilities and applications used to maintain arrays.

BIOS RAID Configuration Utility


The Adaptec BIOS RAID Configuration Utility is an embedded BIOS utility that includes the
following:

• CERC Array Configuration Utility – Used to create, configure, and manage arrays. Also
used to initialize logical drives and rescan hard drives.
• SATASelect – Used to change device and controller settings.
• Disk Utilities – Used to format or verify media.

Note: The CERC SATA 1.5/2s BIOS RAID Configuration Utility only consists of the Array
Configuration Utility and the Disk Utilities. It does not contain a SATASelect utility.

Page 7
To run the utility, press <Ctrl><A> when prompted by the following message during system
startup: "Press <Ctrl><A> for BIOS RAID Configuration Utility".

CERC Array Configuration Utility


The CERC Array Configuration Utility enables the management, creation, and deletion of arrays.
It also supports hard-drive initialization and rescan, and hot-spare assignment. The Array
Configuration Utility can be used to create a bootable array for the system. It is recommended
that the system is configured to boot from an array instead of a single disk, in order to take
advantage of the redundancy and performance features of arrays.

Some key points to take note of:

• During array creation, there is an option available to enable read and write caching for the
array. When enabled (default setting), maximum performance is seen. However, there is a
potential for data loss or corruption during a power failure. Caching should be enabled to
optimize performance, unless the user data is highly sensitive, or the user’s application
performs completely random reads.

• During array creation, 3 options will be provided – Build, Clear and Quick Init. The Build
operation is a background initialization of a redundant array. The array is accessible
throughout. The Clear operation is a foreground initialization of a fault-tolerant array and
zeros out all blocks of the array. The array is not accessible until the clear task is complete.
With the Quick Init operation, an array is available immediately with no on-going background
controller activity. For a RAID 5, write performance is impacted until a Verify with Fix is run on
the array.

• When deleting an array, a backup of the data on the array should be done. Deleted arrays
cannot be restored and all the data on the array will be lost.

• The CERC SATA 1.5/6ch has a Disk Initialization option, which overwrites the partition table
on the disk and makes any data on the disk inaccessible. If the drive is used in an array, the
array may not be able to be used again. A drive that is part of a boot array should not be
initialized. (The boot array is the lowest numbered array – normally 00)

• The CERC SATA 1.5/2s has a Configure Drives option. If an installed disk does not appear in
the disk selection list for creating a new array or if it appears grayed out, it will need to be
configured before it can be used as part of an array. If a drive is configured, but not made
part of a RAID 0 or a RAID 1, it will function as a simple volume. Configuring a single drive
overwrites the partition table on the disk and makes any data on the disk inaccessible. If the
drive is used in an array, the array may not be able to be used again.

SATA Select Utility


The SATA Select utility allows the device and controller settings to be changed without opening
the system or handling the card. With this utility, the Channel Interface Definitions and Device
Configuration Options can be modified.

Disk Utilities
With the disk utilities, a low-level format or a verify operation of the hard disks can be done via the
Format Disk or Verify Disk Media options. Format Disk is a low-level format of the hard drive that
writes zeros to the entire disk. SATA drives are formatted at the factory and do not need to be re-

Page 8
formatted. Formatting destroys all the data on the drive. It is recommended that a fully tested
backup of all the data that is to be recovered is available before performing the Format Disk
option. Verify Disk Media scans the media of a disk drive for defects and any recoverable
defects are remapped.

BIOS Event Logs


The BIOS-based event log stores all firmware events (configuration changes, array creation, boot
activity etc.). The event log has a fixed size, and once it is full, old events are flushed as new
events are stored. The log is also volatile and hence cleared after each system reboot. The BIOS
event logs are only available for the CERC SATA 1.5/6ch and not the CERC SATA 1.5/2s.

Definition of Drive and Array Status as Reported in the Controller


BIOS

Drive Status
Optimal - An array member disk with this status is in the optimal state and there are no errors
detected. For drives that do not belong to an array, this status indicates that they are ready for
use.

Rebuilding - A drive with this status is currently in the rebuilding process.

Unable to Access Drive - A drive obtains this status when the controller card is unable to detect
any physical connection between the controller and the hard drive. This could occur due to
hardware errors on the drive, loose connections between the controller port and the hard drive, or
accidental unseating of the drive.

Missing member - A drive obtains this status from the previous Unable to Access Drive status
after a rescan or a system reboot is done, during which the bus is rescanned and the
configuration is updated to reflect the missing drive.

Grayed out - A drive with this status is one that used to be part of a logical array, and is
recognized as a previous member of that array, but is not currently incorporated as a member of
the degraded or failed array.

Array Status
Optimal - An array with this status is optimal and ready for use.

Degraded - An array with this status is no longer fault tolerant.

Building/Verifying - An array with this state is currently building the mirror for a RAID 1 array or
calculating the parity for a RAID 5 array.

Rebuilding - An array with this status is currently rebuilding.

Failed - An array assumes the failed status when two or more hard drives fail and the data is lost.

Impacted - An array obtains this status when its performance becomes impacted. This could
happen when:

Page 9
• The two mirrors of a RAID 1 are not identical.
• There are parity inconsistencies in a RAID 5 array.
• A building (scrubbing) process is aborted before the array becomes optimal.

DOS Flash Utility


The DOS Flash Utility (applicable to the CERC SATA 1.5/6ch only) is used to update the flash
EEPROM components on one or more RAID controllers. The utility can also be used to verify a
controller's current flash contents against the flash images in a specified file or to save a
controller's current flash contents to a file.

The CERC SATA 1.5/6ch controller uses nonvolatile flash to store on-board software, such as
BIOS, microprocessor kernel, and monitor. Whenever it becomes necessary to update any of
those components, you can update your controller's flash components using this utility. The utility
updates the controller's flash by reading flash image data from a supplied User Flash Image (UFI)
file and writing it to the controller's flash components. A UFI file contains all of a controller's flash
images, as well as information about each image. It also includes general controller information,
such as controller type, to ensure that the utility uses the correct UFI file when updating the
controller's flash.

The utility performs the following primary functions:

• Update - Updates all the flash components on a controller with the flash image data from
a UFI file to ensure the utility uses the correct UFI file when updating the controller’s
flash.
• Save - Reads the contents of a controller's flash components and saves the data to a UFI
file. This enables you to later restore a controller's flash to its previous contents should
the need arise.
• Verify - Reads the contents of a controller's flash components and compares it to the
contents of the specified flash image file.
• Version - Displays version information about a controller's flash components.
• List - Lists all the supported controllers detected in your system.

RAID Storage Manager


The RAID Storage Manager (RSM) is a storage management solution used in SC-class servers
and Dell Precision™ workstations. It is an Adaptec utility used to manage only Adaptec-based
controllers and can be run under Windows or Linux.

The following options can be configured in RSM:

• Spanned volumes and RAID volumes


• Read and Write caching
• Array capacity
• Stripe size
• Array initialization settings
• Array rebuild rate

The array options include Creating, Migrating, Deleting, Rebuilding and Verifying an Array, and
Preparing an array for Windows.

Page 10
Some key points to take note of:

• When performing a RAID level migration, interrupting this process may result in data
loss. Partitioning or formatting the new array will result in complete data loss.
• Deleting an array destroys all data on the array. Deleting an array in which the
operating system resides will destroy the operating system and the system will no
longer boot. RSM will not allow the deletion of an array in which the operating
system resides. The partition must first be deleted, or the array will need to be deleted
from the controller BIOS.
• If multiple drives fail in separate disk groups, replace each defunct drive. If multiple
physical drives fail simultaneously within the same disk group, contact your service
representative.

OpenManage Storage Management


The Dell™ OpenManage™ Storage Management (OMSM) provides storage management
information in an integrated graphical view. Storage Management provides RAID storage
management that is integrated with Server Administrator.

OMSA has drop-down menus and wizards for executing storage management and configuration
tasks.

• Create Virtual Disk


• Reconfigure Virtual Disk
• Maintain Integrity of Redundant Virtual Disks
• Assign Hot Spares
• Rebuild a Failed Array Disk
• Restore Dead Segments

With OMSM, the Rebuild rate, Background Initialization rate, and the Check Consistency rate,
can all be set. Foreign configurations can also be imported.

OpenManage Array Manager


Array Manager (AM) is a storage management application that allows the configuration and
management of local and remote storage attached to a server while the server is online and
continuing to process requests. AM retrieves information about storage devices attached to a
server, including controllers and array disks, and information on the storage system’s logical
components, such as virtual disks and volumes. AM consists of the Console (Client), Managed
System (Server), and Array Manager Utilities and can be used to create, configure, reconfigure,
format, delete virtual disks, check consistency and assign hot spares. AM is supported by both
Microsoft® Windows® and Novell® Netware®. AM is not supported on Red Hat® Linux®
operating systems. Please note that AM versions 3.5 and later have added support for the CERC
SATA 1.5/6ch controller, and AM versions 3.6 and later have added support for the CERC SATA
1.5/2s controller.

Versions of Dell OpenManage Server Administrator (OMSA) previous to version 4.4 used AM as
the RAID management utility. OMSA versions 4.4 and later use OMSM.

Page 11
Note: The functionality of the Storage Management applications is limited on the CERC SATA
1.5/2s. No reconfiguration of the subsystem can be done. However, the applications are still
useful for obtaining array status, starting consistency check, and forcing rebuilds if they do not
start automatically.

Table 1 is a feature comparison chart of RAID Storage Manager, Array Manager, and Open
Manage Storage Management (OMSM):
.

Raid Storage
Feature Manager Array Manager OMSM
Remote RAID Management No Yes Yes
Alarm Functionality Yes Yes Yes
Adaptec Support Yes Yes Yes
AMI/LSI Support No Yes Yes
Force Online Option No Yes Yes
Automatic Rebuild Yes Yes Yes

Table 1: Feature Comparison Chart

RAID Support in Linux


For systems supporting the CERC SATA 1.5/6ch and Linux OS, in addition to RSM or OMSM, the
Command Line Interface (CLI) can be used to manage controller components. CLI commands
can enable test automation or array creation in a production environment using Linux shell
scripts.

For more information on RAID support in Linux, please refer to the CERC SATA 1.5/6ch or CERC
SATA 1.5/2S user guides found on support.dell.com.

Understanding Drive and Array Status

Tables 2 and 3 show the various possible Array and Drive status for the CERC SATA 1.5/6ch
across the three main Storage Management utilities.

BIOS RAID Utility AM OMSM RSM

Optimal Ready Ready Optimal

Degraded Failed Redundancy Degraded Degraded

Building/Verifying Resynching, Not Redundant Resynching Verifying

Rebuilding Rebuilding Regenerating Rebuilding

Failed Failed Failed Failed

Impacted Failed Redundancy Failed Redundancy Impacted

Table 2: CERC SATA 1.5/6ch Array Status

Page 12
BIOS RAID Utility AM OMSM RSM

Unable to Access Drive Offline Offline (Drive Disappears)

Missing Member (Drive Disappears) (Drive Disappears) (Drive Disappears)

(Grayed Out) – Degraded Degraded Offline Optimal

Drive is displayed as part of


an array (Whitened) Ready Online Optimal

(Grayed Out) – Rebuilding Ready Online Rebuilding

Table 3: CERC SATA 1.5/6ch Drive Status

Similarly, Tables 4 and 5 show the various possible Array and Drive status for the CERC SATA
1.5/2s across the three main Storage Management utilities.

BIOS RAID Utility AM OMSM RSM

Optimal Ready Ready Optimal

Degraded Failed Redundancy Failed Redundancy Degraded

Building (Initial Build) Resynching, Not Redundant Resynching Verifying

Building (Rebuild) Rebuilding Regenerating Rebuilding

Failed Failed Failed


Failed (BSOD if OS Array fails) (BSOD if OS Array fails) (BSOD if OS Array fails)

Table 4: CERC SATA 1.5/2s Array Status

BIOS RAID Utility AM OMSM RSM

Unable to Access Drive Offline Offline (Drive Disappears)

Missing Member (Drive Disappears) (Drive Disappears) (Drive Disappears)

(Grayed Out ) – Degraded Degraded Offline Optimal

Drive is displayed as part of


an array (Whitened) Ready Online Optimal

(Grayed Out) – Rebuilding Ready Online Rebuilding

Table 5: CERC SATA 1.5/2s Drive Status

Page 13
Consistency Check of RAID Arrays – CERC SATA 1.5/6ch
RAID arrays are used mainly to protect critical data through redundancy, either in the form of
parity calculations or simple mirroring. Hard drive media defects have improved over time, even
as drive sizes continue to increase. Hard drives, however, are not expected to be completely
flawless and normal wear on a drive may lead to an increase in media or “grown” defects over
time. These bad blocks will need to be remapped to another location on the drive. If a bad block
is detected during a normal write operation, the controller will mark that block as bad and the
block will be added to the “grown defects list” in the drive’s NVRAM. That write operation will be
considered incomplete until the data is properly written to a remapped location successfully. If a
bad block is detected during a normal read operation, the controller will reconstruct the missing
data and remap to a new location.

A double fault scenario is one in which the controller detects a bad block on a drive in a RAID
array and then detects a second bad block on another drive in the same data stripe. This
scenario can also occur when rebuilding a degraded logical drive, when the controller encounters
a bad block on a good drive in the array. This will lead to a rebuild failure and potential data loss.

For the CERC SATA 1.5/6ch, there are two types of consistency checks offered – the consistency
check and the background consistency check. The consistency check (CC) is used to restore the
consistency for redundant arrays after unexpected events, such as a power loss. For RAID 5
based arrays, it recalculates and restores parity if needed. For RAID 1 data based arrays, it
restores the mirror. If media errors are encountered, data recovery is initiated like in a
background consistency check (BCC). A CC can be initiated via any of the storage management
applications, RSM, AM or OMSS. Regular consistency checks will reduce the risk of double fault
scenarios. To avoid downtime and to ensure data integrity, it is recommended that consistency
checks be included as part of routine maintenance of all RAID systems. For more information on
scheduling consistency checks in Windows, refer to Appendix C. To enable a consistency check
in the following Array Management Utilities, the following steps need to be performed:

OpenManage Array Manager:


1. Open the Array Manager Console.
2. Under Arrays -> PERC Subsystem -> CERC SATA 1.6/6ch Controller, right-click on the
required virtual disk.
3. Click on Check Consistency to enable the consistency check.
4. To view the event logs, click on the Events tab.

Dell OpenManage Storage Management:


1. Open Dell Open Manage Storage Management.
2. Under System -> Storage -> CERC SATA 1.5/6ch, click on Virtual Disks to view the required
array.
3. Under Tasks, scroll down and select Check Consistency. Click on Execute to enable the
consistency check.

Raid Storage Manager:


1. Open RAID Storage Manager.
2. Click on RAID Controller CERC SATA 1.5/6ch.
3. Under Actions, click on Enable background consistency check.

Page 14
Background Consistency Check of RAID Arrays – CERC SATA
1.5/6ch
Background consistency check is a method used by the CERC SATA 1.5/6ch controller to detect
hard drive media errors and recover data. It can be enabled in the controller BIOS to run in the
background while other processes are going on, in order to proactively and efficiently detect and
fix media errors. When a hard drive media error is detected, it proceeds to recover the lost data
by regenerating the right data from peer disks and relocating the generated data. Background
consistency check will only run on redundant arrays.

The Background Consistency Check feature was implemented in the CERC SATA 1.5/6ch
firmware version 4.1.0.7417. It is disabled by default and will need to be manually enabled in the
controller BIOS in order to implement it. Typical performance impact when this feature is enabled
is about 1-4%. The worst-case scenario can be approximately 10% for Random Writes.
Performance numbers may also vary depending on the configuration.

To enable/disable Background Consistency Check in the CERC SATA 1.5/6ch BIOS:

1. Press <Ctrl><A> to enter the Adaptec RAID Configuration Utility.


2. From the main Options menu, choose SATASelect Utility.
3. On the next Options menu, choose Controller Configuration.
4. Scroll down to Array Background Consistency Check. Press <Enter> to select option
and choose either Disabled or Enabled.
5. Save changes made upon exit.

Consistency Check of RAID Arrays – CERC SATA 1.5/2s


For the CERC SATA 1.5/2s, a consistency check can be done as per the CERC SATA 1.5/6ch.
However, there is no background consistency check option available for a CERC SATA 1.5/2s. In
the CERC SATA 1.5/s BIOS, there is a Verify Command that is similar to a consistency check. If
a mismatch of data during a build of a RAID array is found, an option to verify the drives will be
available. This Verify option will only be available if the array is optimal. If the array has failed, it
will have to be rebuilt.

To verify the drives, the <Ctrl><S> can be used. A prompt will pop up asking if the utility should
automatically fix any errors. When the verification is complete, a verification complete message
will appear.

Another key point to note is that the Verify command cannot be performed on the CERC SATA
1.5/2s while another operation is queued, such as rebuild or initialization. If the Verify command is
run while another activity is in progress, the system will return to the Manage Arrays section
without completing the verify process.

Scheduling Consistency Checks

It is recommended that consistency checks on each RAID logical volume be performed at least
once a month. This will increase the chance of detecting any media defects (bad blocks), remap
them and recalculate the parity on the data stripes. This will also reduce the probability of
encountering double fault scenarios during rebuild and causing inconvenient down times.

Please refer to the Appendix for instructions on how to set up automated scheduling of
consistency checks on Windows Systems.

Page 15
Upgrading Firmware, Drivers and Storage Management Utilities
Concurrently to the Latest Versions
The latest RAID controller firmware and driver for both the CERC SATA 1.5/6ch and CERC SATA
1.5/2s can be found on support.dell.com. This upgrade will ensure maximum performance,
reliability, and functionality of the RAID controllers. Upgrading the firmware and drivers along with
the latest versions of the Storage Management utilities will ensure correct functionality at all levels
and availability of all features. It is recommended that the driver should be updated before
updating the firmware.

Monitoring System Event Logs and Storage Management Utility


Event Logs
System event logs are generated to provide information to the user or for notifying the user about
events that may affect the physical security and availability of their data. With a Windows based
OS and with Array Manager installed, the help file of the application can be reviewed to get a
complete list of the event types. The system event logs should be checked regularly for any
warnings or error messages.

All storage management applications’ event logs should also be monitored regularly for any
media errors, including corrected media errors. Corrected media errors are normal, but an
excessive number of such errors within a short period of time may be indicative of a drive that will
need to be proactively replaced during a maintenance cycle. These event logs will also be
available from any Novell Netware server with the Windows console. These events will be
displayed on the Array Manager Event logs, as shown in Figure 2.

Figure 2: Array Manager Event Log

The BIOS-based event logs can also be monitored. The BIOS-based event logs store all firmware
events like configuration changes, array creation, boot activity, and so on. This event log has a
fixed size and once full, older events are flushed as newer events are populated. This log is also
volatile, and it is cleared with each system restart.

To access the event log:

1. Press <Ctrl><A> to access the BIOS when prompted.

Page 16
2. From the BIOS RAID Configuration utility menu, press <Ctrl><P>.
The Controller Service menu appears.
3. Select Controller Log Information, and then press <Enter>.
The current log is displayed.

Backup and Recovery of Data


It is highly recommended that a comprehensive backup and recovery strategy be implemented in
order to protect all data. This recovery strategy should be reviewed and tested regularly in order
to ensure that it will be suitable as well as efficient. During a backup process, there may be a
reduction in the normal system performance due to the increased workload.

Hotspare Assignments
A hotspare is a drive that is reserved to replace a failed drive in a redundant array. In the event of
a drive failure, the hot spare replaces the failed drive and the array is built automatically. Before
becoming an array member as a result of a failure, a hot spare can be unassigned using a
management utility.

Note: For the rebuild to complete successfully, a hot spare must be of the same size or larger
than the smallest drive in an array.

For the CERC SATA 1.5/s, there is an Add/Delete Hotspares option in the controller BIOS.
However, the CERC SATA 1.5/2s only supports 2 drive configurations, and no hot spares can be
assigned when the array is in an optimal state. This option can only be used in the case of a
degraded array, if there are problems kicking off a rebuild.

The CERC SATA 1.5/6ch supports two types of hot spares:

• Global: Protects any array that the spare drive has sufficient capacity to protect
• Dedicated: Protects only the array to which it has been assigned

Global Hot Spares


When a drive in an array fails, a global hot spare with enough capacity is automatically used to
store the data contained on the failed drive. The system’s behavior after a failure depends on the
size of the spare relative to the drive it is replacing.
• If the global hot spare is larger than the drive it is replacing by 100MB or more, the
spare will replace the failed drive, but still remain as a global hot spare. The unused
portion will be available for use in the event of future failures.
• If the global hot spare is the same size or less than 100MB larger than the drive it is
replacing, it becomes a member of the array with the failed drive and will no longer
be marked as a global hot spare.

Note: For a RAID 10 array, the system can use the same global hot spare to replace two
failed drives in the same array if the global hot spare is at least twice the size of the failed
drives in the array. This is not recommended because redundancy will be affected. When
assigning a global hotspare in a system with a RAID 10 array, a spare which is the same size
as the members of the array should be used.

Dedicated Hot Spares

Page 17
When a drive in an array containing a dedicated hot spare fails, the spare is automatically used to
store the data contained on the failed drive if the spare has enough capacity. The spare becomes
a member of the array and will no longer be identified as a hot spare. If the spare is larger than
the drive it is replacing, the extra portion will remain unused.

Automatic Failover
The automatic failover feature allows the controller to automatically rebuild an array when a failed
drive is replaced with a new drive. This feature applies only to fault tolerant arrays. In the CERC
SATA 1.5/6ch controller BIOS, to ensure that automatic failover is enabled, the following steps
can be performed:

1. At the BIOS screen, press the <Ctrl> + <A> keys together when prompted to enter the
Adaptec RAID Configuration Utility.

2. From the Options menu, select SATASelect Utility. Then select Controller
Configuration. Verify that Automatic Failover is enabled. If it is disabled, press <Enter>
to select the Enabled option. Press <Esc> to exit and choose Yes to save changes
made.

Cabling Practices
The following are some general cabling best practices that should be followed:
• Ensure that properly qualified cables are being used
• Ensure that the SATA cables are properly connected to the controller or SATA ports and
SATA hard drives and that there are no loose connections
• Ensure that the cables are not excessively bent
• Ensure that the cable lengths are appropriate for installation
• Examine the cables for cuts or exposed shielding

SECTION 4: RECOVERY OF ARRAYS


To avoid loss of data integrity or to aid in the recovery of lost arrays, perform the following simple
steps.

Capture System Logs and Details Surrounding Array Failures to


Assist in the Recovery
In the Windows® OS environment, the use of the Dell™ Server E-Support Tool (DSET) is
recommended. This tool will capture all the system description and configuration data needed in a
debug or recovery effort. DSET is a small, non-intrusive tool that does not require a reboot of the
system to provide basic functionality. Immediately after installation, DSET can collect information
about Windows® drivers, services, network settings, etc. It will also collect basic information
about the system's storage such as active drives and RAID containers. DSET will also collect
extended hardware information such as processors, memory, PCI cards, ESM log, BIOS/firmware
versions and system health (fan/voltage levels). If Array Manager 2.5 or later is installed, DSET
will gather Dell-specific storage information such as CERC and PERC controllers and their
firmware, array/containers, logical disk signatures, enclosures and physical hard drives installed.
DSET always collects Windows NT® 2000 information such as services, tasks, drivers, and
events logs. The ESM Log of supported systems can be cleared so that the amber hardware
warning light on Dell PowerEdge™ systems can be properly reset once an event has been
remedied.

Note: The latest version of DSET can be found at ftp://dropbox.us.dell.com/dropbox2/ips/DSET/

For the Linux operating system, the “Getconfig” tool can be used to retrieve issue information
when possible. Getconfig is a utility that pulls the hardware configuration data from various

Page 18
sources on a Linux box. Retrieving the system data from a Linux box is very labor intensive and
this tool fully automates the process.

Write Down the Circumstances or the Exact Steps Performed


Preceding the Failure
As much as it is possible, all the steps that occurred prior to the failure, including any changes
made to the operating system, system software or hardware, or the CERC driver or firmware,
needs to be monitored. The ability to backtrack and understand the activities that may have
contributed to a failure will assist in the attempt to recover a failed array and could also help
correct the conditions that helped contribute to the failure.

Understand Possible Causes of Drive Array Failure


While the top priority during an array failure is to bring the system back online, it is also important
to determine the root cause of the problem. Failure to determine a root cause may lead to further
outages and data loss.

Cables – All cables should be approved cables for the particular application. Pins should be
inspected and all external and internal SATA cables attached to the system should be reseated. If
any damage is identified on the cable pins, the female connection should be inspected as well for
resultant damage.

Power – Each system in the rack should be protected by an approved Uninterruptible Power
Supply (UPS) and it should be verified that each server has the proper amount of power. Low
voltage or power spikes will knock an array offline.

Firmware and Driver – A mismatch of firmware and driver could result in random controller
lockups, hangs or BSODs. The system should be at the current approved level. The latest
firmware and driver versions can be found on support.dell.com. When performing a firmware and
a driver update, it is recommended that the driver be updated before the firmware. The driver is
always backward compatible with older firmware. A new firmware with an older driver will usually
result in abnormal behavior.

Defective Hard Drive – In some cases, a drive can cause noise on the SATA bus and knock an
array offline. If this is the case, the diagnostics should fail the drive, unless the drive is producing
sufficient noise to render the bus inoperable. System logs should be reviewed for drive reported
errors. Drives can also be reseated to ensure good connections.

Common BIOS Messages


The following are the common CERC SATA 1.5/6ch and CERC SATA 1.5/2s messages that
appear in the controller BIOS:

The following message indicates Port Roaming:

The following drives have moved to different Ch:Id:Lun


0:3:0 0:4:0

The following message is seen when the firmware detects that a new drive has been inserted
since the last boot:

New devices detected at the following SATA Ports:


Port#4

The following message appears when an array is in a degraded state or rebuilding:

Page 19
The following Arrays have Missing or Rebuilding or Failed members and are degraded:
Array#4

The following message appears when an array is missing more than one drive and is failed.

The Following Arrays have missing required members and cannot be configured:
Array#4

The following message appears when there is a Smart error:

Port# Vendor Product Info Rev# SMART Error


----------------------------------------------------------------------------------------
0 ST340014 AS 8.05 Y

The following message appears when the firmware has a problem with one of attached drives.
Either the firmware cannot prepare the driver during boot up or a firmware kernel crash has
occurred:

“Fatal Error: Controller monitor failed. Controller not started. Press any key to continue.”

Simple Troubleshooting Steps When a Failure Is Discovered


The following simple actions and troubleshooting steps can be performed before calling technical
support for assistance. These steps will also assist the technical support representative.

1. Capture controller logs.


2. Record RAID configuration.
3. If the array is degraded or failed, note the offline or grayed out drive IDs.
4. Shut down the system and disconnect/reconnect all SATA power and data cables.
5. Check cable ends for bent pins/problems.
6. Move the CERC SATA 1.5/6ch controller, if present, to another PCI slot.
7. Boot system and check the array and drive status after performing a rescan operation.

The CERC SATA 1.5/6ch and CERC SATA 1.5/2s user guides available on support.dell.com can
be used to find more troubleshooting tips and guidelines.

Recovering from Arrays in a Degraded State


An array is considered to be in a degraded state when it loses its redundancy. This can be
caused by a drive dropping offline, failing, or by a drive becoming degraded (still online but
grayed out). When an array is in a degraded state, a rebuild can be attempted.

For the CERC SATA 1.5/6ch, the Automatic Failover feature allows the controller to automatically
rebuild an array when a failed drive is replaced. This feature only applies to fault-tolerant arrays.
This feature is enabled by default. If disabled, the array will need to be rebuilt manually. The drive
replacement should be done when the system is turned off (except in the case of the PV745N,
which supports hot plugging). A hot spare is a disk that is not used in data storage, but is
reserved for use as a replacement for one of the other drives in the array in the event of a failure.

If the automatic rebuild does not start automatically, storage management utilities like Array
Manager or OMSM can be used to perform a rescan and trigger the rebuild. If the rebuild still
does not start, a manual rebuild should be attempted. In the controller BIOS, the <CTRL><S>
function can be used to assign the newly inserted drive as a dedicated hot spare or <CTRL><G>
can be used to assign it as a global hot spare for the array to be rebuilt. Once the hot spare is
assigned, the rebuild should start. Storage management utilities can also be used to assign a

Page 20
drive as a hot spare. If an error message displays while assigning the hot spare, initialize the
newly inserted hard drive first to erase the old configuration data from the previous usage.

Note: Initializing a hard drive will destroy all the data on that drive.

Recovering from Arrays in a Failed State


Note: The following recovery methods can be used to recover data, if possible.

CERC SATA 1.5/6ch Array Restoration - <CTRL><R> Enable/Restore


RAID
<CTRL><R> is a feature present in the CERC SATA 1.5/6ch BIOS RAID configuration utility.
When a redundant array fails, this option can be used to recover access to some or all the data
on the failed array. <CTRL><R> however cannot guarantee the consistency of the data. The
integrity of the data needs to be verified before it is used.

<CTRL><R> can only be used when the array status is failed. All the original array
member disks must be present in the system. Drives grayed out under Array Members can be
considered to be original members of the failed array. <CTRL><R> can incorporate these
drives back into the array. If the array is in a degraded state, a hot spare should be assigned
to initiate a rebuild to restore the array to its optimal status. <CTRL><R> cannot be used to
recover RAID 0 arrays.

Note: <CTRL><R> cannot guarantee the consistency of the data. The integrity of the data will
need to be verified. This option should be used only to try to recover the data The data may be
lost permanently.

To Enable/Restore RAID or a <CTRL><R> operation, perform the following steps:

1. At the BIOS screen, press <CTRL><A> to enter the BIOS RAID Configuration Utility.
2. From the Options menu, select Array Configuration Utility. Then select Manage
Arrays.
3. Choose the desired array under List of Arrays. Press <CTRL><R> to Enable/Restore
RAID. When the warning message appears, type Y to continue. Back up as much data as
possible from the recovered array.

The Enable/Restore RAID function is also available in Array Manager and OMSM and is referred
to as Restore Dead Disk Segments. If the OS is up and running, this option can be used to
force the drives online.

CERC SATA 1.5/2s Array Restoration


As the CERC SATA 1.5/2s supports only two drives, if both drives are grayed out in the system,
follow the steps mentioned below to attempt to recover the data.

1. Delete the array. When asked whether to delete the boot sectors, select NONE.
2. Re-create an array with the same size as the one that failed.
3. Check if the system can be booted to the OS.
4. If the system can be booted to the OS, perform a backup of all the data required. Then,
perform a Verify Disk Media (in the BIOS) or Dell Diagnostics hard drive long DST test on
the originally problematic drives.
5. If the test passes, re-create the array, perform an Array Verify, and restore the data from
the backup.
6. If the test fails, replace the hard drives that fail the hard drive diagnostics.

Page 21
Note: The preceding restoration process is not a design feature of the CERC SATA 1.5/2s and it
cannot be guaranteed to help restore a system in a failed state.

Double Fault Scenario


A double fault scenario occurs when an array is in a degraded state and a bad block is detected
on another drive, which is part of the degraded array. This scenario can occur when a rebuild of a
degraded array is in progress. A double fault scenario may result in data loss if data is present on
the stripe.

To determine if a double fault scenario has happened under Array Manager or OMSM:

If a rebuild fails, the array disks should be first checked. If a drive other than the one that was
replaced or re-inserted to fix the original issue appears “Degraded” or “Offline”, this indicates the
double fault scenario. If a drive that was re-inserted or replaced appears “Degraded” or “Offline”,
then follow the steps described in the Recovering from Arrays in a Degraded State section.

Alternatively, under the Events log, the ID of the hard drive which showed the medium error can
be checked. If the ID is that of an existing drive in the array, one that was not replaced or re-
inserted, this indicates the double fault scenario. The medium error and rebuild failure error
messages will appear as shown below:

Error 544 Virtual Disk (RAID5 0) rebuild failed


Error 691 Medium Error: ID (0:00) Medium Error - Bad Block Replacement Possible.

Note: When rebuild fails due to a double fault scenario, it is advisable to back up all critical data,
re-create the array, and restore the data. To avoid this scenario in the future, consistency checks
should be scheduled on a regular basis.

To determine if a double fault scenario has happened under the controller BIOS:

If a rebuild fails, check the Array Members under Array Properties. If one of the existing array
disks, one that was not replaced or re-inserted, appears to be grayed out or missing, this
indicates the double fault scenario.

Rebuilding
If a rebuild fails due to a double fault scenario, the rebuild will not kick off again even if a drive is
assigned as a hot spare. This is working as designed. In a double fault scenario, the firmware is
unable to generate the parity for the stripe due to a bad block on the secondary drive (double fault
on an array), thus disabling rebuilding on the replaced new drive (in other words, it does not kick
off rebuild at all even if the drive is replaced again with another drive).

In some cases, a rebuild might not kick off at all on the new or original drive even if there’s no
double fault scenario. To recover from this situation, initialize the drive and/or assign the
replaced or original drive as a hot spare to kick off the rebuild process.

Known Hard Drive Replacement Issues


1. On a CERC SATA 1.5/6ch, once a rebuild fails on the replaced drive, the rebuild will
not kick off again
2. On a CERC SATA 1.5/6ch, once rebuild fails on a virtual disk, the rebuild may
not restart. The rebuild could automatically fail due to the following reasons:

Page 22
i. The replaced drive is bad. In this case, run hard drive diagnostics on the
replaced drive to verify if the drive is truly bad before replacing it again.
ii. One of the existing drives, in a degraded volume, has a bad block (Double
Fault Scenario). In this case, depending on which virtual disk is affected due
to the bad block, the rebuild will fail ONLY on the affected Virtual Disk.
However, it will continue and complete on rest of the virtual disks.
a. If a user reinserts the same drive or replaces it with a new one, the
rebuild will not restart for ONLY those virtual disks that had failed
earlier (due to dual failure scenario).
b. Array Manager logs the following error in the AM log:
Perc2Pro 544 CERC SATA1.5/6ch Controller 0 , Virtual Disk (OS
0) rebuild failed

This problem can be avoided by updating the CERC SATA 1.5/6ch firmware to
version 4.1.0.7417 or later. It is also recommended that regular consistency checks
be scheduled to avoid running into rebuild failure issues.

3. A RAID 1 rebuild may not start and may generate a stop error on a CERC SATA
1.5/2s system.

If a drive fails in a RAID 1 and the rebuild option is selected within OMSS 1.0, the
rebuild may not start or may generate a “stop error”.

The server should be restarted and the Configure Drives option should be selected
in the controller BIOS. The new disk should be selected, followed by the Add/Delete
Hotspare option. The new disk should then be selected again. After rebooting the
system, the virtual disk will rebuild automatically when the operating system starts.
This procedure can also be performed when the drive is first replaced, in which case,
no operating system boot will be required.

All newer OMSS versions have a fix for this issue. No hardware replacements should
be required.

SECTION 5: UPGRADING AND RECONFIGURING ARRAYS


Array Reconstruction
In the CERC SATA 1.5/6ch, array Capacity Expansion (CE) and RAID Level Migration (RLM) are
supported. The process of rebuilding the new array that is created by CE and RLM operations is
called Array Reconstruction. The latest OpenManage Array Manager User Guide or Open
Manage Storage Manager User Guides can be referred to for instructions regarding Array
Reconstruction (available at support.dell.com). Redundancy is maintained during reconstruction
when the initial and final RAID levels are redundant levels. If a disk fails during this process, the
reconstruction process must continue and finish before the degraded array can be rebuilt.

Note: Virtual disks or arrays larger than 2 Terabyte (TB) cannot be created on any Dell CERC
controller. The SATA specification supports 2 TB Virtual Disk (array). However, the 2TB limitation
on CERC SATA is imposed due to BIOS, driver, and Application Programming Interface (API)
restrictions. Currently, Dell does not have any plans to support Virtual Disks larger than 2TB on
the CERC SATA 1.5/6ch or the CERC SATA 1.5/2s. Due to this limitation, certain CE or RLM
operations beyond the 2TB limit may not work.

Capacity Expansion
Capacity Expansion involves adding a physical disk member to an existing RAID array and
expanding the logical drive by utilizing the additional capacity. CE also allows expansion of the

Page 23
logical drive by utilizing the unused space in the existing drives, without inserting a new drive.
Windows Server 2003, Windows 2000, and Netware, also support Online Capacity Expansion
(OCE). Upon completion of an array expansion, the additional capacity can be used without
restarting the system. This feature is not available in the CERC SATA 1.5/2s.

The following are the basic procedures that need to be followed when expanding an array. The
first way to perform an array expansion is by increasing hard drive size. An example would be
upgrading all 80GB hard drives to 250GB or 400GB hard drives. The second way to perform
array expansion is by increasing the total number of hard drives that make up the array. An
example would be adding an additional drive to a three drive RAID 5 to make a four drive RAID 5.

Note: Before any array expansion operation, all critical data should be backed up in the event of
an array reconstruction failure.

Note: Array expansion cannot be done via the BIOS. A storage management application such as
Array Manager or OMSM is required.

Note: The CERC SATA 1.5/6ch does not support hard drive sizes larger than or equal to 1 TB.

Array expansion via increasing hard drive size:


This type of array expansion includes making available existing unused hard drive space, as well
as replacing the existing drives with drives of larger capacity. The following steps should be taken
to replace the existing drives with larger capacity drives:

RAID 0 Array:
1. Back up all data, replace the existing drives with the new drives, re-create the array, and
restore the data on the new array from backup.

RAID 1 Array:
1. Back up all data (Recommended).
2. Remove the first drive and add the replacement drive.
3. Perform a rebuild process (either manually or automatically if auto rebuild is enabled).
4. Upon rebuild completion, remove the second drive and add the replacement drive.
5. Perform a rebuild process (either manually or automatically if auto rebuild is enabled).
6. Perform a Capacity Expansion.

Note: This series of steps may be very time consuming depending on the size of the drives and
the system utilization by applications, due to the dual rebuild cycles required. Alternatively, the
following steps can be performed to reduce the number of rebuild cycles:
1. Back up all data.
2. Delete the existing array and create a new RAID 1 array.
3. Restore the data on the new array from backup.

RAID 5 Array:
1. Back up all data (Recommended).
2. Remove the first drive and add the replacement drive.
3. Perform a rebuild process (either manually or automatically if auto rebuild is enabled).
4. Upon rebuild completion, remove the second drive and add the replacement drive.
5. Perform a rebuild process (either manually or automatically if auto rebuild is enabled).
6. Repeat this process until all the drives are replaced.
7. Perform a Capacity Expansion.

Note: This series of steps may be very time consuming depending on the number of drives in the
RAID 5 array, the system utilization by applications, and the size of the drives, due to the multiple
rebuild cycles required. Alternatively, the following steps can be performed to reduce the number
of rebuild cycles.

Page 24
1. Back up all data.
2. Delete the existing array and create a new RAID 5 array.
3. Restore the data on the new array from backup.

RAID 10 Array:
1. Back up all data (Recommended).
2. Remove the first drive and add the replacement drive.
3. Perform a rebuild process (either manually or automatically if auto rebuild is
enabled).
4. Upon rebuild completion, remove the second drive and add the replacement drive.
5. Perform a rebuild process (either manually or automatically if auto rebuild is
enabled).
6. Repeat this process until all the drives are replaced.
7. Perform a Capacity Expansion.

Note: This series of steps may be very time consuming depending on the size of the drives and
system utilization by applications, due to the multiple rebuild cycles required. Alternatively, the
following steps can be performed to reduce the number of rebuild cycles.
1. Back up all data.
2. Delete the existing array and create a new RAID 10 array.
3. Restore the data on the new array from backup.

Note: When there is a drive failure, and the failed drive is replaced with a drive that is larger than
the rest of the drives in the array, the size of the virtual disk will not increase due to disk coercion.
The leftover space on the new drive however is available for use by other virtual disks in the
system.

Array expansion via increasing the number of hard drives:


RAID 0 Array:
1. Back up all data.
2. Select the additional drives to be added.
3. Recreate the RAID 0 array and restore the data on the new array from backup.

RAID 1 Array:
No additional drives can be added to a RAID 1 array, because by definition, it is formed using
only 2 drives.

RAID 5 Array:
1. Back up all data (Recommended).
2. Select the additional drives to be added.
3. Perform an Array Reconfiguration operation.

RAID Level Migration


Online RAID level migration is an advanced RAID technology feature present in the CERC SATA
1.5/6ch. This feature allows RAID levels to be changed without rebuilding the array from scratch.
This feature is not available in the CERC SATA 1.5/2s.

The CERC SATA 1.5/6ch supports modifying existing arrays by expansion, migration from one
array type to another, and changing the stripe size. These migration scenarios are described in
Table 6.

Page 25
Current Array Type New Array Type

RAID 0 RAID 5 or 10

RAID 1 RAID 0 or 5 or 10

RAID 5 RAID 0 or 10

RAID 10 RAID 0 or 5

Table 6: Array Migration Possibilities in the CERC SATA 1.5/6ch

RLM can occur by migrating from a lower redundancy RAID level to a higher level or from a
higher redundancy RAID level to a lower level, both without taking the array offline. Both types of
RLM must involve migration to an array with a capacity greater than or equal to the original array.
This can be done by combining the RLM operation with the CE operation. Figures 3, 4 and 5
illustrate a RLM from a 2-drive RAID 1 Array to a 4-drive RAID 5 array.

Figure 3: Selecting the Array Disks to be added to the Virtual Disk

Page 26
Figure 4: Selecting the Attributes for the Reconfigured Virtual Disk

Figure 5: New and Old Virtual Disk Configuration Information

Past Known Issues


1. There was an issue with the CERC SATA 1.5/6ch controller (firmware version 4.1.0.7401
and all earlier versions) in which attempting to morph an array of size greater than 1.09
terabytes (TB) could result in data loss. Whereas, creating a new array greater than 1 TB
was not a problem. Morphing encompasses RLM, OCE, shrinking, stripe size migration
and so on. The limitation was on the array size only, and NOT on the number of arrays.
To resolve this issue, a different array would need to be created rather than modifying the
existing array. This issue has been fixed in all firmware versions greater than 4.1.0.7401.

Page 27
2. A second issue involved the CERC SATA 1.5/6ch crashing the Windows Server when
morphing a RAID 5 array. This happened when the morph destination hard drive
sequence did not match the source hard drive sequence on the same set of hard drives.
This issue has been fixed in the latest firmware version 4.1.0.7417.

Note: To ensure that the above listed or other issues pertaining to array morphing are not seen,
the CERC SATA 1.5/6ch firmware should be updated to the latest version 4.1.0.7417, which can
be found on support.dell.com

SECTION 6: PERFORMANCE
The CERC SATA 1.5/2s is a software-based RAID implementation and has no internal cache
memory. This software implementation is integrated within the driver, which contains the code to
run the RAID engine within the OS environment. Driver-based RAID depends completely on the
resources of the system processor and memory for RAID execution, and may affect system
performance in high CPU utilization environments. There is also a known issue with Windows
2003, in which if a system is running the native Microsoft driver atapi.sys, instead of the CERC
SATA 1.5/2s driver, the system may run in Programmable Input/Output (PIO) mode, which may
lead to poor performance. For all CERC SATA 1.5/2s RAID implementations, it is recommended
that the latest aarich.sys driver, found on support.dell.com, be used.

The CERC SATA 1.5/6ch is a hardware RAID implementation, in which dedicated hardware with
embedded firmware is used to control the RAID operations. The performance of a hardware RAID
solution is dependent on the processing power of the controller’s I/O processor and the cache
size, unlike software RAID, whose performance is directly dependent on server CPU performance
and load.

Cache is a fast-access memory on the controller that serves as intermediate storage for data that
is read from, or written to drives. There are 2 caches that can affect performance, the hard drive
cache and the controller cache. The hard drive cache is enabled by default. With the hard drive
cache enabled, a performance gain of up to 40% on write commands can be obtained. The
CERC SATA 1.5/6ch has a controller cache memory of 64 MB, fixed ECC SDRAM, which when
enabled, can significantly improve sequential and random write performance.

The I/O throughput of the CERC SATA 1.5/6ch is mostly determined by the attached hard drive
performance, the onboard I/O processor’s processing power, and the onboard cache size. The
local CPU speed and memory size will not affect the RAID storage subsystem throughput too
much. If a machine is running applications that consume a lot of system memory and free space
becomes scarce, this could affect the RAID subsystem’s operations.

The following are the main SATA configuration options that may affect the performance of the
CERC SATA 1.5/6ch.

• Write Cache (Default: ENABLED) – When Write Cache is enabled, performance is


maximized. Caching should usually be enabled to optimize performance, unless the data
is highly sensitive, or unless an application performs completely random reads, which is
unlikely.

Note: When Write Cache is enabled, there is a potential for data loss or corruption during
a power failure. A UPS solution is recommended to ensure fault tolerance.

• DMA (Default: ENABLED) – When enabled, Direct Memory Access (DMA) mode is used
for the drive, providing maximum performance.

Page 28
• Allow Read Ahead (Default: ENABLED) – When enabled, the drive’s read ahead cache
algorithm is used, providing maximum performance under most circumstances.

• Stripe Size (Default: 64MB) – The default stripe size gives the best overall performance
in most network environments.

• Array Background Consistency Check (Default: Disabled) – When enabled,


consistency checking processes reduce performance. For RAID 5, the performance
reduction is significant.

Write Cache
The write cache policy for the CERC SATA 1.5/6ch is usually set during the creating of a Virtual
Disk. This policy can be changed using an array management utility such as Array Manager or
OMSM. The hard drive cache is enabled by default. With the hard drive cache enabled, a
performance gain of up to 40% on write commands can be obtained.

The write cache policy cannot be changed with Array Manager 3.6 or below. Array Manager 3.7
has provided a way, via registry change, to enable the write cache on the CERC SATA 1.5/6ch
controller. This registry change allows an Array Manager user to perform a Change Policy Virtual
disk command and select the Write Cache Enabled Always setting. This change will permit an
Array Manager user to enable this setting only on CERC SATA 1.5/6ch controllers without
recreating their Virtual Disks. Please note before making any registry changes, it is recommended
that all critical data be backed up. The Write Cache Enable Always setting can lead to cache
data loss. Data in the write cache will be lost if power is lost to the server. This setting should only
be used when there is a UPS battery backup for the system. Even with UPS battery backup for
the server, there is no guarantee that cached data will not be lost during a power failure. This
setting should be selected only on virtual disks that contain non-critical data or data where the
potential for data loss will not be catastrophic.

Note: This functionality was added with Array Manager 3.7 and will not work with earlier versions.
Array Manager 3.7 can only be installed while installing OMSA 4.3 and above.

Page 29
Appendix: SATA BEST PRACTICES
CERC SATA 1.5/6ch Controller Specifications
Minimum System Requirements
Server or workstation with one universal PCI slot and a motherboard and BIOS that complies with
the PCI Local Bus Specification, Revision 2.2 and provides large memory-mapped address
ranges.

Controller Specifications
Component Description
Computer bus 32 or 64-bit PCI local bus
Intel 80302 Intelligent I/O Processor
On-board processors Three Silicon Image SI3512 dual SATA 1.0 controllers with command
queuing
Cache memory 64 MB, fixed ECC SDRAM
Data safety Audible alarm
Device protocol SATA 1.0 and SATA II
RAID levels RAID 0, RAID 1, RAID 10, RAID 5, and Simple volume
Container (array)
Up to 64 containers per controller; 64 partitions maximum per container
support
PCI bus 64-bit, 66 MHz (32-bit, 33 MHz-compatible)
SATA channels Six internal channels
Up to six SATA devices per controller (1 per channel)
Device support
Supports a RAID container as a boot device

Supported Operating Systems


Red Hat Linux Advanced Server 3
Red Hat Linux Advanced Server 2.1
Red Hat Linux Professional (Depending on Version)
Windows Server® 2003 (32bit) Standard Edition
Windows 2003 Enterprise Server (32bit)
Small Business Server 2003 (32bit)
Windows 2003 Web Server (32bit)
Windows 2000 Server
Windows 2000 Advanced Server
Windows 2000 Small Business Server
Novell NetWare, versions 5.1 and 6.5
Novell NetWare Small Business Suite

Page 30
CERC SATA 1.5/2s Controller Specifications

Controller Specifications
Component Description
SATA Controller Integrated ICH5R (SC1420/SC1425/PE1800) and ICH6R(SC420/PE800)
Cache memory None
Device protocol SATA 1.0
RAID levels RAID 0, RAID 1, Up to two single configured drives
Container support One container. One logical drive
SATA channels Two SATA ports
One SATA drive per port, maximum two HDDs
Device support
Supports Logical Drive as boot device
SMART Support Yes

Supported Operating Systems


Windows 2003 Server (32bit) Standard Edition
Windows 2003 Enterprise Server (32bit)
Small Business Server 2003 (32bit)
Windows 2003 Web Server (32bit)
Windows 2000 Server
Windows 2000 Advanced Server
Windows 2000 Small Business Server
Novell Netware, versions 5.1 and 6.5

Page 31
Setting Up Automated Scheduling of Consistency Checks on
Windows Systems

1. For systems with a Windows OS system and Array Manager installed, you can use the
Scheduled Tasks option from the menu under the Accessories folder. Double-click on
Add Scheduled Task and the following wizard will appear:

2. Click Next and the following screen will appear:

3. Click Browse and locate the file amcli.exe. The AMCLI executable is located in the Array
Manager installation directory.
4. Select the file and click OK.

Page 32
5. Click Next and the following screen will appear:

6. Enter a name for this task and select how often the task should be run. The minimum
recommendation for this task is to be run at least once a month.

Page 33
7. Click Next and the following screen will appear:

8. Select the time at which the Consistency Check should run. Remember that there will be
a system performance impact so you want to run this at a low traffic time.
9. Click Next and the following screen will appear:

10. Fill in the name and password fields appropriately so the task can be executed correctly.

Page 34
11. Click Next and the following screen will appear:

12. Select the checkbox for Open advanced properties for this task when I click Finish.
13. Click Finish and the following screen will appear:

Page 35
14. In the Run textbox you can type different parameters. The following is example syntax for
scheduling a check consistency on virtual disk 1. "C:\PathName\amcli.exe" /c1 where
PathName is the path to the AMCLI executable. This will be the command executed by
the scheduler every time it runs this event.
15. To run a consistency check, the parameter is amcli /cn where the c option indicates
consistency check and n is the number of a virtual disk as displayed in the Array
Manager tree view.

Page 36

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy