Docu32691 - White Paper - EMC FAST VP
Docu32691 - White Paper - EMC FAST VP
Docu32691 - White Paper - EMC FAST VP
Abstract
This white paper discusses EMC® Fully Automated Storage
Tiering for Virtual Pools (FAST VP) technology and describes its
features and implementation. Details on how to use the product
in Unisphere™ are discussed, and usage guidance and major
customer benefits are also included.
July 2012
Copyright © 2012 EMC Corporation. All Rights Reserved.
For the most up-to-date listing of EMC product names, see EMC
Corporation Trademarks on EMC.com.
Audience
This white paper is intended for EMC customers, partners, and employees who are
considering using the FAST VP product. Some familiarity with EMC midrange storage
systems is assumed. Users should be familiar with the material discussed in the
white papers Introduction to EMC VNX Series Storage Systems and EMC VNX Virtual
Provisioning.
1
Leveraging Fully Automated Storage Tiering (FAST) with Oracle Database Applications EMC white paper
2
EMC Tiered Storage for Microsoft SQL Server 2008—Enabled by EMC Unified Storage and EMC Fully Automated Storage
Tiering (FAST) EMC white paper
leverage system
• Read/write mixes optimizations favoring
provide predictable disks
performance
multiple streams
• Single-threaded large
sequential I/O
equivalent to SAS
FAST VP operations
FAST VP operates by periodically relocating the most active data up to the highest
available tier (typically the Extreme Performance or Performance Tier). To ensure
sufficient space in the higher tiers, FAST VP relocates less active data to lower tiers
(Performance or Capacity Tiers) when new data needs to be promoted. FAST VP works
at a granularity of 1 GB. Each 1 GB block of data is referred to as a “slice.” FAST VP
relocates data by moving the entire slice to highest available storage tier.
Storage pools
Storage pools are the framework that allows FAST VP to fully use each of the storage
tiers discussed. Heterogeneous pools are made up of more than one type of drive.
LUNs can then be created within the pool. These pool LUNs are no longer bound to a
LUNs must reside in a pool to be eligible for FAST VP relocation. Pools support thick
LUNs and thin LUNs. Thick LUNs are high-performing LUNs that use logical block
addressing (LBA) on the physical capacity assigned from the pool. Thin LUNs use a
capacity-on-demand model for allocating drive capacity. Thin LUN capacity usage is
tracked at a finer granularity than thick LUNs to maximize capacity optimizations.
FAST VP is supported on both thick LUNs and thin LUNs.
RAID groups are by definition homogeneous and therefore are not eligible for sub-LUN
tiering. LUNs in RAID groups can be migrated to pools using LUN Migration. For a more
in-depth discussion of pools, please see the white paper EMC VNX Virtual
Provisioning - Applied Technology.
FAST VP algorithm
FAST VP uses three strategies to identify and move the correct slices to the correct
tiers: statistics collection, analysis, and relocation.
Statistics collection
A slice of data is considered hotter (more activity) or colder (less activity) than
another slice of data based on the relative activity level of those slices. Activity level
is determined simply by counting the number of I/Os, which are reads and writes
bound for each slice. FAST VP maintains a cumulative I/O count and weights each I/O
by how recently it arrived. This weighting deteriorates over time. New I/O is given full
weight. After approximately 24 hours, the same I/O will carry only about half-weight.
Analysis
Once per hour, the collected data is analyzed. This analysis produces a rank ordering
of each slice within the pool. The ranking progresses from the hottest slices to the
coldest. This ranking is relative to the pool. A hot slice in one pool may be cold by
another pool’s ranking. There is no system-level threshold for activity level. The user
can influence the ranking of a LUN and its component slices by changing the tiering
policy, in which case the tiering policy takes precedence over the activity level.
Relocation
During user-defined relocation windows, slices are promoted according to the rank
ordering performed in the analysis stage. During relocation, FAST VP prioritizes
relocating slices to higher tiers. Slices are only relocated to lower tiers if the space
they occupy is required for a higher priority slice. In this way, FAST attempts to ensure
the maximum utility from the highest tiers of storage. As data is added to the pool, it
is initially distributed across the tiers and then moved up to the higher tiers if space is
available. Ten percent space is maintained in each of the tiers to absorb new
allocations that are defined as “Highest Available Tier” between relocation cycles.
Lower tier spindles are used as capacity demand grows.
The following two features are new enhancements that are included in the EMC®
VNX™ Operating Environment (OE) for block release 5.32, file version 7.1:
Pool Rebalancing upon Expansion
When a storage pool is expanded, the sudden introduction of new empty disks
combined with relatively full existing disks causes a data imbalance. This imbalance
is resolved by automating a one-time data relocation, referred to as rebalancing. This
rebalance relocates slices within the tier of storage that has been expanded, to
achieve best performance. Rebalancing occurs both with and without the FAST VP
enabler installed.
Ongoing Load-Balance within tiers (With FAST VP license)
In addition to relocating slices across tiers based on relative slice temperature, FAST
VP can now also relocate slices within a tier to achieve maximum pool performance
gain. Some disks within a tier may be heavily used while other disks in the same tier
may be underused. To improve performance, data slices may be relocated within a
tier to balance the load. This is accomplished by augmenting FAST VP data relocation
to also analyze and move data within a storage tier. This new capability is referred to
as load balancing, and occurs within the standard relocation window discussed
below.
The Tier Status section of the window shows FAST VP relocation information specific
to the pool selected. For each pool, the Auto-Tiering option can be set to either
Scheduled or Manual. Users can also connect to the array-wide relocation schedule
using the Relocation Schedule button, which is discussed in the Automated
scheduler section. Data Relocation Status displays what state the pool is in with
regards to FAST VP. The Move Down and Move Up figures represent the amount of
data that will be relocated in the next scheduled window, followed by the total
amount of time needed to complete the relocation. “Data to Move Within” displays
the total amount of data to be relocated within a tier.
The Tier Details section displays the data distribution per tier. This panel shows all
tiers of storage residing in the pool. Each tier then displays the free, allocated, and
total capacities; the amount of data to be moved down and up; the amount of data to
move within a tier; and RAID Configuration per tier.
Another RAID enhancement coming in this release is the option for more efficient
RAID configurations. Users have the following options in pools (new options noted
with an asterisk):
Table 2. RAID Configuration Options
RAID 5 (8+1) and RAID 6 (14+2) provide 50% savings over the current options,
because of the higher data:parity ratio. The tradeoff for higher data:parity ratio
translates into larger fault domains and potentially longer rebuild times. This is
especially true for RAID 5, with only a single parity drive. Users are advised to choose
carefully between (4+1) and (8+1), according to whether robustness or efficiency is a
Automated scheduler
The scheduler launched from the Pool Properties dialog box’s Relocation Schedule
button is shown in Figure 4. You can schedule relocations to occur automatically. EMC
recommends to set the relocation window either at high rate and run shorter windows
(typically off-peak hours), or low rate and run during production time to minimize any
potential performance impact that the relocations may cause.
The Data Relocation Schedule shown in Figure 4 initiates relocations every 24 hours
for a duration of eight hours. You can select the days, the start time and the duration
on which the relocation schedule should run.
In this example, relocations run seven days a week, which is the default setting. From
this status window, you can also control the data relocation rate. The default rate is
Manual Relocation
Unlike automatic scheduling, manual relocation is user-initiated. FAST VP performs
analysis on all statistics gathered, independent of its default hourly analysis and prior
to beginning the relocation.
Although the automatic scheduler is an array-wide setting, manual relocation is
enacted at the pool level only. Common situations when users may want to initiate a
manual relocation on a specific pool include:
• When new LUNs are added to the Pool and the new priority structure needs to
be realized immediately
• When adding a new tier to a pool.
• As part of a script for a finer-grained relocation schedule
3
This rate depends on system type, array utilization, and other tasks competing for array resources. High utilization rates may
reduce this relocation rate.
The Tier Details section displays the current distribution of slices within the LUN. The
Tiering Policy section displays the available options for tiering policy.
Tiering policies
As its name implies, FAST VP is a completely automated feature and implements a set
of user defined policies to ensure it is working to meet the data service levels
required for the business. FAST VP tiering policies determine how new allocations and
ongoing relocations should apply to individual LUNs in a storage pool by using the
following options:
• Start High then Auto-tier (New default Policy)
• Highest available tier
• Auto-tier
• Lowest available tier
• No data movement
Auto-tier
FAST VP relocates slices of LUNs based solely on their activity level after all slices with
the highest/lowest available tier have been relocated. LUNS specified with the
highest tier will have precedence over LUNS set to Auto-tier.
No data movement
No data movement may only be selected after a LUN has been created. Once the no
data movement selection is made, FAST VP still relocates within a tier, but does not
move LUNs up or down from their current position. Statistics are still collected on
these slices for use if and when the tiering policy is changed.
The tiering policy chosen also affects the initial placement of a LUN’s slices within the
available tiers. Initial placement with the pool set to auto-tier results in the data being
distributed across all storage tiers available within the pool. The distribution is based
When a pool consists of LUNs with stringent response time demands and relatively
less frequent data access, it is not uncommon for users to set certain LUNs in the pool
to highest available tier . That way, the data is assured of remaining on the highest tier
when it is subsequently accessed.
For example if an office has a very important report that is being accessed only once a
week, such as every Monday morning, and it contains extremely important
information that affects the total production of the office. In this case, you want to
ensure the highest performance available even when data is not hot enough to be
promoted due to its infrequent access.
Management
The process for implementing FAST VP for File begins by provisioning LUNs from a
storage pool with mixed tiers that are placed in the File Storage Group. Rescanning
the storage systems from the System tab in the Unisphere software starts a diskmark
operation that makes the LUNs available to VNX for File storage. The rescan
automatically creates a pool for file using the same name as the corresponding pool
for block. Additionally, it creates a disk volume in a 1:1 mapping for each LUN that
was added to the File Storage Group. A file system can then be created from the pool
for file on the disk volumes. The FAST VP policy that was applied to the LUNs
presented to the VNX for File will operate as it does for any other LUN in the system,
dynamically migrating data between storage tiers in the pool.
Copies data from HDDs to Flash drives when they Moves data between different storage tiers based
are accessed frequently. on a weighted average of access statistics collected
over a period of time.
Adapts continuously to changes in workload. Uses a relocation process to periodically make
storage tiering adjustments. Default setting is one 8-
hour relocation per day.
Is designed primarily to improve performance. While it can improve performance, it is primarily
designed to improve ease of use and reduce TCO.
You can use the FAST Cache and the FAST VP sub-LUN tiering features together to
yield high performance and improved TCO for the storage system. As an example, in
scenarios where limited Flash drives are available, the Flash drives can be used to
create FAST Cache, and the FAST VP can be used on a two-tier, Performance and
Capacity pool. From a performance point of view, FAST Cache dynamically provides
performance benefits to any bursty data while FAST VP moves warmer data to
Performance drives and colder data to Capacity drives. From a TCO perspective, FAST
Conclusion
Through the use of FAST VP, users can remove complexity and management overhead
from their environments. FAST VP utilizes Flash, Performance, and Capacity drives (or
any combination thereof) within a single pool. LUNs within the pool can then leverage
the advantages of each drive type at the 1 GB slice granularity. This sub-LUN-level
tiering ensures that the most active dataset resides on the best-performing drive tier
available, while maintaining infrequently used data on lower-cost, high-capacity
drives.
Relocations can occur without user interaction on a predetermined schedule, making
FAST VP a truly automated offering. In the event that relocation is required on-
demand, you can invoke FAST VP relocation on an individual pool by using the
Unisphere software.
Both FAST VP and FAST Cache work by placing data segments on the most
appropriate storage tier based on their usage pattern. These two solutions are
complementary because they work on different granularity levels and time tables.
Implementing both FAST VP and FAST Cache can significantly improve performance
and reduce cost in the environment.
References
The following white papers are available on the EMC Online Support website:
• EMC Unified Storage Best Practices for Performance and Availability –
Common Platform and Block — Applied Best Practices
• EMC VNX Virtual Provisioning
• EMC Storage System Fundamentals for Performance and Availability