CR-2-496 - NetApp Flash Pool Deep Dive
CR-2-496 - NetApp Flash Pool Deep Dive
CR-2-496 - NetApp Flash Pool Deep Dive
How it works
Design considerations
NetApp VST Overview
NetApp VST Strategy
Flash Pool Overview
Flash Pool delivers the ability to add an SSD cache to an
existing HDD aggregate which provides
Offload of expensive HDD operations into the SSD cache
to balance peaks in workloads
Persistent cache across failover events allows the SSD
cache to be immediately available (no rewarming)
Reduce HDD spindle count while achieving the same
performance at a lower total configuration cost
Flash Pool Overview
Flash Pool does not …
Accelerate write operations –
Data ONTAP is already write optimized!
Reduce or alleviate high CPU or
memory utilization
Cache sequential (read or write) or large
block write (>16kB) operations
Increase maximum IOPS or
throughput limits for a system
Data ONTAP Requirements
Software requirements
Data ONTAP 8.1.1 or later operating in 7-Mode or
HDD based 64-bit aggregate
Aggregate state must be healthy (can’t be FAILED,
LIMBO, offline or in a foreign state)
Can be a RAID-DP, RAID 4 or SyncMirror aggregate
Root aggregate support requires PVR (8.1.1 only)
No license is required – it’s free!
Platform Support Requirements
Supported platforms
− FAS2220, FAS2240-2 and FAS2240-4
− FAS/V3160, FAS/V3170, FAS/V3240, and FAS/V3270
− FAS/V6030, FAS/V6040, FAS/V6070, FAS/V6080,
FAS/V6210, FAS/V6240, and FAS/V6280
Storage Requirements
Supported disk shelves
DS14mk4, DS2246, DS424X and DS4486
− Recommended stack depth <= 4 shelves
V-Series must use only NetApp SSD/HDD
Supported drive types:
− FC (DS14mk4 FC only, no MetroCluster FC + Flash Pool)
Supported SSD:
− X441A-R5 100GB SLC SSD
How It Works
SSD Cache
All SSD data drives in the aggregate provide
cache capacity accessible by Flash Pool
− Individual read and write cache capacity is
variable based on the cache policies set and
actual cache usage pattern
Read Caching
1. The first read request goes to HDD –
block is brought into memory and
sent to the requestor.
2. Read is evicted from memory – if it
matches the insertion policy (random 2
CP 3
3. Any additional requests for the same
block are serviced from the SSD cache
– the block is copied back into system
memory and sent to the requestor. Block in HDD = actual block
Block in memory = copy of actual block
Block in SSD cache = copy of actual block
Overwrite Caching
1. First random write is sent to HDD
in a CP. All sequential writes are
sent to HDD.
2. An overwrite of the same random
block arrives in memory – if it 2
CP 3
block random overwrite), it is sent
to the SSD cache in a CP.
3. Actual block resides in SSD (block in
HDD is invalid) – will eventually be Block in HDD = actual block
Block in memory = actual block
de-staged to HDD when evicted Block in SSD cache = actual block
from the SSD cache.
Eviction Scanner
Purpose of the eviction scanner
Runs in order to evict cold blocks to make room for
new blocks that are being inserted - starting when:
− The cache is 75%+ used
− Within the next hour the trending usage is expected to
exceed 75%+ used
Each scanner pass demotes a block, eventually to the
point that the block is evicted from cache
− It takes multiple scanner passes to result in an eviction,
especially if the block is being accessed between
scanner passes
Read Cache Management
Hot Warm Neutral Cold Evict*
scanner scanner scanner scanner
Write Cache Management
Neutral Cold Evict*
scanner scanner
Cache Policies
Cache policies can be modified for each volume that
resides in the Flash Pool
The “priority” command is used to modify volume
cache policies
priority hybrid-cache set <vol_name> <read|write>-cache=<policy>
Cache Policies
Read cache policies
none: disable read caching
Cache Policies
Write cache policies
none: disable write caching
Design Considerations
Flash Pool Capacity Limits
The sum of all SSD data drives usable
capacity from each Flash Pool on a system
counts toward the cache limit
SSD cache does not count towards the
maximum aggregate capacity but does count
towards the systems maximum spindle limit
Per node cache limit cannot be exceeded
− HA pair limits are listed to indicate the
remaining node in a failover can serve the full
cache capacity from both nodes
High-end Platform Limits (DOT 8.1.1)
Midrange/Entry Platform Limits (DOT 8.1.1)
*FAS/V3210, FAS/V3140 and FAS/V3070 are not supported with Flash Pool
Minimum SSDs
A recommended minimum number of SSDs should
used in a Flash Pool to make sure the SSD does not
become a bottleneck
Flash Cache and Flash Pool
Flash Cache and Flash Pool can co-exist
on the same system
Any aggregate containing SSDs (Flash Pool or
homogenous) will be excluded from Flash Cache
(including volumes with Flash Pool caching disabled)
Both products have unique attributes that should be
considered based on your workload requirements
The combined Flash Cache and Flash Pool (data
drives) capacity counts towards the maximum cache
capacity per node/controller
Flash Cache, Flash Pool or Both?
Snapshot Copies
Snapshots and read cache
− Read cache blocks are copies of the actual blocks that
are on the HDD
Volume snapshots only lock the block on HDD
Storage Efficiency
− Blocks that are deduplicated
on HDD are cached
deduplicated blocks
− Cloned blocks are cached
in the SSD cache
− Compressed blocks are not cached
− Blocks in compressed volumes that are 2
POC Recommendations
Workload recommendations
− Write block size should be 16k or smaller and contain a high
percentage of random writes
− Read block size can be any size and contain a high percentage
of random reads
System recommendations
− 20% CPU and memory headroom prior to deployment
− If the working set size fits completely in the cache you can look to
reduce spindle count or use slower drives
− If you are unsure or the working set size is larger than the cache
capacity, do not change media type or reduce spindle count more
than 30%, unless you are willing to experience reduced response
times/throughput on cache misses
Sizing and Analysis
Sizing Flash Pool Configurations
It is imperative that the working set size fits into the
available Flash Pool caching capacity
− Reductions in HDD drive count are based on the working
set fitting into the cache capacity
− If you overrun the cache capacity then you’ll become more
dependent on the HDD configuration
− While the cache is initially warmed, the configuration will
be dependent on the HDD
Sizing Flash Pool Configurations
System Performance Modeler (SPM)
It is critical to understand the workload characteristics
− Read to write mix
− Random to sequential mix
− Transfer size
− Working set size
− Rate of change in the working set
Understanding these factors will aid in determining per
volume cache policies, cache size, and likelihood of
encountering small block random overwrites
Predictive Cache Statistics
PCS cannot be used if Flash Cache or Flash Pool already
exist on the system
Flash Pool caching estimation
− Assume +10% for the actual Flash Pool % replaced
− Set PCS capacity size to 75% of the actual SSD cache
(data drives) being implemented
− In workloads where there are heavy random overwrites
PCS is not as effective as it only accounts for reads,
therefore adjust the results down by 10-20%
Monitoring Performance
A new preset for Flash Pool exists for the
“stats show” command
− “stats show –p hybrid_aggr”
Debug level command
Displays statistics specific to Flash Pool
Detailed information available in TR4070
Key Takeaways
Size the SSD cache capacity to fit the
working set size
PB: Pure SSD and Flash Pool with DS424X disk shelf
© 2012 NetApp, Inc. All rights reserved. No portions of this document may be reproduced
without prior written consent of NetApp, Inc. Specifications are subject to change without notice.
NetApp, the NetApp logo, and Go further, faster, are trademarks or registered trademarks of
NetApp, Inc. in the United States and/or other countries. All other brands or products are
trademarks or registered trademarks of their respective holders and should be treated as such.
Additional Slides
Creating a Flash Pool
− The HDD 64-bit aggregate must already exist
− Two-step process:
1. Set the aggregate option for Flash Pool
– 7-Mode: aggr options <aggr_name>
hybrid_enabled on
– Cluster-Mode: storage aggregate modify
-aggregate <aggr_name> -hybrid-enabled true
2. Add SSDs into a new RAID group
SSDs RAID groups cannot be removed once added (you
must destroy the aggregate to repurpose the drives)
Expanding SSD RGs
A recommended minimum number of SSDs should used to
expand any existing Flash Pool SSD RAID group(s)
Growth Increment
System Family
(SSD Data Drives)
Entry-level Systems 1
Midrange Systems 3
High-end Systems 6