VSAN Design and Sizing Guide PDF
VSAN Design and Sizing Guide PDF
Table of Contents
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1 VMware Virtual SAN. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Virtual SAN Datastore Characteristics and Sizing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Disk Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Virtual SAN Datastore. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Objects and Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Virtual SAN Datastore Sizing Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Number of Failures to Tolerate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Design Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Multiple Disk Groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.2 Flash Capacity Sizing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.3 Memory and CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.4 Network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.5 Installation Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5 Size-Calculating Formulas. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.1 Cluster Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.2 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.3 Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.4 Swap. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.5 Usable Capacity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
About the Author . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Introduction
1.1 VMware Virtual SAN
VMware® Virtual SAN™ is a new hypervisor-converged, software-defined storage platform that is fully integrated
with VMware vSphere®. Virtual SAN aggregates locally attached disks of hosts that are members of a vSphere
cluster, to create a distributed shared storage solution. Virtual SAN enables the rapid provisioning of storage
within VMware vCenter™ as part of virtual machine creation and deployment operations.
Virtual SAN is a hybrid disk system that leverages both flash-based devices, to provide optimal performance,
and magnetic disks, to provide capacity and persistent data storage. This delivers enterprise performance and a
resilient storage platform.
The distributed datastore of Virtual SAN is an object-store file system that leverages the vSphere Storage Policy
Based Management (SPBM) framework to deliver application-centric storage services and capabilities that are
centrally managed through vSphere virtual machine storage policies.
This document focuses on the definitions, sizing guidelines, and characteristics of the Virtual SAN
distributed datastore.
Flash devices: One per disk group One per disk group
SAS, SATA, PCIe SSD
Magnetic disk devices One HDD per disk group Seven HDDs per disk group
Components
In Virtual SAN, objects comprise components that are distributed across hosts in a vSphere cluster. These
components are stored in distinctive combinations of disk groups within the Virtual SAN distributed datastore.
Components are transparently assigned caching and buffering capacity from flash-based devices, with their
data “at rest” on the magnetic disks. Virtual SAN 5.5 currently supports a maximum of 3,000 components
per host.
Objects greater than 255GB in capacity automatically are divided into multiple components. In addition, if the
number-of-disk-stripes-per-object capability is increased beyond the default value of one, each stripe is a
separate component. For every component created in Virtual SAN, an additional 2MB of disk capacity is
consumed for metadata.
Witness components—those that contain only object metadata—are part of every storage object. A witness
serves as a tiebreaker, to avoid split-brain behavior when availability decisions are made in the Virtual SAN
cluster. Each Virtual SAN witness component also consumes 2MB of capacity.
NOTE: Virtual SAN implements a system default policy with a number of failures to tolerate equal to 1 on all
virtual machine objects deployed on the Virtual SAN shared datastore.
The general recommendation for sizing flash capacity for Virtual SAN is to use 10 percent of the anticipated
consumed storage capacity before the number of failures to tolerate is considered. For example, a user plans to
provision 1,000 virtual machines, each with 100GB of logical address space, thin provisioned. However, they
anticipate that over time, the consumed storage capacity per virtual machine will be an average of 20GB.
Table 3 shows a simple sizing scenario based on the general recommendation for flash capacity.
MEASUREMENT REQUIREMENTS VA L U E S
So, in aggregate, the anticipated consumed storage, before replication, is 1,000 x 20GB = 20TB. If the virtual
machine’s availability factor is defined to support number of failures to tolerate equals 1 (FTT=1), this
configuration results in creating two replicas for each virtual machine—that is, a little more than 40TB of
consumed capacity, including replicated data. However, the flash sizing for this case is 10 percent x 20TB = 2TB
of aggregate flash capacity in the cluster where the virtual machines are provisioned.
The optimal value of the target flash capacity percentage is based upon actual workload characteristics, such as
the size of the working set of the data on disk. Ten percent is a general guideline to use as the initial basis for
further refinement.
1.4.3 Memory and CPU
The memory requirements for Virtual SAN are defined based on the number of disk groups and disks that are
managed by a hypervisor. Virtual SAN currently supports a maximum of five disk groups per host and a
maximum of eight disk devices per disk group: one flash-based device and seven magnetic disks.
As long as vSphere hosts have memory configurations of more than 32GB of RAM, they can support the
maximum disk group and disk configuration supported in Virtual SAN. Because of the memory overhead
requirement in Virtual SAN, consider having memory configurations greater than 32GB per host to support the
full storage capacity and scalability capabilities of Virtual SAN.
Virtual SAN is designed to introduce no more than 10 percent of CPU overhead per host. Consider this fact in
Virtual SAN implementations with high consolidation ratios and CPU-intensive application requirements.
1.4.4 Network
Virtual SAN provides support for both vSphere standard switch and VMware vSphere Distributed Switch™, with
either 1GbE or 10GbE network uplinks. Although both vSphere switch types and network speeds work with
Virtual SAN, VMware recommends the use of the vSphere Distributed Switch with 10GbE network uplinks.
These recommendations are made because of the possible replication and synchronization activities that
Virtual SAN might impose on the network based on the number of virtual machines hosted in the system and
the number of active operations.
Virtual SAN network activities potentially can saturate and overwhelm the entire 1GbE network speed capacity,
particularly during rebuild and synchronization operations.
Whenever possible, consider the use of the vSphere Distributed Switch in combination with VMware vSphere
Network I/O Control to share the 10GbE interfaces. Separate the various traffic types—management,
VMware vSphere vMotion®, virtual machine, Virtual SAN—onto different VLANs and use shares as a quality of
service (QoS) mechanism to sustain the level of performance expected during possible contention scenarios.
Figure 4. vSphere Distributed Switch with vSphere Network I/O Control Configuration
For the best security and performance, use the same approach recommended for the vSphere vMotion network
and isolate Virtual SAN network traffic to its own layer 2 network segment. Virtual SAN requires that IP multicast
be enabled on the layer 2 physical network segment utilized for Virtual SAN intracluster communication. Layer 2
multicast traffic can be limited to specific port groups by using IGMP snooping. VMware does not recommend
implementing multicast flooding across all ports as a best practice. Virtual SAN does not require layer 3
multicast for any of its network communication requirements.
Network adapter teamed configuration is supported in Virtual SAN as an availability and redundancy measure.
Virtual SAN does not leverage teaming of network adapters for the purpose of bandwidth aggregation. For a
predictable level of performance, VMware recommends the use of multiple network adapters in active–passive
mode with explicit failover order whenever using a route based on the originating virtual port ID load-balancing
mechanism. Active–active configurations are recommended when using physical network adapters connecting
to Link Aggregation Control Protocol (LACP) port channels and using the following load-balancing algorithms:
• Route based on IP hash
• Route based on physical network adapter load
1.5.4 Swap
A certain amount of raw capacity will be consumed by virtual machine swap space. Virtual SAN always stores
swap space with two replicas, regardless of the Failures to Tolerate setting:
• Formula: ClusterCapacity – (VMs x vmSwp x 2)
• Example: 1,120,000GB – (800 x 10GB x 2) = 1,120,000 – 16,000 = 1,104,000GB Disk Capacity
1.5.5 Usable Capacity
Virtual SAN usable capacity is the amount of capacity that can be used to store the VMDK files of all virtual
machines. It is determined by subtracting the Virtual SAN overhead from the disk capacity and then dividing the
remaining amount by the number of failures to tolerate plus 1:
• Formula: (DiskCapacity – DskGrp x DskPerDskGrp x Hst x VSANoverhead )/(ftt+1)
• Example: (1,104,000GB – 280GB)/(ftt+1) = 1,103,720GB/(2) = 551,860GB Usable Capacity
NOTE: As a general guideline, 1GB of storage capacity per disk will be calculated as the combination of
Virtual SAN components and VMFS metadata overheard (VSANoverhead).
So of approximately 1,120TB of raw capacity, users can create VMDKs that in total consume as many as 551TB.
The remainder is consumed primarily by replicas created for availability and virtual machine swap space. In this
case, for 800 virtual machines with a single virtual disk, each VMDK can be as large as 689GB.
In practice, no more than 80 percent of this capacity should be allocated to virtual machines, to allow for other
factors such as snapshots and working space. In addition, the total number of components, which depends on a
variety of factors, must remain within the limit of 3,000 per host. In this case, we have approximately 900
components per host, but an increase in the number of disks per virtual machine, stripes per object, or
snapshots will contribute to a higher component count.
The following graphs illustrate the results of these calculations. They show the raw capacity split into three major
contributions. The virtual machine swap consumption is in blue, the space used for replicas is in orange, and the
space available to allocate for virtual disks is in green. The entire overhead consumed by Virtual SAN—VMFS
metadata and component metadata—is too small to be seen in the graph and can be considered negligible for
most calculations. The lower graph also shows in red the capacity that is lost after the failure of a single host in
this eight-host cluster.
VM Swap Files Replicas Available Space failed Hosts VSAN Metadata
0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1120000
Conclusion
VMware Virtual SAN is a hypervisor-converged platform that delivers a shared datastore by combining compute
and storage resources of VMware vSphere hosts in a vSphere cluster while providing a much simpler storage
management experience for the user. It is a storage solution designed by VMware to make software-defined
storage a reality for its customers. Because certain factors must be taken into account when sizing and
designing a Virtual SAN cluster, this paper has presented much of what must be considered to successfully
deploy a Virtual SAN configuration.
Acknowledgments
I would like to thank Jorge Guerra, Christian Dickmann, and Christos Karamanolis of VMware R&D, whose deep
knowledge and understanding of Virtual SAN was leveraged throughout this paper. I would also like to thank
Charu Chaubal, group manager of the Storage and Availability Technical Marketing team; Kiran Madnani, senior
product line manager of storage technologies products; and Wade Holmes, senior technical marketing architect
within the Storage and Availability Technical Marketing team, for their contributions and for reviewing this paper.