Day 6 HA - DRS
Day 6 HA - DRS
Day 6 HA - DRS
NIC
Teaming,
Storage
Multipathing
called a cluster.
vSphere HA Architecture: Network Heartbeats
VMFS VMFS NAS/NF
S
vCenter Server
Management Network 1
Management Network 2
Additional vSphere HA Failure Scenarios
o Slave host failure
o Master host failure
o Host isolation
o Virtual machine storage failure:
• Virtual Machine Component Protection
o All Paths Down
o Permanent Device Loss
o Network failures and isolation
Failed Slave Host
• When a slave host does not respond to the network heartbeat
issued by the master host, the master vSphere HA agent tries
to identify the cause.
NAS/NFS VMFS
(Lock File) (Heartbeat Region)
File Locks File Locks
Runs on cluster
enabled for
vSphere HA. Application
availability and
ESXi ESXi remediation.
VMCP detects
and responds to
failures.
Configuring vSphere HA
About Clusters
• A cluster is a collection of
ESXi hosts and their
associated virtual machines,
configured to share their
resources.
• vCenter Server manages
cluster resources like a
single pool of resources.
• Components such as
vSphere HA and VMware Cluster
vSphere® Distributed
Resource Scheduler™ are
configured on a cluster.
vSphere HA Prerequisites
o All hosts must be licensed for vSphere HA.
o A cluster must contain at least two hosts.
o All hosts must be configured with static IP addresses. If you are using DHCP, you must ensure that the address for each
host persists across reboots.
o All hosts must have at least one management network in common.
o All hosts must have access to the same virtual machine networks and datastores.
o For Virtual Machine Monitoring to work, VMware Tools™ must be installed.
o Only vSphere HA clusters that contain ESXi 6 hosts can be used to enable VMCP.
Configuring vSphere HA Settings
• When you create a vSphere HA cluster or configure a cluster, you must configure
settings that determine how the feature works.
Permanent Device Loss and All Paths Down Overview
• vSphere HA uses VMCP to move virtual machines in Permanent Device Loss and
All Paths Down situations to other fully connected hosts.
• Permanent Device Loss:
o The datastore appears as unavailable in the Storage view.
o A storage adapter indicates the operational state as loss of communication.
o All paths to the device are marked as dead.
• Before changing the networking settings on an ESXi host (adding port groups,
removing virtual switches, and so on), you must suspend the Host Monitoring feature
and place the host in maintenance mode.
• This practice prevents unwanted attempts to fail over virtual machines.
Cluster Resource Reservation
• The Resource Reservation tab reports total cluster CPU, memory, memory
overhead, storage capacity, the capacity reserved by virtual machines, and how
much capacity is still available.
Monitoring Cluster Status
• You can monitor the status of a vSphere HA cluster on the Monitor tab.
Introduction to vSphere Fault Tolerance
vSphere Fault Tolerance
• vSphere Fault Tolerance provides instantaneous failover and continuous availability:
o Zero downtime
o Zero data loss
o No loss of TCP connections
Instantaneous
Failover
Fast Checkpointing
ESXi
vSphere Fault Tolerance Features (1)
• vSphere Fault Tolerance protects mission-critical, high-performance applications
regardless of the operating system used.
• vSphere Fault Tolerance:
o Supports up to four virtual CPUs
o Supports up to 64 GB of memory
o Supports VMware vSphere® vMotion® for primary and secondary virtual machines
o Creates a secondary copy of all virtual machine files, including disks
o Provides fast checkpoint copying to keep primary and secondary CPUs synchronized
o Supports thin-provisioned disks
o Supports memory virtualization hardware assist
o Supports Enhanced vMotion Compatibility clusters
How vSphere Fault Tolerance Works
with vSphere HA and vSphere DRS
• vSphere Fault Tolerance works with vSphere HA and vSphere DRS.
• vSphere HA:
o Is required for vSphere Fault Tolerance
o Restarts failed virtual machines
o Is vSphere Fault Tolerance aware
• vSphere DRS:
o Selects the virtual machine’s location at power-on
o Does not balance fault-tolerant virtual machines in a balanced cluster
Primary Secondary
.vmx file .vmx file
vmdk file vmdk file vmdk file vmdk file vmdk file vmdk file
Datastore 1 Datastore 2
vSphere vMotion: Precopy
• During a vSphere vMotion migration, a second virtual machine is created on the
destination host. Then the memory of the source virtual machine is copied to the
destination.
VM A VM A
Memory
Bitmap
Virtual Machine
End User
vSphere vMotion: Memory Checkpoint
• In vSphere vMotion migration, checkpoint data is the last bit of memory that keeps
changing.
VM A VM A
Memory
Bitmap
Virtual Machine
End User
Shared Files
• vSphere Fault Tolerance has shared files:
o shared.vmft prevents UUID change.
o .ftgeneration is for the split-brain condition.
shared.vmft
.ftgeneration
shared.vmft File
• The shared.vmft file, which is found on a shared datastore, is the vSphere Fault
Tolerance metadata file and contains the primary and secondary instance UUIDs and
the primary and secondary vmx paths.
UUID-1 UUID-1
UUID-2
VM Guest OS
Ref: UUID-1
Enabling vSphere Fault Tolerance on a Virtual
Machine
• You can turn on
vSphere Fault
Tolerance for a
virtual machine
through the
VMware vSphere®
Web Client.
vSphere Distributed Resource Scheduler
vSphere DRS Cluster Prerequisites
• vSphere DRS works best when the virtual machines meet VMware vSphere®
vMotion® migration requirements.
• To use vSphere DRS for load balancing, the hosts in the cluster must be part of a
vSphere vMotion migration network.
o If not, vSphere DRS can still make initial placement recommendations.
• A VM-Host affinity
rule:
o Specifies an affinity
relationship between a
virtual machine DRS group
and a host DRS group
Other options:
Must run on hosts in group,
Must Not run on hosts in group,
Should Not run on hosts in group
vSphere DRS Cluster Settings: Automation at the
Virtual Machine Level
• You can customize the automation level for individual virtual machines in a cluster
to override the automation level set on the entire cluster.
Viewing vSphere DRS Cluster Information
• The cluster Summary tab provides information specific to vSphere DRS.
• Clicking the vSphere DRS link on the Monitor tab displays CPU and memory
utilization per host.
Viewing vSphere DRS Recommendations
• The DRS tab displays information about the vSphere DRS recommendations made
for the cluster, the faults that occurred in applying such recommendations, and the
history of vSphere DRS actions.
Refresh recommendations.
Apply a subset of
recommendations.
Apply all
recommendations.
Monitoring Cluster Status
• View the inventory hierarchy for the cluster state.
• You can view the cluster’s Tasks and Events tabs for more information.
Maintenance Mode and Standby Mode
• To service a host in a cluster, for example, to install more memory, or remove a host
from a cluster, you must place the host in maintenance mode:
o Virtual machines on the host should be migrated to another host or shut down.
o You cannot power on virtual machines or migrate virtual machines to a host entering maintenance mode.
o While in maintenance mode, the host does not allow you to deploy or power on a virtual machine.
Fine
Use network
traffic shaping.