- Properly configure BIOS and management settings to support DPM
- Test DPM to verify proper configuration
- Configure appropriate DPM Threshold to meet business requirements
- Configure EVC using appropriate baseline
- Change the EVC mode on an existing DRS cluster
- Create DRS and DPM alarms
- Configure applicable power management settings for ESXi hosts
- Properly size virtual machines and clusters for optimal DRS efficiency
- Properly apply virtual machine automation levels based upon application requirements
- Create and administer ESXi host and Datastore Clusters
- Administer DRS / Storage DRS
Properly configure BIOS and management settings to support DPM
vSphere Resource Management Guide, Chapter 10 “Using DRS Clusters to Manage Resources”, Section “Managing Power Resources”, page 67.
Some background on this subject.
The Distributed Power Management (DPM) feature allows a DRS cluster to reduce its power consumption by powering hosts on and off based on cluster resource utilization.
DPM can use one of three power management protocols to bring a host out of standby mode:
- Intelligent Platform Management Interface (IPMI)
- Hewlett-Packard Integrated Lights-Out (iLO)
- Wake-On-LAN (WOL)
If a host supports multiple protocols, they are used in the order presented above.
If a host does not support any of these protocols it cannot be put into standby mode by vSphere DPM.
Each protocol requires its own hardware support and configuration, hence BIOS and Management Settings will vary depending on the hardware (vendor).
Note: DPM is complementary to host power management policies (See Objective 3.1, Section on Tune ESXi host CPU configuration). Using DPM and host power management together can offer greater power savings than when either solution is used alone.
Example, configuring a Dell R710 server with an iDRAC (Dell Remote access solution) for DPM. A Dell R710 contains also a BMC, which is also needed.
The iDRAC supports IPMI, but out-of-the-box, this feature is disabled.
So, log on to the iDRAC, go to “iDRAC settings”, section “Network Security” and enable IPMI Over LAN.
And while we are logged in, also create a user account. Go to the “Users” section and create a user. Make sure you grant enough privileges, in this case, Operator will do.
If you are not sure, read the documentation or do some trial and error, starting with the lowest level.
The remaining configuration steps take place in vCenter and are described in great detail for IPMI/iLO and WOL configuration.
For IPMI/iLO follow these steps:
- The following steps need to be performed on each host that is part of your DRS Cluster.
- In vCenter, select a Host, go to Configuration, Software and Power Management.
- Provide the Username, Password, IP address and MAC address of the BMC.
Configuration for WOL has a few prerequisites:
- Each host’s vMotion networking link must be working correctly.
- The vMotion network should also be a single IP subnet, not multiple subnets separated by routers.
- The vMotion NIC on each host must support WOL.
- To check for WOL support, first determine the name of the physical network adapter corresponding to the VMkernel port by selecting the host in the inventory panel of the vSphere Client, selecting the Configuration tab, and clicking Networking.
- After you have this information, click on Network Adapters and find the entry corresponding to the network adapter.
- The Wake On LAN Supported column for the relevant adapter should show Yes.
- The switch port that each WOL-supporting vMotion NIC is plugged into should be set to auto negotiate the link speed, and not set to a fixed speed (for example, 1000 Mb/s). Many NICs support WOL only if they can switch to 100 Mb/s or less when the host is powered off.
Note: My fellow Network Admins do not like to configure a Switchport as Auto Negotiate (Everything that goes automatically, automatically goes wrong…)
The Final step is to enable DPM on the Cluster Level
- The Performance Best Practices for VMware vSphere 5.0 has a section on DPM
- VMware Whitepaper “VMware Distributed Power Management Concepts and Use”, although based on vSphere 4.x
- For troubleshooting DPM, read KB 2001651”Failure to enter Standby mode on a vSphere ESX/ESXi host”
Test DPM to verify proper configuration
vSphere Resource Management Guide, Chapter 10 “Using DRS Clusters to Manage Resources”, Section “Test Wake-on-LAN for vSphere DPM”, page 68.
Not only while configuring Wake-on-LAN for DPM, also while configuring IPMI or iLO, it is a good idea to test the functionality. The idea is simple.
- Put a host in Standby, by selecting Enter Standby Mode.
The host should Power down now.
- Try to get the host out of Standby, by selecting Power On.
If a host fails the procedure, disable the host in the Cluster Settings.
In this example host ml110g6 succeeded and ml110g5 failed and is disabled for DPM.
Configure appropriate DPM Threshold to meet business requirements
vSphere Resource Management Guide, Chapter 10 “Using DRS Clusters to Manage Resources”, Section “Test Wake-on-LAN for vSphere DPM”, page 69.
After enabling DPM on the Cluster level, first you must choose the Automation level.
- Off, feature is disabled;
- Manual, recommendations are made, but not executed
- Automatic, Host power operations are automatically executed if related virtual machine migrations can all be executed automatically
Second, the desired DPM Threshold should be selected. 5 options are available, ranging from Conservative to Aggressive.
Note: Conservative is only about Power On recommendations and no Power Off recommendations.
This excellent resource is VMware vSphere 5 Clustering, Technical Deepdive presents an excellent explanation on DPM
In a Nutshell:
- TargetUtilizationRange = DemandCapacityRatioTarget +/- DemandCapacityRatioToleranceHost
- DemandCapacityRatioTarget = utilization target of the ESXi host (Default is 63%)
- DemandCapacityRatioToleranceHost = tolerance around utilization target for each host (Default is 18%)
- This means, DPM attempts to keep the ESXi host resource utilization centered at 63% plus or minus 18%.
- Values of DemandCapacityRatioTarget and DemandCapacityRatioToleranceHost can be adjusted in the DRA advanced options section
- There are two kind of recommendations: Power-On and Power-Off.
- Power-On and Power-Off recommendations are assigned Priorities, ranging from Priority 1 to Priority 5.
- Priority level ratings are based on the resource utilization of the cluster and the improvement that is expected from the suggested recommendation.
- Example: A Power-Off recommendation with a higher prioritylevel will result in more powersavings. Note Priority 2 is regarded higher than Priority 5.
- Example: A Power-On Priority 2 is more urgent than a Priority level 3.
- Power-On priority ranges from 1-3
- Power-Off priority ranges from 2-5
- The one and only excellent book on this subject is VMware vSphere 5 Clustering, Technical Deepdive, by Duncan Epping and Frank Denneman.
Configure EVC using appropriate baseline
vCenter Server Host Management Guide, Chapter 12, “Migrating Virtual Machines”, Section “CPU Compatibility and EVC” and further, page 121.
- EVC (Enhanced vMotion Compatibility) overcomes incompatibility between a virtual machine’s CPU feature set and the features offered by the destination host. EVC does this by providing a “baseline” feature set for all virtual machines running in a cluster and hides the differences among the clustered hosts’ CPUs from the virtual machines.
- EVC ensures that all hosts in a cluster present the same CPU feature set to virtual machines, even if the actual CPUs on the hosts differ.
- EVC is configured on the Cluster level.
- When you configure EVC, you configure all host processors in the cluster to present the feature set of a baseline processor. This baseline feature set is called the EVC mode.
The EVC mode must be equivalent to, or a subset of, the feature set of the host with the smallest feature set in the cluster.
To enable EVC in a Cluster, you must meet these Requirements:
- All virtual machines in the cluster that are running on hosts with a feature set greater than the EVC mode you intend to enable must be powered off or migrated out of the cluster before EVC is enabled.
- All hosts in the cluster must have CPUs from a single vendor, either AMD or Intel.
- All hosts in the cluster must be running ESX/ESXi 3.5 Update 2 or later.
- All hosts in the cluster must be connected to the vCenter Server system.
- All hosts in the cluster must have advanced CPU features, such as hardware Virtualization support (AMD-V or Intel VT) and AMD No eXecute (NX) or Intel eXecute Disable (XD), enabled in the BIOS if they are available.
- All hosts in the cluster should be configured for vMotion.
- All hosts in the cluster must have supported CPUs for the EVC mode you want to enable. To check EVC support for a specific processor or server model, see the VMware Compatibility Guide at http://www.vmware.com/resources/compatibility/search.php
There are two methods to create an EVC cluster:
- Create an empty cluster, enable EVC, and move hosts into the cluster (Recommended method).
- Enable EVC on an existing cluster.
Note: While moving a host into a new EVC cluster or while enabling EVC on a existing cluster: If the host feature set is greater than the EVC mode that you have enabled for the EVC cluster, ensure that the cluster has no powered-on virtual machines.
- Power off all the virtual machines on the host.
- Migrate the host’s virtual machines to another host using vMotion
Example: in my home lab, I have two ESXi hosts with incompatible CPUs. I had to move my vCenter Server VM to the ESXI hosts with the smallest feature set. Then enable EVC and add the second ESXi host (more advanced feature set) with all VMs powered off.
- Identify Intel CPUs, Application Note 485: Intel® Processor Identification and the CPUID Instruction
- Identify AMD CPUs, CPUID Specification
- VMware KB “Detecting and Using New Features in CPUs”
- VMware KB “Enhanced vMotion Compatibility (EVC) processor support Details”
- VMware KB “EVC and CPU Compatibility FAQ”
Change the EVC mode on an existing DRS cluster
vCenter Server Host Management Guide, Chapter 12, “Migrating Virtual Machines”, Section “Change the EVC Mode for a Cluster”, page 125.
To raise the EVC mode from a CPU baseline with fewer features to one with more features, you do not need to turn off any running virtual machines in the cluster. Virtual machines that are running do not have access to the new features available in the new EVC mode until they are powered off and powered back on. A full power cycling is required. Rebooting the guest operating system or suspending and resuming the virtual machine is not sufficient.
To lower the EVC mode from a CPU baseline with more features to one with fewer features, you must first power off any virtual machines in the cluster that are running at a higher EVC mode than the one you intend to enable, and power them back on after the new mode has been enabled.
How to determine the EVC mode for a Virtual Machine?
Select a Cluster or Host and go to the Virtual Machines Tab.
Create DRS and DPM alarms
vSphere Resource Management Guide, Chapter 10 “Using DRS Clusters to Manage Resources”, Section “Monitoring vSphere DPM”, page 70.
If you want to create DRS related Alarms, on the General tab select Clusters from the list of available Event Triggers. On the Triggers tab, you can configure DRS related triggers.
You can use event-based alarms in vCenter Server to monitor vSphere DPM.
The most serious potential error you face when using vSphere DPM is the failure of a host to exit standby mode when its capacity is needed by the DRS cluster. You can monitor for instances when this error occurs by using the preconfigured Exit Standby Error alarm in vCenter Server.
Other available Events:
Event Type Event Name Entering Standby mode (about to power off host) DrsEnteringStandbyModeEvent Successfully entered Standby mode (host power off succeeded) DrsEnteredStandbyModeEvent Exiting Standby mode (about to power on the host) DrsExitingStandbyModeEvent Successfully exited Standby mode (power on succeeded) DrsExitedStandbyModeEvent
Configure applicable power management settings for ESXi hosts
vSphere Resource Management Guide,
Chapter 4, “Administering CPU Resources”, Section “Host Power Management Policies”, Page 22.
See my notes on Objective 3.1, section “Tune ESXi host CPU configuration”.
Properly size virtual machines and clusters for optimal DRS efficiency
vSphere Resource Management Guide,
Chapter 10, “Using DRS Clusters to Manage Resources”, Section “DRS Cluster Validity”, Page 63.
The vSphere Client indicates whether a DRS cluster is valid, overcommitted (yellow), or invalid (red).
DRS clusters become overcommitted or invalid for several reasons.
- A cluster might become overcommitted if a host fails.
- A cluster becomes invalid if vCenter Server is unavailable and you power on virtual machines using a vSphere Client connected directly to a host.
- A cluster becomes invalid if the user reduces the reservation on a parent resource pool while a virtual machine is in the process of failing over.
- If changes are made to hosts or virtual machines using a vSphere Client connected to a host while vCenter Server is unavailable, those changes take effect. When vCenter Server becomes available again, you might find that clusters have turned red or yellow because cluster requirements are no longer met.
More information and examples can be found in the vSphere Resource Management Guide, starting from page 63.
DRS efficiency is also affected by DRS affinity rules. There are two types of these rules:
- VM-VM affinity rules
- VM-Host affinity rules
In case of conflicting VM-VM affinity rules:
- Older rules take precedence over younger rules;
- DRS gives higher precedence to preventing violations of anti-affinity rules than violations of affinity rules.
VM-Host affinity rules come in two flavours:
- Required rules (Must (not) Run)
- Preferential rules (Should (not) Run)
In case of VM-Host affinity rules, remember:
- VM-Host affinity rule are not ranked, but are applied equally;
- Older rules take precedence over younger rules
- DRS, vSphere HA, and vSphere DPM never take any action that results in the violation of required affinity rules (those where the virtual machine DRS group ‘must run on’ or ‘must not run on’ the host DRS group)
Note: a number of cluster functions are not performed if doing so would violate a required affinity rule.
- DRS does not evacuate virtual machines to place a host in maintenance mode.
- DRS does not place virtual machines for power-on or load balance virtual machines.
- vSphere HA does not perform failovers.
- vSphere DPM does not optimize power management by placing hosts into standby mode.
Good advice is to prevent using Required (Must Run) rules.
The chapter concludes with a useful tip:
You can create an event-based alarm that is triggered when a virtual machine violates a VM-Host affinity rule. In the vSphere Client, add a new alarm for the virtual machine and select VM is violating a DRS VM-Host Affinity Rule as the event trigger.
Finally, properly configure your virtual machines, do not “oversize”. See also Objective 3.2 section “Properly size a Virtual Machine based on application workload”
- VMware vSphere 5 Clustering, Technical Deepdive, by Duncan Epping and Frank Denneman
Properly apply virtual machine automation levels based upon application requirements
vSphere Resource Management Guide,
Chapter 10, “Creating a DRS Cluster”, Section “Set a Custom Automation Level for a Virtual Machine”, Page 57.
After you create a DRS cluster, you can customize the automation level for individual virtual machines to override the cluster’s default automation level.
A few examples:
- A VM can be set on Manual in a cluster with full automation;
- In a Manual Cluster, a VM can be set on Partially Automated
- If a VM is set to Disabled, vCenter Server does not migrate that virtual machine or provide migration recommendations for it.
Remember, DRS is about two functions:
- Migration (Recommendations are only executed in Fully Automated mode)
- Initial placement (Recommendations are executed in Partially Automated and Fully Automated mode)
NOTE: Some VMware products or features, such as vSphere vApp and vSphere Fault Tolerance, might override the automation levels of virtual machines in a DRS cluster.
Create and administer ESXi host and Datastore Clusters
ESXi host Clusters
vSphere Resource Management Guide,
Chapter 9, “Creating a DRS Cluster”, Page 51.
vSphere Resource Management Guide,
Chapter 10, “Creating a Datastore Cluster”, Page 77.
ESXi hosts Clusters
- A DRS cluster is a collection of ESXi hosts and associated virtual machines with shared resources and a shared management interface.
- cluster-level resource management capabilities include:
- Load balancing (Migration and Initial Placement)
- Power Management (DPM)
- Affinity Rules
An important note when using Fault Tolerant (FT) VMs. Depending on whether EVC is enabled or not, DRS behaves differently.
EVC DRS (Load Balancing) DRS (Initial Placement) Enabled Enabled (Primary and Secondary VMs) Enabled (Primary and Secondary VMs) Disabled Disabled (Primary and Secondary VMs) Disabled (Primary VMs) Fully Automated (Secondary VMs)
A few Notes on DRS-Clusters:
- Initial placement recommendations only for VMs in DRS Cluster
(so, not for VMs on standalone hosts or non-DRS clusters).
- Admission control (vCenter Server checks that enough resources are available) is executed when you Power on a single VM or a group of VMs.
- VMs selected for a group Power On must reside in the same Datacenter.
- If placement-related actions for any of the virtual machines are in manual mode, the powering on of all of the virtual machines (including those that are in automatic mode) is manual.
- When a nonautomatic group power-on attempt is made, and virtual machines not subject to an initial placement recommendation (that is, those on standalone hosts or in non-DRS clusters) are included, vCenter Server attempts to power them on automatically.
- The DRS migration threshold allows you to specify which recommendations are generated and ranges from Conservative to Aggressive.
- Detailed information on how recommendations are calculated, resources are:
- Excellent, VMware vSphere 5 Clustering, Technical Deepdive, Chapter 14, by Duncan Epping and Frank Denneman
- DRS Deepdive, also by Duncan Epping
- Not so good, VMware KB “Calculating the priority level of a VMware DRS migration recommendation”
- ESXi hosts added to a DRS cluster must meet some requirements. In fact, DRS relies completely on vMotion for migration of VMs. So these requirements are very similar:
- Shared storage (SAN or NAS)
- Place the disks of all virtual machines on VMFS volumes that are accessible by source and destination hosts.
- Set access mode for the shared VMFS to public.
- Ensure the VMFS volume is sufficiently large to store all virtual disks for your virtual machines.
- Ensure all VMFS volumes on source and destination hosts use volume names, and all virtual machines use those volume names for specifying the virtual disks.
- Virtual machine swap files also need to be on a VMFS accessible to source and destination hosts (not necessary when running ESXI 3.5 or higher. )
- Processor Compatibility Requirements.
Best practice is to have identical ESXi hosts in a Cluster. That not only goes for CPU compatibility, but also same amount of Memory, number of NICs. The idea is, it does not matter on which ESXI host a VM is running at a time.
Other Cluster topics, like DRS Validity, DRS Affinity rules and Power Management (DPM) have already been discussed.
Another Cluster feature, Hogh Availability (HA), will be discussed in section 4.
Datastore Cluster have been discussed in objective 1.2, section “Configure Datastore Clusters”
- I keep on telling you, the ultimate resource is VMware vSphere 5 Clustering, Technical Deepdive, by Duncan Epping and Frank Denneman
Administer DRS / Storage DRS
vSphere Resource Management Guide,
Chapter 10, “Using DRS Clusters to Manage Resources”, Page 59.
vSphere Resource Management Guide,
Chapter 12, “Using Datastore Clusters to Manage Storage Resources”, Page 83.
See, also previous objective, Some specific tasks are:
- Adding Hosts to a Cluster
- Adding Virtual Machines to a Cluster
- Removing Virtual Machines from a Cluster
- Removing a Host from a Cluster
- Using DRS Affinity Rules
Adding Hosts to a Cluster
- You can add ESXI hosts already managed by the vCenter Server or ESXi that are not managed.
- Procedures are slightly different, also the existence of Resource Pools play a role.
Adding Virtual Machines to a Cluster
- Adding a host to a cluster, will also all VMs on that host to the cluster
- By creating a new VM
- By migrating VMs from a standalone Host or another Cluster
Removing Virtual Machines from a Cluster
- Migrate VMs to another Cluster or Standalone Host
- When you remove a Host from the cluster (next topic), all powered-off VMs that remain on that Host, are removed from the Cluster
- If a VM is a member of a DRS cluster rules group, vCenter Server displays a warning before it allows migration to another Cluster or Standalone Host.
Removing a Host from a Cluster
- ESXi host must be in maintenance mode or in a disconnected state. If the Cluster is not in Fully Automated mode, apply Recommendations.
- After ESXi host is placed in maintenance mode, move host to another cluster, or select Remove to completely remove a ESXi host from the Inventory
Using DRS Affinity Rules
- Should be familiar for a VCP.
- Two types of rules:
- VM-VM affinity (Keep VMs together and Seperate VMs)
- VM-Host affinity (VMs to Hosts)
- To create a VM-Host affinity rule, first, create Hosts DRS Group and VM DRS Group.
- Create the desired VM-Host affinity rule
- A designation of whether the rule is a requirement (“must”) or a preference (“should”) and whether it is affinity (“run on”) or anti-affinity (“not run on”).
Be careful while selecting “must” rules.
- For VM-VM affinity rules, no Groups are required. Select the desired rule type and add VMs.
Administer Storage DRS,
see also in objective 1.2, section “Configure Datastore Clusters”.