Objectives
- Troubleshoot ESXi host and Virtual Machine CPU performance issues using appropriate metrics
- Troubleshoot ESXi host and Virtual Machine memory performance issues using appropriate metrics
- Use Hot-Add functionality to resolve identified Virtual Machine CPU and memory performance issues
Troubleshoot ESXi host and Virtual Machine CPU and Memory performance issues using appropriate metrics
Official Documentation:
vSphere Monitoring and Performance Guide
Summary:
Both topics will be discussed.
There are four essential resources to an ESXi host; CPU, Memory, Storage and Network. Most critical resource on every ESXi host is Memory.
Methods to view Performance data
- vSphere Client
- Performance Tabs on nearly every level (Cluster, Host, VM)
- Summary Tab on the Host level, Resource Usage
- CLI
- Tools esxtop or resxtop (discussed in Objective 3.4)
Objective 3.4 discusses the usage of esxtop and presents some useful links. I encourage you to practice a lot with esxtop.
But that’s not all; the most important part is interpreting what you see. VMware Communities “Interpreting esxtop Statistics” is an excellent resource. Get familiar and know about Worlds, %RDY, %CSTP, %MLMTD, %USED, %SYS and %SWPWT.
CPU metrics, what do we monitor?
- Host level, CPU usage (time physical CPU is used)
- VM level, CPU usage (time vCPU is using the physical CPU)
- VM level, most important is CPU Ready (time vCPU is ready to execute but waiting for the physical CPU). In esxtop CPU Ready is represented by %RDY. Start worrying if %RDY > 10%
- Beware of SMP VM’s, with a %CSTP > 3, this indicates that a VM is using the assigned vCPUs in a not balanced way, probably you can do with fewer vCPUs.
Memory metric, what do we monitor?
You should understand how ESXi handles Memory and Memory Overcommit Techniques, see also Objective 3.1.
Recommended reading is “Understanding Memory Resource Management in VMware ESX 4.1”.
There are five Memory Performance Metrics you must know, using the Performance tabs.
- Average memory active
- Mem.active.average
- Host or VM
- Memory estimated to be used based on recently touched memory pages – this is the smallest number of active, consumed and granted.
Granted = Configured memory for VMs (highest number),
Active = What ESXi server sees of touched pages,
Consumed = see next
- Average memory consumed
- Mem.consumed.average
- Host or VM
- Amount of memory consumed by one or all virtual machines calculated as memory granted less memory saved by sharing (Consumed = Granted – Savings)
Figure 2 – Active, Consumed and Granted on Host level
- Average memory swapped in or out
- Mem.swapin.average or mem.swapout.average
- Host or VM
- Virtual memory swapped to or from disk
- Average memory swapped
- Mem.swapped.average
- Host or VM
- Total amount of memory swapped out
- Average memory reclaimedby ballooning
- Mem.vmmemctl.average
- Host or VM
- Memory reclaimed by using ballooning
Another approach, VMware ESXi uses several techniques to reclaim virtual memory. Related performance metrics indicate which technique is in use and can point you to a resolution.
- Transparent Page Sharing (TPS)
In esxtop, under global statistics, see PSHARE - Ballooning
MCTL = Memory balloon driver installed Y/N
MCTLSZ = Amount of guest memory reclaimed by balloon driver
MCTLTGT = Amount of guest physical memory to be kept in balloon driver
If MCTLGT < MCTLSZ, balloon driver deflates
MCTLMAX = Max. amount of reclaimable guest memory
- VMKernel swapping
SWCUR = Current Swap usage (should be < 1)
SWTGT = Expected swap usage
If SWTGT > SWCUR, then VMKernel can start/continue swapping
SWW/s = Rate at which memory is being swapped out to disk
- Memory Compression
CACHESZ = compression cache size (10% of VM memory)
CACHEUSD = compression cache in use
ZIO/s and UNZIP/s = (de)compressing actions per second
Other, in esxtop, know how to read the global Memory Statistics. It is a lot of useful information, although not easy to read.
Figure 6 – esxtop Global Statistics
PMEM = physical Memory
VMKMEM = VMKernel memory
PSHARE = Page Sharing (TPS) statistics
SWAP = Swap usage
ZIP = Memory Compression
MEMCTL = Memory Ballooning
Other references:
- Another great explanation in 3 posts: http://www.van-lieshout.com/2009/04/esx-memory-management-part-1/
- An excellent session on vSphere Advanced Troubleshooting by Eric Sloof during the Dutch VMUG 2010, unfortunately only in Dutch language.
- Now you know everything about the statisctics, but what are the thresholds?
Read this excellent post on esxtop by Duncan Epping. - A good reading from vKernel on the Top 20 VMware Performance Metrics you should care about (registration required)
Use Hot-Add functionality to resolve identified Virtual Machine CPU and memory performance issues
Official Documentation:
vSphere Virtual Machine Administration, Chapter 8 “Configuring Virtual Machines”, Section “Change CPU Hot Plug Settings in the … Client”, page 94.
Summary:
Subject is briefly touched in Objective 3.2.
Some conditions and requirements for CPU Hot Plug
- If possible, use hardware version 8 virtual machines.
- Hot-adding multicore virtual CPUs is supported only with hardware version 8 virtual machines.
- Not all guest operating systems support CPU hot add.
- To use the CPU hot-add feature with hardware version 7 virtual machines, set the Number of cores per socket to 1.
- Adding CPU resources to a running virtual machine with CPU hot plug enabled disconnects and reconnects all USB passthrough devices connected to that virtual machine.
- For Linux guest operating VMware Tools must be installed. For Hot Add memory VMware Tools must always be installed.
- The virtual machine must powered off to configure he Hot CPU settings.
- Hot remove of Memory is not supported.
Other bloggers have done a great job doing some testing to find out which Operating Systems do support the Hot Plug / Hot Add feature. See references below.
Other references:
One thought on “VCAP5-DCA Objective 6.2 – Troubleshoot CPU and memory performance”