vCSA, root partition is (almost) full

hwA short post on a topic that I recently experienced on vCenter Server Appliance, version 6.0.
After receiving an alert that the root “/” partition was quickly filling up, it is time to act quickly. When the root partition reaches 100% of it’s capacity, service disruption can occur.
First step is to check the capacity of the vCSA partitions. Log in to the vCSA through SSH, if you are running the appliance shell, enable and access the Bash shell:

Command> shell.set --enabled true
Command> shell

In the Bash shell run this command to check the capacity of the partitions:

# df -h

The second line of the output (starting with /dev/sda3) shows the status of the root partition. If the value under Use% reaches 100%, you are in trouble. Also notice that the root partition is only 11 GB.
Second step is to determine the root cause of the full partition. A good strategy is to look for large consumers. The next command searches for files larger then 100 MB, only on the root partition:

# find / -xdev -type f -size +100M

In my case some interesting results:

/usr/lib/vmware-sca/wrapper/bin/wrapper.log
/usr/lib/oracle/11.2/client64/lib/libociei.so
/var/log/dnsmasq.log-20180121
/var/log/dnsmasq.log-20180128
/var/log/dnsmasq.log-20180107
/var/log/dnsmasq.log-20180114
/var/log/dnsmasq.log
/etc/vmware-vpx/docRoot/client/Vmware-viclient.exe

The most eye-catching files are: the wrapper.log and the dnsmasq.log files.

Wrapper.log

The file /usr/lib/vmware-sca/wrapper/bin/wrapper.log grows unrestricted. According to VMware support, this is a known issue, related to the Java Service Wrapper from Tanuki Software, which was dismissed in vSphere 6.5.
Taking a closer look at the wrapper.log, reveals the root cause of this ever growing log file; the following lines endlessly repeat:

FATAL | wrapper | 2018/02/16 17:40:28 | Unable to get the path for '/usr/lib/vmware-vapi/wrapper/bin/./wrapper'-No such file or directory
FATAL | wrapper | 2018/02/16 17:41:30 | Unable to get the path for '/usr/lib/vmware-vdcs/wrapper/bin/./wrapper'-No such file or directory
FATAL | wrapper | 2018/02/16 17:41:30 | Unable to get the path for '/usr/lib/vmware-vdcs/wrapper/bin/./wrapper'-No such file or directory
FATAL | wrapper | 2018/02/16 17:41:30 | Unable to get the path for '/usr/lib/vmware-vdcs/wrapper/bin/./wrapper'-No such file or directory
FATAL | wrapper | 2018/02/16 17:41:30 | Unable to get the path for '/usr/lib/vmware-vapi/wrapper/bin/./wrapper'-No such file or directory

In vSphere 6.0, this issue can be resolved by performing the following commands, which adds permissions to the wrappers:

 # chmod 0555 /usr/lib/vmware-vdcs/wrapper/bin/wrapper
 # chmod 0555 /usr/lib/vmware-vapi/wrapper/bin/wrapper

Now clean up the wapper.log

# > /usr/lib/vmware-sca/wrapper/bin/wrapper.log

If you’re interested, the long explanation:
log file:
/usr/lib/vmware-sca/wrapper/bin/wrapper.log

logs output of script:
/usr/lib/vmware-sca/scripts/vmware-vdcs.sh

which calls:
/etc/init.d/vmware-vdcs

which is a symlink to:
/usr/lib/vmware-vdcs/wrapper/bin/vmware-vdcs

this script runs the wrapper:
/usr/lib/vmware-vdcs/wrapper/bin/wrapper

which has apparently insufficient privileges:

vcsa:/usr/lib/vmware-vdcs/wrapper/bin # ls -l
 -r-xr----- 1 vdcs cis 460392 Feb 23 2017 wrapper

This construct is used in other places, compare with the privileges of the wrapper found for the vmware-eam service:

/usr/lib/vmware-eam/wrapper/bin # ls -l
 -r-xr-xr-x 1 eam cis 460392 Apr 8 2017 wrapper

dnsmasq.log

The dnsmasq.log files are a well known culprit for a full root partition, mainly because these logs are not compressed. VMware KB “Root partition on the vCenter Server Appliance is full due to dnsmasq.log files (52258)” outlines the steps for updating the logrotate, which will compress and also relocate the log files from /var/log to /var/log/vmware, which lives on another partition.

Final words

In case the root partition is full due to audit.log files, please refer to KB “vCenter Appliance root Partition 100% full due to Audit.log files not being rotated (2149278)

If nothing helps, and increasing disk space seems the only solution (after consulting VMware Support), please take a look at this simple solution, presented by William Lam, “Increasing disk capacity simplified with VCSA 6.0 using LVM autogrow”.

And do not forget to monitor your vCenter Server(s), so you can pro-actively respond to issues like these.

Thank you for reading.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: