Running unmap on a large number of datastores

With the VMFS-6 filesystem came the option to automatically unmap datastores. In short, the unmap command is used to reclaim unused storage blocks on a VMFS datastore when thin provisioning is used.

When Datastores are still on VMFS-5, reclaiming disk space is a manual process. VMware KB “Using the esxcli storage vmfs unmap command to reclaim VMFS deleted blocks on thin-provisioned LUNs (2057513)” details how to use the unmap command.

The action is performed on an ESXi hosts, the basic command looks like this:

# esxcli storage vmfs unmap -l <Volume label>

Where Volume label is the human readable name of a Datastore like: “VMFS01”.

Depending on the size of the datastore(s), running unmap will take quite some time. If you have few datastores, you run this command a couple of times and voila. If cluster(s) have dozens of datastores, the following workaround can help you.

SSH to one of the ESXi hosts in the cluster, and cd to the folder /tmp.

First create an input file, named “datastores” with a sorted list of the datastores:

# find /vmfs/volumes/ -type l | cut -d "/" -f4 | sort > datastores

Next create a script named: unmap.sh Use the vi editor to create this script.

for ds in `cat datastores`; do
    echo `esxcli hardware clock get` "Start unmap Datastore $ds" >> unmap.log
    esxcli storage vmfs unmap -l $ds -n 200 || exit 1
    echo `esxcli hardware clock get` "Ready unmap Datastore $ds" >> unmap.log
done

The for loop will read the entries in the input file “datastores” and puts the first datastore in a variable “ds”.
The fourth line of code starting with “esxcli storage …” does the actual job. The remaining lines write some very basic information to a log file. The “esxcli hardware clock get” gets the actual date and time. This is useful to get some insight in the duration and in case the script prematurely ends, we know where we are.

To make the script executable do:

# chmod 744 ./unmap.sh

To make sure the script will continue to run, even if we exit the shell or log out from the ESXi host, we will use the command nohup.

To start the unmap we do:

# nohup ./unmap.sh > unmap_err.log 2>&1 &
# echo $! > save_pid.txt

The first line will start the unmap script. All standard output and standard error is redirected to logfile unmap_err.log.
Note: Do not forget the second & at the end of the line!.

The second line will store the process id of the first command in a text file. This can be useful in case the unmap script needs to be stopped.

After starting the unmap script, check if it works correctly:

# cat save_pid.txt

This must show the process id (some decimal numbers).
To check if the process is running, do:

# ps | grep <process id>

This must return something like this:

37999 37999 sh

With the following command we can see which datastore is currently unmapped:

# tail -f unmap.log

Unmap is also logged in the hostd.log:

# tail -f /var/log/hostd.log

shows many lines like this one:

2018-03-06T07:16:24.492Z info hostd[2E389B70] [Originator@6876 sub=Libs opID=esxcli-
7-5219 user=root] Unmap: Async Unmapped 200 blocks from volume 5a0ec1c0-0233b277-
ada-38eaa7171020

You can now log off from the ESXi host and if necessary repeat these actions in another cluster. The script continues to run. You can check the progress; log on and check the status:

# cd /tmp
# tail unmap.log

It is also possible to use PowerCLI to create an unmap script. The advantage is that you only need to connect to a single vCenter Server. Unfortunately, I ran into an issue that the script timed out after 30 minutes in all environments (except in my home lab). This behaviour has been observed by more people.

Driven by the urgency to perform the unmap, I choose the approach described here, which worked well.

Update 16-03-2018

The script unmap.sh needed some improvement:

Parameter -n <value>, reclaim unit has been added. That is the number of VMFS blocks to UNMAP per iteration. This parameter is optional, the default value is 200. Have a look at the best practices of the storage in use. In some cases a much higher value is recommended by the vendor.

The logging has been improved and has now the same layout as other vSphere log files.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: