poulpreben.com

Veeam Direct SAN mode and PernixData FVP

In larger environments, many users of Veeam Backup & Replication leverage Direct SAN mode for protecting their VMs. This mode provides the best and most predictable performance, as it will read data blocks directly from VMFS datastores via iSCSI or Fibre Channel (FC) without using the ESXi host’s VMkernel network interfaces or storage I/O stack. Using the Direct SAN mode introduces some complexity, when using host-based write-back caching such as PernixData FVP.

Write-through vs. write-back

The difference between write-through and write-back caching is the way write I/O is acknowledged by the storage subsystem. When using a write-back cache, the write operation will be acknowledged once it arrives in the cache, and then the block will be committed to the underlying storage at a later point in time. This is exactly the reason why most RAID controllers offering write-back caching are battery backed. In case of a power outage, the commit of cached blocks can be resumed at a later point in time.

PernixData FVP offers an additional layer of cache much closer to the application itself, and thus the underlying storage array is completely unaware of such outstanding (cached) writes. If Veeam Backup & Replication tries to access the VMFS datastore prior to the writes being committed, it could ultimately lead to silent corruptions.

By switching the caching method to “write-though” temporarily, the write-back cache will be committed, and for the duration of the backup, PernixData FVP will keep offloading the storage array by offering read caching.

Backup using hot-add mode or network mode

If you are using virtual appliance mode (hot-add) or network mode (nbd) for backups, you are not affected by this issue. My colleague Luca Dell’Oca (@dellock6) already blogged about required settings when protecting write-back cached virtual machines using virtual proxy servers here > Veeam Backup with PernixData write-back caching.

In the same article Luca alludes to the concept of the issue I am discussing in this post, but I wanted to provide a reference script to make lives easier for anyone using or evaluating PernixData FVP. If you are not already following Luca’s blog, I can not recommend this enough.

VeeamPrnxCacheControl.ps1

The script has been tested with the following configuration:

  • VMware PowerCLI 5.5
  • PernixData FVP 2.5 (should work with FVP 2.0)
  • Veeam Backup & Replication 8.0 Update 1 (should also work with v7)

The script is available on my GitHub page > VeeamPrnxCacheControl.ps1

Update March 24, 2015

The initial feedback on the script and post was quite overwhelming, but best of all I received some brillialt feature requests and suggestions as well. First of all, the script did not save the VM’s previous writeback peer configuration, so it would default to -NumWBPeers 1 and -NumWBExternalPeers 1, which is probably not a good solution for most environments. These settings are now stored in a temporary file in C:\Temp, so ensure this folder exists and is writable by your Veeam Backup & Replication service account.

Also powered off virtual machines are now simply skipped, and vCenter connectivity is also bypassed when using -Mode WriteBack. Both of these are implemented to speed up the script.

Each VM in the job needs to be explicitly added to the FVP console

When using the Set-PrnxAccelerationPolicy cmd-let, the PernixData snap-in will throw errors, if each VM in the job is not added under Cluster > PernixData tab > Configure > Datastores/VMs. You can see how my environment looks.

Configuring Veeam Backup & Replication

All jobs that should make use of the script needs to have the pre/post job settings configured. You can find these settings by editing your job > Storage > Advanced > Advanced tab. The dialog looks like this:

In the fields, use these two lines to launch the PowerShell script before and after the job starts processing.

Pre-job setting:

C:\windows\system32\WindowsPowerShell\v1.0\powershell.exe -Command C:\backup\VeeamPrnxCacheControl.ps1 -JobName 'My Backup Job' -Mode WriteThrough

Post-job setting

C:\windows\system32\WindowsPowerShell\v1.0\powershell.exe -Command C:\backup\VeeamPrnxCacheControl.ps1 -JobName 'My Backup Job' -Mode WriteBack

While running the job, you should see PernixData FVP changing the cache settings for each VM.

In the Veeam job session log, you can also see the pre and post scripts are executed.

Share it!

A special thanks to Frank Brix Pedersen (@frankbrix) for helping me configuring PernixData FVP in my lab, and Brandon Willmott (@bdwill) for sharing his scripts developed for NetApp Virtual Storage Console (VSC). They were great for inspiration as to how the write-back peer settings could be stored.

Sharing is caring, so if you found this useful, I will be happy to see you Tweet about it!