Active Full Backup for Backup Copy Job

UPDATE: Starting version 9, this is no longer required. Please read more here > Ultimate Guide to Version 9

In Veeam Backup & Replication v7, the Backup Copy Job was implemented. This job type is by nature using the “forward incremental forever” method, which was later introduced in version 8 for primary backup jobs as well.

The “forever” aspect of these job types is implemented by performing a transform of the oldest incremental backup file (VIB) in the chain once the desired retention is achieved. The transform process will merge data blocks of the VIB file into the full backup file (VBK). Such transforms are also referred to as synthetic operations.

The purpose of this blog post is to provide a workaround for users of Veeam Backup & Replication v7 or v8 using deduplication appliances as backup repository for the Backup Copy Job, so they can avoid such synthetic operations.

Deduplication Storage and Synthetic Operations

So, why avoid synthetic operations? Synthetic full backups have been around for many years, and they were introduced to mitigate the issue of having to perform traditional active full backups with large amounts of data. In larger infrastructures, active full backups simply do not scale. For more information about how synthetic full backups are created, please refer to the Veeam User Guide > How Synthetic Full Backup Works

When using the Backup Copy Job, archive restore points (grandfather-father-son or GFS) will be created as synthetic full backups, while the daily transform process will also read and write existing data blocks in the backup repository. The transform process and creating “synthetic full” restore points are by nature random workloads (1x random read, 1x random write).

In addition to extremely efficient data reduction mechanisms, most deduplication appliances implement a level of read-ahead to speed up restore operations. The read-ahead will have a significant performance boost for sequential operations such as full VM restores, whereas it will typically have a severe penalty for any synthetic operation such as transforming incremental backups and creating full backups. In addition to affecting synthetic operations, the impact of read-ahead extends to features such as Instant VM Recovery, file and item-level recoveries with the Explorer products and Virtual Lab, but I will leave that for a future blog post. Until then, I will simply refer to this post on the Veeam Community Forums (it applies to any version) > Version 7 and Deduplication devices.

Some deduplication appliances implement APIs to offload synthetic operations to the appliance itself, such as EMC DataDomain DDBoost. DDBoost is supported natively in Veeam Backup & Replication v8, and thus synthetic operations run much faster, while the penalty during restores remain unchanged. Veeam also has native support for ExaGrid, and while the implementation is a bit different, it can be considered as efficient as DDBoost for synthetic operations.

This blog post is aimed at deduplication appliances where such implementations cannot be leveraged natively in Veeam Backup & Replication e.g. EMC DataDomain without DDBoost, HP StoreOnce, NetApp SteelStore, Quantum DXi or Dell DR4000 and DR6000.

VeeamActiveBackupCopy.ps1

In order to prevent any synthetic operations from occuring, the following steps can be used for configuring the Backup Copy Job to never exceed its configured retention period for one chain. Instead of continuing working on the same backup chain, a new chain will be created by forcing Veeam Backup & Replication to perform an active full as if it was the very first cycle of the Backup Copy Job.

  1. Configure the repository as Rotated Media
  2. Use the PowerShell script provided below to relocate the chain of the Backup Copy Job once it has reached the desired retention

The script will lookup the configured simple retention points for the Backup Copy Job, and relocate these to a folder prefixed with “Archive” in the same folder as the original Backup Copy Job target folder. Please note there is no retention handling implemented, except from the parameters explained later.

1. Rotated Media

Version 8

As described in the article from the Veeam manual (Backup Repositories with Rotated Drives), it is possible to force the software to recreate the backup chain if it no longer exists. We can leverage this feature to force an active full backup and thus avoid a synthetic operation. It is as simple as a checkbox in the advanced repository configuration:

Rotated Drives configuration

Other versions

For Veeam Backup & Replication older than v8, the following registry key is available: ForceCreateMissingVBK (DWORD). In version 7 Patch 3, an additional registry key was introduced: ForceDeleteBackupFiles (DWORD). Before setting these registry keys, please read more about their behaviour in the following knowledge base article as they can end up wiping your entire repository if not handled with care. As the names imply, they were invented for rotated media > Veeam KB1154

2. Post Job Script

Get the script from my GitHub page > VeeamActiveBackupCopy.ps1

Save the script to C:\Veeam\VeeamActiveBackupCopy.ps1. Ensure that your Veeam service account has permissions to access this folder, and also full permissions to access and rename directories in the backup repository. Configure the script as post job activity for your Backup Copy Job using this command:

C:\windows\system32\WindowsPowerShell\v1.0\powershell.exe -Command C:\Veeam\VeeamActiveBackupCopy.ps1 -DeleteIncremental $false -DeleteOldChain $false

The script takes two parameters: -DeleteIncremental and -DeleteOldChain. The default setting for both is $false, so the script will simply relocate the backup files to the designated “Archive” folder.

  • -DeleteIncremental $true will remove incremental backups from previous chains, once a new chain is added. This can be compared to GFS style
  • -DeleteOldChain $true will remove both previous full and incremental backups, only preserving the most recent backups

The default behaviour and -DeleteIncremental $true can be seen in the following two screenshots.

3. Verification

To verify your script works, simply validate that post job activity has completed successfully in your Backup Copy Job session:

Active Backup Copy Job session

Share it

Kudos to my colleague Tom Sightler for providing the part of the PowerShell script which will automatically detect the name of the job. It really simplifies the usage of the script!

If you found this blog post to be helpful, I would be happy to see you sharing it with your social networks.

You may also like...

  • Paul Elson

    Hi Preben, thanks for the article.

    I have the same situation using a DR4100 to land remote BCJs the GFS operations can take 10-12 hours for a 600GB backup and 2-3 days for our biggest jobs 2-3TB.

    I am going to test this script but I was wondering if your article assumes the Dedupe appliance is on the LAN/1GB+ as I am not sure if it will trigger an active full to run in which if it does over the WAN it could be a problem.

    Thanks

    Paul

    • poulpreben

      Hi @paul_elson:disqus

      Thank you for the feedback – I am glad, you liked it!

      What you are experiencing is the exact reason for creating this script. Creating a synthetic full backups at only 12-15 MB/s (3 TB in 3 days) obviously does not scale very well.

      I can confirm that the script will create a new active full backup over WAN in case your appliance is on a remote location. This script is developed primarily for users who are using Backup & Replication within a datacenter or over high bandwidth links.

      Unless your WAN connection is able to transfer the Active Full backup (maybe you could consider WAN acceleration?), this script is probably not good for you.

      Thanks,
      Preben

      • Paul Elson

        Thanks Preben, back to the drawing board for me and Veeam then.

        Cheers

        • poulpreben

          I have given this some thought. You mention creating GFS points take too long. What about the daily transform time – does that fit within your backup window?

          In case it does, you could potentially just disable GFS on the Backup Copy Job, and copy the VBK file occasionally rather than moving it, as my script does. This way you will retain multiple full backups, while you will still be able to run incremental forever on the Backup Copy Job.

          Thanks,
          Preben

          • Paul Elson

            Hi Preben, Thanks for your message.

            This certainly could be a solution, current I have only one BCJ job which is set to retain 14 restore points and no GFS. This job is quite small.though about 41GB and I can see the daily transform time is very quick.

            From Starting full backup file merge to completed successfully took around 20 seconds.

            I changed a job which is about 1.8TB to a similar setting (no GFS) and moved the *M.vbk and *W.vbk to an Archive folder (otherwise the job asks me if I want to delete these full backups).

            Assuming the daily transform process isn’t as intensive as creating a synthetic full (GFS) on the DR4100 then it should be OK.

            This job has been disabled for about 2 weeks so its recreating the finger prints so this could take all day. I will give you an update when completed.

            Thanks again

            Paul

          • Andrey Vakhitov

            Hi.
            If you can change type of job from backup copy to simple backup, then I suggest another decision for v8:
            1) You create backup job
            2) At advanced storage settings you enable “Active full backup”, for example each monday;
            3) You enable schedule at same days with “active full backup” (daily at 22:00 on Monday).
            At this you get only full backup every day of Backup.

  • Tomas

    Thanks for a great article!
    We have been thinking of using a HP Storeonce in a datacenter and do backup copy jobs to that one. We have 100 Mbit internet connections between 15 sites so WAN accelerators are not recommended from Veeam as far as I know (just up to 60 Mbit) I donn´t think it will work as described in the artice if these new backups over WAN will not work for us I asume.
    Thanks!

    • poulpreben

      Hi Tomas,

      Thank you for stopping by!

      For your use case, WAN acceleration will work great. You have 15 sites sharing 100 Mbit/s, and since you probably only have relatively small backup copy jobs, I am not even sure the issue described in this article will have any significant impact on your environment.

      This is a bit off-topic though. I would be happy to discuss the possibilities, if you send me an e-mail on pberg at veeam.com.

      Thanks,
      Preben

  • Jeremy

    I have this implemented, but the archive folder never gets created. The job shows that the script ran fine, and I checked all of permissions.

    • poulpreben

      Hi Jeremy. I am curious why the script is not working for you. Which commands did you specify in the job configuration wizard? Does the script work, if you run it manually (remember you need to declare $job before doing that)? A couple of screenshots would be very helpful indeed.

      -Preben

      • Jeremy

        This is what I am seeing when running manually.

        • poulpreben

          Hi Jeremy. The $job variable gets overwritten by the script. You may want to comment out lines 9-11, and replace them with

          $job = Get-VBRJob -Name “Test Copy”

          -Preben

          • Jeremy

            The G drive does exist on the remote repository. Is it running against the local server even when the job is running? Should I drop the script on the remote repo and then change the post-script to point to the remote server?

          • poulpreben

            The script is only tested for SMB repositories, which is the protocol used for the vast majority of affected deduplication devices as described in this blog post.

            SMB repositories are referenced in the Veeam database as UNC paths, and thus the post script can reach out to this location for renaming files. It will not work, if you use a Windows-based repository on another server, as the script is executed on the Veeam Backup & Replication server itself.

            If you want to make it work for such scenarios, please feel free to fork the script over at Github. Under normal circumstances, this should definitely not be required.

            Out of curiosity, could you please share your repository configuration? What make and model is your target storage, and how is it configured?

          • Jeremy

            This was all setup as a test, so I was able to blow away the Repo and change it to CIFS. I am running again over the next few days to see if it works as expected. Thanks for the help!

  • Joosef

    Tested this script and it works great. Curious if there is an ability to add a clean up portion to this script for archive chains. Ideally would like to keep a full week chain (7 restore points) and when the new chain completes and archives, the older chain is purged.

    • Preben Berg

      Hi Joosef. You may look forward to using version 9 as the feature is 100% built-in there. I’ll be updating this post with the necessary steps soon.

      Thanks,
      Preben

    • poulpreben

      Hi @disqus_hVY9UeMFp0:disqus. The functionality described in this blog post has been natively included in version 9. Please see this post > https://poulpreben.com/ultimate-guide-to-version-9/#h4-a-blast-from-the-past