Windows Azure Backups with Bacpac – Automating Exports with Data Sync enabled (Part 4)

Here comes the final installment of my azure backup series.  This time we’ll be looking at doing backups on databases with Data Sync enabled.

Since I originally wrote the first post and made plans to do the four posts in the series, Microsoft has since fixed the issue with restoring a database with Data Sync enabled.  Best as I can tell, there was an issue with merge statements added as triggers to all the tables.  There are various reasons why you’d want to perform backups with all the Data Sync data, but the main reason is that in the event you restore the database you’ll have a full set of data available (even though it might be stale) and not have to wait for Data Sync to restore the data.

My solution presented below strips out all of the Data Sync data, so the resulting backup is only your production data and not the data that is most likely stored on-site (in your on premises servers).  Removing the data, could potentially make the backup/restore faster (really depends on amount of data) and more importantly makes the Data Sync restore a lot quicker.  In my tests with a bit less than 1gb of data, perform the initial Data Sync with the stale data took > 8hrs and still did not complete.  In comparison, a database with Data Sync data stripped out, resulted in a restore of < 1hr.  In the event of a catastrophic failure, an extra hour of delay for up to date information is my preference.

Backup With Data Sync

The function itself is quite long ~150 lines of code, but essentially it goes through and removes tables, triggers, types and schema added when you provision a database.  The link below will take you right to the line where the function starts.

https://github.com/anlai/AzureDbBackup/blob/master/AzureBackup/AzureStorageService.cs#L236

In the service I’ve written, the following function creates a copy of the database, removes Data Sync objects, exports the database to blob storage and drops the temporary copy.  It can be tedious to do it yourself since you have to sit there and wait for the copy operation and export operations which can take a bit of time.

public string BackupDataSync(string database, out string filename)
{
    var srcDatabase = database;
    var tmpDatabase = database + "Backup";

    var masterDbConnectionString = string.Format(ConnString, _serverName, "master", _sqlUsername, _sqlPassword);
    var tmpDbConnectionString = string.Format(ConnString, _serverName, tmpDatabase, _sqlUsername, _sqlPassword);

    // make a copy that we can alter
    CreateCopy(srcDatabase, tmpDatabase, masterDbConnectionString);

    // remove the datasync tables/triggers/sprocs
    RemoveDataSync(tmpDbConnectionString);

    // export the copy to blob storage
    var reqId = Backup(tmpDatabase, null, out filename);

    // keep checking until the export is complete
    do
    {
        var response = GetStatus(reqId);

        if (response == "Completed") break;

        // wait 30 seconds before checking again
        Thread.Sleep(30000);

    } while (true);

    // drop the temporary srcDatabase
    CleanupTmp(tmpDatabase, masterDbConnectionString);

    return reqId;
}

The last bit of getting this to work is to simply initialize the class, call the export and then to perform blob cleanup.  You will want to change the DacServiceUrl parameter to match the data center you’ll be using.  The current urls are coded into the sample under the DacServiceUrls class, so you can use the following options (so long as the urls don’t change):

  • DacServiceUrls.NorthCentralUs
  • DacServiceUrls.SouthCentralUs
  • DacServiceUrls.NorthEurope
  • DacServiceUrls.WestEurope
  • DacServiceUrls.EastAsia
  • DacServiceUrls.SoutheastAsia
var service = new AzureStorageService(
    ConfigurationManager.AppSettings["AzureServerName"],
    ConfigurationManager.AppSettings["AzureUserName"],
    ConfigurationManager.AppSettings["AzurePassword"],
    ConfigurationManager.AppSettings["AzureStorageAccountName"],
    ConfigurationManager.AppSettings["AzureStorageKey"],
    ConfigurationManager.AppSettings["AzureBlobContainer"],
    DacServiceUrls.NorthCentralUs	
    );

// perform the backup
string filename;    // filename generated for the bacpac backup
service.BackupDataSync("DatabaseName", out filename);

// cleanup the storage container
service.BlobCleanup();

Azure Backup Sample Application

I’ve written a sample application, that is a basic console application for testing purposes.  You can download the source from my github project AzureDbBackup.  It is however missing the AppConfig file (mostly because I didn’t trust myself to not check-in the credentials for my Azure account.  So you’ll need to add it in, you can copy and paste from below and fill in your information:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <appsettings>
    <add value="server.database.windows.net" key="AzureServerName"></add>
    <add value="server user name" key="AzureUserName"></add>
    <add value="password" key="AzurePassword"></add>
    <add value="blank" key="AzureStorageAccountName"></add>
    <add value="blank" key="AzureStorageKey"></add>
    <add value="blank" key="AzureBlobContainer"></add>
  </appsettings>
</configuration>
  • AzureServerName – The address to your Azure SQL instance.
  • AzureUserName – Username to log into your Azure SQL instance.
  • AzurePassword – Password to log into your Azure SQL instance.
  • AzureStorageAccountName – Storage Account Name
  • AzureStorageKey – Storage key (primary or secondary) to access storage account.
  • AzureBlobContainer – Container inside the above storage account.

My recommendation is that you give the console run a shot first before integrating it into your application.  If you have any suggestions/fixes, I’m happy to integrate them.

Integrating into Your Application

Sure you can run the above console application on an on-premises server and use windows Task scheduler to run it every night.  Or you can integrate it into something running on Azure to run the code on a schedule for you.  If you are running your application on more than  one instance (you probably don’t want to do it on your app), but maybe in a worker role.  But essentially the same integration steps should apply.

  1. Take the AzureStorageService.cs file and drop it into your application.   Or you can download the solution and build the class library and pull the dll into your project (doing it this way will also give you the dac service urls for the data centers).
  2. Next you’ll have to setup a scheduler to perform the backups on a regular interval.  We’re used the Quartz.Net Job Scheduler to run it on a nightly backup job.  Below is a sample setup of a scheduled job using the Quartz scheduler.
  3. It should do it’s magic every night, and you can collect the kudos.
public class DatabaseBackup : Job
{
    public static void Schedule()
    {
        var jobDetails = JobBuilder.Create().Build();

        var storageAccountName = ConfigurationManager.AppSettings["AzureStorageAccountName"];
        var serverName = ConfigurationManager.AppSettings["AzureServerName"];
        var username = ConfigurationManager.AppSettings["AzureUserName"];
        var password = ConfigurationManager.AppSettings["AzurePassword"];
        var storageKey = ConfigurationManager.AppSettings["AzureStorageKey"];
        var blobContainer = ConfigurationManager.AppSettings["AzureBlobContainer"];

        var nightly =
            TriggerBuilder.Create()
                          .ForJob(jobDetails)
                          .WithSchedule(CronScheduleBuilder.DailyAtHourAndMinute(1, 30).InPacificTimeZone())
                          .UsingJobData("StorageAccountName", storageAccountName)
                          .UsingJobData("AzureServerName", serverName)
                          .UsingJobData("AzureUserName", username)
                          .UsingJobData("AzurePassword", password)
                          .UsingJobData("AzureStorageKey", storageKey)
                          .UsingJobData("AzureBlobContainer", blobContainer)
                          .StartNow()
                          .Build();

        var sched = StdSchedulerFactory.GetDefaultScheduler();
        sched.ScheduleJob(jobDetails, nightly);
        sched.Start();
    }

    public override void ExecuteJob(IJobExecutionContext context)
    {
        var storageAccountName = context.MergedJobDataMap["StorageAccountName"] as string;
        var serverName = context.MergedJobDataMap["AzureServerName"] as string;
        var username = context.MergedJobDataMap["AzureUserName"] as string;
        var password = context.MergedJobDataMap["AzurePassword"] as string;
        var storageKey = context.MergedJobDataMap["AzureStorageKey"] as string;
        var blobContainer = context.MergedJobDataMap["AzureBlobContainer"] as string;

        // initialize the service
        var azureService = new AzureStorageService(serverName, username, password, storageAccountName, storageKey, blobContainer, DacServiceUrls.NorthCentralUs);

        // make the commands for backup
        string filename;
        var reqId = azureService.BackupDataSync("PrePurchasing", out filename);

        // clean up the blob
        azureService.BlobCleanup();
    }
}

Links:

Azure Backups Series:

Windows Azure Backups with Bacpac – Automating Cleanup (Part 3)

Welcome to part 3 of my azure backup series.  We’ll be looking at automating cleanup of your blob storage.  If you will be using this method of doing backups to blob storage (taking snapshots of the database every XX hours/days), eventually you’ll have a large build up of old backups that will be outdated.  So rather than logging in to the azure console every few days, why not automate the cleanup of your outdated bacpac files?

The following is a list of settings that are needed to get the cleanup working.

  • _storageAccountName – The name of the storage service account
  • _storageKey – One of the access keys for the storage account (primary or secondary, doesn’t matter).  Just make sure you don’t check it in to a repository!
  • _storageContainer – Name of the storage container you’d like your bacpac files to be put into.
  • _cleanupThreshold – # of days before a backup is considered outdated and deleted.  This needs to be a negative number, I do the conversion in my constructor to ensure it’s negative.

To do the overall cleanup, I take advantage of the managed APIs for the Azure Storage to get a list of all blobs in storage and then to delete them out.

First we want to setup a client, it’s fairly straight forward, we just pass in a connection string and it does the rest.

var storageAccount = CloudStorageAccount.Parse(
     string.Format(CloudStorageconnectionString, _storageAccountName, _storageKey));
var client = storageAccount.CreateCloudBlobClient();

Next, we need to tell the client which container we want to look at and get a list of all it’s contents.  Then we want to filter down the list so that it only shows blobs that are past the threshold (the ones we want to delete).

var container = client.GetContainerReference(_storageContainer);
var blobs = container.ListBlobs(null, true);
var filtered = blobs.Where(a => a is CloudBlockBlob && 
     ((CloudBlockBlob)a).Properties.LastModified < DateTime.Now.AddDays(_cleanupThreshold))
     .ToList();

Finally, we make the calls to do the deletions.  The below is the basic call used to delete the files, but in my full code I return a list of those blobs that were deleted.

foreach(var item in filtered)
{
     var blob = (CloudBlockBlob)item;
     deleted.Add(blob.Name);
     blob.Delete();
}

That’s fairly straight forward right?  Here is the code as a whole, this is an excerpt of the entire code but it’s what is necessary to do the cleanup.

Now sit back and pretend like you have to do your cleanups everyday to save the company money 🙂

Azure Backups Series:

Windows Azure Backups with Bacpac–Automating Exports (Part 2)

This is part 2 of my azure backups “series”.  This will go into doing exports of an Azure database to Bacpac file in blob storage.  Part one just went over a couple of options for backups and how to do it manually.

Today we’ll be looking at the simple case for exporting your databases.  By simple I mean you just have your database and no DataSync enabled on the database.  If you are in the situation where you are using DataSync, I’ve got a solution, it’s ugly but it works (you’ll have to wait for part 4).

The basic idea is that we’re going to use the API to command Azure to export the database to a storage container.

I based my code on the following blog by Colin Farr:

http://www.britishdeveloper.co.uk/2012/05/export-and-back-up-your-sql-azure.html

It’s a great article and goes over how to do it.  Honestly, my code doesn’t change a whole lot, so a lot of the credit goes to Colin on his excellent example.  I do give the option of doing selective export but that’s very minor.

This is only the function for performing the backup, I have rolled it into a whole class to be used with some clean up.

You basically pass in the name of the database, (optionally) a list of tables you’d like to export and it’ll give you back a status id and a filename that it saved the backup to.  All of the credentials are setup through the constructor (not shown), but you could easily hard code that into the function if you wanted.

That’s it for now.

Azure Backups Series:

Windows Azure Backups with Bacpac (Part 1)

Backing up or snapshotting a Windows Azure database seems like it should be a pretty straight forward affair and in reality it is, unless you have a more than simple situation.  Whether you just want to have a copy of your data locally or snapshots in time, it shouldn’t cost you an arm and a leg to maintain.

There are several solutions floating around out there as to how to accomplish backups.  Here are just a few that you can look into if you care to (these are just the ones that don’t involve paying a 3rd party vendor to implement):

  1. Database Copy
  2. Azure Data Sync
  3. Export to Bacpac
        You can read up on

Microsoft’s official suggestions here.

    Personally, I sort of have a problem with option 1 and 2.  They both require you to have another SQL server running (either an Azure or Local Database), bottom line is that if you are using an Azure database, you are essentially doubling your costs.

That leaves us with exporting to Bacpac file.  I like this option since it relies on the relatively cheap blob storage and gives you the option to just download the file and save it locally.

Manual Method

The simple way to export your database to Bacpac files, is to simply open up SQL Management Studio and follow these steps:

  1. Connect to your Windows Azure database server
  2. Right click on the database
  3. Select “Tasks”
  4. Select “Export to Data Tier Application”
  5. Under the Export Settings, select Windows Azure and fill out the credentials
  6. Celebrate!

You can also perform the export from the Azure Management Console:

  1. Log in to the Azure Portal http://windowsazure.com
  2. Select Database
  3. Click on Export, fill out credentials.

That was easy, right?  But who wants to actually do this manually every day or every 6 hours?

DataSync and Bacpac

There is one case that has been driving me up a wall.  Part of our database uses Data Sync to get some on-premises look up data into our Azure Database.  This brings up a problem, since DataSync adds some miscellaneous tables and stored procedures.  The stored procedures triggers added to your tables are incompatible with Bacpac and causes an error on import.  It doesn’t give an error on export, but the second you try to import it throws an error about semi-colons missing.

Microsoft seems to be aware of  this issue, but I’m not seeing any traction on a fix.  There is an open ticket about this.

Update 1/11/2013:  It would appear this issue has been resolved with restoring a bacpac file that previously had DataSync enabled.  However, when you try to re-enable DataSync to the database it can be extremely slow depending on the amount of data being synced.  With approximately 1 gb of sync data, it took more than 4 hrs before I gave up.  It really depends on your recovery situation.  If you must have the database restored ASAP and are not concerned about a little stale data you can do a plain restore and the original tables with data will still be there.  However I prefer to have the latest data from the DataSync available so in Part 4, I have a solution for that.

Automating Bacpac Export

Doing the exporting manually just isn’t an option for most of us. So in the coming posts I’ll be showing the following:

  • Part 1 : Basic Introduction to Azure Backups
  • Part 2 : Automating Exports
  • Part 3 : Automating Cleanup
  • Part 4 : Automating Exports with Data Sync enabled