So you want a rollback script?

So you’ve got a new release coming up, including upgrading your production SQL database. You’re writing great upgrade scripts, but now your DBAs are demanding rollback scripts as well. How do you approach this?

Do we really need them?

Rollback scripts are an expensive overhead to development because of the time and effort involved in writing and testing them. There are only two valid reasons you should ever need to provide them:

  1. Your database is very large, and quickly restoring a backup is not practical.
  2. Your have an agreed non-functional requirement to support downgrades.

For everyone else, just restore a backup.

The naive approach (don’t do this)

So we really do need a rollback script. Okay. Here’s one that might be generated by a tool:

-- upgrade.sql
ALTER TABLE Customer DROP COLUMN CreditCardNumber;

-- rollback.sql
ALTER TABLE Customer ADD COLUMN CreditCardNumber VARCHAR(24);

What’s wrong with this? A rollback script is supposed to return you to a previous point in time. This example restores the schema version, but not the data — if you run it you’ll lose all the data in the CreditCardNumber column. It is not a proper rollback!

Rolling back the data

In order for data to be rolled back, it has to be kept somewhere as a backup. We have to detach and reattach it from our schema.

During the upgrade we need to:

  1. Detach any old table or columns from the schema. Rename it to e.g. “_backup_CreditCardNumber”, unlink any foreign keys, and remove any constraints or triggers. But don’t drop it — we need to keep it as a backup in case rollback is required.
  2. Apply new schema changes e.g. create new table and populate it from the backup table.

Your schema should now look something like this. The tables and columns are still present, if we need them, but no longer active.

To roll it back we need to reverse any changes and reattach the objects:

  1. Drop any new schema objects.
  2. Reattach each backup table or column to the schema. Rename it back to its original name, and recreate the old foreign keys, constraints and triggers.

Backup tables can then be dropped at any point after the release, when they are no longer required. This can be done as a DBA task, or during the following release.

Testing your Upgrade/Rollback Scripts

At work we have a build set up in TeamCity that replays all our upgrade and downgrade scripts against a test database every time one is checked in — we are continuously integrating and testing our SQL scripts. It performs the following steps, and fails if any SQL errors are raised:

  1. Create an empty database.
  2. Run all the upgrade scripts, from version 1 up to current.
  3. Run all the downgrade scripts, from current back down to version 1.
  4. Run all the upgrade scripts again.

The second upgrade pass is important because it detects any objects that were missed in a rollback script: if you forgot to drop something, it will fail with an “object already exists” error. This saves a lot of time, and reduces the chance of nasty surprises on release day.

We have a big market data risk analysis database at work and this detach/reattach rollback pattern (with continuous testing) has worked well for us so far. We expect it to continue scaling as we grow, although if inserting/updating hundreds of millions of rows becomes too slow for a single release window, we may need to use streaming to populate new tables/columns offline before switching them over during an outage. But the same attach/detach principles will still apply.

September 23, 2012

6 Comments

TimR on October 3, 2012 at 1:46 pm.

Nice post! One question – do you put test data in the tables for the upgrade/downgrade/upgrade?

If so, how do you do keep track of this test data to populate the tables?

If not – any suggestions :) ?

Richard on October 4, 2012 at 1:43 pm.

@Tim no we don’t. Some of the upgrade/downgrade scripts insert data into reference/lookup tables, but there is no user data, so bugs like FK violations might get missed. This is unfortunate but could be very difficult to test.

If we wanted to continuously integrate against against databases with proper user data in them, I can think of two possible easy way to do it:

1. Restore a backup of a live system (containing real data) and test the latest upgrade scripts against it. Or,
2. Fire up a new empty instance of the last production version of the app, call your API to simulate some activity and populate some data, and then test the latest upgrade scripts against it.

Otherwise juggling test data in SQL scripts could get very fiddly and time-consuming – and possibly diverge from what the application is actually doing.

TimR on October 9, 2012 at 2:02 pm.

@Richard – right yeah we do #1 but it isn’t continuous.

Guess you could do #2 and have a test that ensures there is atleast one row in every table to make sure you are calling enough of your API to populate the DB. Not bullet proof but should catch more issues than having empty tables.

GeorgeA on November 1, 2012 at 9:50 am.

Nice post, and screenshots ;) There are actually some very good tools, like Red Gate, or even the latest SQL Server 2012 Data Tools that can autogenerate these scripts for you, and execute them as well (if you ;re not in a lock down prod environment).

Rodger on February 2, 2013 at 5:10 pm.

Interesting. Just rename the column.
I assume that you move the data to another field also.

I’ve made backup tables when they were small.

Create tab_backup_date as
(
select *
from tab
)

Naren on May 2, 2014 at 5:58 am.

Is there any risk of table records when we roll back your data base to previous version.

Leave a Reply