So you want a rollback script?

So you’ve got a new release coming up, including upgrading your production SQL database. You’re writing great upgrade scripts, but now your DBAs are demanding rollback scripts as well. How do you approach this?

Do we really need them?

Rollback scripts are an expensive overhead to development because of the time and effort involved in writing and testing them. There are only two valid reasons you should ever need to provide them:

  1. Your database is very large, and quickly restoring a backup is not practical.
  2. Your have an agreed non-functional requirement to support downgrades.

For everyone else, just restore a backup.

The naive approach (don’t do this)

So we really do need a rollback script. Okay. Here’s one that might be generated by a tool:

-- upgrade.sqlALTER TABLE Customer DROP COLUMN CreditCardNumber;-- rollback.sqlALTER TABLE Customer ADD COLUMN CreditCardNumber VARCHAR(24);

What’s wrong with this? A rollback script is supposed to return you to a previous point in time. This example restores the schema version, but not the data — if you run it you’ll lose all the data in the CreditCardNumber column. It is not a proper rollback!

Rolling back the data

In order for data to be rolled back, it has to be kept somewhere as a backup. We have to detach and reattach it from our schema.

During the upgrade we need to:

  1. Detach any old table or columns from the schema. Rename it to e.g. “_backup_CreditCardNumber”, unlink any foreign keys, and remove any constraints or triggers. But don’t drop it — we need to keep it as a backup in case rollback is required.
  2. Apply new schema changes e.g. create new table and populate it from the backup table.

Your schema should now look something like this. The tables and columns are still present, if we need them, but no longer active.

To roll it back we need to reverse any changes and reattach the objects:

  1. Drop any new schema objects.
  2. Reattach each backup table or column to the schema. Rename it back to its original name, and recreate the old foreign keys, constraints and triggers.

Backup tables can then be dropped at any point after the release, when they are no longer required. This can be done as a DBA task, or during the following release.

Testing your Upgrade/Rollback Scripts

At work we have a build set up in TeamCity that replays all our upgrade and downgrade scripts against a test database every time one is checked in — we are continuously integrating and testing our SQL scripts. It performs the following steps, and fails if any SQL errors are raised:

  1. Create an empty database.
  2. Run all the upgrade scripts, from version 1 up to current.
  3. Run all the downgrade scripts, from current back down to version 1.
  4. Run all the upgrade scripts again.

The second upgrade pass is important because it detects any objects that were missed in a rollback script: if you forgot to drop something, it will fail with an “object already exists” error. This saves a lot of time, and reduces the chance of nasty surprises on release day.

We have a big market data risk analysis database at work and this detach/reattach rollback pattern (with continuous testing) has worked well for us so far. We expect it to continue scaling as we grow, although if inserting/updating hundreds of millions of rows becomes too slow for a single release window, we may need to use streaming to populate new tables/columns offline before switching them over during an outage. But the same attach/detach principles will still apply.