The road to automated database deployment

Working in an agile team means delivering software early and often — several times a day even. The only way to achieve this without going crazy doing releases all the time is to eliminate all the manual steps in your release process, until you’re leave with a single mouse-click to get your code into to production.

One of the biggest challenges here is how to handle your app’s back-end database. Application files can be deleted and restored in seconds, websites rolled and and replaced a thousand times, but the database is a living, breathing beast, closely guarded by a separate team of DBAs who usually won’t even let you see what state its in, let alone make changes. If you screw up a database script, it could cost hours of downtime restoring from a backup. So how do you automate changes then? Here’s how we did it on a recent project.

Pre-requisite #1: portable script shipping

I’ll never forget the deployment process on my first major commercial project. During development, everyone shared a ‘working copy’ database, directly adding columns and tables via GUI tools. On release day, a designated person would open up SQL Compare, connect to the dev database, connect to the live database, and generate a script copying all the changes from dev to production.

Needless to say, this was a fairly sub-optimal solution. Here’s some of the problems we developers incurred on a regular basis:

  • The diff scripts only work between specific database versions. If UAT was on a faster development cycle than PROD, we couldn’t use the same scripts between them.
  • Only the final diff script was put in source control, so if the development database crashed or was accidentally wiped in between release cycles, we lost all our changes.
  • SQL Compare can reverse-engineer schema changes, but it can’t reverse-engineer data changes/migration between tables. It’s not even a complete solution.

The last one was the real nail in the coffin for us. If you already have to script data changes by hand… why not go the whole hog and do ALTER/CREATE statements too? That way you can just check in each change as a separate sql script into source control — numbered, so you know which order they go in. No more generating scripts, no more losing our work in the dev database. No more SQL Compare.

Pre-requisite #2: database version tracking table

Given a live database and a stack of SQL scripts, how do you know which ones need to be run? You could start reading going through them, statement by statement, until you find a change that doesn’t appear to be applied and start there — or you could simply have each script write a row to the end of a changelog table like this:

ALTER TABLE ...;
DROP TABLE ...;
-- other DDL changes

INSERT INTO CHANGELOG (16, '0016_customer_info.sql', GETDATE(), USER);

That way, it becomes trivial to find out what scripts have run/need to be run against a database. And putting the special INSERT at the end of each change script ensures it doesn’t add a row unless all previous statements executed successfully.

Pre-requisite #3: database version awareness

When is increasing coupling in your application a good thing? When it’s between the database schema version, and the application running over the top. Your physical data model and application’s persistence model (ORM mappings etc) have to move in lockstep, so it’s a good idea to refuse to start if the database version isn’t correct.

You can do this quite easily by checking the highest-run script number in the CHANGELOG table, or introducing a separate VERSION table. Whatever the strategy used, a loud error on startup is much easier to diagnose than messy runtime errors when a column is missing (fail-fast).

Pre-requisite #4: script privileges

To make our lives easier, we introduced a new rule: all change scripts must be able to be run under the same database user/privileges as the application itself. This means we can use the application’s own connection string to run scripts, and eliminates the need for DBAs or sharing the database administrator password.

Running scripts with the application account means we have to grant our services rights to create and modify tables, views, stored procedures etc (i.e. Oracle’s RESOURCE or MS SQL’s db_ddladmin role). Before you freak out about applications running with elevated privileges, please remember one, this is not a shared database — it’s for exclusive use of our application, there is only one user — and two, this application does not run on desktops, so there is no risk of users uncovering credentials and logging in themselves.

This is the same self-upgrading database model WordPress uses, and we justified that if it’s good enough for the 12 million WordPress sites out there on the greater Internet, then it’s good enough for one little Windows Service running in a locked-down server network.

Anyway. If a script does requires system privileges to run, it probably means it has something to do with storage or security — e.g. storage files or table partitioning — that isn’t actually required for the application to run and thus doesn’t need to be part of the official database version history. Those scripts can thus be run independently of the automatic-upgrade process, as required by DBAs.

Pre-requisite #5: clean up existing installations

We have a number of copies of our applications running in the wild, in various states of disrepair. Scripts that were never run, or ran with ‘continue on error’ enabled, and ad hoc customizations by DBAs, and missing entries from the CHANGELOG table all needed to be smoothed over to make a solid platform before any automated process can take over. In this case, we did it by hand (ironically using SQL Compare), but depending on the number of installations out there, and effort required, you may wish to incorporate these clean up steps in your automated scripts.

End goal: self-upgrading application

From this point, it’s relatively easy to implement an automated upgrade process that checks the CHANGELOG table to find out which scripts haven’t been run yet, and run them. That way it works against any previous database version. Our upgrader tool also performs a number of prepatory and clean-up steps, such as creating the CHANGELOG table if required, and cleaning up any vestigal remains of the previous database deployment process.

This process can be designed, and unit tested, and audited to a greater degree than any human process could ever achieve. You can run it hundreds of times over to tweak it, and test it against all sorts of different database configurations with minimal effort. It can and should become part of your continuous integration and automated build and tests.

Failure and backups

Unfortunately, even with all the extra confidence you get from scripting, there is always a risk of something going wrong on release day. One part of the manual process we’re missing is backups, so we’re currently looking into to add an automated backup step before kicking off (probably a lightweight one, using something like a Snapshot in MS SQL or Flashback in Oracle). That way, we can eventually aim to provide automatic rollback capabilities in the event a script fails.

Final words

I’m not saying that the ours is the only way of achieveing automated database deployment, or that this is neccessarily the best way. We’re still working out the kinks, but I’d say it’s looking pretty good so far.

February 9, 2011

9 Comments

Nathan Evans on February 15, 2011 at 10:40 pm.

Great post. There really needs to be more mindshare on this issue.

It’s not hard to do. The vast majority of the work is simply in maintaining the SQL upgrade scripts (which are effectively acting as diffs). It requires discipline.

My co-workers at the time thought I was absolutely barmy when I suggested that our application should upgrade the database itself. They wanted some glorified external tool for that job. Why make things hard for yourself? Keep It Simple, Stupid. KISS.

Ioannis Cherouvim on February 16, 2011 at 7:19 am.

Thanks for the mention. I’ve been using this technique for 3 years now and I’ll never look back.

Adam D. on February 17, 2011 at 2:59 am.

I’ve been using this method of version scripts for quite some time now. Works great. Curious to see if you end up doing what I’m doing in the end.

I’m also using a similar approach with “meta events” in cqrs to embed schema changes in the stream of regular events. This enables schema changes to be pinned in time with the regular events. I wrote about this in a blog post called “A Time Machine For Your Application”

Cheers, good to see I was onto something that others would do as well.

Kevin Berridge on February 19, 2011 at 2:53 am.

Have you ever looked at Fluent Migrator? https://github.com/schambers/fluentmigrator/

It automates a large amount of what you have to do manually when dealing with SQL scripts directly. Plus you can create extension methods to encapsulate common columns or scripts!

Curious to get your thoughts on that approach vs. the SQL Script approach.

Richard on February 20, 2011 at 9:13 pm.

Kevin: I haven’t looked too closely for a couple of years now – on the first project we did check out fluent migrator, migrator.net, rikmigrations etc but found they were pretty limited in terms of indexes, views, stored procs etc (which were big in that system because it wasn’t ORM powered). So we figured we’d just stick to the incumbent SQL files for the time being. (And when did start using NHibernate, we used the hbm2ddl tool to generate SQL scripts anyway).

However it looks like Fluent Migrator does support indexes now so may be worth another look!

deadalnix on August 17, 2011 at 10:59 am.

I love you.

Leave a Reply