Working in an agile team means delivering software early and often — several times a day even. The only way to achieve this without going crazy doing releases all the time is to eliminate all the manual steps in your release process, until you’re leave with a single mouse-click to get your code into to production.
One of the biggest challenges here is how to handle your app’s back-end database. Application files can be deleted and restored in seconds, websites rolled and and replaced a thousand times, but the database is a living, breathing beast, closely guarded by a separate team of DBAs who usually won’t even let you see what state its in, let alone make changes. If you screw up a database script, it could cost hours of downtime restoring from a backup. So how do you automate changes then? Here’s how we did it on a recent project.
Pre-requisite #1: portable script shipping
I’ll never forget the deployment process on my first major commercial project. During development, everyone shared a ‘working copy’ database, directly adding columns and tables via GUI tools. On release day, a designated person would open up SQL Compare, connect to the dev database, connect to the live database, and generate a script copying all the changes from dev to production.
Needless to say, this was a fairly sub-optimal solution. Here’s some of the problems we developers incurred on a regular basis:
- The diff scripts only work between specific database versions. If UAT was on a faster development cycle than PROD, we couldn’t use the same scripts between them.
- Only the final diff script was put in source control, so if the development database crashed or was accidentally wiped in between release cycles, we lost all our changes.
- SQL Compare can reverse-engineer schema changes, but it can’t reverse-engineer data changes/migration between tables. It’s not even a complete solution.
The last one was the real nail in the coffin for us. If you already have to script data changes by hand… why not go the whole hog and do ALTER/CREATE statements too? That way you can just check in each change as a separate sql script into source control — numbered, so you know which order they go in. No more generating scripts, no more losing our work in the dev database. No more SQL Compare.
Pre-requisite #2: database version tracking table
Given a live database and a stack of SQL scripts, how do you know which ones need to be run? You could start reading going through them, statement by statement, until you find a change that doesn’t appear to be applied and start there — or you could simply have each script write a row to the end of a changelog table like this:
ALTER TABLE ...; DROP TABLE ...; -- other DDL changes INSERT INTO CHANGELOG (16, '0016_customer_info.sql', GETDATE(), USER);
That way, it becomes trivial to find out what scripts have run/need to be run against a database. And putting the special INSERT at the end of each change script ensures it doesn’t add a row unless all previous statements executed successfully.
Pre-requisite #3: database version awareness
When is increasing coupling in your application a good thing? When it’s between the database schema version, and the application running over the top. Your physical data model and application’s persistence model (ORM mappings etc) have to move in lockstep, so it’s a good idea to refuse to start if the database version isn’t correct.
You can do this quite easily by checking the highest-run script number in the CHANGELOG table, or introducing a separate VERSION table. Whatever the strategy used, a loud error on startup is much easier to diagnose than messy runtime errors when a column is missing (fail-fast).
Pre-requisite #4: script privileges
To make our lives easier, we introduced a new rule: all change scripts must be able to be run under the same database user/privileges as the application itself. This means we can use the application’s own connection string to run scripts, and eliminates the need for DBAs or sharing the database administrator password.
Running scripts with the application account means we have to grant our services rights to create and modify tables, views, stored procedures etc (i.e. Oracle’s RESOURCE or MS SQL’s db_ddladmin role). Before you freak out about applications running with elevated privileges, please remember one, this is not a shared database — it’s for exclusive use of our application, there is only one user — and two, this application does not run on desktops, so there is no risk of users uncovering credentials and logging in themselves.
This is the same self-upgrading database model WordPress uses, and we justified that if it’s good enough for the 12 million WordPress sites out there on the greater Internet, then it’s good enough for one little Windows Service running in a locked-down server network.
Anyway. If a script does requires system privileges to run, it probably means it has something to do with storage or security — e.g. storage files or table partitioning — that isn’t actually required for the application to run and thus doesn’t need to be part of the official database version history. Those scripts can thus be run independently of the automatic-upgrade process, as required by DBAs.
Pre-requisite #5: clean up existing installations
We have a number of copies of our applications running in the wild, in various states of disrepair. Scripts that were never run, or ran with ‘continue on error’ enabled, and ad hoc customizations by DBAs, and missing entries from the CHANGELOG table all needed to be smoothed over to make a solid platform before any automated process can take over. In this case, we did it by hand (ironically using SQL Compare), but depending on the number of installations out there, and effort required, you may wish to incorporate these clean up steps in your automated scripts.
End goal: self-upgrading application
From this point, it’s relatively easy to implement an automated upgrade process that checks the CHANGELOG table to find out which scripts haven’t been run yet, and run them. That way it works against any previous database version. Our upgrader tool also performs a number of prepatory and clean-up steps, such as creating the CHANGELOG table if required, and cleaning up any vestigal remains of the previous database deployment process.
This process can be designed, and unit tested, and audited to a greater degree than any human process could ever achieve. You can run it hundreds of times over to tweak it, and test it against all sorts of different database configurations with minimal effort. It can and should become part of your continuous integration and automated build and tests.
Failure and backups
Unfortunately, even with all the extra confidence you get from scripting, there is always a risk of something going wrong on release day. One part of the manual process we’re missing is backups, so we’re currently looking into to add an automated backup step before kicking off (probably a lightweight one, using something like a Snapshot in MS SQL or Flashback in Oracle). That way, we can eventually aim to provide automatic rollback capabilities in the event a script fails.
I’m not saying that the ours is the only way of achieveing automated database deployment, or that this is neccessarily the best way. We’re still working out the kinks, but I’d say it’s looking pretty good so far.February 9, 2011