Back to basics: good source control check-in habits
The other day at work I went over a few good source control habits (TFS-centric) for new developers, and why they’re worth doing. Here are some tips:
Check-ins are coarse-grained
When I first started using source control, a lot of my check-in logs looked like this:
- Added useful function foo() (that will be ultimately required for feature X)
- Added feature X
I was trying to be helpful by making foo() available for use as early as possible, but most of the time I checked it in too early, and needed to change/fix it later — resulting in additional check-ins. In reality though, all this did was clutter up the change history with signal-to-noise, making it harder to establish what changes were actually required to effect a change.
It also of course violates the YAGNI (you ain’t gonna need it) principle – no one ever needed foo() badly enough to make it worth checking it in early.
Check-ins should be coarsely grained — instead of staggering code changes over a period of time, save it all for one big changeset that contains everything. This actually improves YAGNI — instead of adding endless low-level functions ‘because they might be useful’, you’re forced to associate them with a higher application-level change.
Never check in unfinished work
In the main (trunk) branch, there are two cardinal rules:
- Never check in anything that breaks the build
- Never check in an unfinished feature
The main branch should always be in a state where it’s ready for release. Features have been entirely added or don’t exist yet; complete changes have been made, or not even started. There is no middle ground.
Having to worry about the possibly unfinished state of the ‘master copy’ is just another burden on your team. Even worse, if you check in code that doesn’t compile, you can interrupt other people’s work and waste their time commenting out bad blocks and working around it.
Sometimes, however, you’ll get a piece of work that doesn’t fit in one check-in.
- It could take a long time to implement (needs more than one check-in).
- Multiple developers might need to work on it.
- The change isn’t going to be released (committed to the main branch) until a later date.
You have two options: fork a copy of the code for a new development branch, or shelve your changes so you or someone else can work on them later. The basic rule for branches is if you need more than one check-in to make a change, you need to branch and merge back in later.
Source control is not for back ups
This follows from the previous two tips. Check in your work because you’re completely done and finished — not because it’s the end of the day, or because it’s the weekend etc.
I think this habit often begins when using reserved-checkout source control systems like VSS, where having a file checked out explicitly blocks other people from editing it. Everyone who’s used VSS knows from experience that hilarious situation where you need to edit a checked-out file, but the person who has it is on holiday or otherwise unreachable.
To avoid this, teams adapt their work habits to suit VSS, checking in frequently so as not to lock files for extended periods. This practice is continued for some reason (cargo cult programming?) when moving to less brain-damaged source control tools like TFS or subversion.
Anyway, if you want nightly backups of your local working source code, get some backup software. Don’t use source control — change tracking becomes difficult when you check in based on calendar events. It gets even more ridiculous if you have changes that don’t compile yet, and have to comment/uncomment code when checking it in.
Use a separate check-in when reformatting code
Say you go to edit a file, but the formatting is all messed up — the indenting is wrong. So you fix it up while making your changes. But then, if someone comes along later to see what your change entailed, a diff tool will highlight differences on every line, making it very hard to see what the actual code change was. Instead, use a separate check-in for mass code reformatting.
Write useful check-in comments
If you’re looking over the history of a file in source control, comments like ‘css fix’ or ‘#12345′ won’t be very useful. Instead, good check-in comments need three things:
- What was changed
- Why it was changed
- Any bug/issue tracking numbers (if applicable)
Describe what you changed, and why it was necessary. You don’t have to write an essay — I find one sentence of ‘did X because Y’ is usually sufficient.
You should also list any bug/issue tracking numbers, or make reference to any relevant project documentation (e.g. list of requirements) to provide a higher-level context. This is a good way of establishing traceability on tightly-controlled source repositories — if you do it right, you’ll find almost every change can be traceable back to some external identifier.
I also like to paste my changeset number when I update an issue as being completed (ready for user testing or peer review) — the easier it is to find my changes, the better! (Note that some source control tools can do this automatically from your check-in comment).
Real-life examples
You can see good check-in habits in action in most popular open source projects. Here are a few of examples:
- Boost C++ Libraries revision log
- Ruby on Rails commit history
- Subtext revision log