Deployment readiness

Deployment readiness

Have you ever seen this development cycle?

  1. Install new build
  2. App doesn’t start, doesn’t log any error why
  3. Go back to source code and add improved logging
  4. Install new build
  5. Read logs
  6. Fix the actual problem (usually some stupid configuration thing)
  7. Repeat steps 1-6 until all bugs are gone

Was it because of your code, or someone else’s? This sort of cycle is time consuming, frustrating, stressful, and makes you look helpless (or like a cowboy) in front of project managers and support staff.

If your app can’t produce logs your infrastructure staff can understand, then your app is not even close to production-readiness. If it doesn’t log clear errors, how can anyone hope to deploy and support it?

libcheck – quick and dirty assembly compatibility debugging

libcheck – quick and dirty assembly compatibility debugging

Today I wrote a little command-line tool for helping track down .NET assembly compatibility issues:

System.IO.FileLoadException: Could not load file or assembly {0} or one of its dependencies. The located assembly's manifest definition does not match the assembly reference.

My team encounters these problems very regularly — we have a build/dependency tree that easily rivals Castle Project in complexity, and it gets out of sync easily while waiting for downstream projects to build. Normally, your only option in this situation is to fire up reflector and poke through all your references manually — a very tedious process.

So, to use this tool, you point it at a directory (e.g. your lib folder) and provide a list of patterns for assemblies you want to check:

libcheck.exe C:devyourapplib *NCover*

Then it’ll tell you any that are broken.

Stick to the paradigm (even if it sucks)

Stick to the paradigm (even if it sucks)

Today I had the pleasure of fixing a bug in an unashamedly-procedural ASP.NET application, comprised almost entirely of static methods with anywhere from 5-20 parameters each (yuck yuck yuck).

After locating the bug and devising a fix, I hit a wall. I needed some additional information that wasn’t in scope. There were three places I could get it from:

  • The database
  • Form variables
  • The query string

The problem was knowing which to choose, because it depended on the lifecycle of the page — is it the first hit, a reopened item, or a post back? Of course that information wasn’t in scope either.

As I contemplated how to figure the state of the page using HttpContext.Current, my spidey sense told me to stop and reconsider.

Let’s go right back to basics. How did we use to manage this problem in procedural languages like C? There is a simple rule — always pass everything you need into the function. Global variables and hidden state may be convenient in the short-term, but only serve to confuse later on.

To fix the problem, I had to forget about “this page” as an object instance. I had to forget about private class state. I had to forget that C# was an object-oriented language. Those concepts were totally incompatible with this procedural code base, and any implementation would likely result in a big mess.

In fact, it turns out to avoid a DRY violation working out the page state again, and adding hidden dependencies on HttpContext, the most elegant solution was simply to add an extra parameter to every method in the stack. So wrong from a OO/.NET standpoint, but so right for procedural paradigm in place.

A wee test helper for setting private ID fields

A wee test helper for setting private ID fields

Every now and then I need to write tests that depend on the ID of a persistent object. IDs are usually private and read-only, assigned internally by your ORM, so I use a little fluent extension method to help set them via reflection.

public static class ObjectExtensions{    // Helper to set private ID    public static T WithId<T>(this T obj, object id)    {        typeof(T)            .GetField("id", BindingFlags.Instance | BindingFlags.NonPublic)            .SetValue(obj, id);        return obj;    }}

Tweak as desired to suit your naming conventions or base classes. Usage is as follows:

[TestFixture]public class When_comparing_two_foos{    [Test]    public void Should_not_be_the_same_if_they_have_different_ids()    {        var a = new Foo().WithId(4);        var b = new Foo().WithId(42);        a.Should().Not.Be.EqualTo(b);    }}

Remove content from files with Powershell

Remove content from files with Powershell

I love Windows PowerShell. It’s the missing < a> shell/scripting environment that could never be properly satisfied with just DOS commands and batch files. I know it’s been out for a while now, but I’ve only recently started getting on top of it, and it’s been fantastic so far.

Quite often I find myself needing to do little maintenance jobs involving iterating through files and tranforming them — C# source code, SQL scripts, my MP3 collection etc. Powershell is the perfect tool for these sorts of tasks. Here’s a very simple (and probably not that efficient) snippet I wrote last week to strip out SQL Server Management Studio’s useless ‘descriptive headers’ comments from a nested hierarchy of 2,000 generated SQL scripts:

/****** Object:  Table [dbo].[hibernate_unique_key]    Script Date: 04/12/2009 11:35:29 ******/CREATE TABLE [dbo].[hibernate_unique_key](        [next_hi] [int] NULL) ON [PRIMARY]GO
dir -recurse -filter *.sql $src | foreach ($_) {        $file = $_.fullname        echo $file        (get-content $file) | where {$_ -notmatch "^s?/****** Object:.*$" } | out-file $file}

How easy was that? I definitely need to learn more of this stuff!

Best practice DDD/TDD ASP.NET MVC example applications

Best practice DDD/TDD ASP.NET MVC example applications

It’s no secret that I’m a big fan of the ASP.NET MVC Framework. One of the reasons I like it is because it’s opinionated – instead of leaving developers free to fumble things as they choose, it recognises that most projects will be more-or-less exactly the same, and imposes some conventions to keep everyone on the same path.

This is the same philosophy used by many other application frameworks, and it saves everyone time and confusion by providing well-thought-out standards that everyone’s familiar with. However, because ASP.NET MVC is a general purpose web framework, its conventions can only cover so much — guidance for other frameworks and patterns is left to the community.

One very important area (that inspired this article) is around object-relational mappers (ORM). If I’m using the ASP.NET MVC framework:

  • How do I reference a repository from controller?
  • What’s the best way of applying a per-http-request unit of work pattern?
  • What should it look like if I’m using NHibernate? LINQ-to-SQL? etc.

Thankfully, there are a few example applications available that demonstrate solutions to these questions (and many more):

Kym NHibernate, Rhino-Commons, Castle Windsor, NUnit, Rhino Mocks, jQuery, Json.NET
S#arp Architecture NHibernate, Ninject, MvcContrib, NUnit, Rhino Mocks, Json.NET
CarTrackr Linq to SQL, Unity, MSTest, Moq, Visifire Silverlight charts, OpenID
MVC Storefront LINQ to SQL, StructureMap, MSTest, Moq, NLog, CardSpace, Windows Workflow, OpenID
Suteki Shop LINQ to SQL, Castle Windsor, NUnit, Moq, log4net, jQuery

It’s really good to see some different opinions in implementation and design, across a range of frameworks. MVC Storefront is of particular note – everything is explained over a series of 20-or-so podcasts that include interviews with well-known .NET developers like Jeff Atwood (Coding Horror/StackOverflow), and Ayende (NHibernate).

Are your applications ‘legacy code’ before they even hit production?

Are your applications ‘legacy code’ before they even hit production?

When a business has custom software developed, it expects that software to last for a long time. To do this, software must A) continue operating — by adapting to changing technology needs, and B) continue to be useful — by adapting to changing business needs.

Businesses’ needs change all the time, and all but the most trivial of software will require modification beyond its original state at some point in its life, in order to to remain useful. This is sometimes known as brownfield development (as opposed to greenfield), and from my experience, accounts for a good two thirds to three quarters of all enterprise software development.

Anyway, time for a story. A company has a flash new business intelligence system developed (in .NET) to replace an aging mainframe system. Everything goes more-or-less to plan, and the system is delivered complete and on time. It’s robust and works very well — everyone is satisfied.

Now, six months have passed since the original go-live date, and the owners have a list of new new features they want added, plus a few tweaks to existing functionality.

Development begins again. After a few test releases, however, new bugs start to appear in old features that weren’t even part of phase two. Delivery dates slip, and time estimates are thrown out the window as search-and-destroy-style bug fixing takes over. Developers scramble to patch things up, guided only by the most recent round of user testing. Even the simplest of changes take weeks of development, and endless manual testing iterations to get right. The development team looks incompetent, testers are fed up, and stakeholders are left scratching their heads wondering how you could possibly break something that was working fine before. What happened?

The illusion of robustness

The robustness of many systems like this is an illusion — a temporary spike in value that’s only reflective of the fact it has been tested, and is known to work in its current state. Because everything looks fine on the surface, the quality of source code is assumed to be fine as well.

From my experience, a lot of business source code is structured like a house of cards. The software looks good from the outside and works perfectly fine, but if you want to modify or replace a central component — i.e. a card on which other cards depend — you pretty much have to re-build and re-test everything above it from scratch.

Such systems are written as legacy code from the very first day. By legacy, I mean code that is difficult to change without introducing bugs. This is most commonly caused by:

  • Low cohesion – classes and methods with several, unrelated responsibilities
  • High coupling – classes with tight dependencies on each other
  • Lack of unit tests

Low cohesion and high coupling are typical symptoms of code written under time-strapped, feature-by-feature development, where the first version that works is the final version that goes to production.

Once it’s up and running, there’s little motivation for refactoring or improving the design, no matter how ugly the code is — especially if it will require manual re-testing again. This is the value spike trap!

The tragedy of legacy code

Back to the story. If a developer on the project realises the hole they’ve fallen into, they can’t talk about it. This late in the process, such a revelation could only be taken in one of two ways — as an excuse, blaming the original designers of the software for mistakes made today — or as a very serious admission of incompetence in the development team from day one.

The other developers can’t identify any clear problems with the code, and simply assume that all brownfield development must be painful. It’s the classic story — when granted the opportunity to work on a greenfield project, you’re much happier ó unconstrained by years of cruft, you’re free to write clean, good code. Right? But if you don’t understand what was wrong with the last project you worked on, you’ll be doomed to repeat all of its mistakes. Even with the best of intentions, new legacy code is written, and without knowing it, you’ve created another maintenance nightmare just like the one before it.

Clearly, this is not a good place to be in — an endless cycle of hard-to-maintain software. So how can we break it?

Breaking the cycle

I believe that Test Driven Development (TDD) is the only way to keep your software easy-to-modify over the long-term. Why? Because:

  1. You can’t write unit tests for tightly-coupled, low cohesive code. TDD forces you to break dependencies and improve your design, simply in order to get your code into a test harness.
  2. It establishes a base line of confidence — a quantifiable percentage of code that is known to work. You can then rely on this to assert that new changes haven’t broken any existing functionality. This raises the value of code between ‘human tested’ spikes, and will catch a lot of problems before anyone outside the development team ever sees them.

TDD gives you confidence in your code by proving that it works as intended. But object-oriented languages give you a lot freedom, and if you’re not careful, your code can end up as a huge unreadable mess.

“Any fool can write code that a computer can understand. Good programmers write code that humans can understand.” – Martin Fowler

So how do you write code that humans can understand? Enter TDD’s partner in crime, Domain Driven Design (DDD).

DDD establishes a very clear model of the domain (the subject area that the software was written to address), and puts it above all technological concerns that can put barriers of obfuscation between the business concepts and the code used to implement them (like the limitations of a relational database schema). The end result is very expressive entities, repositories and service classes that are based on real life concepts and can be easily explained to non-technical business stakeholders.

Plus, DDD provides a clear set rules around all the fiddly stuff like relationships and responsibilities between entities. In short, DDD is simply object-oriented code done right.

Putting it to practice

So how do you get on to a point where you can use all this? There’s a lot to learn — an entire universe of information on top of your traditional computer science/software development education. I’ve only scratched the surface here in trying to explain why I think it’s important, but if you want to learn more:

  • Read books. How many books on software development have you read this year? Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans is pretty much the DDD bible, and, just like GoF, should be mandatory reading for any serious developer. If you’re stuck in legacy code hell right now though, have a read of Michael Feathers’ Working Effectively with Legacy Code. It’s all about breaking dependencies and beating existing code into test harnesses so you can get on with improving it without fear of unintended consequences.
  • Read blogs. TDD/DDD is a hot subject right now, and blogs are great for staying on top of all the latest tools and methods. They’re also a good place for learning about TDD/DDD’s support concepts like dependency injection, inversion of control, mocking, object/relational mapping, etc.
  • Join mailing lists — DomainDrivenDesign and ALT.NET (if you’re that way inclined) are two good ones with a reasonably high volume of great intellectual discussion. You can gain a lot of insight reading other people’s real-life problems, and watching the discussion evolve as potential solutions are debated.
  • Practise. Grab an ORM, a unit test framework/runner, an IoC container and start playing around.

I believe the only way to make software truly maintainable long-term (without becoming legacy code) is to use TDD and DDD. Together, they’re all about building confidence — confidence that you can make changes to a system without introducing any bugs. Confidence that other people can understand your code and the concepts it represents. And most importantly, confidence that you’re learning, and aren’t making all the same mistakes every time you start a new project.

T-SQL equality operator ignores trailing spaces

T-SQL equality operator ignores trailing spaces

Today I discovered something new about SQL Server while debugging an application. T-SQL’s equality operator ignores any trailing spaces when comparing strings. Thus, these two statements are functionally equivalent:

SELECT * FROM Territories WHERE TerritoryDescription = 'Savannah'SELECT * FROM Territories WHERE TerritoryDescription = 'Savannah         '

When executed against the Northwind database included with SQL Server they both return the same row, which has no trailing spaces after its TerritoryDescription.

TerritoryID          TerritoryDescription                               RegionID    -------------------- -------------------------------------------------- ----------- 31406                Savannah                                           4(1 row(s) affected)

This behaviour isn’t immediately obvious from the offset, and isn’t mentioned on the MSDN entry.

To avoid this problem, you should use LIKE instead:

SELECT * FROM Territories WHERE TerritoryDescription LIKE 'Savannah         '

When comparing strings with LIKE all characters are significant, including trailing spaces.

Update: a co-worker discovered yesterday that using LIKE in T-SQL JOINs doesn’t use indices in the same way that the equals operator does. This can have a significant impact on performance. Be warned!