Deployment and source-control-friendly database versioning scripts

One of the projects I’m on has over 2,200 stored procedures. Until I arrived, none of these were under source control — the ‘stable’ version was whatever was on the production database server, and the ‘trunk’ version was whatever was in the development database server (which incidentally wasn’t even backed up).

Anyway, I set about exporting all the stored procedures and checking them into source control. But as easy as it sounds, it actually took a few goes to get the it all right. You see, when generating a CREATE script with IF NOT EXISTS, SQL Management Studio has a habit of generating entire scripts as one giant EXEC sp_executesql string instead of just putting a GO in between. This isn’t very good for development — you don’t get intellisense, syntax highlighting, etc inside a string.

To get around this problem, I ended up generating two separate scripts — first a DROP (with IF NOT EXISTS), then a CREATE (without it) — and set SSMS to put each in the same file (Append to File). This gave me one script per stored procedure, each with an IF EXISTS DROP, GO then CREATE statement — perfect for source control.

Here are some other tips for configuring SQL Management Studio (via Options > SQL Server Object Explorer > Scripting) to make it automatically generate friendly scripts:

  • Disable Include descriptive headers. They sound helpful, but all they do is add the script date and repeat the object’s name. This is really annoying because it clutters up my results for source-code text search in Visual Studio.
  • Disable Script USE <database>. We often have several copies of the same database running with different names on the same SQL box (e.g. for different development branches). Hard-coding the name of the database in all your scripts means you have to manually change all the USE statements every time you run them. And trust me — the risk of having your changes leak over and get applied to the wrong database is not really something you want to be worrying about.
  • Enable Include IF NOT EXISTS clause. Stored procedures, triggers, functions — it’s much easier to just nuke and re-create them every time instead of worrying about the semantics of CREATE vs ALTER. This also means you can just run each script over and over and over again.

If you get it right, SQL Management Studio 2008’s Script DROP and CREATE should produce something that looks like:

IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_Employee_Get]') AND type in (N'P', N'PC'))DROP PROCEDURE [dbo].[usp_Employee_Get]GOCREATE PROCEDURE [dbo].[usp_Employee_Get]AS        SELECT                EmployeeID,                FirstName,                LastName        FROM                EmployeeGO

We also put permissions (e.g. GRANT EXEC) at the end of each script, so they are restored each time the stored procedure is created.

Automatically log all INSERT, UPDATE and DELETE commands on a table in SQL Server

Just before Christmas we had a problem with an application database where rows were being mysteriously deleted. Unfortunately the table in question is written to by 20-30 stored procedures, so there was no single point in the application where I could put a trace.

Instead, I wrote a quick generic script that would automatically create the necessary triggers to log all INSERT, UPDATE and DELETE commands on a table of your choosing (or just generate the script).

For example, every time someone modified data in the HR.Employees table, a row would be written to HR._Employees_Audit with all the new values, time stamp, command, and name of the user who did it.

You can download it here: sql-server-auto-audit.sql

Refactoring insight: should I really have a separate domain service for that, or is it the responsibility of an existing entity?

This insight follows on from an earlier article where I identified that the business rule “when an employee leaves the organisation, all their assets must be returned” was part of the domain model, and not just a matter of cleaning up foreign keys in the database.

I originally implemented this rule as part of a dedicated domain service called IPersonTerminatorService. Later on, I refactored it out into a standalone IAssetReturnerService to satisfy the single-responsibility principle.

asset-returner-service

But this morning I realised something: in real life, it’s the employee himself who returns all his stuff — not some other process. By taking this responsibility away from the Employee class I’ve committed two sins: creating a fictional concept in the domain, and reducing the capabilities of the Employee (leading to an anemic data-only domain model).

Time to give this responsibility back to the employee:

employee

Capture the Output from a Scheduled Task

Today I had to diagnose a problem with a Windows Scheduled Task that was sporadically failing with a non-zero return code. The exe in question was a .NET console application that was throwing an exception before Main() got called; it was outside our try-catch block.

Anyway, if you ran the .exe from a command line yourself, you would see the error written to stderr. If you ran the Scheduled Task, the error was not logged anywhere.

To capture the output of the scheduled task, I redirected it to a text file with the following command:

1
2
3
before: NightlyBatchJob.exe
after: cmd /C NightlyBatchJob.exe >> NightlyBatchJob.output.txt 2>&1

The > symbol redirects the output to a file; >> makes it append instead of creating a new blank file each time it runs. 2>&1 makes it include the output from stderr with stdout — without it you won’t see any errors in your logs.

The whole command is run in a new cmd.exe instance, because just running an .exe directly from a scheduled task doesn’t seem to produce any console output at all. For more posts on this subject, go back to the homepage.

Refactoring insight: repositories gone wild!

Here’s a snippet from my prototype NHibernateRepository<T> I mentioned the other day. Can you see what I’ve done wrong here?

public abstract class NHibernateRepository<T>{    protected ISession unitOfWork;    protected IEntityValidator<T> validator;    protected NHibernateRepository(ISession unitOfWork,        IEntityValidator<T> validator)    {        if (unitOfWork == null)            throw new ArgumentNullException("unitOfWork");        if (validator == null)            throw new ArgumentNullException("validator");        this.validator = validator;        this.unitOfWork = unitOfWork;    }        protected virtual void Validate(T item)    {        IEnumerable<RuleViolation> ruleViolations =             this.validator.GetAllRuleViolations(item);        if (ruleViolations.Count() == 0)            return;        string message = String.Format(            "The {0} '{1}' could not be persisted because it failed validation.",            typeof(T).Name, item);        throw new RuleViolationException(message, ruleViolations);    }        public virtual void Save(T item)    {        Validate(item);        using (ITransaction transaction = this.unitOfWork.BeginTransaction())        {            this.unitOfWork.SaveOrUpdate(item);            transaction.Commit();        }    }    // ...}

See that Validate() method? It’s very clever, automatically checking entities are valid before it commits them to the database. Unfortunately, validating entities is not the responsibility of a repository. This is a big fat violation of the single responsibility principle (SRP), and should be moved to a higher-level domain service instead.

Refactoring insight: simple foreign key cleanup or a domain service?

On my quest for DDD/TDD nirvana, I’m going to start documenting moments of insight on real projects when design problems are solved, principles suddenly make sense for the first time, or pitfalls are avoided. I want to do it partly for posterity, and partly to help others who are also learning. Here’s the first one.

One of my projects at the moment is developing a simple asset management system. It’s basically just a catalog of assets (computers, devices and office furniture) with various details (specs, serial numbers, notes on wear and tear etc). Each asset may be issued to a person.

I was writing some tests for my concrete repositories (mainly in order to check the database mappings are correct), when the question arose: what should happen to a person’s assets when the person is deleted?

My first thought was simply to edit the mapping XML and change a cascade option so Asset.AssignedToPersonID would be set back to NULL. However, I realised this is more than just a foreign key issue for NHibernate — returning a person’s assets when they leave the organisation is a significant domain concern.

If I take the shortcut and leave it implemented as a cascade option:

  • It’ll work fine, but the business rule will be implicit, rather than explicitly expressed somewhere as something that happens when a person is terminated. Other developers probably won’t even notice it at all.
  • The implementation won’t be testable without hitting a live SQL instance.
  • The business rule won’t be observable in other test cases unless they hit a live SQL instance, because mock repositories will likely not factor for it.
  • If we ever move away from NHibernate, or otherwise need to recreate the database mappings, this feature could be lost, and will probably only be rediscovered after FK constraint errors arise.

That’s no good. Let’s make this business rule an official part of the domain model instead, with a fully-fledged domain service:

public interface IPersonTerminatorService{    void TerminatePerson(Person person);}public class PersonTerminatorService : IPersonTerminatorService{    IPersonRepository personRepository;    IAssetRepository assetRepository;    public PersonTerminatorService(IPersonRepository personRepository,        IAssetRepository assetRepository)    {        DesignByContract.Require(personRepository != null,             "personRepository cannot be null.");        DesignByContract.Require(assetRepository != null,            "assetRepository cannot be null.");                this.personRepository = personRepository;        this.assetRepository = assetRepository;    }    public void TerminatePerson(Person person)    {        ReturnAllAssetsIssuedToPerson(person);        this.personRepository.Remove(person);    }    public void ReturnAllAssetsIssuedToPerson(Person person)    {        IEnumerable<Asset> assets =             this.assetRepository.GetAssetsIssuedToPerson(person);        foreach (Asset asset in assets)        {            asset.Return();            this.assetRepository.Save(asset);        }    }}

The implementation is not tested yet (and will likely be refactored later because I think it does too much stuff) but here’s what it’s going to satisfy:

[Test]public void Terminating_a_person_returns_all_assets_issued_to_them(){    Asset a = new Computer() { Name = "Apple Powerbook" };    Asset b = new Computer() { Name = "Dell Dimension" };    Asset c = new Computer() { Name = "IBM ThinkPad" };    Person p = new Person("Richard");    a.IssueTo(p);    b.IssueTo(p);    c.IssueTo(p);    IPersonTerminatorService terminatorService =        ServiceLocator.Current.GetInstance<IPersonTerminatorService>();    terminatorService.TerminatePerson(p);    Assert.IsNull(a.IssuedTo);    Assert.IsNull(b.IssuedTo);    Assert.IsNull(c.IssuedTo);}

It’s a start!

Refactoring insight: base project for .NET domain-driven design applications

Refactoring insight: base project for .NET domain-driven design applications

In the past couple of weeks, I’ve started working on a new framework for my team, for developing domain-driven, test-driven web applications with ASP.NET MVC. Actually, it’s more of a meta-framework: a pattern for application development that leans on as many community-backed, best-of-class tools like NHibernate, NUnit, Rhino.Mocks, Json.NET etc as possible.

From time to time, however, I need to write my own utility class or interface, because it’s too specialised or not available elsewhere. To promote reuse and consistency between applications, I started putting them in a shared Foo.Core project (plus Foo.Core.Tests of course), similar to SharpArch.Core. Unfortunately, it began to turn into a bit of a mess; I had DDD-specific stuff like concept base classes and repository traits mixed in alongside generic LINQ extension methods and other random .NET utility classes.

It’s quite likely that some of the more generic stuff will be used in non-DDD projects, like SharePoint components and addins for Microsoft Office. But to a non-DDD developer, Foo.Core contains a lot of mysterious and scary stuff for whom the purpose of isn’t clear. This is not going to help adoption within my team.

To solve this problem, I decided to split the project in two. I now have a Foo.DomainDrivenDesign project that so far includes:

  • Repository Traits
  • Specification bases
  • DomainException
  • IEntityValidator<T>, RuleViolation and RuleViolationException
  • ObjectWithId<T>

The project name makes it immediately obviously what all this stuff is for. The rest has been dumped in Foo.Utilities:

  • DesignByContract
  • LINQ extensions

Hopefully in future, the Utilities project can be eliminated completely and replaced by Umbrella or Utilities.NET so we don’t have to maintain it.

IRepository: one size does not fit all

I’ve been spending a lot of time putting TDD/DDD into practice lately, and obsessing over all sorts of small-yet-important details as I try concepts and patterns for myself.

One pattern that has recently become popular in mainstream .NET is IRepository<T>. Originally documented in PoEAA, a repository is an adapter that can read and write objects from the database as if it were an in-memory collection like an array. This is called persistence ignorance (PI), and lets you forget about underlying database/ORM semantics.

Somewhere along the line however, someone realized that with generics you can create a single, shared interface that will support pretty much any operation you could ever want to do with any sort of object. They usually look something like this:

public interface IRepository<T>
{
    T GetById(int id);
    IEnumerable<T> GetAll();
    IEnumerable<T> FindAll(IDictionary<string, object> propertyValuePairs);
    T FindOne(IDictionary<string, object> propertyValuePairs);
    T SaveOrUpdate(T entity);
    void Delete(T entity);
}

If you need to add any custom behaviour, you can simply subclass it — e.g. IProjectRepository : IRepository — and add new methods there. That’s pretty handy for a quick forms-over-LINQ-to-SQL application.

However, I don’t believe this is satisfactory for applications using domain-driven design (DDD), as repositories can vary greatly in their capabilities. For example, some repositories will allow aggregates to be added, but not deleted (like legal records). Others might be completely read-only. Trying to shoehorn them all into a one-size-fits-all IRepository<T> interface will simple result in a lot of leaky abstractions: you could end up with a Remove() method that is available but always throws InvalidOperationException, or developer team rules like “do not never call Save() on RepositoryX”. That would be pretty bad, so what else can we do instead?

Possibility #2: no common repository interface at all

The first alternative is simply dropping the IRepository<T> base, and make a totally custom repository for each aggregate.

public interface IProjectRepository
{
    Project GetById(int id);
    void Delete(Project entity);
    // ... etc
}

This is good, because we won’t inherit a bunch of crap we don’t want, but similar repositories will have a lot of duplicate code. Making method names consistent is now left to the developer — we could end up with one repository having Load(), another Get(), another Retrieve() etc. This isn’t very good.

Thinking about it, a lot of repositories are going to fit into one or two broad categories that share a few common methods — those that support reading and writing, and those that only support reading. What if we extracted these categories out into semi-specialized base interfaces?

Possibility #3: IReadOnlyRepository and IMutableRepository

What if we provided base interfaces for common repository types like this:

This is better, but still doesn’t satisfy all needs. Providing a GetAll() method might be helpful for a small collection of aggregates that we enumerate over often, but it wouldn’t make so much sense for a database of a million customers. We still need to be able to include/exclude standard capabilities at a more granular level.

Possibility #4: Repository Traits

Let’s create a whole load of little fine-grained interfaces — one for each standard method a repository might have.

public interface ICanGetAll<T>
{
    IEnumerable<T> GetAll();
}

public interface ICanGetById<TEntity, TKey>
{
    TEntity GetById(TKey id);
}

public interface ICanRemove<T>
{
    void Remove(T entity);
}

public interface ICanSave<T>
{
    void Save(T entity);
}

public interface ICanGetCount
{
    int GetCount();
}

// ...etc

I am calling these Repository Traits. When defining a repository interface for a particular aggregate, you can explicitly pick and choose which methods it should have, as well as adding your own custom ones:

public interface IProjectRepository :
    ICanSave<Project>,
    ICanRemove<Project>,
    ICanGetById<Project, int>,
    ICanGetByName<Project>
{
    IEnumerable<Project> GetProjectsForUser(User user);
}

This lets you define repositories that can do as little or as much as they need to, and no more. If you recognize a new trait that may be shared by several repositories — e.g., ICanDeleteAll — all you need to do is define and implement a new interface.

Side note: concrete repositories can still have generic bases

Out of interest, here’s what my concrete PersonRepository looks like:

There’s very little new code in it, because all of the common repository traits are already satisfied by a generic, one-size-fits-all NHibernateRepository base class (which must be completely hidden from external callers!). The IPersonRepository just defines which subset of its methods are available to the domain layer.

Cargo-cult commenting

Currently I am responsible for maintaining a big legacy .NET ERP system. It’s plagued by something I like to call cargo-cult commenting — source code comments that are ritually included, for some reason, but serve no real purpose at all. These are the sort of comments that:

  • Are written for aesthetic reasons more than anything else
  • Usually just repeat the next line of C# in sentence form
  • Tell you what the code does, not the reasons why
  • Don’t add any value whatsoever
  • Just make the code base even more bloated and crappy than it was before

Here are a few examples to demonstrate exactly what I mean.

/// <summary>/// POST/// </summary>/// <param name="oSpace"></param>public override void POST(ref Space oSpace)
//Create a new attribute.XmlAttribute newAttr = xmlDocuments.CreateAttribute("Directory");newAttr.Value = "CompanyCorrespondenceDocs";
// read-only?if (bVehicleEdit == false) oField.SetReadWrite(ref oSpace, true);
// return the vehicle riskreturn DataAccess.ExecuteXml("Result", "Row", "Vehicle_VehicleRisk_Get",     iVehicleID).SelectSingleNode("Result/Row");

This sort of rubbish gets in the system because people are taught the stupid rule that comments are always good, no matter what their purpose is. One developer sees another developer writing a line of comments for every 2-3 lines of code, copies it, and pretty soon sections without any green look naked and unfinished. If only unit tests were written like this!

Anyway, my golden rules for code commenting are:

  1. There are only two valid reasons for writing comments — to explain why code was written (e.g. original business motivation), or as a warning when something unexpected happens (here be dragons).
  2. Write the reasons why something is done, not how.
  3. If you find yourself putting comments around little blocks of 3-6 lines of related code, each of these blocks should probably be a separate method with a descriptive name instead.
  4. If you can’t write anything useful (i.e. that couldn’t be determined by a quick glance at the code), don’t write anything at all.
  5. BUT if you find yourself regularly needing to write comments because it is not clear what’s going on from the code, then your code probably isn’t expressive enough.

Writing good comments is hard. One habit I’ve recently adopted is to paste entire paragraphs of relevant user requirement into my code to explain why it was written. It’s working really well — it saves me time, and produces comments of very high quality, straight from the horse’s mouth. I encourage you to try it too.

Entity validation and LINQ: Using yield return to optimize IsValid over a list of broken rules

Entity validation and LINQ: Using yield return to optimize IsValid over a list of broken rules

A common pattern for checking an entity is valid involves testing a number of rules on it. After all tests have been performed, a list of broken rules is returned.

Consider this example for validating instances of a simple Customer class:

class CustomerValidator{    public IEnumerable<RuleViolation> GetAllRuleViolations(Customer c)    {        IList<RuleViolation> ruleViolations = new List<RuleViolation>();        if (String.IsNullOrEmpty(c.FirstName))        {            ruleViolations.Add(new RuleViolation("FirstName",                 "First name cannot be empty."));        }        if (String.IsNullOrEmpty(c.LastName))        {            ruleViolations.Add(new RuleViolation("LastName",                 "Last name cannot be empty."));        }        if (!Regex.IsMatch(c.PhoneNumber, @"[d ()+-]+"))        {            ruleViolations.Add(new RuleViolation("PhoneNumber",                 "Invalid phone number."));        }        return ruleViolations;    }}

Quite often though, we don’t care about the full list of broken rules — we only care if the object is valid or not. Instead of…

IEnumerable<RuleViolation> brokenRules =     customerValidator.GetAllRuleViolations(customer);if (brokenRules.Count() > 0)    // do stuff</pre>...we would rather have:</p><pre lang="csharp">if (!customerValidator.IsValid(customer))    // do stuff

So what’s the difference between checking if an entity is valid and getting a detailed list of validation errors?

For starters, the only way of finding out if an entity is valid is by testing all the rules against it. Let’s assume this is a reasonably intensive operation — if you have a lot of rules, or need to check things with external systems (checking a username doesn’t already exist in the database, for example).

If all we’re doing is checking if the entity is valid, we want to do as little work as possible. This means stopping as soon as we hit a broken rule.

The easiest way to do this is with the yield return keyword. Yield return is kind of strange — it lets you iterate over objects as they are returned from a method. This is used for evaluating LINQ expression trees. Instead of filtering and reducing a collection one criteria at a time — e.g. testing 1000 objects, then re-testing the 500 objects that passed, etc — it tests each object against all the criteria at once.

In this case, we simply want to return as soon as a broken rule is encountered.

class CustomerValidator{    public IEnumerable<RuleViolation> GetAllRuleViolations(Customer c)    {        if (String.IsNullOrEmpty(c.FirstName))        {            yield return new RuleViolation("FirstName",                 "First name cannot be empty.");        }                if (String.IsNullOrEmpty(c.LastName))        {            yield return new RuleViolation("LastName",                 "Last name cannot be empty.");        }        if (!Regex.IsMatch(c.PhoneNumber, @"[d ()+-]+"))        {            yield return new RuleViolation("PhoneNumber",                 "Invalid phone number.");        }    }}

See that? The method is still defined as returning a collection, but it has three return statements with a single object each. With this, we can use a little bit of LINQ to break out of the method early:

    public bool IsValid(Customer c)    {        // are there any rule violations?        return !GetAllRuleViolations(c).Any();    }

I whipped up a little test app to prove this — IsValid vs GetAllRuleViolations.

CustomerValidator validator = new CustomerValidator();// An invalid customerCustomer customer = new Customer(){    FirstName = "",    LastName = "",    PhoneNumber = "asdsd"};// do as little work as possible to work out if the entity is valid or notConsole.WriteLine("IsValid = ");Console.WriteLine(validator.IsValid(customer));// check everything for a full reportConsole.WriteLine("GetAllRuleViolations().Count =");Console.WriteLine(validator.GetAllRuleViolations(customer).Count());

Here’s what it produces. Note that IsValid returns immediately after the first rule fails.

IsValid =        Testing first nameTrueGetAllRuleViolations().Count =        Testing first name        Testing last name        Testing phone number3