software development methodologies

Popular Software Development Methodologies to Put in Practice

Living in the era of technology , we have the blessing to reach out to almost everything by only one click. Thanks to the  software development process, the technology has reached a point where you can create platforms that can do whatever you imagine.

We came across some of the most popular software development methodologies, so today we are  going to share with you their advantages and disadvantages.

Agile Software Development Methodology.

This type of software development methodology is an approach that is very adaptive and is very flexible to changes. It enables direct communication and improves the quality of the whole software development process by triggering the defects in the early stages and fixing them very quickly.

However, the main disadvantage of this software development methodology is that it is not very efficient regarding documentation.

Dynamic System Development Methodology

According to all the books on software development methodologies, this model tries to involve the user as much as often in the whole process, and its main purpose is to get the process done right on time, utilizing all the required resources.

Compared to the other software development methodologies, the dynamic model is not recommended for small companies and it is also very expensive.

Spiral Software Development Methodology

The advantages of this model are pretty darn good. It can reduce all the risk factors at the early stages of the process, and is suitable for more complex projects with high risks and companies that have different requirements.

Nevertheless, this is a very costly software development methodology, and any downfall on the risk factors can make the project disappear.

If you have any thoughts on this topic, you can contact me, and we can discuss our approaches.

Powershell to recursively strip C# regions from files

Here’s a super quick little powershell snippet to strip regions out of all C# files in a directory tree. Useful for legacy code where people hide long blocks in regions rather than encapsulate it into smaller methods/objects.

dir -recurse -filter *.cs $src foreach ($_) {
    $file = $_.fullname
    echo $file
    (get-content $file) | where {$_ -notmatch "^.*\#(end)?region.*$" } | out-file $file
}

Run this in your solution folder and support the movement against C# regions!

Dogfooding: how to build a great API

It’s common these days for web applications to provide some sort of RESTful API for third party integration, but too many of them are built after the fact, as a lacklustre afterthought — with holes and limitations that makes them useless for all but the most trivial applications. Here’s some common pitfalls I’ve seen (and been partially responsible for!) arising from this style of development:

  • Only a narrow set of features supported: The application may have a hundred different tasks a user can do in the GUI, but only a handful are supported through the API.
  • Impractical for real-world use: APIs that fail to consider things like minimizing the number of requests through and batching and bulk APIs. Just think, how you would implement GMail’s “mark all as read” button if you could only update one conversation at a time?
  • API features lag UI features: For example a new field might be added to the UI, but there is no way to access it in the API until someone requests it.
  • Weird/buggy APIs: Bad APIs don’t get as much developer attention if no one’s using them, making them a natural place for bugs and design quirks to harbour.

Instead of restricting developers to a limited, locked-down set of operations, the goal is to empower developers to be as creative and productive as possible with your API.

A better way to design an API

Don’t try to build the sort of API you think people will want. Flip the problem on its head and eat your own dog food: make a rule that, from now on, all new UI features can only be based on the public API. Abandon your traditional, internal back-end interfaces and start using the public API as an interface for all your front-end applications.

“All Google tools use the API. There is no backdoor. The web UI is built on Google App Engine, for example.”

This quote from High Scalability beautifully summarizes the design of their API. There is no backdoor, no privileged internal interface — Google developers have to use the exact same API as you at home.

Twitter is another great example of an application that dogfoods its own API – click view source on twitter.com and you’ll see, apart from tracking and advertising, it’s mostly AJAX calls to their public developer API – the very same API all the thousands of third party apps use.

Running Mocha browser tests in TeamCity

Mocha is a great javascript testing framework that supports TeamCity out-of-the-box for testing node.js-based apps on your build server. Here’s a quick guide on how to get it running in TeamCity for browser-based apps as well.

Configuring Mocha’s TeamCity reporter

First we need to configure Mocha to emit specially formatted messages to console.log() that TeamCity can detect and parse. This is easy because Mocha supports pluggable reporters for displaying test progress, and it provides a TeamCity reporter out of the box. Set this up in your mocha.html page:

<script lang="text/javascript">
	mocha.setup({
		ui: 'bdd',
		reporter: function(runner) {
			// Note I am registering both an HTML and a TeamCity reporter here
			// so I can use the same HTML page for local browser development and
			// TeamCity.
			new mocha.reporters.HTML(runner);
			new mocha.reporters.Teamcity(runner);
		},
		ignoreLeaks: true,
		timeout: 5000 // ms
	});

	$(function(){
		mocha.run();
	})
</script>

Executing Mocha tests from the command line

For running our tests we will a special browser, PhantomJS, which is “headless” and has no GUI — you simply invoke it from the command line and interact with it via javascript. Here’s how you can load an HTML page containing Mocha tests — from either the local file system or a web server — passing the URL as a command-line argument:

(function () {
    "use strict";
    var system = require("system");
    var url = system.args[1];

    phantom.viewportSize = {width: 800, height: 600};

    console.log("Opening " + url);

    var page = new WebPage();

    // This is required because PhantomJS sandboxes the website and it does not
    // show up the console messages form that page by default
    page.onConsoleMessage = function (msg) {
        console.log(msg);

        // Exit as soon as the last test finishes.
        if (msg && msg.indexOf("##teamcity[testSuiteFinished name='mocha.suite'") !== -1) {
            phantom.exit();
        }
    };

    page.open(url, function (status) {
        if (status !== 'success') {
            console.log('Unable to load the address!');
            phantom.exit(-1);
        } else {
            // Timeout - kill PhantomJS if still not done after 2 minutes.
            window.setTimeout(function () {
                phantom.exit();
            }, 120 * 1000);
        }
    });
}());

You can then invoke your tests from the command line, and you should see a bunch of TeamCity messages scroll past.

phantomjs.exe phantomjs-tests.js http://localhost:88/jstests/mocha.html

Note in my example I am running tests on a local web server on port 81, but PhantomJS also supports local file:// URLs.

Setting up a build step for PhantomJS

Finally the last thing we need to do is to set up a new Command Line build step in TeamCity to run PhantomJS.

And that’s it! You should now see your test passes and fails showing up in TeamCity:

Acknowledgements

Thanks to Dan Merino for his original article on running Jasmine tests under TeamCity, which formed the basis for most of this post.

One NHibernate session per WCF operation, the easy way

This week I’ve been working on a brownfield Castle-powered WCF service that was creating a separate NHibernate session on every call to a repository object.

Abusing NHibernate like this was playing all sorts of hell for our app (e.g. TransientObjectExceptions), and prevented us from using transactions that matched with a logical unit of work, so I set about refactoring it.

Goals

  • One session per WCF operation
  • One transaction per WCF operation
  • Direct access to ISession in my services
  • Rely on Castle facilities as much as possible
  • No hand-rolled code to plug everything together

There are a plethora of blog posts out there to tackle this problem, but most of them require lots of hand-rolled code. Here are a couple of good ones — they both create a custom WCF context extension to hold the NHibernate session, and initialize/dispose it via WCF behaviours:

  • NHibernate Session Per Request Using Castles WcfFacility
  • NHibernate’s ISession, scoped for a single WCF-call

These work well, but actually, there is a much simpler way that only requires the NHibernate and WCF Integration facilities.

Option one: manual Session.BeginTransaction() / Commit()

The easiest way to do this is to register NHibernate’s ISession in the container, with a per WCF operation lifestyle:

windsor.AddFacility<WcfFacility>();
windsor.AddFacility("nhibernate"new NHibernateFacility(...));
windsor.Register(
    Component.For<ISession>().LifeStyle.PerWcfOperation()
        .UsingFactoryMethod(x => windsor.Resolve<ISessionManager>().OpenSession()),
    Component.For<MyWcfService>().LifeStyle.PerWcfOperation()));

If you want a transaction, you have to manually open and commit it. (You don’t need to worry about anything else because NHibernate’s ITransaction rolls back automatically on dispose):

[ServiceBehavior]
public class MyWcfService : IMyWcfService
{
    readonly ISession session;
    public MyWcfService(ISession session)
    {
        this.session = session;
    }
    public void DoSomething()
    {
        using (var tx = session.BeginTransaction())
        {
            // do stuff
            session.Save(...);
            tx.Commit();
        }
    }
}

(Note of course we are using WindsorServiceHostFactory so Castle acts as a factory for our WCF services. And disclaimer: I am not advocating putting data access and persistence directly in your WCF services here; in reality ISession would more likely be injected into query objects and repositories each with a per WCF operation lifestyle (you can use this to check for lifestyle conflicts). It is just an example for this post.)

Anyway, that’s pretty good, and allows a great deal of control. But developers must remember to use a transaction, or remember to flush the session, or else changes won’t be saved to the database. How about some help from Castle here?

Option two: automatic [Transaction] wrapper

Castle’s Automatic Transaction Facility allows you to decorate methods as [Transaction] and it will automatically wrap a transaction around it. IoC registration becomes simpler:

windsor.AddFacility<WcfFacility>();
windsor.AddFacility("nhibernate"new NHibernateFacility(...));
windsor.AddFacility<TransactionFacility>();
windsor.Register(
    Component.For<MyWcfService>().LifeStyle.PerWcfOperation()));

And using it:

[ServiceBehavior, Transactional]
public class MyWcfService : IMyWcfService
{
    readonly ISessionManager sessionManager;
    public MyWcfService(ISessionManager sessionManager)
    {
        this.sessionManager = sessionManager;
    }
    [Transaction]
    public virtual void DoSomething()
    {
        // do stuff
        sessionManager.OpenSession.Save(...);
    }
}

What are we doing here?

  • We decorate methods with [Transaction] (remember to make them virtual!) instead of manually opening/closing transactions. I put this attribute on the service method itself, but you could put it anywhere — for example on a CQRS command handler, or domain event handler etc. Of course this requires that the class with the [Transactional] attribute is instantiated via Windsor so it can proxy it.
  • Nothing in the NHibernateFacility needs to be registered per WCF operation lifestyle. I believe this is because NHibernateFacility uses the CallContextSessionStore by default, which in a WCF service happens to be scoped to the duration of a WCF operation.
  • Callers must not dispose the session — that will be done by castle after the transaction is commited. To discourage this I am using it as a method chain — sessionManager.OpenSession().Save() etc.
  • Inject ISessionManager, not ISession. The reason for this is related to transactions: NHibernateFacility must construct the session after the transaction is opened, otherwise it won’t know to enlist it. (NHibernateFacility knows about ITransactionManger, but ITransactionManager doesn’t know about NHibernateFacility). If your service depends on ISession, Castle will construct the session when MyWcfService and its dependencies are resolved (time of object creation) before the transaction has started (time of method dispatch). Using ISessionManager allows you to lazily construct the session after the transaction is opened.
  • In fact, for this reason, ISession is not registered in the container at all — it is only accessible via ISessionManager (which is automatically registered by the NHibernate Integration Facility).

This gives us an NHibernate session per WCF operation, with automatic transaction support, without the need for any additional code.

Update: there is one situation where this doesn’t work — if your WCF service returns a stream that invokes NHibernate, or otherwise causes NHibernate to load after the end of the method, this doesn’t work. A workaround for these methods is simply to omit the [Transaction] attribute (hopefully you’re following CQS and not writing the DB in your query!).

Brownfield CQRS Part 1 – Commands

One question that came up several times at DDD eXchange last week was CQRS: now we understand all the benefits, how do we begin migrating our existing applications towards this sort of architecture?

It’s something we’ve been chipping away at at work recently, and over a short series of posts I’d like to share some of the conventions and patterns we’ve been using to migrate traditional WPF client/WCF server systems towards a CQRS-aware architecture.

Note that WCF isn’t strictly required here — these patterns are equally applicable to most other RPC-based services (e.g. old ASMX web services).

CQRS recap

Command-Query Responsibility Segregation (CQRS) is based around a few key assumptions:

  • All system behaviour is either a command that changes the state of the system, or a query that provides a view of that state (e.g. to display on screen).
  • In most systems, the number of reads (queries) is typically an order of magnitude higher than than the number of writes (commands) — particularly on the web. It is therefore useful to be able to scale each side independently.
  • Commands and queries have fundamentally different needs — commands favour a domain model, consistency and normalization, where reads are faster when highly denormalized e.g. table-per-screen with as little processing as possible in between. They often also portray the same objects differently — commands are typically small, driven by the needs of business transactions, where queries are larger, driven by UX requirements and sometimes including projections, flattening and aggregation.
  • Using the same underlying model and storage mechanism for reads and writes couples the two sides together, and ensures at least one will suffer as a result.

CQRS completely separates (segregates) commands and queries at an architectural level, so each side can be designed and scaled independently of the other.

CQRS and WCF

The best way to to begin refactoring your architecture is to define clear interfaces — contracts — between components. Even if secretly, the components are huge messes on the inside, getting their interfaces (commands and queries) nailed down first sets the tone of the system, and allows you to begin refactoring each component at your discretion, without affecting others.

Each method on our WCF service must be either a command or a query. Let’s start with commands.

Commands

Each command method on your service must:

  • Take a single command argument — a simple DTO/parameter object that encapsulates all the information required to execute the command.
  • Return void — commands do not have return values. Only fault contracts may be used to throw an exception when a command fails.

Here’s an example:

[DataContract]
public class BookTableCommand : ICommand
{
    [DataMember]
    public DateTime TimeAndDay { get; set; }

    [DataMember]
    public int PartySize { get; set; }

    [DataMember]
    public string PartyName { get; set; }

    [DataMember]
    public string ContactPhoneNumber { get; set; }

    [DataMember]
    public string SpecialRequests { get; set; }
}

Commands carry all the information they need for someone to execute them — e.g. a command for booking a restaurant would tell us who is coming, when the booking is for, contact details, and any special requests (e.g. someone’s birthday). Commands like these are a special case of the Parameter Object pattern.

Now we’ve got our command defined, here’s the corresponding method on the WCF service endpoint:

[ServiceContract]
public interface IBookingService
{
    [OperationContract]
    void BookTable(BookTableCommand command);
}

One class per command

Command DTO classes are never re-used outside their single use case. For example, in the situation a customer wishes to change their booking (e.g. change it to a different day, or invite more friends), you would create a whole new ChangeBookingCommand, even though it may have all the same properties as the original BookTableCommand.

Why bother? Why not just create a single, general-purpose booking DTO and use it everywhere? Because:

  1. Commands are more than just the data they carry. The type communicates intent, and its name describes the context under which the command would be sent. This information would be lost with a general-purpose object.
  2. Using the same command DTO for two use cases couples them together. You couldn’t add a parameter for one use case without adding it for the other, for example.

What if I only need one parameter? Do I need a whole command DTO for that?

Say you had a command that only carried a reference to an object — its ID:

[DataContract]
public class CancelBookingCommand : ICommand
{
    [DataMember]
    public Guid BookingReference { get; set; }
}

Is it still worth creating an entire command DTO here? Why not just pass the GUID directly?

Actually, it doesn’t matter how many parameters there are:

  • The intent of the command (in this case, cancelling a restaurant booking) is more important than the data it carries.
  • Having a named command object makes this intent explicit. Not possible just with a GUID argument on your operation contract.
  • Adding another parameter to the command (say, for example, an optional reason for cancelling) would require you to change the signature of the service contract. Ading another property to a command object would not.
  • Command objects are much easier to pass around than a bunch of random variables (as we will see in the next post). For example, you can queue commands on a message bus to be processed later, or dispatch them out to a cluster of machines.

Why not just one overloaded Execute() method?

Instead of having one operation contract per command, why don’t you just use a single overloaded method like this?

[ServiceContract]
public interface IBookingService
{
    [OperationContract]
    void Execute<T>(T command) where T : ICommand;
}

You can but I wouldn’t recommend it. We’re still doing SOA here — a totally-generic contract like this makes it much harder for things like service discovery and other clients to see the endpoint’s capabilities. Discover more solutions like this one!

Correctness vs Robustness

In programming, correctness and robustness are two high-level principles from which a number of other principles can be traced back to.

Correctness Robustness
Design by Contract
Defensive programming
Assertions
Invariants
Fail Fast
Populating missing parameters
Sensible defaults
Getting out of the users way
Anticorruption Layer
Backwards-compatible APIs

Robustness adds built-in tolerance for common and non-critical mistakes, while correctness throws an error when it encounters anything less than perfect input. Although they are contradictory, both principles play an important role in most software, so it’s important to know when each is appropriate, and why.

Robustness

“Robustness” is well known as one of the founding principles of the internet, and is probably one of the major contributing factors to its success. Postel’s Law summarizes it simply:

Be conservative in what you do; be liberal in what you accept from others.

Postel’s Law originally referred to other computers on a network, but these days, it can be applied to files, configuration, third party code, other developers, and even users themselves. For example:

Problem Robust approach Correct approach
A rogue web browser that adds trailing whitespace to HTTP headers. Strip whitespace, process request as normal. Return HTTP 400 Bad Request error status to client.
A video file with corrupt frames. Skip over corrupt area to next playable section. Stop playback, raise “Corrupt video file” error.
A config file with lines commented out using the wrong character. Internally recognize most common comment prefixes, ignore them. Terminate on startup with “bad configuration” error.
A user who enters dates in a strange format. Try parsing the string against a number of different date formats. Render the correct format back to the user. Invalid date error.

(Important side note: Postel’s law doesn’t suggest skipping validation entirely, but that errors in non-essential items simply be logged and/or warned about instead of throwing fatal exceptions.)

In many cases, relentless pursuit of correctness results in a pretty bad user experience. One recent annoyance of mine is companies that can’t handle spaces in credit card numbers. Computers are pretty good at text processing, so wasting the user’s time by forcing them to retype their credit card numbers in strange formats is pure laziness on behalf of the developer. Especially if you consider the validation error probably took more code than simply stripping the spaces out.

Compare this to Google Maps, where you can enter just about anything in the search box and it will figure out a street address. I know which I prefer using.

Robustness makes users’ lives easier, and by doing so promotes uptake. Not only amongst end users, but also developers — if you’re developing a standard/library/service/whatever, and can build in a bit of well thought-out flexibility, it’s going to grant a second chance for users with clients that aren’t quite compliant, instead of kicking them out cold.

Adam Bosworth noted a couple of examples of this in a recent post what makes successful standards:

If there is something in HTTP that the receiver doesn’t understand it ignores it. It doesn’t break. If there is something in HTML that the browser doesn’t understand, it ignores it. It doesn’t break. See Postel’s law. Assume the unexpected. False precision is the graveyard of successful standards. XML Schema did very badly in this regard.

HTTP and HTML are displaying robustness here; if they see anything they don’t recognize they simply skip past it. XML Schema on the other hand fails validation if there are any extra elements/attributes present other than precisely those specified in the XSD.

Correctness

So robustness makes life easier for users and third-party developers. But correctness, on the other hand, makes life easier for your developers according to the books on software development methodologies — instead of bogging down checking/fixing parameters and working around strange edge cases, they can focus on the one single model where all assumptions are guaranteed. Any states outside the main success path can thus be ignored (by failing noisily) — producing code that is briefer, easier to understand, and easier to maintain.

For example, consider parsing XHTML vs regular HTML. If XHTML isn’t well formed, you can simply throw a parser error and give up. HTML on the other hand has thoroughly documented graceful error handling and recovery procedures that you need to implement — much harder than simply validating it as XML.

Which should you use then?

We have a conflict here between robustness or correctness as a guiding principle. I remember one paralyzing moment of indecision I had when I was a bit younger with this very question. The specific details aren’t important, but the 50/50 problem basically boiled down to this: If my code doesn’t technically require this value to be provided, should I check it anyway?

To answer this question, you need to be aware of where you are in the code base, and who it is serving.

  • External interfaces (UI, input files, configuration, API etc) exist primarily to serve users and third parties. Make them robust, and as accommodating as possible, with the expectation that people will input garbage.
  • An application’s internal model (i.e. domain model) should be as simple as possible, and always be in a 100% valid state. Use invariants and assertions to make safe assumptions, and just throw a big fat exception whenever you encounter anything that isn’t right.
  • Protect the internal model from external interfaces with an anti-corruption layer which maps and corrects invalid input where possible, before passing it to the internal model.

Remember if you ignore users’ needs, no one will want to use your software. And if you ignore programmers’ needs, there won’t be any software. So make your external interfaces robust. Make your internal model correct.

Or in other words: internally, seek correctness; externally, seek robustness. A successful application needs both.

Domain-Driven Documentation

Here’s a couple of real-life documentation examples from a system I’ve been building for a client:

Monitored Individual is a role played by certain Employees. Each Monitored Individual is required to be proficient in a number of Competencies, according to [among other things] what District they’re stationed in.

Training Programme is comprised of Skills, arranged in Skill GroupsSkill Groups can contain Sub Groups with as many levels deep as you like. Skills can be used for multiple Training Programmes, but you can’t have the same Skill twice under the same Training Programme. When a Skill is removed from a Training ProgrammeIndividuals should no longer have to practice it.

This is the same style Evans uses himself in the blue DDD book. A colleague jokingly called it Domain-Driven Documentation.

I adopted it after noticing a couple of problems with my documentation:

  • I was using synonyms — different words with the same meaning — interchangeably to refer to the same thing in different places.
  • Sentences talking about the code itself looked messy and inconsistent when mixing class names with higher-level concepts.

It’s a pretty simple system. There are only three rules to remember: when referring to domain concepts, use capital letters, write them in full, and write them in bold.

Highlighting the names of domain concepts like this is a fantastic way to hammer down the ubiquitous language — the vocabulary shared between business and developers.

Since adopting it, I’ve noticed improvements in both the quality of my documentation, and of the communication in our project meetings — non-technical business stakeholders are starting to stick to the ubiquitous language now, where in the past they would fall back to talking about purely UI artifacts. This is really encouraging to see — definitely a success for DDD.

The Trouble with Soft Delete

Soft delete is a commonly-used pattern amongst database-driven business applications. In my experience, however, it usually ends up causing more harm than good. Here’s a few reasons why it can fail in bigger applications, and some less-painful alternatives to consider.

Tomato, Tomato

I’ve seen a few different implementations of this pattern in action. First is the standard deleted flag to indicate an item should be ignored:

SELECT * FROM Product WHERE IsDeleted = 0

Another style uses meaningful status codes:

SELECT * FROM Task WHERE Status = 'Pending'

You can even give an item a fixed lifetime that starts and ends at a specific time (it might not have started yet):

SELECT * FROM Policy WHERE GETDATE() BETWEEN StartDate AND EndDate

All of these styles are all flavors of the same concept: instead of pulling dead or infrequently-used items out of active set, you simply mark them and change queries to step over the corpses at runtime.

This is a trade-off: soft delete columns are easy to implement, but incur a cost to query complexity and database performance later down the track.

Complexity

To prevent mixing active and inactive data in results, all queries must be made aware of the soft delete columns so they can explicitly exclude them. It’s like a tax; a mandatory WHERE clause to ensure you don’t return any deleted rows.

This extra WHERE clause is similar to checking return codes in programming languages that don’t throw exceptions (like C). It’s very simple to do, but if you forget to do it in even one place, bugs can creep in very fast. And it is background noise that detracts away from the real intention of the query.

Performance

At first glance you might think evaluating soft delete columns in every query would have a noticeable impact on performance.

However, I’ve found that most RDBMSs are actually pretty good at recognizing soft delete columns (probably because they are so commonly used) and does a good job at optimizing queries that use them. In practice, filtering inactive rows doesn’t cost too much in itself.

Instead, the performance hit comes simply from the volume of data that builds up when you don’t bother clearing old rows. For example, we have a table in a system at work that records an organisations day-to-day tasks: pending, planned, and completed. It has around five million rows in total, but of that, only a very small percentage (2%) are still active and interesting to the application. The rest are all historical; rarely used and kept only to maintain foreign key integrity and for reporting purposes.

Interestingly, the biggest problem we have with this table is not slow read performance but writes. Due to its high use, we index the table heavily to improve query performance. But with the number of rows in the table, it takes so long to update these indexes that the application frequently times out waiting for DML commands to finish.

This table is becoming an increasing concern for us — it represents a major portion of the application, and with around a million new rows being added each year, the performance issues are only going to get worse.

Back to the original problem

The trouble with implementing soft delete via a column is that it simply doesn’t scale well for queries targeting multiple tables — we need a different strategy for larger data models.

Let’s take a step back and examine the reasons why you might want to implement soft deletes in a database. If you think about it, there really are only four categories:

  1. To provide an ‘undelete’ feature.
  2. Auditing.
  3. For soft create.
  4. To keep historical items.

Let’s look at each of these and explore what other options are available.

Soft delete to enable undo

Human error is inevitable, so it’s common practice to give users the ability to bring something back if they delete it by accident. But this functionality can be tricky to implement in a RDBMS, so first you need to ask an important question — do you really need it?

There are two styles I have encountered that are achieved via soft delete:

  1. There is an undelete feature available somewhere in the UI, or
  2. Undelete requires running commands directly against the database.

If there is an undo delete button available somewhere for users, then it is an important use case that needs to be factored into your code.

But if you’re just putting soft delete columns on out of habit, and undelete still requires a developer or DBA to run a command against the database to toggle the flags back, then this is a maintenance scenario, not a use case. Implementing it will take time and add significant complexity to your data model, and add very little benefit for end users, so why bother? It’s a pretty clear YAGNI violation — in the rare case you really do need to restore deleted rows, you can just get them from the previous night’s backup.

Otherwise, if there really is a requirement in your application for users to be able to undo deletes, there is already a well-known pattern specifically designed to take care of all your undo-related scenarios.

The memento pattern

Soft delete only supports undoing deletes, but the memento pattern provides a standard means of handling all undo scenarios your application might require.

It works by taking a snapshot of an item just before a change is made, and putting it aside in a separate store, in case a user wants to restore or rollback later. For example, in a job board application, you might have two tables: one transactional for live jobs, and an undo log that stores snapshots of jobs at previous points in time:

If you want to restore one, you simply deserialize it and insert it back in. This is much cleaner than messing up your transactional tables with ghost items, and lets you handle all undo operations together using the same pattern.

Soft delete for auditing

Another common practice I have seen is using soft delete as a means of auditing: to keep a record of when an item was deleted, and who deleted it. Usually additional columns are added to store this information:

As with undo, you should ask a couple of questions before you implement soft delete for auditing reasons:

  • Is there a requirement to log when an item is deleted?
  • Is there a requirement to log any other significant application events?

In the past I have seen developers (myself included) automatically adding delete auditing columns like this as a convention, without questioning why it’s needed (aka cargo cult programming).

Deleting an object is only one event we might be interested in logging. If a rogue user performs some malicious acts in your application, updates could be just as destructive so we should know about those too.

One possible conclusion from this thought is that you simply log all DML operations on the table. You can do this pretty easily with triggers:

-- Log all DELETE operations on the Product table
CREATE TRIGGER tg_Product_Delete ON Product AFTER DELETE
AS
    INSERT INTO [Log]
    (
        [Timestamp],
        [Table],
        Command,
        ID
    )
    SELECT
        GETDATE(),
        'Product',
        'DELETE',
        ProductID
    FROM
        Deleted
CREATE TRIGGER tg_Product_Update ON Product AFTER UPDATE
AS
-- ...etc

Contextual logging

If something goes wrong in the application and we want to retrace the series of steps that led to it, CREATES, UPDATES and DELETES by themselves don’t really explain much of what the user was trying to achieve. Getting useful audit logs from DML statements alone is like trying to figure out what you did last night from just your credit card bill.

It would be more useful if the logs were expressed in the context of the use case, not just the database commands that resulted from it. For example, if you were tracking down a bug, would you rather read this:

[09:30:24] DELETE ProductCategory 142 13 dingwallr

… or this?

[09:30:24] Product 'iPhone 3GS' (#142) was removed from the catalog by user dingwallr. Categories: 'Smart Phones' (#13), 'Apple' (#15).

Logging at the row level simply cannot provide enough context to give a true picture of what the user is doing. Instead it should be done at a higher level where you know the full use-case (e.g. application services), and the logs should be kept out of the transactional database so they can be managed separately (e.g. rolling files for each month).

Soft create

The soft delete pattern can also be extended for item activation — instead of simply creating an item as part of the active set, you create it in an inactive state and flick a switch or set a date for when it should become active.

For example:

-- Get employees who haven't started work yet
SELECT * FROM Employee WHERE GETDATE() &lt; StartDate

This pattern is most commonly seen in publishing systems like blogs and CMSs, but I’ve also seen it used as an important part of an ERP system for scheduling changes to policy before it comes into effect.

Soft delete to retain historical items

Many database-backed business applications are required to keep track of old items for historical purposes — so users can go back and see what the state of the business was six months ago, for example.

(Alternatively, historical data is kept because the developer can’t figure out how to delete something without breaking foreign key constraints, but this really amounts to the same thing.)

We need to keep this data somewhere, but it’s no longer immediately interesting to the application because either:

  • The item explicitly entered a dormant state – e.g. an expired eBay listing or deactivated Windows account, or
  • The item was implicitly dropped from the active set – e.g. my Google Calendar appointments from last week that I will probably never look at again

I haven’t included deleted items here because there are very few cases I can think of in business applications where data is simply deleted (unless it was entered in error) — usually it is just transformed from one state to another. Udi Dahan explains this well in his article Don’t Delete — Just’ Don’t:

Orders aren’t deleted – they’re cancelled. There may also be fees incurred if the order is canceled too late.Employees aren’t deleted – they’re fired (or possibly retired). A compensation package often needs to be handled.

Jobs aren’t deleted – they’re filled (or their requisition is revoked).

Unless your application is pure CRUD (e.g. data grids), these states most likely represent totally different use cases. For example, in a timesheet application, complete tasks may be used for invoicing purposes, while incomplete tasks comprise your todo list.

Different use cases, different query needs

Each use case has different query requirements, as the information the application is interested in depends on the context. To achieve optimal performance, the database should reflect this — instead of lumping differently-used sets of items together in one huge table with a flag or status code as a discriminator, consider splitting them up into separate tables.

For example, in our job board application, we might store open, expired and filled listings using a table-per-class strategy:

Physically separating job listings by state allows us to optimize them for different use cases — focusing on write performance for active items, and read performance for past ones, with different columns, indexes and levels of (de)normalization for each.

Isn’t this all overkill?

Probably. If you’re already using soft delete and haven’t had any problems then you don’t need to worry — soft delete was a sensible trade-off for your application that hasn’t caused any serious issues so far.

But if you’re anticipating growth, or already encountering scalability problems as dead bodies pile up in your database, you might like to look at alternatives that better satisfy your application’s requirements.

The truth is soft delete is a poor solution for most of the problems it promises to solve. Instead, focus on what you’re actually trying to achieve. Keep everything simple and follow these guidelines:

  • Primary transactional tables should only contain data that is valid and active right now.
  • Do you really need to be able to undo deletes? If so, there are dedicated patterns to handle this.
  • Audit logging at the row level sucks. Do it higher up where you know the full story.
  • If a row doesn’t apply yet, put it in a queue until it does.
  • Physically separate items in different states based on their query usage.

Above all, make sure you’re not gold plating tables with soft delete simply out of habit!

Repositories Don’t Have Save Methods

Here’s a repository from an application I’ve been working on recently. It has a pretty significant leaky abstraction problem that I shall be fixing tomorrow:

public interface IEmployeeRepository
{
    void Add(Employee employee);
    void Remove(Employee employee);
    void GetById(int id);
    void Save(Employee employee);
}

What’s Wrong with this Picture?

Let me quote the DDD step by step wiki on what exactly a repository is:

Repositories behave like a collection of an Aggregate Root, and act as a facade between your Domain and your Persistence mechanism.

The Add and Remove methods are cool — they provide the collection semantics. GetById is cool too — it enables the lookup of an entity by a special handle that external parties can use to refer to it.

Save on the other hand signals that an object’s state has changed (dirty), and these changes need to be persisted.

What? Dirty tracking? That’s a persistence concern, nothing to do with the domain. Dirty tracking is the exclusive responsibility of a Unit of Work — an application-level concept that most good ORMs provide for free. Don’t let it leak into your domain model! Stay tune for more info on this topic here!