Object-oriented basics: single object or collection scope?

Here is a contrived example of a common SOLID violation you might see. Can you spot it?

class Mp3Encoder : IMp3Encoder
{
    public void Encode(IEnumerable<string> wavFiles)
    {
        foreach (var wavFile in wavFiles)
        {
            var outputFile = /* create output file */;

            while (/* blocks remaining... */)
            {
                var buffer = /* read block */;
                var encoded = /* encode wav block as MP3 */;
                /* write block to output file */;
            }

            /* write ID3 trailing header */;
        }
    }
}

Except in trivially simple cases, there should always be a class boundary when shifting context from coordinating a collection versus performing actions on a single object.

The class above is violating this rule — it knows how to perform collection-level responsibilities as well as single-object responsibilities. It needs to be broken into two classes; one for encoding a single file and one for coordinating the group.

This rule is a form of the Single Responsibility Principle. For example:

Collection-scoped class responsibilities
  • Coordinating ‘before all’ and ‘after all’ actions
  • Looping through items
  • Maintaining shared state (counting, accumulating etc)
Single object-scoped class responsibilities
  • Coordinating ‘before each’ and ‘after each’ actions
  • Performing actions on item

If you ignore this collection-vs-single-object contextual boundary, your classes will become messes of nested procedural code — especially when different behaviour is required for each item in the collection. Your classes will be that much harder to unit test, and you won’t easily be able to re-use them in single-object scenarios.


December 30, 2011 | 2 comments

Subscribing to NuGet package updates via RSS

Just a quick tip I found today – If you’re a NuGet package author and want to be notified when updates are published for upstream packages you depend, you can do so by subscribing to an OData query in an RSS reader.

For example, in order to keep protobuf-net-data in sync with the latest protobuf-net, I need to publish a new package rebuilt against the latest protobuf-net data every time they release a new version. For this I subscribed to the following URL in Google Reader:

http://packages.nuget.org/v1/FeedService.svc/Packages()?$filter=Id eq ‘protobuf-net’

Matt Wrock has a few more advanced examples of this using ifttt.com to orchestrate sending emails and tracking package downloads etc.


December 18, 2011 | Leave your comment

Protocol Buffers DataReader Extensions for .NET

.NET, as a mostly-statically typed language, has a lot of really good options for serializing statically-typed objects. Protocol Buffers, MessagePack, JSON, BSON, XML, SOAP, and the BCL’s own proprietary binary serialization are all great for CLR objects, where the fields can be determined at runtime.

However, for data that is tabular in nature, there aren’t so many options. In my past two jobs I’ve had a need to serialize data:

  • That is tabular – not necessarily CLR DTOs.
  • Where the schema is unknown before it is deserialized – each data set can have totally different columns.
  • In a way that is streamable, so entire entire data sets do not have to be buffered in memory at once.
  • That can be as large as hundreds of thousands of rows/columns.
  • In a reasonably performant manner.
  • In a way that could potentially be read by different platforms.
  • Into as small a number of bytes as possible.

Protocol Buffers DataReader Extensions for .NET was born out of these needs. It’s powered by Marc Gravell’s excellent Google Protocol Buffers library, protobuf-net, and it packs data faster and smaller than the equivalent DataTable.Save/Write XML:

Usage is very easy. Serializing a data reader to a stream:

DataTable dt = ...;

using (Stream stream = File.OpenWrite("C:\foo.dat"))
using (IDataReader reader = dt.CreateDataReader())
{
    DataSerializer.Serialize(stream, reader);
}

Loading a data table from a stream:

DataTable dt = new DataTable();

using (Stream stream = File.OpenRead("C:\foo.dat"))
using (IDataReader reader = DataSerializer.Deserialize(stream))
{
    dt.Load(reader);
}

It works with IDataReaders, DataTables and DataSets (even nested DataTables). You can download the protobuf-net-data from NuGet, or grab the source from the GitHub project page.


November 1, 2011 | Leave your comment

Failing the tube test

As a Londoner working in the city, I (along with hundreds of thousands of others) catch the tube to work every day. Although you’re typically crammed in a carriage with hundreds of other commuters, the journey is a solitary one, and many passengers turn to their iPhones, iPads and Kindles to pass the time.

For me, it’s a pretty long journey, up to an hour each way to get from Parsons Green to Canary Wharf — a large portion of which is spent deep underground, with no mobile or 3G coverage on my iPhone 4. During this time you can really tell which apps’ data strategies have been properly thought out and designed, and which ones have been hacked together in a hurry.

I’m not going to mention any offending games or apps by name, but these are the sort of things developers should be shot for:

Apps that don’t cache downloaded data for later viewing

Except in special cases (e.g. streaming audio/video, realtime or sensitive content), all downloaded data must be cached locally, and able to be viewed again without an internet connection. Apps that don’t store downloaded content are just crippled web browsers.

Offline App Store reviews

It is unacceptable to nag me to rate your app in the App Store when there is no internet connection. Honestly — don’t ask users to do the impossible.

Repeated connection attempts and repeated error diaglogs

It is unacceptable to repeatedly show error dialogs when there is no internet connection. By all means, try to continuously download updates, but do it silently and in the background. I do not need to be told every 30 seconds that it (still) couldn’t connect.

Locking the UI when checking for updates on startup

When launched, It is unacceptable to block the user in the UI from reading previously-cached data while downloading updated content. By all means, check for updates on start up, but while that is happening, the user must be able to read the previous cached copy, unless they explicitly asked for a refresh from-scratch.

Throwing away cached data then failing to update

It is unacceptable to leave the user with a blank screen because you threw away previously-cached data THEN failed to update. Local data caches should not be cleared until new data has been 100% retrieved successfully and loaded.

Lost form data, please retype

t is unacceptable to ‘lose’ form values and force the user to retype a message when a submit error occurs. Form values must be saved somewhere — either automatically requeued for resubmission, or as a draft that can be edited or cancelled by the user.

Ad priorities

It is unacceptable for an ad-supported app to crash, refuse to start or behave weirdly because it couldn’t connect to the ad server. Are you worried about people using your app, or seeing ads?

Occasionally connected apps

It is unacceptable for any app to refuse to start without an internet connection. Users should always be able to change settings, view and edit previously-cached data and enqueue new requests where possible (to be submitted when a connection is available).

Final thoughts

In general, occasionally-connected computing is a good model for iPhone/Android apps that push/pull data from the Internet. It just needs to be implemented correctly. Like web designers assuming no javascript and using progressive-enhancement to add improved behaviour on top of basic functionality, mobile developers should assume no internet connection is available most of the time — jump at the opportunity to sync when possible, but don’t interrupt the user or assume it won’t fail.

So please, if If you’re a mobile developer, think about us poor Londoners before you deploy your next version!


October 1, 2011 | 1 comment

SqlDropDatabase, SqlCreateDatabase MSBuild tasks

When working with SQL, I often find I need to quickly spin up/tear down local developer database instances – for example, setting up a clean environment for integration tests on the build server, or blowing away test data.

To help make this easier, here are a couple of MSBuild tasks I wrote that allow you to drop an existing database:

<SqlDropDatabase ConnectionString="Server=.\SQLEXPRESS;Database=AdventureWorks;Integrated Security=SSPI;"
                 Database="AdventureWorks" />

… and to create a new (empty) one.

<SqlCreateDatabase ConnectionString="Server=.\SQLEXPRESS;Database=AdventureWorks;Integrated Security=SSPI;"
                   Database="AdventureWorks" />

It’s also sometimes helpful to be able to parse individual individual keys out of a connection string (e.g. the the database name). This can be very tedious with RegexMatch/RegexReplace, so I wrote a separate MSBuild task to do it:

<SqlParseConnectionString ConnectionString="Server=.\SQLEXPRESS;Database=AdventureWorks;Integrated Security=SSPI;">
    <Output PropertyName="myDb" TaskParameter="InitialCatalog" />
    <Output PropertyName="myServer" TaskParameter="DataSource" />
    <Output PropertyName="myTimeout" TaskParameter="ConnectTimeout" />
</SqlParseConnectionString>
<Message Text="Parsed the $(myDb) database on server $(myServer) with timeout = $(myTimeout)." />

You can grab the code here: http://github.com/rdingwall/sqlmsbuildtasks


August 22, 2011 | Leave your comment