Parallel vs serial javascript async tests

If you’re writing tests for a javascript web application, sooner or later you’ll need to be aware of whether you are using a parallel or serial test runner.

Parallel Serial
How it works Start all async tests at once and wait until they have all completed. Wait for each test to complete before starting the next one.
Implications Callbacks execute in the order they return so race conditions are possible. Tests are isolated, but take much longer to run.
Examples Jasmine, Vows QUnit, Mocha

I discovered this last week while porting all our tests to a different framework. Under QUnit, everything was green, but under Jasmine, most of the end-to-end tests (that exercise our whole app, simulating user actions via the DOM down to a live API and back) were failing with all sorts of weird errors.

The problem? Unlike our regular tests, the end-to-end tests run the whole application at once, using the entire DOM, plus a globally-scoped event aggregator.

In QUnit all the tests ran fine, because they were completely isolated from one another. But in Jasmine they were running concurrently, all racing to update the DOM at the same time, and publishing all sorts of events that were being picked up by other tests. We fixed the problem by switching to Mocha instead… but definitely something to watch out for!

Turning a case sensitive string into a non-case sensitive string

Here’s a trick I recently picked up when dealing with document databases. Say you need to save objects that have IDs that only differ by case, but you’re using a document DB like Raven where keys are not case sensitive. In Google Books for example, oT7wAAAAIAAJ is an article in Spanish from a Brazilian journal, but OT7WAAAAIAAJ is a book about ghosts. RavenDB would not be able to recognize that these are two different IDs — so attempting to store them would result in a single document that gets overwritten each time. What can you do?

If it were the other way around — database is case sensitive, app is not — simply discarding the case information by converting everything to a common lowercase representation (a lossy transformation) would do the trick.

Our situation is a bit harder, however. We somehow need to represent the key as a string including each letter and also store whether it was uppercase or not. You could write a custom converter for this (maybe using special escape characters to indicate uppercase letters)… but a much easier way would be simply to convert it to Base32.

Why Base32? Using Base64 would produce shorter strings (more efficient encoding), but but it encodes data using both upper and lower case characters, so you are still at risk of collisions. Base32 on the other hand only uses uppercase, so it is safe to use for a case-insensitive key-value store.

Hex would work too (only uppercase characters from A-F), but it would need even more space to do so.

Mini book review: Recipes With Backbone

Disclaimer: this is my first-ever book review on this blog! I’m not a javascript/HTML developer by trade so it’s not going to be a critical review or anything — just how I found reading it.

I’ve worked on a few web applications over the years, and I’m familiar with at least a couple of other MVC-style frameworks, but it’s all been .NET, and recently, all on the desktop. Only in the past few months have I attempted my first full-on browser-based javascript application, using the fantastic Backbone.js. The journey so far has been pretty bumpy, but I’m finally getting to the point where I’m comfortable enough with Backbone (and the javascript language itself) to really start being productive. But recently I’ve been starting to hit some complex situations that I’m not sure how to implement, and am struggling to find online. As well, I’m always on the look out for tips confirming the code I’ve already written is on the right track.

I stumbled across Recipes With Backbone book by chance from a blog post by one of the authors, where he was discussing how to implement default routes. I was a bit hesitant at first when I couldn’t find many reviews of it, but ‘advanced Backbone’ is a pretty new subject area (even to find on blogs) and it was cheap so I bought a copy. I am glad to say I was very satisfied with my purchase.

The book has 152 pages, and it took me a day to read on my Kindle. It covers:

  1. Writing Client Side Apps (Without Backbone)
  2. Writing Backbone Applications
  3. Namespacing
  4. Organizing with Require.js
  5. View Templates with Underscore.js
  6. Instantiated View
  7. Collection View (Excerpt)
  8. View Signature
  9. Fill-In Rendering
  10. Actions and Animations
  11. Reduced Models and Collections
  12. Non-REST Models
  13. Changes Feed (Excerpt)
  14. Pagination and Search
  15. Constructor Route
  16. Router Redirection
  17. Evented Routers
  18. Object References in Backbone
  19. Custom Events
  20. Testing with Jasmine

The book is based around a hypothetical online calendar application, analogous to Google Calendar. I use GCal a lot so it was nice to have familiar examples — some books use made-up applications that are unfamiliar or too vague (even a blog or a todo list could be designed in many different ways). But when examples are based based off a specific real-world application like GCal or Twitter then you already know how it’s supposed to behave and it simply becomes a matter of mapping that behaviour to the code examples on the page.

The book is structured as a series of enhancements to the calendar — starting with a plain jQuery $.getJSON() on a server-generated HTML page, just like we used to write in 2008. This was a good familiar starting point for me, before leaping into a basic Backbone structure and then refactoring it and adding more advanced behaviour.

The twenty chapters are each presented in a problem: solution format. They are very brief but I really liked this — they are clear, succinct, get straight to the point, and are very readable. Code is explained in 3-4 line chunks at a time, which fit well on my Kindle (which has a small screen and no colors). Overall I thought it was very good value for time spent reading.

For me, the book was valuable because:

  • It corrected some things I thought I already knew 🙂 like $(el) injection
  • It showed patterns for dealing with things like view deactivation and dangling event references (as a WPF developer this stuff gives me nightmares)
  • It showed how simply and elegantly difficult things like paging could be implemented
  • It suggested some neat things I hadn’t even thought about yet e.g. constructor routing

The book also includes a chapter on Require.js, which it presents early on as a foundation — chapter four — right after namespacing. I’m a fan of require.js and have no doubt it will soon become a standard requirement for any serious JS development in future. But right now it’s not widely supported and — in my experience — can cause a lot more problems than it solves. Unless you’re totally comfortable accepting the fact you’ll probably have to hack a lot of third party jQuery plugins just to make them work, I would not recommend require.js to beginners until the rest of the community catches up.

Regardless, Recipes With Backbone is still the single best text I’ve read on Backbone (aside from the Backbone docs themselves), and I’m looking forward to seeing what else these guys write. I would definitely recommend it to anyone wanting to take their Backbone skills to the next level. Go check it out and read the sample chapters here: http://recipeswithbackbone.com

Returning camelCase JSON from ASP.NET Web API

Loving ASP.NET Web API but not loving the .NET-centric PascalCase JSON it produces?

// a .NET class like this...public class Book{    public int NumberOfPages { get; set; }    public string Author { get; set; }}
// ... should be serialized into JSON like this{    "numberOfPages": 42,    "author": "JK Rowling"}

Luckily this is quite easy to fix with a custom formatter. I started with Henrik Nielsen’s custom Json.NET formatter and changed it slightly to use a camelCase resolver and also indent the JSON output (just for developers; we would turn this off once the app was deployed). You can grab the code here: https://gist.github.com/2012642

Then just swap out the default JSON formatter in your global.asax.cs:

var config = GlobalConfiguration.Configuration;// Replace the default JsonFormatter with our custom onevar index = config.Formatters.IndexOf(config.Formatters.JsonFormatter);config.Formatters[index] = new JsonCamelCaseFormatter();

RavenDB Includes much simpler than you think

Here’s something I’ve been struggling to get my head around over the past few days as I’ve been getting deeper into RavenDB. The example usage from their help page on pre-fetching related documents:

var order = session.Include<Order>(x => x.CustomerId)    .Load("orders/1234"); // this will not require querying the server!var cust = session.Load<Customer>(order.CustomerId);

That doesn’t look too difficult at first glance – it looks pretty similar to futures in NHibernate, which I’ve used plenty of times before. But hang on. The first line instructs RavenDB to preload a second object behind the scenes, but how does it know it specifically to be a Customer?

session.Include<Order>(x => x.CustomerId)

At first I thought there should be a Customer type argument somewhere. Something like this:

// WRONGvar order = session.Include<Customer>(...)    .Load<Order>(...)

Otherwise how can RavenDB know I want a Customer and not some other type? Maybe the Order.CustomerID property is actually stored in RavenDB as sort of strongly-typed weak reference to another object? Maybe the order returned is some sort of runtime proxy object with referencing metadata built-in?

No no no. It is much simpler than that. Let’s take a step back.

In a traditional SQL database, you need both the type (table name) and ID to locate an object. The ID alone is not enough, because two differently typed objects may have the same ID (but in different tables). So you need the type as well.

In RavenDB, you only need the ID. This is possible because RavenDB does not have the concept of types – under the covers it’s effectively just one big key-value store of JSON strings which get deserialized by the client into different CLR types. Even though the RavenDB .NET client is strongly typed (Customers and Orders), the server has no awareness of the different types stored within it.*

This is what makes includes work. Raven doesn’t need to know the type of the document being included, it’s just another chunk of JSON (which could be deserialized as anything). And the ID can only point to exactly one document because all documents are stored in the same single key-value store. So, back to the original example:

// This line instructs RavenDB to preload the JSON document// that has an ID == x.CustomerIdvar order = session.Include<Order>(x => x.CustomerId)    .Load("orders/1234"); // This line accesses the previously-loaded JSON document and// deserializes it as a Customer.var cust = session.Load<Customer>(order.CustomerId);

That makes much more sense now.

* actually RavenDB does keep metadata about the CLR type, but for unrelated purposes.

Yet another reason to love REST

There are a lot of reasons why you should love REST. It’s fast, simple, stateless, and easy to debug. This makes it absolutely fantastic to test against.

REST APIs get you great end-to-end test coverage

Line for line, an end-to-end system test covers a lot more code than a deep down class-level unit test. They also more closely simulate a user’s actions, providing a realistic path of possible scenarios that need to be verified. So for the most bang for your buck, you want to be testing at the outermost level possible.

The ultimate level here is testing through UI automation — simulating clicks and looking for responses on screen, as a real user would do. However, although UI automation libraries are improving, UI tests still tend to be very complex to write and are often brittle to things like positioning/layout changes. A public API provides a fantastic alternative ‘hooking’ point of entry where application behaviour can be invoked from external code, without involving the UI but still typically mirroring fields and actions on screen pretty closely (if well-designed in order to keep your clients fast and lightweight).

So testing against an API avoids the complexity of UI testing but still give the system a thorough end-to-end workout.

Tests written against REST APIs aren’t brittle

Although the back-end behind your API may change considerably, a public API must continue to work in the exact same way or you’ll break all your clients and have a lot of very angry users. So as well as covering more lines of code for less effort, tests written against a public API won’t be as brittle and need updating as often as ‘subcutaneous’ ones testing against internal code structures.

Everyone can connect to a REST API

If you had to choose a protocol for interacting with your application, you couldn’t find a much simpler one than REST. Stateless HTTP, verbs, status codes and JSON — 99% of the time you need nothing but a browser to debug it, and your tests can be written in a completely different language than the API back-end and still be completely understandable.

Compare this to previous ‘standard’ HTTP protocols that were often anything-but understandable – even with things like service discovery and WSDLs (designed to help developers), debugging SOAP XML mismatches between a Java Metro client and a .NET WCF service is one personal hell I hope never to have to relive.

Not just for big APIs

Remember you don’t need a formal public API like Twitter or Facebook to realise the testing benefits mentioned here. Pretty much any sort of REST endpoint will do. The JSON handlers powering your single-page web app would be a great entry point to get started testing underlying behaviour, for example.

Quick-and-dirty unique constraints in Raven DB

Quick-and-dirty unique constraints in Raven DB

Raven DB, like most NoSQL databases designed to be fast and scale out, does not readily support enforcing uniqueness of fields between documents. If two users must not share the same email address or Facebook ID, you cannot simply add a unique constraint for it.

However, Raven DB does guarantee uniqueness in one place: each document must have a unique ID, or a ConcurrencyException will be thrown. By creating dummy documents with say, an email address for an ID, this feature can be effectively exploited to achieve unique constraints in Raven DB.

You can easily add this behaviour to your database if you install the Raven DB UniqueConstraints Bundle, which will enforce uniqueness on updates as well as inserts. However… if the field is immutable and you just want something quick and dirty you can use this: RavenUniqueInserter.cs 🙂

using (var session = documentStore.OpenSession()){    var user = new User                   {                       EmailAddress = "rdingwall@gmail.com",                       Name = "Richard Dingwall"                   };    try    {        new RavenUniqueInserter()            .StoreUnique(session, user, p => p.EmailAddress);    }    catch (ConcurrencyException)    {        // email address already in use    }}

It works by simply wrapping the call to DocumentSession.Store() with another document – in this case, it would also create a document with ID UniqueConstraints/MyApp.User/rdingwall@gmail.com, guaranteed to be unique.

You can grab it here: https://gist.github.com/1950991

Fast empty Raven DB sandbox databases for unit tests

Fast empty Raven DB sandbox databases for unit tests

Say you you have some NUnit/xUnit/Mspec tests that require a live Raven DB instance. Specifically:

  • You do not want your test to be affected by any existing documents, so ideally the Raven DB database would be completely empty.
  • Your test may span multiple document sessions, so doing it all within a single transaction and rolling it back is not an option.
  • You want your tests to run fast.

What are your options?

Raven DB currently has no DROP DATABASE or equivalent command. The recommended method is simply to delete Raven DB’s ServerData or ServerTenants directories, but this requires restarting the Raven DB service (expensive). Also any live document stores may throw an exception at this point.

Multi-tenanting

One option that Raven DB makes very cheap, however, is spinning up new database instances (aka tenants). In fact all you need to do is specify a new DefaultDatabase and the document store will spin a new database up for you. For example:

var store = new DocumentStore    {        Url = "http://localhost:8080",        DefaultDatabase = "MyAppTests-" + DateTime.Now.Ticks    };store.Initialize();// now you have an empty database!

Pretty easy huh? Here’s a little test helper I wrote to help manage these sandbox databases, stores and sessions. Here’s how you would use it in a tenant-per-fixture test:

[TestFixture]public class When_doing_something{    [TestFixtureSetUp]    public void SetUp()    {        RavenDB.SpinUpNewDatabase();        using (var session = RavenDB.OpenSession())        {            // insert test data        }    }    [Test]    public void It_should_foo()    {        using (var session = RavenDB.OpenSession())        {            // run tests        }    }}

You can grab it here as a gist on Github: https://gist.github.com/1940759.

Note that if you use this method, a number of sandbox databases will (of course) build up over time. You can clean these up you by simply deleting the Raven DB data directories. (See gist for an example batch file you can throw in your source control to do this.)

Object-oriented basics: single object or collection scope?

Here is a contrived example of a common SOLID violation you might see. Can you spot it?

class Mp3Encoder : IMp3Encoder{    public void Encode(IEnumerable<string> wavFiles)    {        foreach (var wavFile in wavFiles)        {            var outputFile = /* create output file */;                        while (/* blocks remaining... */)            {                var buffer = /* read block */;                var encoded = /* encode wav block as MP3 */;                /* write block to output file */;            }            /* write ID3 trailing header */;        }    }}

Except in trivially simple cases, there should always be a class boundary when shifting context from coordinating a collection versus performing actions on a single object.

The class above is violating this rule — it knows how to perform collection-level responsibilities as well as single-object responsibilities. It needs to be broken into two classes; one for encoding a single file and one for coordinating the group.

This rule is a form of the Single Responsibility Principle. For example:

Collection-scoped class responsibilities
  • Coordinating ‘before all’ and ‘after all’ actions
  • Looping through items
  • Maintaining shared state (counting, accumulating etc)
Single object-scoped class responsibilities
  • Coordinating ‘before each’ and ‘after each’ actions
  • Performing actions on item

If you ignore this collection-vs-single-object contextual boundary, your classes will become messes of nested procedural code — especially when different behaviour is required for each item in the collection. Your classes will be that much harder to unit test, and you won’t easily be able to re-use them in single-object scenarios.

Subscribing to NuGet package updates via RSS

Update – check out nugetfeed.org for a more polished way to subscribe to NuGet package updates in your RSS reader!

Just a quick tip I found today – If you’re a NuGet package author and want to be notified when updates are published for upstream packages you depend, you can do so by subscribing to an OData query in an RSS reader.

For example, in order to keep protobuf-net-data in sync with the latest protobuf-net, I need to publish a new package rebuilt against the latest protobuf-net data every time they release a new version. For this I subscribed to the following URL in Google Reader:

http://packages.nuget.org/v1/FeedService.svc/Packages()?$filter=Id eq ‘protobuf-net’

Matt Wrock has a few more advanced examples of this using ifttt.com to orchestrate sending emails and tracking package downloads etc.