Quick-and-dirty unique constraints in Raven DB

Quick-and-dirty unique constraints in Raven DB

Raven DB, like most NoSQL databases designed to be fast and scale out, does not readily support enforcing uniqueness of fields between documents. If two users must not share the same email address or Facebook ID, you cannot simply add a unique constraint for it.

However, Raven DB does guarantee uniqueness in one place: each document must have a unique ID, or a ConcurrencyException will be thrown. By creating dummy documents with say, an email address for an ID, this feature can be effectively exploited to achieve unique constraints in Raven DB.

You can easily add this behaviour to your database if you install the Raven DB UniqueConstraints Bundle, which will enforce uniqueness on updates as well as inserts. However… if the field is immutable and you just want something quick and dirty you can use this: RavenUniqueInserter.cs 🙂

using (var session = documentStore.OpenSession()){    var user = new User                   {                       EmailAddress = "rdingwall@gmail.com",                       Name = "Richard Dingwall"                   };    try    {        new RavenUniqueInserter()            .StoreUnique(session, user, p => p.EmailAddress);    }    catch (ConcurrencyException)    {        // email address already in use    }}

It works by simply wrapping the call to DocumentSession.Store() with another document – in this case, it would also create a document with ID UniqueConstraints/MyApp.User/rdingwall@gmail.com, guaranteed to be unique.

You can grab it here: https://gist.github.com/1950991

Deployment readiness

Deployment readiness

Have you ever seen this development cycle?

  1. Install new build
  2. App doesn’t start, doesn’t log any error why
  3. Go back to source code and add improved logging
  4. Install new build
  5. Read logs
  6. Fix the actual problem (usually some stupid configuration thing)
  7. Repeat steps 1-6 until all bugs are gone

Was it because of your code, or someone else’s? This sort of cycle is time consuming, frustrating, stressful, and makes you look helpless (or like a cowboy) in front of project managers and support staff.

If your app can’t produce logs your infrastructure staff can understand, then your app is not even close to production-readiness. If it doesn’t log clear errors, how can anyone hope to deploy and support it?

Mandatory overloads

Mandatory overloads

Yesterday I read a pretty sensible tweet by Jeremy D. Miller:

(args), you almost always need a Method(Type, args) too.’ width=’550′ height=’292′>

In spirit, I would like to propose another:

When you have a method like Method(IEnumerable<T> items), you should always provide a Method(params T[] items) too.

Access invocation arguments when returning a value with Rhino Mocks

Access invocation arguments when returning a value with Rhino Mocks

My team use Rhino Mocks at work, and as a Moq fan, one of my most missed features is the ability to access invocation arguments when returning a value. For example:

mock.Setup(x => x.Execute(It.IsAny<string>()))    .Returns((string s) => s.ToLower());

Rhino lacks this feature out of the box. It is possible, but pretty ugly:

mock.Stub(x => x.Execute(Arg<string>.Is.Anything))    .WhenCalled(invocation =>        invocation.ReturnValue =             ((string) invocation.Arguments[0]).ToLower());

Today I wrote some quick extensions for Rhino to make it behave a bit more like Moq.

mock.Stub(x => x.Execute(Arg<string>.Is.Anything))    .Return<string, string>(s => s.ToLower());

Grab them here: RhinoExtensions.cs

SQL Notifications: not very practical for large data sets

SQL Notifications: not very practical for large data sets

I ran into an interesting problem today with command-based SQL Notifications. We’ve recently introduced SysCache2 on a project as NHibernate’s level 2 cache provider, because of it’s ability to invalidate regions when underlying data changes, and I already wrote about some issues we had with it. Unfortunately we hit another road block today, this time with the queries for notification themselves.

Here’s the offending config:

<syscache2>    <cacheRegion name="Tasks" relativeExpiration="9999999">        <dependencies>          <commands>              <add command="SELECT ID FROM Task" connectionName="Xyz" />              <add command="SELECT ID FROM TaskUser" connectionName="Xyz" />              ...          </commands>      </dependencies>    </cacheRegion></syscache2>

Can you spot the bug here that will result in an unhandled exception in ISession.Get()?

Nope? Neither could I for most of this afternoon.

Query for notification timeout errors

The problem is that the Task and TaskUser tables have four and six million rows respectively. SELECT ID from TaskUser takes over 90 seconds to execute. At this speed, by the time we have re-subscribed to the query notification, new data would have already been written by other users.

Depending on your exact scenario, you have several options:

  1. Refactor the database schema to remove rows from these tables that aren’t likely to change.
  2. Accept the slow query subscription.
  3. Enable caching, but ignore changes from these tables.
  4. Limit the command to only cover rows that are likely to change, e.g. SELECT ID FROM Task WHERE YEAR(DueDate) = 2009.
  5. Disable level 2 cache for these entities entirely.

Accepting the slow query subscription only works if you have very infrequent writes to the table, where it is worth caching rows for a long time.

For us, the high frequency of writes to these tables means that we would be invalidating the cache region all the time, and limited sharing of data between users doesn’t give much benefit in caching. Also blocking a HTTP request thread for 90 seconds is not feasible. So we chose the last option and now don’t bother caching these tables at all.

By the way, while working on this problem, I submitted my first patch to NH Contrib that adds a commandTimeout setting to SysCache2.

Back to basics: exception handling in .NET

Back to basics: exception handling in .NET

Following on from my article on good source control check-in habits, I’ve got a few tips I’d like to share on exceptions in .NET. These are basically all responses to things I’ve seen done in production code before.

Write a method that throws — don’t return success/failure codes

Here’s a C function I wrote a couple of years ago for an application a couple of years back that listens for messages via UDP:

int bind_udp_server(struct udp_server * server){        server->socket_descriptor = socket(AF_INET, SOCK_DGRAM, 0);        if (-1 == server->socket_descriptor)                return -1; /* bail out */                if (!set_non_blocking(server->socket_descriptor))                return -1;  /* bail out */                ...                if (-1 == bind(server->socket_descriptor,                        (struct sockaddr *) &server_addr,                        sizeof(server_addr)))                return -1; /* bail out */                return 0; /* socket bound successfully */}

C has no formal mechanism for functions to return to their caller when exceptional behaviour is encountered. Instead they typically return an integer status code (e.g. 1 or 0) to indicate success or failure.

Clients using these functions must check for errors after every call. Because of this:

  • Your real application code can be much harder to follow when it’s hidden amongst all the error handling code (which probably only executes in rare circumstances anyway).
  • If you accidentally omit a return code check and something fails, your program will continue happily executing as if nothing happened. Problems might not become apparent until several operations later, making it much harder to track down the original issue.
  • Because you have to check for errors the whole way (and at different levels), there will be a lot of duplicated code, violating the DRY principle.
  • Error codes can overlap with legitimate return values, making it hard to indicate when an actual error has occurred. This is known as the semipredicate problem.
  • Sometimes things happen that are so catastrophic, you don’t even bother with a strategy for trying to cope with them. For example, it might be perfectly acceptable to die if malloc() fails, unless you’re fully equipped to keep your program running when the machine is out of memory.

Languages like .NET are free from these worries because they use a different strategy: assume everything will always succeed (try), and handle any problems later in one single location (catch).

public void Foo(){    try    {        DoSomething();        DoSomethingElse();        DoThirdThing();    }    catch    {        // bail out    }}

This lets you focus on the success case (the one that actually matters), instead of cluttering your code up with error handling.

Unfortunately, it seems a lot of .NET developers only think of exceptions as things you need to catch when calling the Base Class Library, and shy away from throwing and catching their own exceptions. Instead, I see kludges like:

if (!DoSomething())   // bail out

Or even this one, which simulates Exception.Message:

string errorMessage = DoSomething();if (errorMessage != "")   // bail out

The golden rule is, if your method encounters a condition that prevents it from achieving its intended purpose, it should throw an exception.

Re-throwing exceptions

Do you know the difference between the two following code snippets?

catch (Exception ex){        // log the exception        ...                // re-throw        throw ex;}catch (Exception ex){        // log the exception        ...                // re-throw        throw;}

The first example resets ex’s stack trace to the current location. This is fine when you’re throwing a brand new exception, but not so good if you’re just passing one along — you’ll lose the stack trace telling you where it originated from.

To preserve the stack trace, just use “throw” by itself with no argument, as in the second example.

Use parameterless catch/throw if you don’t care about the details

You don’t have to specify any parameter name in your catch block if you don’t need the exception object itself:

catch (SqlException){        // rollback transaction        // rethrow        throw;}

This is handy for eliminating that “variable ‘ex’ is not used” compiler warning. In fact, if you don’t care about the type, you can get rid of the brackets altogether:

catch // anything{        // handle the exception}

But be careful with one — in 99% of situations, we do care what the type is.

Only catch exceptions you can handle

Imagine you are working on some code that tries to parse some user input as an integer. If it’s invalid (e.g. not a number), we’ll display an error message.

int value = 0;try{        // forget about TryParse() for the moment :)        value = Int32.Parse(someUserInput);}catch (Exception ex){        // invalid input}

What’s wrong with this? We’ve specified that this catch block is capable of handling any type of exception, no matter how severe or unrelated. They’ll all be interpreted as validation errors — OutOfMemoryException, ThreadAbortException — even NullReferenceException (which might indicate an actual bug in the code).

In reality, however, all we really care about is a very narrow subset of exceptions: those that indicate the number couldn’t be parsed correctly.

int value = 0;try{        value = Int32.Parse(someUserInput);}catch (FormatException ex){        // invalid input}

Only catch exceptions you anticipate could be thrown as a consequence of the code within your try block. Don’t try to handle anything outside that window, unless you have a specific strategy to deal with it.

Multiple catch blocks

Here’s an example of some code I saw at my last job, that did different things depending on the type of exception:

try{        ...}catch (Exception ex){        if (ex is FileNotFoundException)                // do stuff        else if (ex is IOException)                // do stuff        else                // ???}

Yuck — using reflection and a big if-statement to differentiate types is generally a sign of bad code in .NET (or any OO language), and catch blocks are no exception (ba-dum-psh). Instead, use multiple catch blocks with overloads for different exception types. At runtime, the .NET framework will call the first matching catch block, so you need to specify them in most-to-least specific order:

try{        ...}catch (FileNotFoundException ex){        // file not found (subclass of IOException)}catch (IOException ex){        // some other file-related error}

Inner exceptions

In .NET, low-level exceptions can be wrapped up into a higher-level context that makes more sense in the grand scheme of things. For example, a CouldNotOpenDocumentException might be caused by a FileNotFoundException. This is reflected in the Exception.InnerException property, and inner exception each is complete with its own error message and stack trace.

Here’s some code I saw once for unwrapping them, via a loop:

try{        ...}catch (Exception ex){        // get exception details        string errMessage = ex.Message + ex.StackTrace;        Exception innerException = ex.InnerException;        while (innerException != null)        {                errMessage += innerException.StackTrace;                innerException = innerException.InnerException;        }        // log errMessage}

This code is pretty redundant (and ugly), as the .NET framework will do this for you. Exception.ToString() will print out the exception’s message and stack trace, then call itself on the inner exception:

try{        ...}catch (Exception ex){        // get full stack trace        string errMessage = ex.ToString();        // log errMessage}

This will return a complete dump of the inner exception tree, producing a message like:

Could not get OSM Status menu item status. ---> System.NullReferenceException: Object reference not set to an instance of an object.   at SMS.Common.GetOSMMenuStatus(Space oSpace) in D:DevXYZSourceXYZ.WebsiteCodeCommonCommon.cs:line 1051   --- End of inner exception stack trace ---   at SMS.Common.GetOSMMenuStatus(Space oSpace) in D:DevXYZSourceXYZ.WebsiteCodeCommonCommon.cs:line 1058   at SMS.Common.AppendSMSNav(Space& oSpace, DateTime& dtSMSDate, DateTime& dtStartDate, DateTime& dtEndDate) in D:DevXYZSourceXYZ.WebsiteCodeCommonCommon.cs:line 837

Another tip for is finding the root inner exception — the original cause. The developers from the previous example chose to drill through InnerExceptions in a loop until they reached the bottom (null). An easier way would just be to call Exception.GetBaseException().

Defensive coding

To write robust and idiot-proof code, you have to assume people are going to try to break it. This means checking input parameters are valid at the start of every method.

You should continue this habit all throughout internal application methods as well, so bad data gets stopped short at the point of origin, instead of trickling through and causing problems later on.

public class ProjectsController : Controller{    IProjectRepository projectRepository;        public ProjectsController(IProjectRepository projectRepository)    {        // better to have an exception here, when the controller is constructed...        if (projectRepository == null)            throw new ArgumentNullException("projectRepository");        this.projectRepository = projectRepository;    }    public ActionResult Detail(string name)    {        // ...than down here, when an action gets called.        Project p = this.projectRepository.GetByName(name);        return View(p);    }    ...}

Note that Microsoft’s Spec# contracts will make these sorts of checks much easier in future, with built-in syntax for not-nullable parameters:

public ProjectsController(IProjectRepository! projectRepository){    // method will not be entered if projectRepository is null    this.projectRepository = projectRepository;}

Plus these rules are enforced at compile time as well! So the following line would not build:

ProjectsController controller = new ProjectsController(null); // error

See Microsoft’s article on Best Practices for Handling Exceptions for more .NET exception tips.

T-SQL equality operator ignores trailing spaces

T-SQL equality operator ignores trailing spaces

Today I discovered something new about SQL Server while debugging an application. T-SQL’s equality operator ignores any trailing spaces when comparing strings. Thus, these two statements are functionally equivalent:

SELECT * FROM Territories WHERE TerritoryDescription = 'Savannah'SELECT * FROM Territories WHERE TerritoryDescription = 'Savannah         '

When executed against the Northwind database included with SQL Server they both return the same row, which has no trailing spaces after its TerritoryDescription.

TerritoryID          TerritoryDescription                               RegionID    -------------------- -------------------------------------------------- ----------- 31406                Savannah                                           4(1 row(s) affected)

This behaviour isn’t immediately obvious from the offset, and isn’t mentioned on the MSDN entry.

To avoid this problem, you should use LIKE instead:

SELECT * FROM Territories WHERE TerritoryDescription LIKE 'Savannah         '

When comparing strings with LIKE all characters are significant, including trailing spaces.

Update: a co-worker discovered yesterday that using LIKE in T-SQL JOINs doesn’t use indices in the same way that the equals operator does. This can have a significant impact on performance. Be warned!