RavenDB Includes much simpler than you think

Here’s something I’ve been struggling to get my head around over the past few days as I’ve been getting deeper into RavenDB. The example usage from their help page on pre-fetching related documents:

var order = session.Include<Order>(x => x.CustomerId)    .Load("orders/1234"); // this will not require querying the server!var cust = session.Load<Customer>(order.CustomerId);

That doesn’t look too difficult at first glance – it looks pretty similar to futures in NHibernate, which I’ve used plenty of times before. But hang on. The first line instructs RavenDB to preload a second object behind the scenes, but how does it know it specifically to be a Customer?

session.Include<Order>(x => x.CustomerId)

At first I thought there should be a Customer type argument somewhere. Something like this:

// WRONGvar order = session.Include<Customer>(...)    .Load<Order>(...)

Otherwise how can RavenDB know I want a Customer and not some other type? Maybe the Order.CustomerID property is actually stored in RavenDB as sort of strongly-typed weak reference to another object? Maybe the order returned is some sort of runtime proxy object with referencing metadata built-in?

No no no. It is much simpler than that. Let’s take a step back.

In a traditional SQL database, you need both the type (table name) and ID to locate an object. The ID alone is not enough, because two differently typed objects may have the same ID (but in different tables). So you need the type as well.

In RavenDB, you only need the ID. This is possible because RavenDB does not have the concept of types – under the covers it’s effectively just one big key-value store of JSON strings which get deserialized by the client into different CLR types. Even though the RavenDB .NET client is strongly typed (Customers and Orders), the server has no awareness of the different types stored within it.*

This is what makes includes work. Raven doesn’t need to know the type of the document being included, it’s just another chunk of JSON (which could be deserialized as anything). And the ID can only point to exactly one document because all documents are stored in the same single key-value store. So, back to the original example:

// This line instructs RavenDB to preload the JSON document// that has an ID == x.CustomerIdvar order = session.Include<Order>(x => x.CustomerId)    .Load("orders/1234"); // This line accesses the previously-loaded JSON document and// deserializes it as a Customer.var cust = session.Load<Customer>(order.CustomerId);

That makes much more sense now.

* actually RavenDB does keep metadata about the CLR type, but for unrelated purposes.