SqlDropDatabase, SqlCreateDatabase MSBuild tasks

SqlDropDatabase, SqlCreateDatabase MSBuild tasks

When working with SQL, I often find I need to quickly spin up/tear down local developer database instances – for example, setting up a clean environment for integration tests on the build server, or blowing away test data.

To help make this easier, here are a couple of MSBuild tasks I wrote that allow you to drop an existing database:

<SqlDropDatabase ConnectionString="Server=.SQLEXPRESS;Database=AdventureWorks;Integrated Security=SSPI;"                 Database="AdventureWorks" />

… and to create a new (empty) one.

<SqlCreateDatabase ConnectionString="Server=.SQLEXPRESS;Database=AdventureWorks;Integrated Security=SSPI;"                   Database="AdventureWorks" />

It’s also sometimes helpful to be able to parse individual individual keys out of a connection string (e.g. the the database name). This can be very tedious with RegexMatch/RegexReplace, so I wrote a separate MSBuild task to do it:

<SqlParseConnectionString ConnectionString="Server=.SQLEXPRESS;Database=AdventureWorks;Integrated Security=SSPI;">    <Output PropertyName="myDb" TaskParameter="InitialCatalog" />    <Output PropertyName="myServer" TaskParameter="DataSource" />    <Output PropertyName="myTimeout" TaskParameter="ConnectTimeout" /></SqlParseConnectionString><Message Text="Parsed the $(myDb) database on server $(myServer) with timeout = $(myTimeout)." />

Entity validation and LINQ: Using yield return to optimize IsValid over a list of broken rules

Entity validation and LINQ: Using yield return to optimize IsValid over a list of broken rules

A common pattern for checking an entity is valid involves testing a number of rules on it. After all tests have been performed, a list of broken rules is returned.

Consider this example for validating instances of a simple Customer class:

class CustomerValidator{    public IEnumerable<RuleViolation> GetAllRuleViolations(Customer c)    {        IList<RuleViolation> ruleViolations = new List<RuleViolation>();        if (String.IsNullOrEmpty(c.FirstName))        {            ruleViolations.Add(new RuleViolation("FirstName",                 "First name cannot be empty."));        }        if (String.IsNullOrEmpty(c.LastName))        {            ruleViolations.Add(new RuleViolation("LastName",                 "Last name cannot be empty."));        }        if (!Regex.IsMatch(c.PhoneNumber, @"[d ()+-]+"))        {            ruleViolations.Add(new RuleViolation("PhoneNumber",                 "Invalid phone number."));        }        return ruleViolations;    }}

Quite often though, we don’t care about the full list of broken rules — we only care if the object is valid or not. Instead of…

IEnumerable<RuleViolation> brokenRules =     customerValidator.GetAllRuleViolations(customer);if (brokenRules.Count() > 0)    // do stuff</pre>...we would rather have:</p><pre lang="csharp">if (!customerValidator.IsValid(customer))    // do stuff

So what’s the difference between checking if an entity is valid and getting a detailed list of validation errors?

For starters, the only way of finding out if an entity is valid is by testing all the rules against it. Let’s assume this is a reasonably intensive operation — if you have a lot of rules, or need to check things with external systems (checking a username doesn’t already exist in the database, for example).

If all we’re doing is checking if the entity is valid, we want to do as little work as possible. This means stopping as soon as we hit a broken rule.

The easiest way to do this is with the yield return keyword. Yield return is kind of strange — it lets you iterate over objects as they are returned from a method. This is used for evaluating LINQ expression trees. Instead of filtering and reducing a collection one criteria at a time — e.g. testing 1000 objects, then re-testing the 500 objects that passed, etc — it tests each object against all the criteria at once.

In this case, we simply want to return as soon as a broken rule is encountered.

class CustomerValidator{    public IEnumerable<RuleViolation> GetAllRuleViolations(Customer c)    {        if (String.IsNullOrEmpty(c.FirstName))        {            yield return new RuleViolation("FirstName",                 "First name cannot be empty.");        }                if (String.IsNullOrEmpty(c.LastName))        {            yield return new RuleViolation("LastName",                 "Last name cannot be empty.");        }        if (!Regex.IsMatch(c.PhoneNumber, @"[d ()+-]+"))        {            yield return new RuleViolation("PhoneNumber",                 "Invalid phone number.");        }    }}

See that? The method is still defined as returning a collection, but it has three return statements with a single object each. With this, we can use a little bit of LINQ to break out of the method early:

    public bool IsValid(Customer c)    {        // are there any rule violations?        return !GetAllRuleViolations(c).Any();    }

I whipped up a little test app to prove this — IsValid vs GetAllRuleViolations.

CustomerValidator validator = new CustomerValidator();// An invalid customerCustomer customer = new Customer(){    FirstName = "",    LastName = "",    PhoneNumber = "asdsd"};// do as little work as possible to work out if the entity is valid or notConsole.WriteLine("IsValid = ");Console.WriteLine(validator.IsValid(customer));// check everything for a full reportConsole.WriteLine("GetAllRuleViolations().Count =");Console.WriteLine(validator.GetAllRuleViolations(customer).Count());

Here’s what it produces. Note that IsValid returns immediately after the first rule fails.

IsValid =        Testing first nameTrueGetAllRuleViolations().Count =        Testing first name        Testing last name        Testing phone number3

Find missing foreign/primary keys in SQL Server

Find missing foreign/primary keys in SQL Server

Last week I wrote a SQL query to estimate how many columns are missing from foreign or primary keys. This works because of our naming convention for database keys:

  • We use a Code suffix for natural keys e.g. CountryCode = NZ
  • We use an ID suffix for surrogate keys e.g. EmployeeID = 32491

This script looks for any columns that match this naming pattern, but aren’t part of a primary or foreign key relationship.

-- Find columns on tables with names like FooID or FooCode which should-- be part of primary or foreign keys, but aren't.SELECT        t.name AS [Table],        c.name AS [Column]FROM        sys.tables t        INNER JOIN sys.syscolumns c ON                c.id = t.object_id                -- Join on foreign key columns        LEFT JOIN sys.foreign_key_columns fkc ON                (fkc.parent_object_id = t.object_id                AND c.colid = fkc.parent_column_id)                OR (fkc.referenced_object_id = t.object_id                AND c.colid = fkc.referenced_column_id)                -- Join on primary key columns        LEFT JOIN sys.indexes i ON                i.object_id = t.object_id                and i.is_primary_key = 1        LEFT JOIN sys.index_columns ic ON                ic.object_id = t.object_id                AND ic.index_id = i.index_id                AND ic.column_id = c.colidWHERE        t.is_ms_shipped = 0        AND (c.name LIKE '%ID' OR c.name LIKE '%Code')        AND        (                fkc.constraint_object_id IS NULL -- Not part of a foreign key                 AND ic.object_id IS NULL -- Not part of a primary key        )        AND        (                -- Ignore some tables                t.name != 'sysdiagrams'                AND t.name NOT LIKE '[_]%' -- temp tables                AND t.name NOT LIKE '%temp%'                AND t.name NOT LIKE '%Log%' -- log tables                                -- Ignore some columns                AND c.name NOT IN ('GLCode', 'EID', 'AID') -- external keys        )ORDER BY        t.name,        c.name

Using this script, I found over 200 missing foreign keys in one production database!