Wednesday, July 29, 2009

The Sands of Time

One of the things they talk about in Business Intelligence is the Grain Statement. This is essentially a statement about the level of granularity of a particular fact table and associated dimensions. For instance, you might have monthly sales goals per store, but not per employee, so the grain statement would have to specify that. On the other hand, actual sales may be recorded by employee by day, so the grain statement for the actual sales would be different from the grain statement for the goals.

The grain statement can also be a useful concept in other areas of data querying and reporting. I have worked for and with non-profits for almost twenty years, so one of the most common places I see this applying is the Fiscal Year. Many non-profits have a fiscal year that runs from July 1 to June 30. Since many programs have date/time functions centered around the calendar year, getting fiscal year data can be a real hassle.

One of the problems that people run into with fiscal years is forgetting to take the grain statement into account. The granularity of the fiscal year is the year. A common formula I see to calculate the fiscal year is something along the lines of the following:
CASE
WHEN Month(ShipDate) > 6 THEN Year(ShipDate) + 1
ELSE Year(ShipDate)
END


The problem here is that they are carefully preserving the month and day for the intermediate values when the granularity of the final result is the year. Any formula that maps the date to the right year will work. There simply is no need to preserve the month and day information.

In fact, preserving the month information, at least, is wrong. For the fiscal year July-June, the fiscal period for July is 1. There is no straightforward way to calculate this when you preserve the month information.

Here is the code that I prefer to use to calculate fiscal years.
Year(DateAdd(qq, 2, ItemDate)). This code has the advantage that it treats all dates the same and it is easy to modify it to produce the correct fiscal period by replacing Year() with Month().

Another place that the grain statement comes into play is with DateDiff(). Recently on one of the forums, one poster was calculating the difference in minutes between two complicated date/time expressions. One of the complicated date/time expressions served the sole purpose of truncating the milliseconds from a datetime field. Since the granularity of the DateDiff() function was to the minute, the complicated date/time expression served no useful purpose. The programmer was unnecessarily complicating his code, because he forgot to take the grain statement of his code into account.

So the next time you're writing code, take a minute to consider the grain statement of your code. You might find that it might lead to a different approach.