Whether you’re creating a report, chart, KPI, or calculated measure, at some point you’ll probably need to add year-to-date calculations. It doesn’t matter if you’re using SQL or MDX, the technique to generate the From and To dates for a YTD calculation remains the same.
The general concept goes like this:
Step 1: Get the current year and prefix it with “01/01/”. This is the From date.
Step 2: Get today’s date. This is the To date.
It seems simple enough but if your ETL process runs daily and pulls “yesterday’s” data then you’ll need to account for a 1 day lag on the To date. For example, if today is June 15th, 2011 then your warehouse will only contain data up to June 14th. Using the logic described above, the YTD calculation will select data from “01/01/2011” to “06/15/2011”, even though data for June 15th hasn’t yet been loaded into the warehouse. Although not entirely accurate, it won’t hurt anything, the YTD calculation just won’t return any data for the 15th.
Now let’s say today is December 31st, 2011. On this day the warehouse will only contain data up to December 30th and the YTD calculation will select data from “01/01/2011” to “12/31/2011”. Are you beginning to see the problem? Here’s one last example to drive the point home: Lets advance one day to January 1st, 2012. On this day the warehouse will only contain data up to December 31st, 2011 but the YTD calculation will select data from “01/01/2012” to “01/01/2012”. In this case the YTD calculation will return an empty resultset and the business users will never get a full 365-day view of the data.
Ok, so we know we have to update the YTD calculation to handle the one-day lag in the ETL. The logic now looks like this:
Step 1: Get the current year of yesterday and prefix it with “01/01/”. This is the From date.
Step 2: Get yesterday’s date. This is the To date.
To summarize, here’s what the YTD calculation will select using the original logic:
- June 15th, 2011: “01/01/2011” through “06/15/2011”
- December 31st, 2011: “01/01/2011” through “12/31/2011”
- January 1st, 2012: “01/01/2012” through “01/01/2012”
And here’s what the YTD calculation will select using the new logic:
- June 15th, 2011: “01/01/2011” through “06/14/2011”
- December 31st, 2011: “01/01/2011” through “12/30/2011”
- January 1st, 2012: “01/01/2011” through “12/31/2011”
As you can see, given the 1-day ETL lag, the new YTD calculation will always select the correct range of data, even on the first of the year.
But lets take it a step further. I propose the idea of working with the “last completed month” or even the “last completed week”. This way calculations are handled more gracefully. In other words, the YTD calculation is based on January through the last completed month or week. Keep in mind that we still have to account for the 1-day ETL lag.
This is what would happen if we decided to operate on the last completed month:
- June 15th, 2011: “01/01/2011” through “05/31/2011”
- December 31st, 2011: “01/01/2011” through “11/30/2011”
- January 1st, 2012: “01/01/2011” through “12/31/2011”
In order to achieve this behavior we must implement the following logic:
Step 1: Get the current year of yesterday and prefix it with “01/01/”. This is the From date.
Step 2: Get the month of yesterday’s date, append “/01/” for the day, append the year of yesterday’s date, and subtract one day. This is the To date.
The benefit of this approach is that business users have up to a month to analyze data produced by year-to-date calculations. However, it’s a double edged sword because business users have to wait up to a month. Admittedly this approach may be a better choice for “rolling” calculations but this will depend on what works best for the business. In some cases using the last completed week may be preferred over using the last completed month since users still have time to analyze data but don’t have to wait long for new data to be included in the calculation.