Jamie posted a question on ragged hierarchies to his blog today: Are you balanced or unbalanced? I started thinking about it this evening, and decided to post my thoughts here. I think that, for the most part, business requirements drive should drive the decision between a balanced and unbalanced hierarchy. But there is a gray area. You can often force an unbalanced hierarchy into a balanced hierarchy by filling in the missing levels between the top and bottom. This would typically be implemented by either leaving the intervening levels blank, or copying the parent value down through each intervening level.
For example, say I have the following unbalanced hierarchy, where the DC branch rolls up directly to the North region, and the California (CA) territory has no branches:
Corporate | Region | Territory | Branch |
HQ | North | NY | NYC |
North | DC | ||
North | MA | Boston | |
South | NC | Charlotte | |
West | CA |
I can implement a balanced hierarchy by filling in the blanks, like this:
Corporate | Region | Territory | Branch |
HQ | North | NY | NYC |
North | North | DC | |
North | MA | Boston | |
South | NC | Charlotte | |
West | CA | CA |
Sometimes this is perfectly acceptable, but it may not accurately represent the business information. That is a call that has to be made on a case by case basis. However, there are some reasons that balancing the hierarchy may be desirable:
- Parent-child hierarchies, when used in Analysis Services, only have aggregates created for the key attribute and the All level (or top attribute, if you disable the All level). That has a significant performance impact on large dimensions.
- If you using Analysis Services, you can still simulate an unbalanced hierarchy by using the Hide Member If property, which allows you to hide a level if it is blank, or contains the same value as its parent.
- Parent-child table relationships, while being simple to set up at the relational database level, are difficult to query. If you are developing a data mart that end users will be creating queries against, parent-child relationships are likely to cause problems.
In general, I prefer balanced hierarchies, but mostly for technical reasons. There are certainly business cases where they just don’t apply.