I’m balanced, except when I’m not

Jamie posted a question on ragged hierarchies to his blog today:  Are you balanced or unbalanced? I started thinking about it this evening, and decided to post my thoughts here. I think that, for the most part, business requirements drive should drive the decision between a balanced and unbalanced hierarchy. But there is a gray area. You can often force an unbalanced hierarchy into a balanced hierarchy by filling in the missing levels between the top and bottom. This would typically be implemented by either leaving the intervening levels blank, or copying the parent value down through each intervening level.

For example, say I have the following unbalanced hierarchy, where the DC branch rolls up directly to the North region, and the California (CA) territory has no branches:

Corporate Region Territory Branch
HQ North NY NYC
  North   DC
  North MA Boston
  South NC Charlotte
  West CA  

I can implement a balanced hierarchy by filling in the blanks, like this:

Corporate Region Territory Branch
HQ North NY NYC
  North North DC
  North MA Boston
  South NC Charlotte
  West CA CA

Sometimes this is perfectly acceptable, but it may not accurately represent the business information. That is a call that has to be made on a case by case basis. However, there are some reasons that balancing the hierarchy may be desirable:

  • Parent-child hierarchies, when used in Analysis Services, only have aggregates created for the key attribute and the All level (or top attribute, if you disable the All level). That has a significant performance impact on large dimensions.
  • If you using Analysis Services, you can still simulate an unbalanced hierarchy by using the Hide Member If property, which allows you to hide a level if it is blank, or contains the same value as its parent.
  • Parent-child table relationships, while being simple to set up at the relational database level, are difficult to query. If you are developing a data mart that end users will be creating queries against, parent-child relationships are likely to cause problems.

In general, I prefer balanced hierarchies, but mostly for technical reasons. There are certainly business cases where they just don’t apply.

2 Comments

  1. Nick says:

    I know this is an old post, but wouldn’t the business want to see:

    HQ>North>DC

    instead of

    HQ>North>North>DC

    ?

  2. jwelch says:

    Probably, which is why I’d want to expose this data through Analysis Services. Using AS, I can use the Hide Member If property mentioned above to prevent North from showing up twice.

    If this is a purely relational system, well, I can still hide it by using something like:

    SELECT
    Corporate
    , Region
    , CASE WHEN Territory = Region THEN Branch ELSE Region END AS Region

Leave a Reply