Posts tagged ‘Agile Practices’

Developer Gaps

I’ve been meaning to post about this ever since I saw Jamie’s post on “The SQL Developer Gap”. I couldn’t agree more with what Jamie expressed. Prior to getting into serious BI development, I was primarily an application developer. This was at a time when agile development was becoming very popular. As part of that, refactoring, test driven development, continuous integration, and automated unit testing were becoming accepted as good software engineering practices, and tool support was coming along very quickly.

Then I switched over to BI development in the SQL Server 7 time frame. In a lot of ways, it was like going back to the dark ages. No refactoring support, no automated testing, no concept of builds. Nothing significant changed until SQL Server 2005, when tools like SSIS and SSAS took their first steps toward becoming more “developer friendly” by leveraging Visual Studio to easily integrate into source control and the beginnings of multi-developer support. However, there haven’t really been any improvements in this since 2005. Refactoring, automated testing, automated builds, etc., can all be done, but they are painful and time consuming to set up, and require a fair amount of specialized knowledge to do correctly. In addition, these are all skills that the average BI developer usually doesn’t posses.

To join in Jamie’s rant, this is something that has aggravated me increasingly over the last few years. In many ways, BI is ideally suited to an agile approach and developer tools that increase productivity – requirements shift on the whim of the business, you need to deliver quickly and often, and you need easy mechanisms to confirm that what you are delivering provides the correct results. There are many tasks in developing BI solutions that are repetitive and could be easily automated, if only the tools provided better support for it. And developer productivity using the SQL Server BI tools hasn’t seen a significant increase since 2005.

I’m spending a fair amount of my time these days working in Visual Studio, where I have the luxury of a built in unit testing tool, the capability to switch between visual editing and text editing depending on which makes the most sense, the ability to easily do a diff between two versions in source control, a full undo-redo stack, etc. And I get to use add-ins like ReSharper (a fantastic tool that I can’t recommend enough). It really highlights the difference between developing traditional applications and BI applications these days.

That’s part of the reason I joined Varigence, where I have the opportunity to actually help developers deliver BI solutions faster and better. Our approach makes it much easier to support the same features that you see in traditional application development tools. I’ve been pretty pleased to see how easy it is for us to add productivity features to our tools – honestly, it makes me wonder why BI developers had to wait this long for these features to be available in the tools we use on a daily basis.

My New Blog

When I originally started this blog, I wanted to cover agile development in BI/DW projects, as well as technical topics about BI. However, it didn’t work out quite that way. I’ve posted a few thing here and there about agile development on this blog, but for the most part it’s been focused on technical topics (and most of those about SSIS). So, to avoid any confusion on this blog, I’ve started a second one named BIpartisan (no, it’s not about politics :) ). On BIpartisan, I’ll be focusing a bit more on higher level topics around agile development and delivering value with BI, and not so much on technical ones. If you are interested in that, please check it out.

I will continue to post here as well on the more technical topics. I’ve got several posts planned, but it’s been tough finding time to put them together. Hopefully, things will settle down a bit and I’ll have more free time for blogging.

ssisUnit – A Unit Testing Tool for SSIS

I’ve been a bit lax posting on my blogs and the MSDN forums recently. Fortunately, I have a good reason (at least I think it’s a good one). :) My employer, Mariner, has graciously given me permission to open source a unit testing framework for SSIS packages. Preparing it for release has taken a bit more time than I expected, as I wanted to polish up a few items, and that led to a few more changes, etc. The framework, as we were using it, was definitely functional, but I wanted to make a few changes for ease of use. Now I have those changes in, and an alpha (but functional) version is available on Codeplex, under the ssisUnit project.

I’ve posted previously about unit testing for SSIS, and how I really missed the automated unit testing capability that I’d taken for granted in more traditional application development. Since the team currently has no plans for unit testing for SSIS, it seemed like a good time to get this out and available to the public. There are currently methods of testing SSIS, but most of them involve testing the package as a whole. One of our goals for unit testing SSIS, though, was to enable testing at a more granular level. ssisUnit enables testing down to the individual task level in the control flow. In future iterations, I’d like to expand the functionality to include testing individual components in the data flow.

We have additional plans for ssisUnit in the future, including Visual Studio integration, additional command capabilities, and a GUI for creating the test cases. The current version is v0.50, and I hope to have another release by mid-April. Please download it, give it a whirl, and provide feedback and suggestions for improvement. If you’re interested in contributing to the project, please leave a comment on this post, or email me at

Iterative Development In BI

One of the staples in agile (and many other methodologies) development is the idea of iterative development. That’s the idea that, rather than doing an entire project in one big pass, it is broken up into smaller iterations. Each iteration ideally consists of a specific set of deliverables with value to the end user (ideally working code), and is a small enough slice of the project that the deliverables can be met in a short time frame*. So in a traditional application development project, the first iteration might deliver a working data entry form to the end user. The second iteration might add an additional form and a background validation process, and so on until all the application requirements are met.

Another feature of iterative development is that each iteration involves some aspects of the full software development lifecycle. For each iteration, you expect to go though some requirements, design, development, and testing. But these activities are focused on just what is required for the current iteration. There usually is a small amount of up front time spent mapping out the entire project, but not in an extreme amount of detail.

Most of the BI projects I have been involved in, however, follow much more of a waterfall approach. There usually are intermediate milestones defined, but it’s usually not working code, it’s a requirement document, or an architecture diagram. What’s really depressing about this is the look of incomprehension I get when I suggest breaking things into smaller iterations.

Here’s some of the common objections I hear to doing iterative development, and my responses to them.

  • “We have to gather all the requirements up front so we can make sure we don’t miss anything.”
    This is one of the most common, and one of the more absurd, objectives. The chances that you can gather all or even most of the requirements up front on a project of any complexity are small. There’s even less chance that the requirements won’t change over the course of the project. I much prefer to do detailed requirements in a just-in-time fashion.
  • “If we don’t know all the requirements at the time we begin design, we might build something that can’t be extended.”
    I agree that designing a BI solution is easier if you have all the pieces in front of you when you begin. However, it is certainly possible to create a flexible design without knowing every detail about how it will be used. In fact, since BI systems encourage emergent behavior, you can count on new requirements arising over the course of the project, and after it is completed. That means a flexible design is necessary, even if you think you have all the requirements up front.
  • “It takes too long to produce working code to deliver iterations that quickly.”
    This is one argument that holds some weight with me. If you are starting from scratch, and you need to build a star schema, load it with data, and create reports on that to deliver to end users, it can be difficult to accomplish that in four to eight weeks. However, that’s where I feel that you may have to narrow the focus of the iteration – instead of building all 8 needed dimensions, you build 4 instead. Then you build the remaining 4 in the next iteration. This does require some creativity to make sure you are building something of use to the end user, while still being able to complete it in the iteration’s timeframe.

That covers some of the common objections. I have done short iterations on BI projects, with a high degree of success, so I know that not only is it possible, it works well. In a future post, I will list some of the benefits I find in doing iterative development on BI projects.

*What is a short time frame? This is somewhat subjective, and you will find opinions ranging from several days to several months. My personal feeling is that an iteration shouldn’t go past 8 weeks, and I think 4 weeks is a much more manageable size.