Posts tagged ‘Unit Testing’

Q & A From “Unit Tests for SSIS Packages”

I did a webinar this week for Pragmatic Works’ “Free Training on the T’s”. The topic was “Unit Tests for SSIS Packages”. If you attended, thanks for taking the time! If not, the recording is now available at the link above. You can also download the slides from my SkyDrive, and ssisUnit can be downloaded from CodePlex.

We had great turnout for the session, and a lot of great questions. I didn’t have time to address all of them during the webinar, and I had a number of requests to share my answers with all the attendees, so I thought I’d write up a blog post on the questions that I didn’t get a chance to answer.

Testing Practices

“For test drive development, should a “final test” (of sorts) be performed after the “refactor” phase to ensure that the act of refactoring didn’t negatively alter the code?”

Yes, absolutely. You want to run your tests after any code changes. This helps ensure that code continues working as expected, and that you don’t have hidden “side effects” associated with your changes.

“How would you unit test a package that is performing transforms in the source query?”

I would create a known set of data in a test version of the source database. I would then create a unit test for the data flow task that checks that the output includes the data with the proper transformations. This will get much easier in a future release, when it will support data flow component testing in the unit test harness, allowing you to test the results of a source component directly.

“What is the best way to incorporate SSIS testing on existing packages to automate..as this also needs requirement breakdown?”

I approach this in the same way that I approach adding tests to an existing .NET application. Start by identifying a subset of existing packages  to work with, and add tests to those. Once those have adequate coverage, move on to the next set. As far as requirements breakdown – yes, you do need to understand what the package is supposed to do in order to test it properly. You can add some simple test cases without a great deal of analysis (for example, did the task execute successfully?) but to get real value out of the tests, they do need to check that the task carries out the requirements as intended.

There is the option of generating unit tests automatically for existing packages via the ssisUnit API. While this can improve code coverage, I would caution you not to rely on it to verify real functionality.

“Sorry I missed the first part of the talk. so not sure if you already talked about this. But how do you recommend creating test data for dimensions if you need to use synthetic data? Are there any tools you recommend using?”

Remember that unit and integration testing is more about verifying functionality, and less about performance. So I like to create a small amount of handcrafted test data that hits the specific scenarios that I need to validate. I find that most of the tools out there for generating data tend to work well if I want large volumes of test data, but not so well for concrete examples. For example, when I need to validate that the package handles a Type 2 SCD update that affects multiple rows correctly, I need 3 to 5 rows of test data that are very specific. Data generators don’t do that very well.

Automating ssisUnit in Builds

“Can we automate the ssisUnit with build & deployments in TFS?” and “How would you enable automated testing with ssisUnit for Continuous Integration?”

The simplest way to to incorporate ssisUnit into your builds is by calling the command line test execution tool. It’s called ssisUnitTestRunner2008.exe (substitute the appropriate version number for your version of SSIS) and you can find it in the folder where you installed ssisUnit.

Another approach is to use the ssisUnit API, but this requires some level of .NET or PowerShell coding.

Compatibility

“Is ssisUnit backward compatible with VS2008?” and “Is this framework compatible with MS SQL Server 2012?”

It does work against 2008 and 2012. You can download the 2008 version directly from the CodePlex site. For 2012, you need to download and compile the source currently, which can be done with the free Express version of Visual Studio. The next release will have separate setups for each version.

“Does the SSIS unit testing work with evaluation edition of BIDS 2008 R2?”

Yes, it does.

“Will this connect with the Microsoft Parallel Data Warehouse?”

I have not tested this myself. However, because it uses standard OLEDB and ADO.NET connection technology for database access, I don’t see any reason why it wouldn’t work.

“I am a SQL 2008 R2 user.  What version of VS is John running?  What were those testing menu options?”

I was running SQL 2012 for the demo. The menu option for running ssisUnit was created using the External Tools menu option under the Tools menu in VS.

Setting Up and Using Tests

“I’m lost.  How did you set up the tests using the GUI and then link it to the SSIS package?  Did i miss that?”

During the demo, I showed a prebuilt ssisUnit test, that I opened in the ssisUnit UI tool. The unit tests are linked to packages through Package Reference objects, which basically refer to a package by it’s file path, or location in SQL Server.

“Can you use a different operator then equal to? Like greater than, less than, different than?”

Yes, the Asserts in ssisUnit can use expressions. These are C#-based expressions that evaluate to True or False. You can use an expression like “((DateTime)result).Date==DateTime.Now.Date” to check that the result of the Assert command is equal to today’s date. “result”, in the expression, represents the object returned by the command associated with the Assert. You can apply pretty much any C# / .NET operation to the result object. For more examples, check out this page.

“When performing SCD operations, the details of the row need to be inspected rather than just count of rows. How would you test this? Within ssisUnit or another tool?”

Well, I’d use ssisUnit, but I’m not exactly unbiased Smile. Currently, you can do this using the SQL Command, which enables you to retrieve specific row details from the database. In the future, you will be able to do this more directly by testing it in the data flow.

“Did you say the file command can accommodate various file types such as xml, csv, pipe delimited, etc.?”

I didn’t say that (I don’t think), but it will work with pretty much any file. Internally, it uses standard .NET file operations, so it’s very similar to the capabilities of the File System task in SSIS. It doesn’t actually care what format the file is in, it can copy, move, and delete it regardless. For line counts, it counts the number of carriage return / line feeds in the file.

“Do you have a guide on how to setup ssisUnit with SSIS? We have tried using this before, but we couldn’t successfully run the test.”

I’m sorry to hear you had problems with it. The Getting Started guide is admittedly light on content, and I really need to provide a more detailed walkthrough. Please vote for this here if you would like this prioritized over the other changes in the queue.

“I noticed in the code that ssisUnit doesn’t handle password protected packages. When will this be supported?”

I’ve added an issue to track this here. Please vote for it if you use password protected packages, as it will help me prioritize how quickly to address it.

“If possible, can you demo if a container can be executed? Especially a For loop or For Each loop?”

I didn’t have time to demo this during the presentation. Good thing too, because there was an error around handling of containers. This has now been fixed in the source code version of the project.

“When testing in a 64bit environment, is there a specific way to execute ssisUnit when data sources are 32bit?”

Currently, this requires that you compile a 32bit version of the source. In the next release, I will provide both 32-bit and 64-bit versions of the test execution utility.

“Will the new version of ssisUnit for 2012 actually include some bug fixes? We’ve tried the current ssisUnit and it’s pretty buggy.” and “Not a question, just feedback: So there are over 30 bug reports and enhancement requests on CodePlex (including UI bugs, missing parts like handling complex Data Flow logic unit testing, etc.) posted since the posting date in 2008. If you address some of these (particularly the UI bugs) in a new release we might try it again. The biggest lack my organization found is that there’s no support for Data Flow tasks, which are 80% or more of our ETL testing. Just some feedback to keep in mind if you make any updates in the future. I’ll gather any additional bugs that we found in our evaluation back in 2011 and add them to the site when I have time.”

Again, sorry that you have had difficulties with it, and I’d definitely appreciate any feedback you can add to the site. For data flow testing, I do use that fairly successfully today, and a number of other users do as well. Admittedly, component level testing for the data flow would be nice (and it is being worked on) but I’m curious about what is blocking you from using it today. If you can submit your issues to the site, I will look at how to address them.

As far as the number of outstanding requests and UI bugs, well. unfortunately, I don’t get paid for working on open source. So focus on this project tends to take a backseat to demands from paying work. That being said, I do want to address as many of the issues as possible, but I have to prioritize my time pretty heavily these days. If there are items that particularly annoy you, please vote for them, as I do use the votes to determine what I should work on. For the UI, I really don’t enjoy UI work (nor, as evidenced by the current GUI, is it my strongest skill as a developer), so you are unlikely to see any significant updates on that front on the open source project. However, Pragmatic Works has taken an interest in unit testing as of late, and we are investigating offering an enhanced UI that’s integrated with Visual Studio as part of our BI xPress product.

“This makes sense to me, but I don’t understand how I setup a package to match up to the unit test.”

Each test case can reference tasks by name or GUID. GUID is more accurate, as names can conflict if you have multiple tasks with the same name in different containers.

“In package 1, you were getting a table count as a task and then writing a unit test to check the results of that test.  Are you returning the result of the SELECT to a variable and then checking the value stored in the variable or are you directly testing the return value from the execute SQL task?”

The Execute SQL task in that test was designed to store the results of the SQL statement in a variable, so the test is written to check the value of the variable (using a VariableCommand) after the task executes. If you need to get a value directly from the database as part of the test, you can use a SqlCommand.

“Do unit tests always fall under Miscellaneous?”

Yes – the SSIS project structure doesn’t allow for custom folders.

“Would you have a more complex example for the test than the result of a table count?”

The ProductSample test on the website illustrates a few other test scenarios. If there are additional scenarios that you’d like to see examples for, please add an issue or discussion item on the site.

Miscellaneous

“Where I can get some xml to parse my .dtsx packages and only extract all the SQL code and associated SSIS task name?”

I’m not really sure how this relates, honestly. However, to do this, you’d need to either write a fairly complex XSLT transform, or use the SSIS API. Unfortunately, I don’t know of any public example code that illustrates this.

Sharing the Presentation

Many, many variations of “Can I get the slides?”

I mentioned this at the top of the post, but just in case: The recording of the webinar is available here: “Unit Tests for SSIS Packages”. You can  download the slides from my SkyDrive, and ssisUnit can be downloaded from CodePlex.

Thanks again to everyone who attended, and thanks for all the great questions.

Speaking at the Carolina Code Camp

I’ll be presenting a couple of sessions at the Carolina Code Camp on Saturday, May 15th. This will be my second time presenting at the Carolina Code Camp, and I’m looking forward to it. Last time, I had great audience participation, and the event was well organized and fun.

I’ll be presenting on “Processing Flat Files with SSIS” at 8:45 AM and “Unit Testing SSIS Packages” at 12:30 PM. Looking forward to both – it should be fun.

If you are in the area, feel free to drop in – there is no charge for the Code Camp.

Developer Gaps

I’ve been meaning to post about this ever since I saw Jamie’s post on “The SQL Developer Gap”. I couldn’t agree more with what Jamie expressed. Prior to getting into serious BI development, I was primarily an application developer. This was at a time when agile development was becoming very popular. As part of that, refactoring, test driven development, continuous integration, and automated unit testing were becoming accepted as good software engineering practices, and tool support was coming along very quickly.

Then I switched over to BI development in the SQL Server 7 time frame. In a lot of ways, it was like going back to the dark ages. No refactoring support, no automated testing, no concept of builds. Nothing significant changed until SQL Server 2005, when tools like SSIS and SSAS took their first steps toward becoming more “developer friendly” by leveraging Visual Studio to easily integrate into source control and the beginnings of multi-developer support. However, there haven’t really been any improvements in this since 2005. Refactoring, automated testing, automated builds, etc., can all be done, but they are painful and time consuming to set up, and require a fair amount of specialized knowledge to do correctly. In addition, these are all skills that the average BI developer usually doesn’t posses.

To join in Jamie’s rant, this is something that has aggravated me increasingly over the last few years. In many ways, BI is ideally suited to an agile approach and developer tools that increase productivity – requirements shift on the whim of the business, you need to deliver quickly and often, and you need easy mechanisms to confirm that what you are delivering provides the correct results. There are many tasks in developing BI solutions that are repetitive and could be easily automated, if only the tools provided better support for it. And developer productivity using the SQL Server BI tools hasn’t seen a significant increase since 2005.

I’m spending a fair amount of my time these days working in Visual Studio, where I have the luxury of a built in unit testing tool, the capability to switch between visual editing and text editing depending on which makes the most sense, the ability to easily do a diff between two versions in source control, a full undo-redo stack, etc. And I get to use add-ins like ReSharper (a fantastic tool that I can’t recommend enough). It really highlights the difference between developing traditional applications and BI applications these days.

That’s part of the reason I joined Varigence, where I have the opportunity to actually help developers deliver BI solutions faster and better. Our approach makes it much easier to support the same features that you see in traditional application development tools. I’ve been pretty pleased to see how easy it is for us to add productivity features to our tools – honestly, it makes me wonder why BI developers had to wait this long for these features to be available in the tools we use on a daily basis.

Presenting at the SQL Server Best Practices Conference

I’m going to be presenting two sessions at the SQL Server Best Practices Conference on August 24-26 in Washington, DC. This conference has a different focus than many of the technical conferences that I’ve presented at in the past. Instead of technical deep dives, the focus is on providing attendees with information on the decision points they are likely to encounter in their projects, and guidance on how to choose the best course of action at those points. It promises to be a very interesting conference.

I’ll be presenting on two topics. The first, Unit Testing SSIS Packages, is a topic I’ve presented on before, but not with this focus. I’m a big fan of unit testing and the test driven development (TDD) model, but I’ve found that there are numerous challenges in applying this approach to SSIS. So my session will focus on the best practices you can use to make this successful.

My other topic, SQL Server BI in the Cloud, is a new one for me (and for pretty much everyone, considering that SQL Azure is still pre-CTP). However, I’ve been doing some work with it and other cloud based models for BI, and there are some clear key decision points that you need to consider when embarking on this type of project. I’m really looking forward to this one, as it’s a new area, and that always prompts good discussions.

If you are attending, please consider dropping by for my sessions. If you’re not registered for the conference, there’s still time, and there’s a great lineup of speakers.

 

BestPractices_banner640x96

SQL Heroes Contest

David Reed, who manages the SQL Server community samples on CodePlex, has been running a SQL Heroes contest. It’s for open source applications that help improve the SQL Server experience. You can read more about it on the SQL Heroes blog.

I find myself in the interesting position of having two horses in the race: BIDS Helper (with Greg Galloway and Darren Gosbell) and ssisUnit. If you use either (or both) of these, and like them, please vote in the Finalists Survey.

If you don’t use them, why not give them a try? :)

Testing the SQL Database at TechED

While at TechEd this summer, I participated in a panel discussion on testing the database and related technologies. The panel also included Gert Drapers and Jamie Laflen of the Visual Studio Team System – Database Edition team, David Reed (who manages the SQL Server community samples on CodePlex), and Adam Machanic as the moderator. The discussion covered some of the pros and cons of testing data, and some of the common issues encountered. It was also a chance to talk about testing related areas (like SSIS packages), of which I am a big fan. If you are interested in seeing it, it was just put online.

If you are interested in unit testing SSIS packages, keep an eye on www.codeplex.com/ssisunit. I should have a new release going up soon, which adds some much needed features, and a much more extensible framework for adding new functionality.

Presenting at the SSWUG Virtual BI Conference

I’m going to be presenting a few sessions at the upcoming SSWUG Virtual Business Intelligence Conference. It’s occurring from September 24-26, 2008. It has a very impressive list of speakers (I’m really not sure how I made it on the list – it’s quite an honor). I’ll be presenting on configurations in SSIS, unit testing SSIS (one of my favorite topics), and the new Report Builder functionality in SQL Server 2008.

I’m looking forward to it, but it will be a new experience for me. The sessions will be recorded in advance, and then the speakers will be online during their presentation time slot to answer questions from the audience. It sounds interesting, and since it’s being broadcast online, there’s potential for a much wider range of attendees.

If you’re interested in the conference, please check out the web site.

ssisUnit – A Unit Testing Tool for SSIS

I’ve been a bit lax posting on my blogs and the MSDN forums recently. Fortunately, I have a good reason (at least I think it’s a good one). :) My employer, Mariner, has graciously given me permission to open source a unit testing framework for SSIS packages. Preparing it for release has taken a bit more time than I expected, as I wanted to polish up a few items, and that led to a few more changes, etc. The framework, as we were using it, was definitely functional, but I wanted to make a few changes for ease of use. Now I have those changes in, and an alpha (but functional) version is available on Codeplex, under the ssisUnit project.

I’ve posted previously about unit testing for SSIS, and how I really missed the automated unit testing capability that I’d taken for granted in more traditional application development. Since the team currently has no plans for unit testing for SSIS, it seemed like a good time to get this out and available to the public. There are currently methods of testing SSIS, but most of them involve testing the package as a whole. One of our goals for unit testing SSIS, though, was to enable testing at a more granular level. ssisUnit enables testing down to the individual task level in the control flow. In future iterations, I’d like to expand the functionality to include testing individual components in the data flow.

We have additional plans for ssisUnit in the future, including Visual Studio integration, additional command capabilities, and a GUI for creating the test cases. The current version is v0.50, and I hope to have another release by mid-April. Please download it, give it a whirl, and provide feedback and suggestions for improvement. If you’re interested in contributing to the project, please leave a comment on this post, or email me at john.welch@mariner-usa.com.

Unit Testing In SSIS

I’d put in a request for additional support for unit testing in Integration Services on Connect (https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=276292). Unfortunately, it doesn’t look like it will make the cut for SQL Server 2008.

I used automated unit testing extensively when I was more focused on application development, and I really miss having it available in SSIS. In my experience, it made a significant difference in both quality and speed of development. Having a suite of test cases gave developers a safety net and being able to quickly test small units of work made progress much more tangible. (For more on test-driven development and some of the benefits, see http://en.wikipedia.org/wiki/Test-driven_development.)

Given that, maybe it would be worth consider incorporating some unit testing capabilities into the BIDSHelper project (http://www.codeplex.com/bidshelper) or starting another community project to add those capabilities. Any thoughts on what would be good features to include? My list includes:

  • Ability to test tasks individually
  • Ideally, ability to test pipeline components individually
  • Standard xUnit functionality (the ability to set up and tear down a test)
  • Some support for setting the database to a known state prior to the test would be valuable

Any other thoughts would be welcome.