Posts tagged ‘Custom Tasks and Components’

Data Cleaner Component Published on CodePlex

If you saw my talk “Implementing a Data Cleansing Component for SSIS” at the PASS Summit or SQLSaturday #56 (#sqlsat56), you might be wondering where the source for the sample component is located. It’s published on CodePlex, and downloadable from the Source Code page. As time permits, I’ll create an installer for it, and there’s some updates I’d like to make to add some more functionality. But it’s usable right now, and serves as another example component if you are looking to create your own.

If you have suggestions for improvements (and there are many that can be made), please post them to the CodePlex project’s Issue Tracker page.

Presenting at the Triad Developers Guild

I’ll be presenting at the Triad Developers Guild on Tuesday. The presentation will be on Developing Custom Components for SSIS, a topic I’ve spoken on a few times before. Here’s the abstract:

“SSIS data flows are great tools for moving data. But what if you need to go beyond the out-of-the-box components provided with SSIS? John Welch will describe how custom components are a great way to encapsulate and reuse functionality for the data flow in SSIS. He will discuss what it takes to create and deploy custom components in SSIS, review the pros and cons of using custom components instead of scripts, and discuss some of the common challenges and issues with creating them.”

I’m looking forward to the presentation – it usually generates some good discussion, and I like talking about developing for SSIS. If you are in the area, please drop by.

Implementing PerformUpgrade in Custom Components

If you develop custom components for SSIS, you may have the need to update them as you add new functionality. If you are just upgrading the functionality, but not changing the metadata, then you can simply recompile and redeploy the component. An example of this type of update would be changing the component to do additional warning or informational logging. The code has to be updated, but the metadata (the properties of the component, the settings for the inputs and outputs) was not modified.

The other type of update involves changing the component’s metadata. Examples of this would be adding a new property to the component or adding new inputs or outputs. In this case, you could increment the assembly version of your component, but then you would have to remove the old one from any data flows, and then add the new one back in and reconnect it. Rather than forcing users of the component to go through that effort for every package that uses the component, you can implement the PerformUpgrade method on your component. The PerformUpgrade method will be called when the package is loaded and the current version of the component does not match the version stored in the package’s metadata. You can use this method to compare the current version of the component to the expected version, and adjust the metadata appropriately.

Setting the CurrentVersion

To use this, you have to tell SSIS what the current version of your component is. You do this by setting the CurrentVersion property in the DtsPipelineComponent attribute that can be set on the PipelineComponent class:

[DtsPipelineComponent(
    DisplayName = "Test Component",
    ComponentType = ComponentType.Transform,
    CurrentVersion = 1,
    NoEditor = true)]
public class TestComponent : PipelineComponent

The CurrentVersion property defaults to zero, so a value of 1 indicates that this component is now on it’s second version.

Performing the Upgrade

Next, you need to implement some code in the PerformUpgrade method. This consists of first getting the value of the CurrentVersion property, and at the end of the method, setting the version in the component’s metadata to the current version.

public override void PerformUpgrade(int pipelineVersion)
{
    // Obtain the current component version from the attribute.
    DtsPipelineComponentAttribute componentAttribute =
      (DtsPipelineComponentAttribute)Attribute.GetCustomAttribute(this.GetType(), typeof(DtsPipelineComponentAttribute), false);
    int currentVersion = componentAttribute.CurrentVersion;

    if (ComponentMetaData.Version < currentVersion)
    {
        //Do the upgrade here
    }

    // Update the saved component version metadata to the current version.
    ComponentMetaData.Version = currentVersion;
}

The actual upgrade code can vary a good bit, from adding custom properties, adjusting the data types of outputs, or adding / deleting inputs or outputs. I won’t show the logic for these things here, but it’s pretty similar to the same code you’d use in ProvideComponentProperties.

Handling Multiple Upgrades

The code above is based on the sample in Books Online, but there’s a slight issue. Determining what upgrades need to be applied can be more complicated than simply comparing the current version to the ComponentMetaData version. Imagine that you have already upgraded the component from version 0 to version 1, by adding a new property. Now, you discover a need to add another new property, which will result in version 2. What do you do about the property added in version 1? You don’t want to add it twice for components that have already been upgraded to version 1. But it’s also possibly that not all packages have been upgraded from version 0 yet, so for those you need to add both properties. By altering to version check logic a little, you can accommodate upgrading from multiple versions pretty easily:

if (ComponentMetaData.Version < 1)
{
    //Perform upgrade for V1
}

if (ComponentMetaData.Version < 2)
{
    //Perform upgrade for V2
}

This change will ensure that the appropriate upgrade steps are taken for each version.

Some Other Thoughts

There’s a few things to be aware of with PerformUpgrade. One, it’s called only when the package is loaded, and the version stored in the package’s metadata is different than the binary component. This can occur both at design time (when the package is opened in Visual Studio), or at runtime (when executing the package from DTEXEC, etc).

Two, when you update the CurrentVersion property, and then add the component to a new package, the version number in the package metadata will initially be set to 0. So the next time the package is opened, the PerformUpgrade will be performed. Since the ProvideComponentProperties would have already set the metadata appropriately for new version of the component, the PerformUpgrade can cause errors by attempting to add the same metadata again. This appears to be a bug in the behavior when adding the component to the data flow, and it occurs under both 2005 and 2008. The workaround is code the PerformUpgrade method to check before altering any metadata, to make sure that it doesn’t already exist.

Three, due to what looks like another bug, when the package is opened the second time after the component is initially added to the package, the version will be incremented at the end of PerformUpgrade (assuming you use the code above that updates the version). However, this change does not mark the package as dirty in the designer, so the updated version number will not be saved unless some other property in the package is modified, and then the package is saved. This isn’t a huge problem – though you do need to make sure that the code in PerformUpgrade can be run repeatedly to avoid issues.

That’s pretty much it. Hopefully this will be helpful if you are developing custom components for SSIS.

Updating Custom Assembly References in SSIS

Just a quick tip related to a recent problem I had with a custom task I developed for SSIS. I had to recompile it recently, and in the process, managed to change the public key associated with the assembly. Since SSIS uses strong names (which include the public key) to reference custom tasks, this caused a bit of a problem. None of the packages that used the custom task would execute properly, and I really didn’t want to go through 20+ packages and delete the task, then add it back and try to remember how it was set up in each package. Fortunately, fixing this was much simpler. I opened one of packages in the XML view, located the original strong name (which looks something like this “Mariner.Pamlico.ETL.ControlFlowTasks.CustomTask, Mariner.Pamlico.ETL.ControlFlowTasks, Version=1.0.0.0, Culture=neutral, PublicKeyToken=73ea249dd43ab42e”), and did a search and replace on all the packages to update the reference with the new public key. I could then execute all my packages without any problems, and they all retained their original settings.

Like I said, just a quick tip if you ever have to update multiple packages to reference a new version of a custom task or component. A few words of warning – before updating the XML of a package directly, it’s always a good idea to make a backup. Also, this works as long as you haven’t changed the public interface for the task. If you have removed properties, it’s likely to cause new and interesting errors when you attempt to open the package in BIDS.