Posts tagged ‘Components’

Implementing PerformUpgrade in Custom Components

If you develop custom components for SSIS, you may have the need to update them as you add new functionality. If you are just upgrading the functionality, but not changing the metadata, then you can simply recompile and redeploy the component. An example of this type of update would be changing the component to do additional warning or informational logging. The code has to be updated, but the metadata (the properties of the component, the settings for the inputs and outputs) was not modified.

The other type of update involves changing the component’s metadata. Examples of this would be adding a new property to the component or adding new inputs or outputs. In this case, you could increment the assembly version of your component, but then you would have to remove the old one from any data flows, and then add the new one back in and reconnect it. Rather than forcing users of the component to go through that effort for every package that uses the component, you can implement the PerformUpgrade method on your component. The PerformUpgrade method will be called when the package is loaded and the current version of the component does not match the version stored in the package’s metadata. You can use this method to compare the current version of the component to the expected version, and adjust the metadata appropriately.

Setting the CurrentVersion

To use this, you have to tell SSIS what the current version of your component is. You do this by setting the CurrentVersion property in the DtsPipelineComponent attribute that can be set on the PipelineComponent class:

    DisplayName = "Test Component",
    ComponentType = ComponentType.Transform,
    CurrentVersion = 1,
    NoEditor = true)]
public class TestComponent : PipelineComponent

The CurrentVersion property defaults to zero, so a value of 1 indicates that this component is now on it’s second version.

Performing the Upgrade

Next, you need to implement some code in the PerformUpgrade method. This consists of first getting the value of the CurrentVersion property, and at the end of the method, setting the version in the component’s metadata to the current version.

public override void PerformUpgrade(int pipelineVersion)
    // Obtain the current component version from the attribute.
    DtsPipelineComponentAttribute componentAttribute =
      (DtsPipelineComponentAttribute)Attribute.GetCustomAttribute(this.GetType(), typeof(DtsPipelineComponentAttribute), false);
    int currentVersion = componentAttribute.CurrentVersion;

    if (ComponentMetaData.Version < currentVersion)
        //Do the upgrade here

    // Update the saved component version metadata to the current version.
    ComponentMetaData.Version = currentVersion;

The actual upgrade code can vary a good bit, from adding custom properties, adjusting the data types of outputs, or adding / deleting inputs or outputs. I won’t show the logic for these things here, but it’s pretty similar to the same code you’d use in ProvideComponentProperties.

Handling Multiple Upgrades

The code above is based on the sample in Books Online, but there’s a slight issue. Determining what upgrades need to be applied can be more complicated than simply comparing the current version to the ComponentMetaData version. Imagine that you have already upgraded the component from version 0 to version 1, by adding a new property. Now, you discover a need to add another new property, which will result in version 2. What do you do about the property added in version 1? You don’t want to add it twice for components that have already been upgraded to version 1. But it’s also possibly that not all packages have been upgraded from version 0 yet, so for those you need to add both properties. By altering to version check logic a little, you can accommodate upgrading from multiple versions pretty easily:

if (ComponentMetaData.Version < 1)
    //Perform upgrade for V1

if (ComponentMetaData.Version < 2)
    //Perform upgrade for V2

This change will ensure that the appropriate upgrade steps are taken for each version.

Some Other Thoughts

There’s a few things to be aware of with PerformUpgrade. One, it’s called only when the package is loaded, and the version stored in the package’s metadata is different than the binary component. This can occur both at design time (when the package is opened in Visual Studio), or at runtime (when executing the package from DTEXEC, etc).

Two, when you update the CurrentVersion property, and then add the component to a new package, the version number in the package metadata will initially be set to 0. So the next time the package is opened, the PerformUpgrade will be performed. Since the ProvideComponentProperties would have already set the metadata appropriately for new version of the component, the PerformUpgrade can cause errors by attempting to add the same metadata again. This appears to be a bug in the behavior when adding the component to the data flow, and it occurs under both 2005 and 2008. The workaround is code the PerformUpgrade method to check before altering any metadata, to make sure that it doesn’t already exist.

Three, due to what looks like another bug, when the package is opened the second time after the component is initially added to the package, the version will be incremented at the end of PerformUpgrade (assuming you use the code above that updates the version). However, this change does not mark the package as dirty in the designer, so the updated version number will not be saved unless some other property in the package is modified, and then the package is saved. This isn’t a huge problem – though you do need to make sure that the code in PerformUpgrade can be run repeatedly to avoid issues.

That’s pretty much it. Hopefully this will be helpful if you are developing custom components for SSIS.

Batch Destination and the Merge Destination

After I created the Batch Destination for my presentation at the MS BI Conference, I was talking with Matt Masson, who let me know that they’d been working on a MERGE Destination sample component, and would hopefully be releasing it soon. It’s out now on the CodePlex site for the Integration Services team samples. I’ve taken a look at it, and it is very nicely done.

What’s the difference between the two? The MERGE Destination has a much nicer interface, that lets you graphically build a MERGE statement. And, of course it uses the MERGE statement to perform updates and inserts in a single operation. This is nice, but limits you to using a SQL Server 2008 database as a destination. The Batch Destination executes any SQL command you give it, so it can be used on SQL Server 2005 or 2008.

I haven’t done a performance comparison between the two, so I’m not positive which one is faster. However, the MERGE Destination stores the working data in memory, so it should be more efficient at loading the data. This does require the MERGE Destination to create a type and a stored procedure in the destination database, so the appropriate permissions need to be provided. The Batch Destination uses the ADO.NET Bulk Copy functionality, so the data is persisted in a working or temporary table in the target database, and cleaned up afterward, which is likely to be slower. However, you don’t need any special permissions to create a temporary table.

This isn’t intended to be a “this one is better than that one”. I can see using either or both as circumstances dictate. I just think it’s nice to have options.

SSIS Tasks and Components

Todd McDermid (a frequent poster on the MSDN SSIS forums, and creator of the Kimball SCD component) and I have started a new project on CodePlex along with Todd McDermid, the SSIS Community Tasks and Components (ssisCTC) project. This project is intended to be a umbrella project for custom SSIS tasks and components that are released as open source, as well as provide a listing of other open components for SSIS. I’ve published the Batch Destination component that I created to the site, and intend to add some more over the coming months.

Why set up a community project around this? Primarily, because I’ve found that many single person projects get one release, and then are never touched again. Having multiple people involved helps insure that if one person gets busy, or moves on to other interests, there are still people around to maintain and enhance what’s been put out there. So, if you are interested in joining or contributing, please let me know, or get in touch with use through the CodePlex site. Also, if you have or know of a freely available SSIS component or task, let us know, and we’ll be happy to add it to the list.