I recently had a need to scan through an SSIS package to find all instances of data flow tasks within the package. This is to support a future enhancement in BIDSHelper. However, I ended up making the code a little more general, so it can be used to get a list of all instances of any type of tasks within a package.
Before looking at the code, there is a concept that it is helpful to understand. SSIS packages have a nested structure. Everything is held within a container. Some containers, like the Sequence or ForEach Loop containers, can hold groups of other containers. The package itself is a container. All containers that can hold multiple containers implement the IDTSSequence interface. The TaskHost container, however, holds a single task, and does not implement the IDTSSequence interface. The TaskHost is what we have to find to identify individual instances of tasks.
The function below is designed to be called recursively. It uses the IDTSSequence interface to iterate through the Executables collection, which can hold either TaskHosts or containers that implement IDTSSequence themselves, which is where the recursion comes in. It also takes the type name as a string parameter. The code checks the type name string against the name of the type as returned by the GetType function. It’s simply checking to see if the typeName string exists within the full string returned, so it doesn’t have to be an complete match. This means you don’t have to specify the full namespace, like “Microsoft.SqlServer.Dts.Tasks.ExecuteSQLTask”. Instead, you can just specify “ExecuteSQL”.
List<DtsContainer> FindExecutablesByType(IDTSSequence sequence, string typeName)
{
string typeNameUpper = typeName.ToUpper();
List<DtsContainer> matchingExecutable = new List<DtsContainer>();
foreach (Executable e in sequence.Executables)
{
if (e.GetType().ToString().ToUpper().Contains(typeNameUpper))
{
matchingExecutable.Add((DtsContainer)e);
}
if (e is TaskHost)
{
TaskHost taskHost = (TaskHost)e;if ((typeNameUpper.Contains(“DATA FLOW”)
|| typeNameUpper.Contains(“DATAFLOW”)
|| typeNameUpper.Contains(“MAINPIPE”)
|| typeNameUpper.Contains(“PIPELINE”))
&& taskHost.InnerObject is IDTSPipeline90)
{
matchingExecutable.Add((DtsContainer)e);
}
else if (taskHost.InnerObject.GetType().ToString().ToUpper().Contains(typeNameUpper))
{
matchingExecutable.Add((DtsContainer)e);
}
}
if (e is IDTSSequence)
{
matchingExecutable.AddRange(FindExecutablesByType((IDTSSequence)e, typeNameUpper));
}}
return matchingExecutable;
}
The function returns a list of DtsContainer objects, which is a base class for all control flow objects. The DtsContainer gives access to the name and description for each object. There is some special handling for Data Flow objects. These are actually implemented as COM objects, so a call to GetType on them returns “ComObject” rather than the .NET class name. To work around this, I have hard-coded a few values for the data flow objects, and a specific check for the IDTSPipeline object. There may be a better way to accomplish this, but I haven’t located it yet. Another limitation of this code is that it does not check event handlers. Any tasks inside of event handlers will not be found. This could be added fairly easily, but wasn’t necessary for my current purposes, so it didn’t make the cut.
To call this function and iterate the entire package, you can use this code:
Microsoft.SqlServer.Dts.Runtime.Application ssisApp = new Microsoft.SqlServer.Dts.Runtime.Application();
Package pkg = ssisApp.LoadPackage(“C:\Temp\MyPackage.dtsx”, null);
List<DtsContainer> containers = FindExecutablesByType((IDTSSequence)pkg, “ExecuteSQL”);
foreach (DtsContainer exec in containers)
{
//Do Something
}
I’ve put together a small GUI around this, posted on my SkyDrive.
If you are interested in learning more about working with SSIS programmatically, please post in the comments. I’m considering doing a few more posts on this, but only if there is interest.