Posts tagged ‘SSIS 2008’

Moving SSIS Packages with ADO.NET Destinations Between 2008 R2 and 2008

As noted by Matt Masson from the Integration Services team, not much has changed in SSIS 2008 R2. In fact, R2 is pretty much identical, with the exception of the ADO.NET Destination. So if you are developing packages, you can build them using the 2008 or 2008 R2 version of BIDS, and they can be used in either environment successfully, as long as the package doesn’t have an ADO.NET Destination. If it does, a package developed in the 2008 R2 version of BIDS will give errors when you open on a machine with the 2008 version of SSIS installed. This is because the 2008 version of the ADO.NET Destination doesn’t know what to do with the new property added to the R2 version.

This wouldn’t be a big deal, if you could have side-by-side installs of BIDS 2008 and BIDS 2008 R2, but the install for R2 replaces the 2008 version of BIDS. So, if you need to move packages developed in 2008 R2 to 2008 (a common scenario for me), you have to do a little extra work. There are a few values that need to be removed or changed by directly editing the package XML.

As always, it’s a good idea to make a backup of your package before editing the XML directly.

Open the package in your favorite text or XML editor, and look for the section that contains <components>. Underneath that, you need to locate the <component> tag that relates to your ADO NET Destination (the sample below is easy, since the name is “ADO NET Destination”, but that’s not the typical case (you do give your components meaningful names, right?).

<component id="16" name="ADO NET Destination" componentClassID="{2E42D45B-F83C-400F-8D77-61DDE6A7DF29}" description="Writes to a database using ADO.NET provider." localeId="-1" usesDispositions="true" validateExternalMetadata="True" version="1" pipelineVersion="0" contactInfo="Writes to a database using ADO.NET provider.;Microsoft Corporation; Microsoft SqlServer v10; © 2007 Microsoft Corporation; All Rights Reserved; http://www.microsoft.com/sql/support;1">
  <properties>
    <property id="23" name="TableOrViewName" dataType="System.String" state="default" isArray="false" description="The Destination database table name." typeConverter="" UITypeEditor="Microsoft.DataTransformationServices.Controls.ModalMultilineStringEditor, Microsoft.DataTransformationServices.Controls, Version= 10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" containsID="false" expressionType="Notify">"sample"</property>
    <property id="24" name="BatchSize" dataType="System.Int32" state="default" isArray="false" description="Specify the number of rows per batch. The default is 0 which uses the same size as SSIS internal buffer. " typeConverter="" UITypeEditor="" containsID="false" expressionType="Notify">0</property>
    <property id="25" name="CommandTimeout" dataType="System.Int32" state="default" isArray="false" description="The number of seconds before a command times out. A value of 0 indicates infinite time-out. " typeConverter="" UITypeEditor="" containsID="false" expressionType="Notify">30</property>
    <property id="26" name="UseBulkInsertWhenPossible" dataType="System.Boolean" state="default" isArray="false" description="Specifies whether to use the SqlBulkCopy interface to improve the performance of bulk insert operations. Only certain providers support this interface." typeConverter="" UITypeEditor="" containsID="false" expressionType="Notify">true</property>
    <property id="27" name="UserComponentTypeName" dataType="System.String" state="default" isArray="false" description="" typeConverter="" UITypeEditor="" containsID="false" expressionType="None">Microsoft.SqlServer.Dts.Pipeline.ADONETDestination, Microsoft.SqlServer.ADONETDest, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91</property>
  </properties>

Once you’ve located the right <component> tag, you need to make two changes:

One, change the version=”1” attribute in the <component> tag to version=”0”.

<component id="16"
           name="ADO NET Destination"
           componentClassID="{2E42D45B-F83C-400F-8D77-61DDE6A7DF29}"
           description="Writes to a database using ADO.NET provider."
           localeId="-1"
           usesDispositions="true"
           validateExternalMetadata="True"
           version="0"
           pipelineVersion="0"
           contactInfo="Writes to a database using ADO.NET provider.;Microsoft Corporation; Microsoft SqlServer v10; © 2007 Microsoft Corporation; All Rights Reserved; http://www.microsoft.com/sql/support;1">

Second, remove the entire <property name=”UseBulkInsertWhenPossible> element. You can comment it out, as shown below, or just delete it.

    <property id="25" name="CommandTimeout" dataType="System.Int32" state="default" isArray="false" description="The number of seconds before a command times out. A value of 0 indicates infinite time-out. " typeConverter="" UITypeEditor="" containsID="false" expressionType="Notify">30</property>
    <!--<property id="26" name="UseBulkInsertWhenPossible" dataType="System.Boolean" state="default" isArray="false" description="Specifies whether to use the SqlBulkCopy interface to improve the performance of bulk insert operations. Only certain providers support this interface." typeConverter="" UITypeEditor="" containsID="false" expressionType="Notify">true</property>-->
    <property id="27" name="UserComponentTypeName" dataType="System.String" state="default" isArray="false" description="" typeConverter="" UITypeEditor="" containsID="false" expressionType="None">Microsoft.SqlServer.Dts.Pipeline.ADONETDestination, Microsoft.SqlServer.ADONETDest, Version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91</property>

If you have a lot of packages to move back and forth, you probably don’t want to hand edit this each time. I wrote a little application that will process a folder of .dtsx files, and strip the 2008 R2 information out. This is done using an XSLT transform (available in the source code). The application and source are available to download from my SkyDrive.

Passing an Object from a Parent Package To a Child

Occasionally, you may run into the need to pass values between packages. In most cases, you can use a Parent Package Variable configuration to pass the value of a variable in the calling (or parent) package to the called (or child) package. However, Parent Package Variable configurations can’t be set up on variables of type Object. Fortunately, you can work around this pretty easily, thanks to the fact that the calling package variables are inherently accessible in the called packages. 

I’ve set up a sample parent and child package to illustrate this. The parent package is straightforward. It uses an Execute SQL task to populate an object variable named TestVar with a recordset object, and then calls the child package.

image image

The child package has a Foreach Loop Container to iterate over the recordset object. It has a Script task that is used to copy the parent package’s variable (TestVar) to a local variable named LocalVar. This is the variable that the Foreach Loop is configured to use. Why copy the value? If you don’t have a local variable to reference in the Foreach Loop, it won’t validate properly.

image image

The script in Copy Variable is pretty simple. It relies on the fact that you can reference parent package variables inherently, as they are included in the collection of variables accessible in the local package. The script just copies the value from one variable to the other, so that the Foreach Loop will have something to do.

public void Main()
{
    Variables vars = null;
    Dts.VariableDispenser.LockForWrite("User::LocalVar");
    Dts.VariableDispenser.LockForRead("User::TestVar");
    Dts.VariableDispenser.GetVariables(ref vars);

    vars["User::LocalVar"].Value = vars["User::TestVar"].Value;
    vars.Unlock();

    Dts.TaskResult = (int)ScriptResults.Success;
}

Please note that for this to work, you cannot have a variable in the child package with the same name as the variable in the parent package. If you do, the local variable will hide the parent variable. Outside of that, this works really well for passing object values between packages. The same technique can also be used in reverse to send values back to the parent package, if you have that need.

The sample has been uploaded to my Skydrive. Let me know if you have any questions.

SSIS and SQL Azure – Getting Started

Since SQL Azure is currently in a Community Technology Preview, the technology and this information provided below is subject to change. This post is based on the August 18th CTP.

Now that I’ve been working with SSIS against Azure for a few days, I thought I’d post about my experiences. Overall, I’m pretty happy with it, considering that it is a pre-release product. I’ve had some good and and some bad experiences, but with what I am seeing right now, and the direction it’s heading in, I think it has a good future.

Prior to the CTP, people wanting to get an early start with SQL Azure were advised to developed locally against SQL Express. Theoretically, you could then simply change your connection strings to point to SQL Azure, and away you go. In practice, that’s not exactly how it worked for me with SSIS (your mileage may vary – .NET apps are probably much easier to port).

Make sure you read through the documentation first – there’s a lot of good information there, and some of it is pretty important. The first thing to note is that SQL Azure currently does not support OLE DB. The normal recommendation for SSIS is to use the OLE DB Source or Destination to access SQL Server. However, if you want to port your packages to SQL Azure, you must use the ADO.NET Source and Destinations. This is fine for 2008, but if you are using SSIS 2005, there is no ADO.NET Destination, so you would have to implement your own through a script component.

The second thing to be aware of is that bulk insert operations are not currently supported (though it’s been said they will be available in a later CTP). Since the ADO.NET Destination doesn’t support bulk inserts anyway, this isn’t a huge issue. However, if you are writing your own destination (in a script component or custom component), you can’t currently use the ADO.NET SqlBulkCopy class.

So, with those two caveats out of the way, it should be pretty much like creating any data flow in SSIS – add a source, add a destination, and you are ready to go. However, I got the following error when using the ADO.NET Source and Destination:

image

This error appears to come up because SQL Azure does not currently support the system catalog tables that ADO.NET calls to retrieve table information.  For the ADO.NET Source, since you can’t type the table name in, the simplest way to work around  this is to use the SQL Command option and specify a SQL Statement instead of the Table or View option.

image

For the ADO.NET Destination, your only choice is to use the Table or View option, so you can just type the table name in. The table name must be provided in the following format: “schema”.”tablename”.

Once this is done, you can run the package, and watch your data move. Once or twice, I saw validation warnings that prevented the package from running, but these all went away the next time I ran it, so I’m guessing it was a momentary connectivity issue. I’m on the road right now, so I don’t have the most stable internet connection available.

I’ll be posting a follow up to this soon that talks about performance, and how you can tune your packages to move data in and out more quickly. I should also have a few performance test results to share.

Warning about Using String Variables in Derived Column Expressions

I ran across an interesting behavior in SSIS this week, and thought it was worth sharing. Occasionally, I’ll have the need to use the value of a variable (set in the control flow) in the data flow. The typical way to accomplish this is to use a Derived Column transformation to introduce the variable’s value into the data flow. However, you need to be cautious if you are doing this with a String variable.

When you are working with Derived Column expressions, the output type of the Derived Column expression is based on the input values. For example, inputting an expression like 1+2 will result in an output column data type of DT_I4 (an integer). Basically, the editor guesses the data type based on the input values. Generally, this is OK – if you are referencing other data flow columns, or variables that aren’t of type string, the lengths of the values are static. However, when the editor calculates the length of string variables, it uses the current value in the variable. So, if your variable has 3 characters, the output data type only expects three characters. If the value in the variable is static and never changes, this is fine. But if the value is updated regularly (perhaps it’s a variable in a template package) or the value is set through an expression (so that the value changes at runtime), this can cause truncation errors in the package.

I created a sample package that demonstrates this. The package is pretty simple: a single Data Flow task, with an OLE DB Source, a Derived Column transformation, and a Union All (just there to terminate the data flow).

image

There is a single variable named TestVar that contains the value “1234”.

image

In the Derived Column transformation, there is a single new column added. Note that the calculated length for the output column is 4, matching the value “1234” in the variable.

image

If the package is run right now, everything works fine. But that’s not the case if you change the value in the variable to “123456”.

image

Running the package after changing the value results in the following error:

[Derived Column [91]] Error: The "component "Derived Column" (91)" failed because truncation occurred, and the truncation row disposition on "output column "ColumnFromStringVar" (289)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.

This occurs because the metadata in the Derived Column transformation isn’t updated when the variable’s value is changed. So, to avoid seeing this error occur in your packages, you need to explicitly set the output column’s length.

In 2005, you could change the calculated data type by editing the data type, length, precision, and scale fields. In 2008, however, the values are locked. You can change the data type by going into the advanced editor for the Derived Column. However, it’s easier to simply cast the string value, to force the Derived Column editor to treat it as if it has constant length.

image

By using this approach, as long as your string variable’s value is less than 50 characters, the Derived Column will continue to work. It’s best to set the length of the cast to the same value as the destination column’s length.

There’s a Connect submission on improving this behavior, either by updating the pipeline’s metadata as the string variable’s value changes, or by throwing a validation warning or error if the current value of the variable exceeds the output length in the Derived Column transformation. If you agree that this could use some improvement, you can vote here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=470995

In the meantime, I’d highly recommend performing an explicit cast in any Derived Column expression that uses a string variable, particularly if the value is subject to change.

The sample package is on my SkyDrive, if you’d like to see the error in action.

Creating Multiple Rows in a Text File from a Single Row

A recent post on the SSIS forums was asking about creating multiple rows in a text output file from a single row in the data flow. Given a set of rows like this:

John Smith 1/1/1900 Value A
Jane Smith 12/1/1900 Value B

the poster wanted this output in the text file:

John Smith  
1/1/1900 Value A
Jane Smith  
12/1/1900 Value B

Basically, the poster wanted a line break in the middle of a row of data, while keeping a line break at the end.

There are a couple of ways to accomplish this in SSIS. One way is the use of a script task to create the file, which gives you complete control over the format of the file. There’s also a couple of ways to do it directly in SSIS. The first way is to use a Multicast transform to create two copies of each row, perform some string concatenation, and then combine them using a Union All or a Merge.

image

The Derived Column transforms are used to put the multiple columns into a single column, so that a variable length record can be written to the flat file. The Sort transforms and the Merge combines the rows into the proper order, before sending them to a flat file.

The other option (and one that probably falls under the category of stupid SSIS tricks), is to hack the flat file connection manager a little bit. You can set the column delimiters so that a carriage return/linefeed is inserted in the middle of the row. However, this isn’t as simple as just choosing {CR}{LF} as the column delimiter. SSIS checks to make sure that none of the column delimiters are the same as the row delimiter. Why it does that check, I don’t know, given the way it parses flat files. Regardless, you have to work around it. So, you can simply select the column where you want to introduce the break, and set it’s delimiter to {CR}.

image

Then insert a new column immediately following that column, set the output width to 0, and set the column delimiter to {LF}.

image

Now the output will include a carriage return / linefeed between the columns.

The sample package for this is located here. It is SSIS 2008, but the concepts are the same for 2005.

Code Snippets for SSIS 2008

If you develop in Visual Studio using C# or VB, then you are probably familiar with code snippets. They are little chunks of code that you can insert into your application and save you some typing. For example, if you are writing a C# application, and you type “if” and hit TAB twice, the following will be inserted into your code:

image

You can enter in the value to be evaluated, and then type in the rest of your code. I like code snippets because they save me from typing the same things over and over again. However, in SSIS 2005 scripting, code snippets weren’t available, due to the VSA environment used by the script objects. Now that we have a real scripting environment in SSIS 2008, code snippets can be used, although there are a few things you need to know.

Two of the most common activities I do in SSIS scripts are reading from and writing to variables. So, I put together a few code snippets that do this. Here’s what the snippet for reading a value from a variable looks like:

image

To create a snippet, you define the code snippet in an XML format, and save it to a file with a .snippet extension. I’m not going to post the code for the snippet here, as it is fairly verbose, but you can look at the .snippet file in any text editor. Once you have the file, you need to put it in a specific location for the Visual Studio environment to recognize it. This is where it can get a little confusing. Normally, snippets go into the “C:\Users\<user>\Documents\Visual Studio 2008\Code Snippets\<language>\My Code Snippets” folder. However, this is the location used for the full Visual Studio environment. Since the script editors in SSIS use the Visual Studio Tools for Applications environments, they get their own folder. In fact, there are separate folders for the Script Task and the Script Component. Code snippets for the Script Task go into the “C:\Users\<user>\Documents\Integration Services Script Task\Code Snippets\<language>\My Code Snippets” folder, and snippets for the Script Component go into the “C:\Users\<user>\Documents\Integration Services Script Component\Code Snippets\<language>\My Code Snippets” folder.

This actually works out well, because the code for the snippets is slightly different between the Script Task and Script Component. This is due to the object references being slightly different between the two.

I’ve provided two code snippets for C# for reading from and writing to variables in the Script Task, and the same two in C#, but altered to work in the Script Component. To use them, you can simply copy them to the above folders, and then type “SSISReadOneVariable” or “SSISWriteOneVariable” as a shortcut, followed by two tabs to insert them. You can change the shortcut in the snippet file if you’d like to use something else. It also wouldn’t be difficult to convert them to VB, if that’s your preference. The snippets are located on my SkyDrive. If you have ideas for other useful snippets, post a comment. Enjoy!

What’s Happening on SQL Server When My Package Fails?

We encountered an issue recently in an ETL process during the extract portion. When extracting from one very large table, using a NOLOCK hint, we were encountering errors indicating that the data was changing while we were extracting it. Since no other processes were supposed to be using the table during the same time window as the ETL, this was a source of some confusion. Since there was nothing indicating what, if anything, was changing the data, we needed a quick way to see what was going on in the database at the point the package failed.

Normally, when troubleshooting database activity problems, I’d just run sp_who or sp_who2. However, since the problem was occurring between 2 and 3 am, nobody was terribly excited around getting up to monitor the server. We didn’t really want to leave Profiler running all night either. So, we set up the package to log the current database activity whenever an error occurred. Basically, in the OnError event handler, we just added a data flow. The data flow runs the sp_who stored procedure to get the current activity on the database, and saves it to a text file.

This was pretty simple, and I’ve created a sample package to show it. The main control flow has a script task that generates an error, purely for the purpose of firing the OnError event handler.

image

The event handler does a check to make sure this is the first time it’s fired, since we didn’t want to log the activity multiple times. It then runs a data flow, and once it completes, sets a flag to ensure it doesn’t doesn’t fire again.

image

The data flow runs sp_who. While sp_who2 gives better information, it doesn’t provide metadata in an easily consumable format for SSIS. sp_who, on the other hand, works right off, so we went with it for simplicity. The output is sent to a text file.

image

It was pretty simple to implement, and it served the purpose. By reviewing the information logged, we could tell what other processes were running, and what computer it originated from. If you ever encounter a similar situation, hopefully this can help. The template package is located on my SkyDrive.

Batch Destination and the Merge Destination

After I created the Batch Destination for my presentation at the MS BI Conference, I was talking with Matt Masson, who let me know that they’d been working on a MERGE Destination sample component, and would hopefully be releasing it soon. It’s out now on the CodePlex site for the Integration Services team samples. I’ve taken a look at it, and it is very nicely done.

What’s the difference between the two? The MERGE Destination has a much nicer interface, that lets you graphically build a MERGE statement. And, of course it uses the MERGE statement to perform updates and inserts in a single operation. This is nice, but limits you to using a SQL Server 2008 database as a destination. The Batch Destination executes any SQL command you give it, so it can be used on SQL Server 2005 or 2008.

I haven’t done a performance comparison between the two, so I’m not positive which one is faster. However, the MERGE Destination stores the working data in memory, so it should be more efficient at loading the data. This does require the MERGE Destination to create a type and a stored procedure in the destination database, so the appropriate permissions need to be provided. The Batch Destination uses the ADO.NET Bulk Copy functionality, so the data is persisted in a working or temporary table in the target database, and cleaned up afterward, which is likely to be slower. However, you don’t need any special permissions to create a temporary table.

This isn’t intended to be a “this one is better than that one”. I can see using either or both as circumstances dictate. I just think it’s nice to have options.

Error Redirection with the OLE DB Destination

A question that comes up occasionally is how to handle errors that occur on the OLE DB Destination in the data flow. Errors that occur when using the OLE DB Destination to insert into a table are usually caused by primary key, foreign key, or check constraints violations. When one of these errors occurs, the default behavior of the OLE DB Destination is to fail and not insert any rows, even the good ones. So the question is, "How can you insert the good rows, while redirecting the bad rows somewhere else?"

There is always the option of preprocessing the data prior to sending it to the OLE DB Destination, to ensure that no constraints will be violated. However, this usually means that you are incurring the validation cost twice – once up front, and then again before you insert into the table. You can drop and recreate the constraints on the table, but this incurs some performance penalties itself. As usual with anything dealing with performance, your mileage may vary, so the best way to see which performs best for you is to test it in your environment.

Another approach is to leave the constraints intact on the destination table, and handle the constraint violations through error redirection. You can enable error redirection on the OLE DB Destination, but you have to change the "Maximum insert commit size" property to make it work. By default, this value is set so that the entire set of rows going to the database is committed at once. A failure to insert one rows means that no rows will be inserted.

image

If you set this to a value of one, the OLE DB Destination will only try to commit a single row at time. The problem with this is that single row inserts are painfully slow. So you probably want to set this value considerably higher, between 10,000 and 100,000 rows. You may have to try a few different values to determine what works best in your environment.

Now that you have the OLE DB Destination set up to commit multiple, smaller batches instead of one huge batch, you can enable error redirection. However, it will still redirect the entire batch of records that contain an error, not just the individual rows in error. For example, if you defined a batch size of 10,000 rows, and one row in the 10,000 row batch has an error, they will all be redirected to the error output. So how do you set it up to get all the good rows inserted, and get down to just the rows in error? We really want it to work on single row batches for error handling purposes, but as pointed out earlier, that can be very slow.

There is an approach that allows you to get the best of both worlds – single row error handling with good performance. It involves staging the inserts through multiple OLE DB Destinations, each with smaller batch sizes. The last one in the process should have a commit size of 1, so that you get single rows on error redirection.

To illustrate this technique, I’ve created a sample project. It has a few external dependencies that need to be set up before it will run. You need a database with three tables in it. One is a Numbers table, which are incredibly useful for quickly generating sample data. If you don’t have one, please take a look at this post from Adam Machanic to see why you need one. The second table is used to test inserting data into a table with constraints.

CREATE TABLE [dbo].[SampleTable](
    [SampleTableID] [int] NOT NULL,
    [Name] [varchar](50) NULL,
    [Description] [varchar](250) NULL,
    [CheckValue] [int] NULL,
 CONSTRAINT [PK_SampleTable] PRIMARY KEY CLUSTERED 
([SampleTableID] ASC )) ON [PRIMARY]
GO
 
ALTER TABLE [dbo].[SampleTable]  WITH CHECK ADD  CONSTRAINT [CK_SampleTable] CHECK  (([CheckValue]>=(0) AND [CheckValue]<=(9)))
GO

 

The third table is a duplicate of the second table, but with no constraints. This table is used to catch the error rows.

CREATE TABLE [dbo].[SampleTableError](
    [SampleTableID] [int] NOT NULL,
    [Name] [varchar](50) NULL,
    [Description] [varchar](250) NULL,
    [CheckValue] [int] NULL
) ON [PRIMARY]
GO

The package control flow truncates the destination table (SampleTable), and then runs a data flow.

image

The data flow consists of a source that generates 5000 rows of data, with two of the rows violating constraints on the destination table. The OLE DB Source is connected to a OLE DB Destination that is pointed to the SampleTable, with a Maximum Insert Commit Size set to 1000 rows. The error output is connected to a second OLE DB Destination, that inserts into the same table (SampleTable). The second destination has it’s Maximum Insert Commit Size set to 1. Finally, the error output from that OLE DB Destination is connected to a third one, which inserts the error rows into SampleTableError.

image

The first OLE DB Destination attempts to insert a batch, giving us good performance. If the batch is successful, it moves on to the next batch. This keeps the inserts happening quickly. However, if the batch has one or more error rows, the entire batch is redirected to the second OLE DB Destination. This one inserts rows one at a time, giving us the single row error handling that we wanted. Good rows are inserted, but the error rows are redirected to the third OLE DB Destination to be written to an error table. It doesn’t have to be an OLE DB Destination to handle the error rows, you could use a Flat File Destination or a script component to process them.

You can modify this technique by using additional stages of OLE DB Destinations (for example, to go from 50,000 rows to 10,000 to 1) but I try not to go beyond three levels to keep things understandable. Your needs may vary, depending on the performance and number of rows you are processing.

I’ve uploaded the sample package to my SkyDrive. This one is done with SSIS 2008, but the same technique works in 2005.

Configuration Changes in SSIS 2008

In a previous post, I detailed some of the challenges that I had encountered when trying to modify the connection string used for SQL Server configurations from the command line. In SSIS 2005, command line options were applied after the configurations were loaded (with the exception of parent package variables). Effectively, this meant that you could not change the location a configuration pointed to from the command line.

This has been changed in SSIS 2008. Configurations are now loaded twice, once when the package is originally loaded, and then again after any command line options are applied. The order of events is:

  1. The package file is loaded.
  2. The configurations specified in the package at design time are applied in the order specified in the package (except for Parent Package Variables).
  3. Command line values are applied.
  4. The configurations specified in the package at design time are reloaded in the order specified in the package (except for Parent Package Variables). The configurations will use any updated values from the command line when they are reloaded, so it is possible to change the connection string for a SQL Server configuration.
  5. Parent Package Variable Configurations are applied.
  6. The package is run.

This is a nice improvement, since you can now set up a package with SQL Server configurations, and point the package to the correct database are runtime by simply using the /CONN switch of DTEXEC. However, it still isn’t perfect . Since the design time configuration is applied twice, a value that you apply using /SET on the command line can be overridden by a preexisting configuration.

If you’d like to see this behavior, I’ve provided a simple package. The package has a single variable (“TestVar”) defined. It has an XML configuration enabled that sets the value of TestVar to “Dev”. There is a single Script Task that fires an information event that contains the value of the variable.

public void Main()
        {
            bool fireAgain = false;
            Variables vars = null;
            Dts.VariableDispenser.LockOneForRead("TestVar", ref vars);
            Dts.Events.FireInformation(0, "", "Value is: " + vars["TestVar"].Value, "", 0, ref fireAgain);
            vars.Unlock();
            Dts.TaskResult = (int)ScriptResults.Success;
        }

If I run this package using this command:

dtexec /file Configuration_Overwrite.dtsx /SET \Package.Variables[User::TestVar].Properties[Value];Prod /REP EWI

I’d expect to see the value “Prod” reported in the output. Instead, “Dev” is displayed. If you disable the package configurations and re-run it using the same command, it does report “Prod”.

Overall, I prefer the new behavior, but it does introduce a new issue to be aware of. In general, you need to be careful to not /SET a value that is also specified in a configuration, as it will be overwritten. I’ve filed a Connect suggestion for the addition of /SetBefore and /SetAfter switches to DTEXEC, so that you can explicitly define when you want the /SET applied. If you think it’s a good idea, please vote for it.

The sample package is on my SkyDrive. You may need to update the location of the XML configuration, but it has no other dependencies.

Thanks to Cho Yeung on the SSIS team for clarifying the load order when the package is executed.