Warning about Using String Variables in Derived Column Expressions

I ran across an interesting behavior in SSIS this week, and thought it was worth sharing. Occasionally, I’ll have the need to use the value of a variable (set in the control flow) in the data flow. The typical way to accomplish this is to use a Derived Column transformation to introduce the variable’s value into the data flow. However, you need to be cautious if you are doing this with a String variable.

When you are working with Derived Column expressions, the output type of the Derived Column expression is based on the input values. For example, inputting an expression like 1+2 will result in an output column data type of DT_I4 (an integer). Basically, the editor guesses the data type based on the input values. Generally, this is OK – if you are referencing other data flow columns, or variables that aren’t of type string, the lengths of the values are static. However, when the editor calculates the length of string variables, it uses the current value in the variable. So, if your variable has 3 characters, the output data type only expects three characters. If the value in the variable is static and never changes, this is fine. But if the value is updated regularly (perhaps it’s a variable in a template package) or the value is set through an expression (so that the value changes at runtime), this can cause truncation errors in the package.

I created a sample package that demonstrates this. The package is pretty simple: a single Data Flow task, with an OLE DB Source, a Derived Column transformation, and a Union All (just there to terminate the data flow).

image

There is a single variable named TestVar that contains the value “1234”.

image

In the Derived Column transformation, there is a single new column added. Note that the calculated length for the output column is 4, matching the value “1234” in the variable.

image

If the package is run right now, everything works fine. But that’s not the case if you change the value in the variable to “123456”.

image

Running the package after changing the value results in the following error:

[Derived Column [91]] Error: The "component "Derived Column" (91)" failed because truncation occurred, and the truncation row disposition on "output column "ColumnFromStringVar" (289)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.

This occurs because the metadata in the Derived Column transformation isn’t updated when the variable’s value is changed. So, to avoid seeing this error occur in your packages, you need to explicitly set the output column’s length.

In 2005, you could change the calculated data type by editing the data type, length, precision, and scale fields. In 2008, however, the values are locked. You can change the data type by going into the advanced editor for the Derived Column. However, it’s easier to simply cast the string value, to force the Derived Column editor to treat it as if it has constant length.

image

By using this approach, as long as your string variable’s value is less than 50 characters, the Derived Column will continue to work. It’s best to set the length of the cast to the same value as the destination column’s length.

There’s a Connect submission on improving this behavior, either by updating the pipeline’s metadata as the string variable’s value changes, or by throwing a validation warning or error if the current value of the variable exceeds the output length in the Derived Column transformation. If you agree that this could use some improvement, you can vote here: https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=470995

In the meantime, I’d highly recommend performing an explicit cast in any Derived Column expression that uses a string variable, particularly if the value is subject to change.

The sample package is on my SkyDrive, if you’d like to see the error in action.

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.