There’s been a few questions on the forums recently about taking column values and pivoting them to rows. For example, I might have an input file with 5 columns: a key column and 4 category columns.
Key; C1; C2; C3; C4
1;1;2;3;4
2;1;2;3;4
3;1;2;3;4
4;1;2;3;4
and I’d like an output that has 2 columns (the key and the category value):
1;1
1;2
1;3
1;4
2;1
2;2
2;3
…
Since my input has a set number of columns, I can use the Unpivot Transform to get the desired result.
But what if I have a variable number of columns to pivot? In that case the input file might look like this:
1;1;2;3
2;1;2;3;4;5
3;1
4;1;2;3;4;5;6;7;8
5;1;2;3;4;5;6;7;8;9
6;1;2;3
The first column still indicates the key, but there are a variable number of categories. In Handling Flat Files with Varying Numbers of Columns, I showed how to handle the varying number of columns by treating each row as a single column. This post is going to expand on that technique and show how to incorporate a pivot into the script.
The connection manager and the flat file source should be set up the same way as in the previous post, so I won’t cover that again. The difference is in the script component. Since I want to output more rows than I’m getting as inputs, I’m setting the script up with an asynchronous output. That is done by setting the SynchronousInputID property to None. Two columns were added to the output, one for the key, and one for the category value.
This is the code inside the script task:
Dim Values() As String
Dim i As Integer
Values = Row.Column0.Split(CChar(“;”))
‘Array is zero based – but zero index is the key value,
‘so start with 1
For i = 1 To Values.GetUpperBound(0)
Output0Buffer.AddRow()
Output0Buffer.Key = CInt(Values(0))
Output0Buffer.Value = CInt(Values(i))
Next
The Split function is used to break the string up into individual columns in an array. The code loops through the array, starting with the second column, since the first one represents the key. For each iteration through the loop, a row is added to the output buffer, and the key and and value are added to the row.
That’s pretty much all there is to it. This technique can be used to handle flat files and columns that contain multiple items (in XML, or as a delimited list).