Yesterday I posted about using a destination script component to write XML files from a data flow. Today, I’m continuing in the same vein with a post about writing multiple destination files from a destination script component. The inspiration for this was a post on the MSDN forums where a developer described a scenario where he had a single file and needed to split it into multiple output files based on some conditional logic. Normally, this could be handled with a conditional split and multiple destinations, but the catch in his scenario was that he didn’t know how many destination files he would need.
Based on that, I worked up a quick script destination to accomplish his goal. Later, I’m also going to post an alternative approach that does not use any script.
The data flow is fairly straightforward. The flat file source is retrieving a flat file with the following data:
NAME;VALUE;DATE
A;1;1/1/2000
A;2;1/2/2000
A;3;1/3/2000
A;4;1/4/2000
A;5;1/5/2000
B;1;1/1/2000
B;2;1/2/2000
B;3;1/3/2000
B;4;1/4/2000
B;5;1/5/2000
C;1;1/1/2000
C;2;1/2/2000
C;3;1/3/2000
C;4;1/4/2000
C;5;1/5/2000
The goal is to output three files, one each for the A, B, and C rows. I used a folder connection rather than a specific flat file connection. The folder connection specifies the output directory to place the files in.
On the script task (created as a destination), all the available input columns were selected.
The Destination Folder is specified on the Connection Managers page.
The script is as follows:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.IOPublic Class ScriptMain
Inherits UserComponentPrivate Const fileName As String = “Output”
Private Const fileExt As String = “.txt”
Private Const delimiter As String = “;”
Private targetFolder As String
Private fileWriter As StreamWriter
Private fileCount As Int32 = 1
Private previousValue As String = String.EmptyPublic Overrides Sub AcquireConnections(ByVal Transaction As Object)
targetFolder = CType(Me.Connections.Destination.AcquireConnection(Nothing), String) + “”
End SubPublic Overrides Sub PostExecute()
fileWriter.Close()
End SubPublic Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim column As IDTSInputColumn90If Not Row.NAME = previousValue Then
If Not fileWriter Is Nothing Then
fileWriter.Close()
End If
fileWriter = New StreamWriter(targetFolder + fileName + fileCount.ToString() + fileExt, False)
fileCount += 1
previousValue = Row.NAME
End IfWith fileWriter
.Write(Row.NAME + delimiter)
.Write(Row.VALUE.ToString() + delimiter)
.WriteLine(Row.DATE.ToString() + delimiter)
End With
End SubEnd Class
The most interesting portion of this script is in the ProcessInputRow method. The initial section of the method checks to see if the current row Name value is the same as the previous row. If it is different, it closes the current file, opens a new one, and increments a counter for the file name. Then the current row values are written out to the file.
This script works well for outputting multiple files. In the script, I’m using a simple previous value comparison to see if I need to start a new file, but the logic could be much more complex if necessary.
An advantage of this approach over the non-script approach is that it only requires one pass through the source file. Another advantage is that the comparison logic can be fairly complex and can easily handle triggering new file creation based on number of rows or a trigger row. The pure SSIS approach can’t handle that as easily. However, this approach does require that your rows are sorted properly before being fed into the destination, whereas the alternate approach can handle the rows in any order.