XML Destination Script Component

I often hear comments along the lines of “Why doesn’t SSIS have an XML destination?”. Unfortunately, I don’t know the answer to that, but I can show you how to implement your own XML destination fairly easily. Thanks to the magic of the script component, it doesn’t take much work at all. This was also a good opportunity for me to use a destination script component, as I haven’t had much need for this in my work.


One caveat to this is that I have kept the XML format very simple and very generic. You can customize the script to handle more complex scenarios, but I’d imagine there would be a lot of work in producing a component that can handle all the different ways of formatting data in XML. Maybe that’s why MS didn’t include it out of the box :)


As an example, I have a very simple package with a single data flow. The data flow has a flat file source, and a script destination component.


XML Dest


The flat file source is standard, nothing interesting there. The work is all done in the script. When adding the script component, make sure to specify that it is a destination component.


When configuring the script component destination, I chose all the input columns that I wanted to include in the XML (in this case, all of the columns from my flat file).


DestScriptColumns


There is no need to alter anything on the Inputs and Outputs page. Since this is a destination component, no output is necessary. On the Connection Managers page, I added a reference to a connection manager of type ”File”, and named it Destination.


DestScriptCM


This is the complete script for the component. I’ll explain it section by section below.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.IO
Imports System.Reflection

Public Class ScriptMain
Inherits UserComponent

Private targetFile As String
Private xmlWriter As StreamWriter
Private rootElement As String = “Root”
Private rowElement As String = “Row”

Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
targetFile = CType(Me.Connections.Destination.AcquireConnection(Nothing), String)
End Sub

Public Overrides Sub PreExecute()
xmlWriter = New StreamWriter(targetFile, False)
xmlWriter.WriteLine(FormatElement(rootElement))
End Sub

Public Overrides Sub PostExecute()
xmlWriter.WriteLine(FormatElement(rootElement, True))
xmlWriter.Close()
End Sub

Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim column As IDTSInputColumn90
Dim rowType As Type = Row.GetType()
Dim columnValue As PropertyInfo

With xmlWriter
.Write(FormatElement(rowElement))
For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
columnValue = rowType.GetProperty(column.Name)
.Write(FormatElement(column.Name) + columnValue.GetValue(Row, Nothing).ToString() + FormatElement(column.Name, True))
Next
.WriteLine(FormatElement(rowElement, True))
End With
End Sub

Private Function FormatElement(ByVal elementName As String) As String
Return FormatElement(elementName, False)
End Function

Private Function FormatElement(ByVal elementName As String, ByVal closingTag As Boolean) As String
Dim returnValue As String
If closingTag Then
returnValue = “</”
Else
returnValue = “<”
End If
returnValue += elementName + “>”

Return returnValue
End Function

End Class


I added the System.IO and System.Reflection to the Imports as I am using objects from both namespaces. There are 4 class level variables defined:

    Private targetFile As String
Private xmlWriter As StreamWriter
Private rootElement As String = “Root”
Private rowElement As String = “Row”

xmlWriter is a StreamWriter, the .NET Framework object used to create the XML file. For more information on this, please see the MSDN documentation. rootElement is the value to enclose the entire XML document with, and rowElement defines what to enclose each individual row in. By changing the values in these variables, you affect what the final XML output looks like.


targetFile holds the path and file name for the destination file. It is set in the AcquireConnections method. Interesting note: when you are working with a file connection manager, calling AcquireConnection simply returns the path and file name of the file. Why not just use a variable or set it explicitly? Doing it this way makes the script behave more like a standard SSIS destination, and makes it clearer what you’d need to change to put the file in a new destination. As a plus, if you set an expression or configuration on the ConnectionString property of the file connection manager, the script will use it automatically.

    Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
targetFile = CType(Me.Connections.Destination.AcquireConnection(Nothing), String)
End Sub

In the PreExecute method, the code creates a new instance of the StreamWriter, and writes the opening tag of the XML file.

    Public Overrides Sub PreExecute()
xmlWriter = New StreamWriter(targetFile, False)
xmlWriter.WriteLine(FormatElement(rootElement))
End Sub

The ProcessInputRow method is where most of the work occurs. I’m using the System.Reflection and the ComponentMetaData objects to process the Row object dynamically. Essentially, this code determines what input columns are available at runtime, based on what was selected in the Script component’s property pages. It then writes each of those columns to the file, surrounding it with an XML tag based on the column name.

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim column As IDTSInputColumn90
Dim rowType As Type = Row.GetType()
Dim columnValue As PropertyInfo

With xmlWriter
.Write(FormatElement(rowElement))
For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
columnValue = rowType.GetProperty(column.Name)
.Write(FormatElement(column.Name) + columnValue.GetValue(Row, Nothing).ToString() + FormatElement(column.Name, True))
Next
.WriteLine(FormatElement(rowElement, True))
End With
End Sub


 Finally, in the PostExecute method, the closing tag is written, and the file is closed.

    Public Overrides Sub PostExecute()
xmlWriter.WriteLine(FormatElement(rootElement, True))
xmlWriter.Close()
End Sub

 With the script component in place, if I read in a text file with these values:


NAME;VALUE;DATE
A;1;1/1/2000
A;2;1/2/2000
A;3;1/3/2000
A;4;1/4/2000
A;5;1/5/2000

I get an output that looks like this:

<Root>
<Row><NAME>A</NAME><VALUE>1</VALUE><DATE>1/1/2000 12:00:00 AM</DATE></Row>
<Row><NAME>A</NAME><VALUE>2</VALUE><DATE>1/2/2000 12:00:00 AM</DATE></Row>
<Row><NAME>A</NAME><VALUE>3</VALUE><DATE>1/3/2000 12:00:00 AM</DATE></Row>
<Row><NAME>A</NAME><VALUE>4</VALUE><DATE>1/4/2000 12:00:00 AM</DATE></Row>
<Row><NAME>A</NAME><VALUE>5</VALUE><DATE>1/5/2000 12:00:00 AM</DATE></Row>
</Root>

The nicest thing about this script, in my opinion, is that it does not have to be modified if the input columns change. The use of the System.Reflection classes allows the code to process the Row object at runtime. This does come at the expense of performance, though, so I wouldn’t recommend using this in high volume scenarios without some tweaks.


 


 


 


 

18 Comments

  1. jamiet says:

    Great stuff John. I’ve written something similar before but I’ve never used Reflection to do it. Very cool indeed.

    -Jamie

  2. jwelch says:

    I was looking for a way to create a generic script, and figured that I could use the metadata to make it a bit more dynamic. However, I haven’t done a lot of testing on it, so I’m not sure if there are any hidden problems.

  3. Joost says:

    Great stuff! Just what I need!
    I got it working for a file with a few columns. But I recieve a .Net error when running it with 45 columns (711 chars record length).

    The error is within this part op de code: “columnValue.GetValue(Row, Nothing).ToString()” When I remove that part is ‘works’ without errors. To bad you cann’t debug. Any suggestions?

    Compleet Error:

    at XmlScriptComponent.ScriptMain.Input0_ProcessInputRow(Input0Buffer Row) in dts://Scripts/ScriptComponent_e23d01095c0747ac9e08095ca648fb1d/ScriptMain:line 45
    at XmlScriptComponent.UserComponent.Input0_ProcessInput(Input0Buffer Buffer) in dts://Scripts/ScriptComponent_e23d01095c0747ac9e08095ca648fb1d/ComponentWrapper:line 29
    at XmlScriptComponent.UserComponent.ProcessInput(Int32 InputID, PipelineBuffer Buffer) in dts://Scripts/ScriptComponent_e23d01095c0747ac9e08095ca648fb1d/ComponentWrapper:line 21
    at Microsoft.SqlServer.Dts.Pipeline.ScriptComponentHost.ProcessInput(Int32 inputID, PipelineBuffer buffer)

  4. jwelch says:

    Is the column value NULL? That might cause a problem.

  5. Joost says:

    > Is the column value NULL? That might cause a problem.
    I thought that was the problem too… so I filled a textfile with 10 rows with 711 times an ‘X’, but that didn’t fix the problem. I also added a Dataviewer between the source and the script destination and saw that all columns (all are dt_str) where filled.

    After that I wrote a try catch around the “columnValue.GetValue(Row, Nothing).ToString()”:

    Try
    .Write(FormatElement(column.Name) + columnValue.GetValue(Row, Nothing).ToString() + FormatElement(column.Name, True))
    Catch ex As Exception
    .Write(FormatElement(column.Name) + ex.Message.ToString() + FormatElement(column.Name, True))
    End Try

    The result is that some columns have the correct value and others show “Object reference not set to an instance of an object.” as value:
    XXX
    Object reference not set to an instance of an object.
    Object reference not set to an instance of an object.
    X

    After that I changed the data in de textfile and saw that the same rows failed. But all rows have the same definition. The only difference is the length of the column. I’m totally flabbergasted…

  6. Joost says:

    Hi Again,

    I rewrote your script a little bit en used the ‘ProcessInput’ instead of ‘Input0_ProcessInputRow’ and now it works fine again. Don’t know what went wrong. Thanks for the help. Let me know if you want the source code.

  7. jwelch says:

    Yes, I’d like to see what you ended up with. I was working with some similar code recently, and getting some very unexpected results. I’m wondering if we weren’t seeing a related problem.

  8. Joost says:

    ‘the modified code:

    Imports System
    Imports System.Data
    Imports System.Math
    Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
    Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
    Imports System.IO
    Imports System.Reflection

    Public Class ScriptMain
    Inherits UserComponent

    Private targetFile As String
    Private xmlWriter As StreamWriter
    Private rootElement As String = “Root”
    Private rowElement As String = “Row”

    Private columns As Integer()
    Private columnames As String()

    Public Overrides Sub AcquireConnections(ByVal Transaction As Object)
    targetFile = CType(Me.Connections.Destination.AcquireConnection(Nothing), String)
    End Sub

    Public Overrides Sub PreExecute()
    xmlWriter = New StreamWriter(targetFile, False)
    xmlWriter.WriteLine(FormatElement(rootElement))

    Dim input As IDTSInput90 = ComponentMetaData.InputCollection(0)
    ReDim columns(input.InputColumnCollection.Count)
    columns = Me.GetColumnIndexes(input.ID)

    Dim column As IDTSInputColumn90

    ReDim columnames(input.InputColumnCollection.Count)
    Dim counter As Integer
    counter = 0
    For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
    columnames(counter) = column.Name
    counter = counter + 1
    Next

    End Sub

    Public Overrides Sub PostExecute()
    xmlWriter.WriteLine(FormatElement(rootElement, True))
    xmlWriter.Close()
    End Sub

    ‘Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    ‘ Dim column As IDTSInputColumn90
    ‘ Dim rowType As Type = Row.GetType()
    ‘ Dim columnValue As PropertyInfo

    ‘ With xmlWriter
    ‘ .Write(FormatElement(rowElement))
    ‘ For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
    ‘ columnValue = rowType.GetProperty(column.Name)
    ‘ Try
    ‘ .Write(FormatElement(column.Name) + columnValue.GetValue(Row, Nothing).ToString() + FormatElement(column.Name, True))
    ‘ Catch ex As Exception
    ‘ .Write(FormatElement(column.Name) + “” + FormatElement(column.Name, True))
    ‘ End Try
    ‘ Next
    ‘ .WriteLine(FormatElement(rowElement, True))
    ‘ End With
    ‘End Sub

    Public Overrides Sub ProcessInput(ByVal InputID As Integer, ByVal Buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)
    While Buffer.NextRow()
    ‘Dim values As New System.Text.StringBuilder
    xmlWriter.Write(FormatElement(rowElement))
    Dim counter As Integer
    counter = 0

    For Each index As Integer In columns
    Dim value As Object = Buffer(index)
    xmlWriter.Write(FormatElement(columnames(counter).ToString()) + value.ToString() + FormatElement(columnames(counter).ToString(), True))
    counter = counter + 1

    Next
    xmlWriter.WriteLine(FormatElement(rowElement, True))
    End While
    End Sub

    Private Function FormatElement(ByVal elementName As String) As String
    Return FormatElement(elementName, False)
    End Function

    Private Function FormatElement(ByVal elementName As String, ByVal closingTag As Boolean) As String
    Dim returnValue As String
    If closingTag Then
    returnValue = “”

    Return returnValue
    End Function

    End Class

  9. http:// says:

    This line still causes Problems when NULL Values are processed:

    xmlWriter.Write(FormatElement(columnames(counter).ToString()) + value.ToString() + FormatElement(columnames(counter).ToString(), True))

    I used an IF Clause to check this case and write “” instead of value.ToString().

    But this is a great script, thank you!

  10. http:// says:

    I used a similar approach when I needed to output an XML file this time last year. However, I did reflection on the rowType class so I could detect and skip null values. I also used the overloaded WriteValue method so values like dates are written out in the ISO 8601 format (2000-01-01T00:00:00 instead of 1/1/2000 12:00:00 AM).

    Dim writer As XmlTextWriter

    Public Overrides Sub PreExecute()
    writer = New XmlTextWriter(“data.xml”, Nothing)
    writer.Formatting = Formatting.Indented
    writer.WriteStartDocument()
    writer.WriteStartElement(“Rows”)
    End Sub

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
    writer.WriteStartElement(“Row”)

    For Each prop As PropertyInfo In rowType.GetProperties()
    If Not prop.Name.EndsWith(“IsNull”) Then
    If CBool(rowType.GetProperty(prop.Name & _
    “_IsNull”).GetValue(Row, Nothing)) = False Then
    writer.WriteStartElement(prop.Name)
    writer.WriteValue(prop.GetValue(Row, Nothing))
    writer.WriteEndElement()
    End If
    End If
    Next

    writer.WriteEndElement()
    End Sub

    Public Overrides Sub PostExecute()
    writer.WriteEndElement()
    writer.WriteEndDocument()
    writer.Close()
    End Sub

  11. julie smith says:

    Thanks for this solution. Very helpful as always :)

  12. julie smith says:

    For anyone using SSIS 2008, the IDTSInputColumn90 needs to be changed to IDTSInputColumn100 .

  13. Sean says:

    SSIS removes any “.” and “_” in your field names when it creates the objects in the script component, so “Some_Field” becomes “SomeField”. You need to handle this when getting the columnvalue object:

    columnValue = rowType.GetProperty(column.Name.Replace(“_”, “”).Replace(“.”, “”))

    Hopefully this saves someone else some time!

    Sean

  14. Cybrduck says:

    Thanks for the handy script. It really saved me time and reduced my stress level over the suspiciously absent XML Destination.

  15. Umar says:

    I am newbie to SSIS and C#
    when I am trying to execute it it is giving me following warnings, Can any one please guide me please how to solve this issue and make it working

    Warning 1 Return type of function ‘Destination’ is not CLS-compliant. C:\Documents and Settings\umemon01\Local Settings\Temp\SSIS\SC_dba966fa656444d3b5d08a765ea50f03\ComponentWrapper.vb 50 30 SC_dba966fa656444d3b5d08a765ea50f03

    Warning 2 ‘ScriptMain’ is not CLS-compliant because it derives from ‘UserComponent’, which is not CLS-compliant. C:\Documents and Settings\umemon01\Local Settings\Temp\SSIS\SC_dba966fa656444d3b5d08a765ea50f03\main.vb 8 14 SC_dba966fa656444d3b5d08a765ea50f03

    Warning 3 Type of parameter ‘Row’ is not CLS-compliant. C:\Documents and Settings\umemon01\Local Settings\Temp\SSIS\SC_dba966fa656444d3b5d08a765ea50f03\main.vb 25 55 SC_dba966fa656444d3b5d08a765ea50f03

Leave a Reply