Bundling Errors in the Data Flow

I’ve posted about handling multiple errors in SSIS before, but that post was addressing it in the context of the control flow. What if you want to capture a series of data errors from the data flow? An example of this might be a requirement to capture and email all the rows that fail in a Lookup transform, so that the reference table can be updated with new values.

There are a number of ways to do this. You can send the error output of the Lookup to a Flat File destination, and then attach that to an email message using the Send Mail task. If you want to avoid writing anything to disk, you could send the error rows to a Recordset destination, and then use a Script Task to put them into a string variable to send. Or you could just write them directly to a string, using a Script destination.

I’ve put together a sample package that shows how to send the error rows to either a flat file or a string variable, and then how to email both. The control flow consists of a Data Flow task, and two Send Mail tasks (you won’t need both, they are just there for illustration purposes).

image

The data flow is fairly basic. The OLE DB Source uses a hard-coded query to return the numbers 1 through 5. The Lookup uses a hard-coded query to return a reference table with 1 and 2, so we expect 3, 4, and 5 to fail. The Row Count transform is used to capture the number of rows in error, so that we know if an email needs to be sent. The Multicast is used so that we can show both methods of handling the error rows. The Write Errors to File destination simply sends the error rows to a flat file.

image

The Write Errors to String Script Destination uses a StringBuilder class to capture each row, then writes it to the errorRows variable at the end of data flow processing. The StringBuilder is more efficient for handling large strings, which is why it is used here. If the package was processing a large number of rows, the error row string could get quite large.

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.Text
 
Public Class ScriptMain
    Inherits UserComponent
    Dim errorMessages As StringBuilder = New StringBuilder()
 
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        errorMessages.AppendLine(String.Format("Key value {0} not found in the reference table.", Row.KEYVALUE))
    End Sub
 
    Public Overrides Sub PostExecute()
        Dim vars As IDTSVariables90
 
        Me.VariableDispenser.LockOneForWrite("errorRows", vars)
        vars("errorRows").Value = errorMessages.ToString()
        vars.Unlock()
    End Sub
 
End Class

Back in the control flow, each of the Send Mail tasks has a precedence constraint that also has an expression set on it. The expression is just checking to see if there were any errors. If not, we don’t need to send the emails.

image

The Send Mail – No Attachment task uses the string variable populated by the Script destination in the data flow as the Message Source for the email.

image

The Send Mail – attachment, on the other hand, uses a file as an attachment to the message. There is something to point out with this method. The package has a string variable defined that holds the path and filename of the error file. This variable is used in an expression on the ConnectionString property of the flat file connection manager, and in an expression on the FileAttachments property of this Send Mail task. This lets us set the path in one place, and avoiding changing it in multiple locations if it ever needs to be updated.

image

That’s all there is to it. I’ve uploaded the sample package to my SkyDrive, located here. To run it, you’ll need to update the SMTP server address in the SMTP connection manager, and the ErrorFile variable to point to a valid path on your machine.

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.