Importing Files Using SSIS

A topic that has come up a few times recently is the idea of loading a set of files into a database using SSIS. One of the data flow components in SSIS is the Import Column transform, which allows you to load files into a binary column. There is a great video by Brian Knight that shows how to use it, and I recommend viewing that  (12/01/2013 update: evidently, the video this post originally linked to is no longer available). If you are looking for a quick overview of the component, and a different approach for loading the list of filenames to import, then read on.

 

The Import Column transform works in the data flow, and imports one file for each row that is passed through it. It expects a column that contains the file name to import as an input. It outputs a column of type DT_TEXT, DT_NTEXT, or DT_IMAGE, that contains the file contents.

 

I’ve included a sample package with this post that uses the Import Column transform. It has a single data flow that uses a script component to get the list of files to import.

 

pic1[4]

 

The package has two connection managers, one of which points to a SQL Server database where the files will be stored. The other connection manager is a File connection manager, that is pointed to a folder. This is the folder that we want to import the files from.

 

p1

 

The script component was created as a Source. A single output column of type DT_WSTR was added to contain the filenames.

 

1

 

On the connection managers page, the File connection manager is specified so that it can be accessed from the script.

 

1[4]

 

The script uses the Directory class from the System.IO namespace. By calling the GetFiles method, the code can iterate through all of the files in the directory, and output one row for each file.

 

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.IO

Public Class ScriptMain
    Inherits UserComponent

    Public Overrides Sub CreateNewOutputRows()
        Dim fileName As String

        For Each fileName In Directory.GetFiles(Me.Connections.ImportFilesDir.AcquireConnection(Nothing).ToString())
            Output0Buffer.AddRow()
            Output0Buffer.Filename = fileName
        Next

        Output0Buffer.SetEndOfRowset()
    End Sub

End Class

 

The next component is the Import Columns component. Configuring it can be a little difficult. I found the Books Online documentation a little unclear on what had to be done. On the Input Columns tab, the column that contains the filename (including path) to import needs to be selected.

 

1[1]

 

On the Input and Output Properties tab, a new column was added to hold the binary contents of the file. When adding this column, make a note of the LineageID value, as it needs to be used in the next step.

 

1[1]

 

After adding the output column, the input column (that contains the filename, not the file contents), needs to be selected. The LinageID from the previous step needs to be put into the FileDataColumnID property. This tells the component which column to populate with the file contents.

 

1[3]

 

The OLE DB Destination is fairly straightforward, as it just maps the columns from the data flow to the database.

 

Hopefully this helps if you are working with the Import Column transform. The samples are located on my Skydrive.

This entry was posted in Uncategorized and tagged , . Bookmark the permalink.

Comments are closed.