Script – Agile BI http://agilebi.com A community for sharing ideas about business intelligence development using agile methods. Mon, 11 Jan 2016 20:55:23 +0000 en-US hourly 1 https://wordpress.org/?v=6.2.2 Presentations at Atlanta’s BI Edition SQL Saturday http://agilebi.com/blog/2016/01/11/presentations-at-atlantas-bi-edition-sql-saturday/ Mon, 11 Jan 2016 20:55:23 +0000 http://6.1550 Continue reading ]]> I presented this weekend at SQL Saturday #477 in Atlanta. It was a great event, very well organized. I appreciate all the attendees at my sessions – there were some great questions and comments. I promised that I’d publish my slides and sample code, so here it is.

Getting Started with SSIS Script Tasks and Components

This session was an introduction to the scripting objects in SSIS, and how they can be used to extend the built in functionality. Download the files here.

Testing Data and Data-Centric Applications

This session was on testing data-centric applications, both during development and how you can continue validating your data in production. Download the files here.

Thanks again to eveyone who attended!

]]>
Advanced Scripting in SSIS http://agilebi.com/blog/2015/08/17/advanced-scripting-in-ssis/ Mon, 17 Aug 2015 15:51:15 +0000 http://6.1548 Continue reading ]]> Last week I presented a session on Advanced Scripting for SSIS for Pragmatic Works. Thanks to everyone who attended, and all the great questions. I’ve had a few requests for the samples I used, so I wanted to make those available. You can download them from my OneDrive.

]]>
Naming Columns for the Script Component http://agilebi.com/blog/2011/08/18/naming-columns-for-script-component/ Thu, 18 Aug 2011 14:00:00 +0000 http://6.1514 Continue reading ]]> Do you use whitespace or special characters in your column names? Most people don’t, because of the additional headaches it creates. You have to delimit the column names, come up with a work-around for tools that don’t support column names with special characters, etc. Underscores, however, are used pretty extensively in place of spaces. If you are using Script Components in SSIS, though, you may encounter an all-new headache with special characters or even underscores in your column names.

When you use a script component in SSIS, it generates some .NET code for you automatically, based on the metadata in the pipeline connected to the script component. However, when this code is generated, SSIS strips out any whitespace or special characters from the names of inputs, outputs, and columns. It only retains the letters and numbers (alphanumeric characters) in these names.

Here’s some examples of column name issues that I’ve run into with scripts (and while these specific items are made up, they represent real-world scenarios I’ve encountered – there’s some really horrible naming approaches out there):

Original Column Name Script Column Name
Account Account
Account# Account
Account Number AccountNumber
Account_Number AccountNumber
TI_TXN_ID TITXNID
TI_TXNID TITXNID

 

As you can see, once the alphanumeric characters have been stripped from these column names, they are no longer unique. That can pose a few problems in your script code. What’s worse, because this code is auto-generated by SSIS, you can’t fix it without changing the column names in the data flow, even though this is really purely a script thing (and not even a .NET limitation – underscores are perfectly valid in .NET naming). What’s even worse than that – you don’t get an error till the binary code is recompiled.

So, if you are working with script components, make sure all your column names are unique even when all non-alphanumeric characters have been stripped from them. The same thing applies to your output names – they must be unique based only on the alphanumeric characters.

]]>
Using OLE DB Connections from Script Tasks http://agilebi.com/blog/2011/08/17/oledb-connections-in-script-tasks/ Wed, 17 Aug 2011 14:00:00 +0000 http://6.1513 Continue reading ]]> I write scripts on a pretty regular basic, and often need to access database connections from them. It’s pretty easy to do this if you are using an ADO.NET connection. However, if you are using OLE DB, you have to go through a couple of additional steps to convert the connection to an ADO.NET version. Matt Masson posted a great code sample for doing this conversion.

I use a slightly altered version of this code pretty regularly. It’s been modified to support both OLE DB and ADO.NET connections, so that I can switch connections without having to change the script code.

To use it, you need to add a reference in your script project to Microsoft.SqlServer.DTSRuntimeWrap. Then, add the following to the usings section at the top of the script:

[sourcecode language="csharp"]
using System.Data.Common;
using Wrap = Microsoft.SqlServer.Dts.Runtime.Wrapper;
[/sourcecode]

For the code to get the connection, use the following:

[sourcecode language="csharp" padlinenumbers="true"]
ConnectionManager cm = Dts.Connections["MyConnection"];
DbConnection conn = null;
if (cm.CreationName == "OLEDB")
{
	Wrap.IDTSConnectionManagerDatabaseParameters100 cmParams =
	cm.InnerObject as Wrap.IDTSConnectionManagerDatabaseParameters100;
	conn = cmParams.GetConnectionForSchema() as DbConnection;
}
else
{
	conn = cm.AcquireConnection(null) as DbConnection;
}

if (conn.State == ConnectionState.Closed)
{
	conn.Open();
}

// TODO: Add your code here

conn.Close();
Dts.TaskResult = (int)ScriptResults.Success;
[/sourcecode]

You can use the “conn” object to perform actions against the connection. Since it’s using the common DBConnection interface, you can use it against any database connection that you have an ADO.NET provider for (which includes OLE DB providers).

]]>
Importing Files Using SSIS http://agilebi.com/blog/2008/02/02/importing-files-using-ssis/ Sat, 02 Feb 2008 20:27:00 +0000 http://6.1289 Continue reading ]]> A topic that has come up a few times recently is the idea of loading a set of files into a database using SSIS. One of the data flow components in SSIS is the Import Column transform, which allows you to load files into a binary column. There is a great video by Brian Knight that shows how to use it, and I recommend viewing that  (12/01/2013 update: evidently, the video this post originally linked to is no longer available). If you are looking for a quick overview of the component, and a different approach for loading the list of filenames to import, then read on.

 

The Import Column transform works in the data flow, and imports one file for each row that is passed through it. It expects a column that contains the file name to import as an input. It outputs a column of type DT_TEXT, DT_NTEXT, or DT_IMAGE, that contains the file contents.

 

I’ve included a sample package with this post that uses the Import Column transform. It has a single data flow that uses a script component to get the list of files to import.

 

pic1[4]

 

The package has two connection managers, one of which points to a SQL Server database where the files will be stored. The other connection manager is a File connection manager, that is pointed to a folder. This is the folder that we want to import the files from.

 

p1

 

The script component was created as a Source. A single output column of type DT_WSTR was added to contain the filenames.

 

1

 

On the connection managers page, the File connection manager is specified so that it can be accessed from the script.

 

1[4]

 

The script uses the Directory class from the System.IO namespace. By calling the GetFiles method, the code can iterate through all of the files in the directory, and output one row for each file.

 

Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Imports System.IO

Public Class ScriptMain
    Inherits UserComponent

    Public Overrides Sub CreateNewOutputRows()
        Dim fileName As String

        For Each fileName In Directory.GetFiles(Me.Connections.ImportFilesDir.AcquireConnection(Nothing).ToString())
            Output0Buffer.AddRow()
            Output0Buffer.Filename = fileName
        Next

        Output0Buffer.SetEndOfRowset()
    End Sub

End Class

 

The next component is the Import Columns component. Configuring it can be a little difficult. I found the Books Online documentation a little unclear on what had to be done. On the Input Columns tab, the column that contains the filename (including path) to import needs to be selected.

 

1[1]

 

On the Input and Output Properties tab, a new column was added to hold the binary contents of the file. When adding this column, make a note of the LineageID value, as it needs to be used in the next step.

 

1[1]

 

After adding the output column, the input column (that contains the filename, not the file contents), needs to be selected. The LinageID from the previous step needs to be put into the FileDataColumnID property. This tells the component which column to populate with the file contents.

 

1[3]

 

The OLE DB Destination is fairly straightforward, as it just maps the columns from the data flow to the database.

 

Hopefully this helps if you are working with the Import Column transform. The samples are located on my Skydrive.

]]>