Agile BI

Null Date Bug Fixed in SSIS

Posted on December 21, 2007 by jwelch

A few months back I reported a minor but annoying bug in the way NULL dates are handled in SSIS. It’s documented here, but basically, if you use an Execute SQL Task to set a DateTime variable to NULL, the value is 11/30/1999. However, a NULL date in SSIS returns 12/30/1899.

To explain a bit more, the SSIS package variables can’t hold NULL in the same way that a database column can. The variable always has to have a value in it. You can set a variable to NULL, but this just sets a default value in the variable. To see an example of this, create a new variable of type DateTime, set the EvaluateAsExpression property to TRUE, and put this expression in it:

NULL(DT_DATE)

The resulting value will be 12/30/1899. It isn’t uncommon to translate NULL values to a known, but unlikely to be used, value. However, since the Execute SQL Task is setting the NULL value to 11/30/1999, and the NULL value in SSIS is 12/30/1899, you can’t easily compare the value of a DateTime variable to NULL(DT_DATE) to see if the variable is “NULL”.

Fortunately, with this fix, this should be a little easier. Unfortunately, I don’t know exactly when we will see it released.

Posted in Uncategorized | Comments Off

Fun with SSIS Configurations

Posted on December 4, 2007 by jwelch

As part of my standard approach (and the one used by Mariner), I use configurations in SSIS quite heavily. My preference is to use the SQL Server configuration type, since it keeps all the configurations in one place. XML configurations tend to result in a lot of extra files to manage. With SQL Server based configurations, though, you need to provide the initial connection string to the SQL Server. We typically use a system environment variable to control that initial connection to the configuration table (either directly, using a specific configuration to control the connection as Rafael has show in the past, or using the indirect approach, as highlighted by Jamie). We use a separate connection manager for the SQL Server configuration connection, usually named Configuration (I like self-explanatory names).

One of our current clients has their development and QA databases as separate SQL Server instances on the same server. There a number of reasons why I don’t care for that approach, but that’s another post. We also needed to use this same server as the host for both the development and QA versions of the SSIS packages. This posed a problem, as the both versions of the package would pick up the same environment variable, so we would only be able to run for one environment at a time.

My initial thought was to use the /CONN switch of DTEXEC utility to override the Configuration connection. Unfortunately, this didn’t work. As best as I can tell, based on some information here and my own testing, the order of events at package load time is:

Configurations are applied, based on what is set in the package at design time.
Any /ConfigFile, /Connection, and /Set values are applied.
Package validation is run.
The package is executed.

Basically, what this boils down to is that there is no way to override a SQL Server configuration from the command line. Even setting the Configuration connection manager from an XML configuration on the command line doesn’t work, because by the time it is applied, the design time configuration is has already been set. Setting the connection string doesn’t force the configurations to be reapplied.

There are a few workarounds for this. You could apply an XML configuration at runtime to override all the configured values in the package, but that defeats the purpose of using SQL Server configurations. We ended up with a different approach, that still lets us use the SQL Server configurations. By setting the environment variable up as a user variable, we can override it based on the user account that the packages are run under. This has a few caveats as well. For one, you need a separate account for each unique environment (not unnecessarily a bad thing). Another is that you can’t use the Integration Services job step type to run the package. Evidently, it doesn’t create a full user session, which is needed to access user environment variables. You can work around it by using the CmdExec job step type.

If you find yourself needing to support multiple environments on the same server, using configurations can be difficult. Hopefully, this information will be helpful for determining the best approach.

Posted in Uncategorized | Comments Off

The November CTP for SQL Server 2008 Is Out!

Posted on November 19, 2007 by jwelch

The download link is here: http://www.microsoft.com/downloads/details.aspx?FamilyId=3BF4C5CA-B905-4EBC-8901-1D4C1D1DA884&displaylang=en

Evidently there have been some issues getting it on the Connect site, so they are hosting it in Microsoft’s Download Center for now.

Posted in Uncategorized | Comments Off

BIDSHelper 1.2 Is Out

Posted on November 18, 2007 by jwelch

BIDSHelper 1.2 is now available at CodePlex. The new version has a number of SSIS specific features, like expression and configuration highlighting, a tool window that lists all the expressions in the package, and a Package Smart Diff tool. Darren (one of the other team members) posted about it on his blog. Keep watching, as we are targeting shorter release cycles with more incremental functionality. If you have suggestions for new features, please post them here.

Posted in Uncategorized | Comments Off

Changing Destination Tables With the OLE DB Destination

Posted on November 18, 2007 by jwelch

The OLE DB Destination supports using a variable to get the name of the target table. This allows you to switch the destination table dynamically, but there is a caveat. The columns in each table that you will use must match, or the OLE DB Destination will give a NEEDS_NEW_METADATA error.

What’s a potential way that you might use this capability? Well, one recent scenario posed on a forum was this: A set of rows should be loaded to Table A, unless Table A already had rows in it. If it did, the rows should be loaded to Table B. This is fairly straightforward to implement in SSIS, thanks to expressions and the OLE DB Destination working with a variable.

To set this up, I started with an Execute SQL Task. It was configure to retrieve the record count for Table A into a variable called User::RecordCount. A second variable, User::TableName, was created to hold the table name. The value for this variable is determined from the following expression:

This expression evaluates the RecordCount variable. If it is equal to 0, Table A is used, otherwise Table B is used.

The Execute SQL task is connected to a data flow, with an OLE DB Source and the OLE DB Destination. The OLE DB Destination is configured to use the variable for the table name:

That’s all there is to it. I’ve posted a sample package to my SkyDrive here:

TargetTwoTables.zip Compressed (zipped) Folder
5.6 KB TargetTwoTables.zip

A few notes about the sample: There are three tables involved, the source (Numbers) and two destinations (TableA and TableB). All three tables have a single Integer column (named ID in Numbers, and ColumnA in TableA and TableB). The scripts for these are not included in the sample.

Posted in Uncategorized | Comments Off

Running Password Protected Packages

Posted on November 16, 2007 by jwelch

A recent post on the forums was asking about running packages programmatically. The poster’s package was password protected, and when running the package, they were receiving a message that one of the connection managers was failing to connect.

The code being used was similar to this:

        Dim app As Microsoft.SqlServer.Dts.Runtime.Application

        Dim pkg As Microsoft.SqlServer.Dts.Runtime.Package
        app = New Microsoft.SqlServer.Dts.Runtime.Application()

        pkg = app.LoadPackage(“C:ProjectsTestPackage.dtsx”, Nothing)

        pkg.PackagePassword = “password”

        pkg.Execute()

This code compiles fine, but does have a small error. The password is being set on the Package object, after it has been loaded. In this case, the Package’s ProtectionLevel property was set to EncryptSensitiveWithPassword, so the error didn’t show up until the package was executed. The problem is that the password needs to be set on the Application object, before the package is loaded. This makes sense, since encrypted values are decrypted when the package is loaded. The following code works fine:

        Dim app As Microsoft.SqlServer.Dts.Runtime.Application

        Dim pkg As Microsoft.SqlServer.Dts.Runtime.Package
        app = New Microsoft.SqlServer.Dts.Runtime.Application()

        app.PackagePassword = “password”

        pkg = app.LoadPackage(“C:ProjectsTestPackage.dtsx”, Nothing)

        pkg.Execute()

This particular problem is a lot easier to see if the Package ProtectionLevel property is EncryptAllWithPassword, because you’ll get an error as soon as you load the package.

I’m not positive of this (the documentation is a little lacking), but I’m guessing you’d set the Password property on the Package object to encrypt a package that you had created programmatically. Evidently, setting it will not decrypt any existing encrypted values.

Posted in Uncategorized | Comments Off

SSIS Performance Webcast for the Charlotte SQL Server User Group

Posted on November 14, 2007 by jwelch

I’ll be doing a web cast for the Charlotte SQL Server User Group, our local PASS chapter, on SSIS Performance Tuning on Nov. 28, 2007 at 12:00 EST. From the abstract:

“This presentation will cover the various aspects of performance tuning in SQL Server Integration Services. We will start by covering some general best practices for performance. We will then drill into details of performance optimization for SSIS data flows, including optimizing execution trees, managing buffer size, and evaluating performance metrics to determine where the bottlenecks are. We’ll also cover the logging options and performance counters that can be used to monitor performance.”

If you are interested in tuning in, please sign up here. We’ll be giving away a few copies of Windows Server 2003 Enterprise Edition. To sign up as a member of the user group, go here and select the sign-up option.

Posted in Uncategorized | Comments Off

Dealing with Unicode and ASCII Text Files

Posted on November 8, 2007 by jwelch

A recent post on the MSDN forums brought up an interesting problem. The poster was attempting to process a folder of files, all of which had the same format, but some were encoded as ASCII (ANSI, DT_STR, or varchar, if you like the SQL Server data types) and some were encoded as Unicode (DT_WSTR, nvarchar).

The initial issue was determining which way the file was encoded. After a little research, I located a post entitled “Detecting File Encodings in .NET” by Heath Stewart. This showed how to check the encoding of a file in C#. I converted the code to VB.NET and set it up to work within a Script Task. These are the results:

    Public Sub Main()

        ”’Code based on sample originally posted here: http://www.devhood.com/tutorials/tutorial_details.aspx?tutorial_id=469

        Dim file As System.IO.FileStream = Nothing

        Dim isUnicode As Boolean = False
        Try

            file = New System.IO.FileStream(Dts.Connections(“TestFileAnsi”).AcquireConnection(Nothing).ToString(), _

                        System.IO.FileMode.Open, System.IO.FileAccess.Read, System.IO.FileShare.Read)

            If file.CanSeek Then

                Dim bom(4) As Byte ‘ = Byte ‘// Get the byte-order mark, if there is one

                file.Read(bom, 0, 4)

                If ((bom(0) = &HEF And bom(1) = &HBB And bom(2) = &HBF) Or _

                    (bom(0) = &HFF And bom(1) = &HFE) Or _

                    (bom(0) = &HFE And bom(1) = &HFF) Or _

                    (bom(0) = 0 And bom(1) = 0 And bom(2) = &HFE And bom(3) = &HFF)) Then

                    isUnicode = True

                Else

                    isUnicode = False

                End If

                ‘// Now reposition the file cursor back to the start of the file

                file.Seek(0, System.IO.SeekOrigin.Begin)

            Else

                ‘// The file cannot be randomly accessed, so you need to decide what to set the default to

                ‘// based on the data provided. If you’re expecting data from a lot of older applications,

                ‘// default your encoding to Encoding.ASCII. If you’re expecting data from a lot of newer

                ‘// applications, default your encoding to Encoding.Unicode. Also, since binary files are

                ‘// single byte-based, so you will want to use Encoding.ASCII, even though you’ll probably

                ‘// never need to use the encoding then since the Encoding classes are really meant to get

                ‘// strings from the byte array that is the file.

                isUnicode = False

            End If
            Dts.TaskResult = Dts.Results.Success
        Catch e As Exception

            Dts.TaskResult = Dts.Results.Failure

        Finally

            If Not file Is Nothing Then

                file.Close()

            End If
            Try

                Dim vars As Variables

                Dts.VariableDispenser.LockOneForWrite(“isUnicode”, vars)

                vars(“isUnicode”).Value = isUnicode

                vars.Unlock()

            Catch ex As Exception

                Dts.TaskResult = Dts.Results.Failure

            End Try

        End Try

    End Sub

Essentially, the script above is getting the file path from a flat file connection manager. It’s then checking the byte order mark (BOM), which is what tells us if the file is Unicode or not. The script assumes that the file is ASCII, if it can’t determine it from the BOM. The result of the check is written to a boolean variable defined in the package called isUnicode.

That part was relatively straightforward. Originally, I thought I could just use this variable to set the Unicode property on the flat file connection manager through an expression, and use the same dataflow, as long as I typed all the string columns as DT_WSTR (I really should have known better). Oddly enough, when I first tested this, I found that if I set the Unicode property to false on the connection manager, but set the column types to DT_WSTR, I could process ASCII files without a problem, but Unicode files didn’t send any rows through, even though there was no error. If I set the Unicode property to true, and then set it to false via an expression, the flat file source threw a validation error because the error output was using a DT_NTEXT type, and it needs to use DT_TEXT with ASCII files.

What I ended up with instead isn’t quite as simple, but it does work consistently. I created two connection managers, one configured for ASCII, and the other for Unicode. Both connection managers have an expression defined that sets their ConnectionString property based on the same variable, so they both point at the same file. Then I created two data flows, one for ASCII, one for Unicode. This eliminates any metadata errors. Finally, I set precedence constraints from the script task to each data flow, and made execution conditional based on the isUnicode variable.

The constraint to DF ANSI looks like this:

and the constraint to DF Unicode looks like:

The sample files for this (included a text file formatted as Unicode and ASCII) are on my Live Drive:

CheckingUnicode.zip Compressed (zipped) Folder
23.6 KB CheckingUnicode.zip

To switch the test file from ASCII to Unicode, change the file path in the “testFilePath” variable.

Posted in Uncategorized | Comments Off

Checking for the Existence of a File

Posted on November 1, 2007 by jwelch

The File System Task in SSIS doesn’t support checking to see if a file exists. You can work around this easily with a script task. Create a new script task and add a Imports statement referencing the System.IO namespace to the top of the script.

Imports System.IO

Then add the following to the Main method:

If File.Exists(Dts.Connections(“ConnMgrA”).AcquireConnection(Nothing).ToString()) Then

    Dts.TaskResult = Dts.Results.Success

Else

    Dts.TaskResult = Dts.Results.Failure

End If

This script checks the file referenced by the ConnMgrA connection manager. If it exists, the script task returns Success, meaning execution will follow the Success constraint from the Script Task. If the file does not exist, the task will fail, and the Error constraint will be used. You could also set a variable with the results, and use that in an expression on a precedence constraint.

If, instead of using a connection manager, you want to get the file name from a variable, you can replace the If statement with the following:

If File.Exists(ReadVariable(“FileNameVariable”).ToString()) Then

The variable locking is occurring in the ReadVariable method. To see the definition for it, please refer to Daniel Read’s blog post here. This is a good practice to follow when working with variables in Script Tasks.

There is a Connect posting here requesting that the File System Task be enhanced to support checking for a file’s existence. If you’d like to see this in a future version, please vote for issue.

Posted in Uncategorized | Comments Off

Scanning a Package For Specific Tasks

Posted on October 28, 2007 by jwelch

I recently had a need to scan through an SSIS package to find all instances of data flow tasks within the package. This is to support a future enhancement in BIDSHelper. However, I ended up making the code a little more general, so it can be used to get a list of all instances of any type of tasks within a package.

Before looking at the code, there is a concept that it is helpful to understand. SSIS packages have a nested structure. Everything is held within a container. Some containers, like the Sequence or ForEach Loop containers, can hold groups of other containers. The package itself is a container. All containers that can hold multiple containers implement the IDTSSequence interface. The TaskHost container, however, holds a single task, and does not implement the IDTSSequence interface. The TaskHost is what we have to find to identify individual instances of tasks.

The function below is designed to be called recursively. It uses the IDTSSequence interface to iterate through the Executables collection, which can hold either TaskHosts or containers that implement IDTSSequence themselves, which is where the recursion comes in. It also takes the type name as a string parameter. The code checks the type name string against the name of the type as returned by the GetType function. It’s simply checking to see if the typeName string exists within the full string returned, so it doesn’t have to be an complete match. This means you don’t have to specify the full namespace, like “Microsoft.SqlServer.Dts.Tasks.ExecuteSQLTask”. Instead, you can just specify “ExecuteSQL”.

        List<DtsContainer> FindExecutablesByType(IDTSSequence sequence, string typeName)

        {

            string typeNameUpper = typeName.ToUpper();

            List<DtsContainer> matchingExecutable = new List<DtsContainer>();

            foreach (Executable e in sequence.Executables)

            {

                if (e.GetType().ToString().ToUpper().Contains(typeNameUpper))

                {

                    matchingExecutable.Add((DtsContainer)e);

                }

                if (e is TaskHost)

                {

                    TaskHost taskHost = (TaskHost)e;
                    if ((typeNameUpper.Contains(“DATA FLOW”)

                        || typeNameUpper.Contains(“DATAFLOW”)

                        || typeNameUpper.Contains(“MAINPIPE”)

                        || typeNameUpper.Contains(“PIPELINE”))

                        && taskHost.InnerObject is IDTSPipeline90)

                    {

                        matchingExecutable.Add((DtsContainer)e);

                    }

                    else if (taskHost.InnerObject.GetType().ToString().ToUpper().Contains(typeNameUpper))

                    {

                        matchingExecutable.Add((DtsContainer)e);

                    }

                }

                if (e is IDTSSequence)

                {

                    matchingExecutable.AddRange(FindExecutablesByType((IDTSSequence)e, typeNameUpper));

                }
            }

            return matchingExecutable;

        }

The function returns a list of DtsContainer objects, which is a base class for all control flow objects. The DtsContainer gives access to the name and description for each object. There is some special handling for Data Flow objects. These are actually implemented as COM objects, so a call to GetType on them returns “ComObject” rather than the .NET class name. To work around this, I have hard-coded a few values for the data flow objects, and a specific check for the IDTSPipeline object. There may be a better way to accomplish this, but I haven’t located it yet. Another limitation of this code is that it does not check event handlers. Any tasks inside of event handlers will not be found. This could be added fairly easily, but wasn’t necessary for my current purposes, so it didn’t make the cut.

To call this function and iterate the entire package, you can use this code:

Microsoft.SqlServer.Dts.Runtime.Application ssisApp = new Microsoft.SqlServer.Dts.Runtime.Application();
Package pkg = ssisApp.LoadPackage(“C:\Temp\MyPackage.dtsx”, null);
List<DtsContainer> containers = FindExecutablesByType((IDTSSequence)pkg, “ExecuteSQL”);

foreach (DtsContainer exec in containers)

{

    //Do Something
}

I’ve put together a small GUI around this, posted on my SkyDrive.

SSISObjectSearch.zip

If you are interested in learning more about working with SSIS programmatically, please post in the comments. I’m considering doing a few more posts on this, but only if there is interest.

Posted in Uncategorized | Comments Off

Null Date Bug Fixed in SSIS

Fun with SSIS Configurations

The November CTP for SQL Server 2008 Is Out!

BIDSHelper 1.2 Is Out

Changing Destination Tables With the OLE DB Destination

Running Password Protected Packages

SSIS Performance Webcast for the Charlotte SQL Server User Group

Dealing with Unicode and ASCII Text Files

Checking for the Existence of a File

Scanning a Package For Specific Tasks

Recent Posts

Recent Comments

Archives

Categories

Meta