Welcome to Agile BI Community Sign in | Join | Help

A common question on the SSIS forums is how to continue a loop after an error occurs. For example, you might have a For Each Loop container that iterates through a set of files in a folder. If one file is invalid, you want to continue processing the remaining files. Fortunately, there is a simple pattern that allows this to be accomplished.

I created a package to illustrate this. Basically, the package implements the scenario above. It consists of a For Each Loop to iterate through all the files in a folder. There is a data flow task that processes each input file, and runs it through a row count (just for illustration purposes). The Script task is there to illustrate an error handling task could be attached to the data flow, but it’s empty in the example package. It would need to be replaced with the appropriate task to handle a failure, such as a File System task to move the file to another folder.

image

The key to handling the errors is the Sequence container, or more specifically, how the Sequence container is set up. The OnError event handler for the Sequence container has been enabled. You can do this by selecting the Sequence container, then clicking the Event Handlers tab. Select the OnError event in the drop down, then click the link in the designer to create an empty event handler. It’s not necessary to add anything to event handler.

image

Once it has been created, go to the Variables window, and select the option to show System variables. Select the Propagate variable, and set the value to FALSE. This prevents any errors occurring on tasks inside the Sequence container from being raised to the For Each loop.

image

This is all that really needs to be done, but if you run the package with only this set, you’ll notice that the Data Flow task and the Sequence container both report an error. To prevent the Sequence Container from reporting an error, set the MaximumErrorCount property of the container to 0.

image

Note that you do not have to use a sequence container to accomplish this. You could make the same settings on individual tasks inside the For Each loop. However, the Sequence container gives you a common place to make the settings, and you can have as many tasks inside it as you’d like.

The example package is posted on my Skydrive. The zip file contains a folder call Import. The package expects this folder to be located in C:\Temp. It contains 4 text files for the package to process. The 2nd file contains an error (one of the rows has fewer columns than it is supposed to), so that you can see the error behavior.

In a previous post, I detailed some of the challenges that I had encountered when trying to modify the connection string used for SQL Server configurations from the command line. In SSIS 2005, command line options were applied after the configurations were loaded (with the exception of parent package variables). Effectively, this meant that you could not change the location a configuration pointed to from the command line.

This has been changed in SSIS 2008. Configurations are now loaded twice, once when the package is originally loaded, and then again after any command line options are applied. The order of events is:

  1. The package file is loaded.
  2. The configurations specified in the package at design time are applied in the order specified in the package (except for Parent Package Variables).
  3. Command line values are applied.
  4. The configurations specified in the package at design time are reloaded in the order specified in the package (except for Parent Package Variables). The configurations will use any updated values from the command line when they are reloaded, so it is possible to change the connection string for a SQL Server configuration.
  5. Parent Package Variable Configurations are applied.
  6. The package is run.

This is a nice improvement, since you can now set up a package with SQL Server configurations, and point the package to the correct database are runtime by simply using the /CONN switch of DTEXEC. However, it still isn’t perfect . Since the design time configuration is applied twice, a value that you apply using /SET on the command line can be overridden by a preexisting configuration.

If you’d like to see this behavior, I’ve provided a simple package. The package has a single variable (“TestVar”) defined. It has an XML configuration enabled that sets the value of TestVar to “Dev”. There is a single Script Task that fires an information event that contains the value of the variable.

public void Main()
        {
            bool fireAgain = false;
            Variables vars = null;
            Dts.VariableDispenser.LockOneForRead("TestVar", ref vars);
            Dts.Events.FireInformation(0, "", "Value is: " + vars["TestVar"].Value, "", 0, ref fireAgain);
            vars.Unlock();
            Dts.TaskResult = (int)ScriptResults.Success;
        }

If I run this package using this command:

dtexec /file Configuration_Overwrite.dtsx /SET \Package.Variables[User::TestVar].Properties[Value];Prod /REP EWI

I’d expect to see the value “Prod” reported in the output. Instead, “Dev” is displayed. If you disable the package configurations and re-run it using the same command, it does report “Prod”.

Overall, I prefer the new behavior, but it does introduce a new issue to be aware of. In general, you need to be careful to not /SET a value that is also specified in a configuration, as it will be overwritten. I’ve filed a Connect suggestion for the addition of /SetBefore and /SetAfter switches to DTEXEC, so that you can explicitly define when you want the /SET applied. If you think it’s a good idea, please vote for it.

The sample package is on my SkyDrive. You may need to update the location of the XML configuration, but it has no other dependencies.

Thanks to Cho Yeung on the SSIS team for clarifying the load order when the package is executed.

This is really cool – Mariner (my employer) was awarded the Microsoft Performance Management Partner of the Year award this year, based on a solution that we created for the Charlotte-Mecklenburg Schools (CMS) system. When I was first told about the award, I asked, “It’s for the Southeast region, right?”. The answer: “No, think bigger.” “The US? Very cool!” “No, bigger than that.” Yes, it’s the Worldwide Performance Management Partner of the Year – selected from over 2000 entries.

So what is this award winning solution? It’s a performance management system that allows school systems to track how well they are progressing on their long term goals for improving the learning environments in schools. This was an interesting solution to work on – most of my experience has been in applying business intelligence to more traditional business solutions. While there are some similarities, the core focus is very different. Success for a school system isn’t judged on profits, it’s based on how well the children are educated. Since I have one child already in the CMS system, and another one going in soon, there was much more of a personal impact from seeing what they were doing, and their approach to meeting their goals.

It was also interesting from a technology standpoint. The technology for the solution involves most of Microsoft’s BI stack – SQL Server for the data repository, SSIS for the data integration, SSAS for analysis, SSRS for some reporting, and Sharepoint + PerformancePoint Server for the front-end display of the analytic information. It’s pretty impressive seeing how Microsoft’s BI story has really come together over the last couple of years, and actually getting the chance to put all these pieces in action with each other was great (and challenging in a few spots :) ). The team that did the development work did a great job with it, and it really shows in the end product.

Douglas Laudenschlager has posted some good information about the considerations you need to make if you are running SSIS on a 64-bit box. This is additional information to what is currently in Books Online, and it’s well worth looking at. Considering that a lot of developers are developing on x86 machines, and deploying to x64 servers, it’s even more valuable. I currently do all my development on a x64 machine, but since the development environment (BIDS) is 32-bit only, some of these items still have impact.

64-bit Considerations for SQL Server Integration Services - Douglas Laudenschlager - Dougbert.com

To finish up the topic that I started in my previous post, I wanted to post a more complex example of running total calculations. This is based off a request on the MSDN forums, for a running total calculation that would “slide” along as additional rows were processed. In the poster’s example, the running total needed to reflect the total for the last 10 weeks.

To show how to accomplish this, I’m going to update the same package that was used in the previous example.

image

The script component has the same RunningTotal column defined on the output:

image

The primary difference is in the script itself. Since I want to keep a list of the previous 10 records, I am using a Queue object, which is part of the System.Collections.Generic namespace. This object lets you easily remove the oldest item from the queue, so it works very well for this example. In the script, if the queue has more than 10 items, the oldest item is removed prior to adding a new one. The running total is produced by iterating through all the items in the collection, and storing the result in the RunningTotal output column.

Public Class ScriptMain
    Inherits UserComponent
    Dim runningTotalCollection As Queue(Of Integer) = New Queue(Of Integer)
 
 
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        If runningTotalCollection.Count >= 10 Then
            runningTotalCollection.Dequeue()
        End If
 
        runningTotalCollection.Enqueue(Row.OrderQty)
 
        Dim x As Integer
        Dim sumX As Integer
        For Each x In runningTotalCollection
            sumX += x
        Next
 
        Row.RunningTotal = sumX
    End Sub
 
End Class

This is another example of the flexibility of the Script component. It also helps highlight that by using the Script component, you can store the values from previous rows, and then reference them later. The example package is posted on my Skydrive.

Running totals are a common calculation for reporting, and are occasionally needed in ETL processes to populate aggregate tables. One way to accomplish this in SSIS, if you are retrieving data from a relational database, is to do the running total in the SELECT statement. There are a number of resources on the Internet that illustrate how to accomplish this. However, this may not work for you if you are retrieving information from a flat file. Also, the logic to implement a running total may not be straightforward in SQL, particularly if you need sliding windows for the totals (I'll explain this later).

Fortunately, there are other ways to accomplish this in SSIS. A very flexible method is to use a script component, as this will let you customize the logic however you prefer. As an example, I'll take a look at a couple of examples from AdventureWorks.

First, the more straightforward example. The AdventureWorks database has a Sales.SalesOrderDetail table that lists the itemized purchases for each order. Note - I'm only filtering the results for example purposes. The actual code in the package works with the whole table.

image

If I was to calculate a running total for the OrderQty for each SalesOrderID above, I'd expect to see:

SalesOrderID ProductID OrderQty RunningTotal
43685 765 3 3
43685 763 1 4
43685 754 1 5
43685 725 1 6
43686 758 3 3
43686 762 1 4
43686 770 1 5
43687 768 1 1
43687 765 2 3

Notice that the running total resets each time the SalesOrderID changes. To implement a script component that will calculate the running total, I've created a new script component and added an output column to it to hold the running total.

image

Inside the script component, I've added a variable to store the value of the control number (SalesOrderID), and another variable to capture the running total. When the control number changes, the running total is reset.

Public Class ScriptMain
    Inherits UserComponent
    Dim runningTotal As Integer
    Dim previousValue As Integer
 
 
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        If Not previousValue = Row.SalesOrderID Then
            runningTotal = 0
            previousValue = Row.SalesOrderID
        End If
 
        runningTotal += Row.OrderQty
 
        Row.RunningTotal = runningTotal
    End Sub
 
End Class

This is an easy method to calculate running totals in the data flow. One thing to note is that the data must be sorted by the control number (SalesOrderID in the example) in order for this to work properly.

This is in an example package on my SkyDrive here. I will follow up this post with another sample showing a more complex "running total" problem and how it can be solved in SSIS as well.

There's an interesting "behavior" that occurs with configurations at design time. The value of the configuration will be saved in your package as the new stored value. I'd seen this before, but never really thought about it, until a poster on the forums questioned why it was happening. To understand it, you need to know how the package is processed when it is loaded in the design environment.

If you have configurations enabled in the package, then the configuration is applied to the package when it is initially loaded. When a configuration is applied, it updates the target property of the configuration. For example, you might have a package that has the Description property set to "No Configuration". If you add a configuration that sets the Description property of your package to "Has Configuration", when the package is loaded, the value specified in the configuration ("Has Configuration") will overwrite the value currently stored in the Description property ("No Configuration").

If you then close the package without saving, the change to the property is not saved, and it will reflect the original value ("No Configuration"). If, however, you save the package (and remember, running it in the IDE saves it automatically), the new value ("Has Configuration") will be saved as the value in the package file.

Normally, this does not cause any issues. If you execute a package in production that has a configuration, it is not saved after execution, so the values of the configurations is not save in the package file. However, it can impact you in the development environment. This is one of the reasons that I recommend setting up configurations even in your development environment. If you do that, then you know that regardless of the value the package was saved with, it will have the value of your local configuration each time it is loaded in the IDE. This is also a good reason never to update a package directly on your production server. When you open it in the IDE on the production server, it will retrieve the production configurations. If you save or run it, it will be saved with the production configuration values. This can cause a number of headaches in troubleshooting problems down the road, so I recommend that you don't do it.

A question that comes up occasionally on the forums is how to determine if a file is being written to by another process. It can cause issues if a file is partially written when a source component starts extracting data from it.

There is a way to check this through a Script Task. The script below attempts to lock the file for writing. If the file is be written to be another process, the attempt to lock it will fail. If the file is available, the attempt will succeed.

Public Sub Main()
    Dim vars As Variables
    Dim file As System.IO.FileStream
 
    Try
        file = System.IO.File.Open(Dts.Connections("FlatFile").ConnectionString, System.IO.FileMode.Open, System.IO.FileAccess.Write)
    Catch ex As Exception
        Dts.VariableDispenser.LockOneForRead("FileLocked", vars)
        vars("FileLocked").Value = True
        vars.Unlock()
    Finally
        If Not file Is Nothing Then
            file.Close()
        End If
    End Try
 
    Dts.TaskResult = Dts.Results.Success
End Sub

The package needs to have a connection manager that points to the file you want to check (named "FlatFile" in the script above), and a variable ("FileLocked") that is set to True if the file is locked. The variable should be set to False by default. The variable can be used in a precedence constraint to control whether other tasks execute.

On May 15th I'll be presenting at the Triad SQL Server User Group, located in High Point, NC. The presentation will be focusing on new features in SQL Server 2008 for business intelligence and data warehousing. If you're in the area, please stop in and introduce yourself.

There are a number of new features in the 2008 Lookup Transform. Jamie Thomson posted a good overview of the features, and I recommend reading it prior to this article. One of the new features in SSIS 2008 is the ability to cache the data from the Lookup to a local file. This should help improve performance, as the lookup data would only need to be retrieved once, stored locally, and reused. Internally, the lookup cache file (.caw) uses a RAW file structure (.raw), which is very fast to read and write from disk.

One of the first things that I wanted to try with the new caching features was to dynamically update the cache file. That proved to be a bit of a challenge, as Jamie pointed out in his article. One of the main reasons for this is that, while the cache file is very similar to a RAW file, the cache file stores some additional metadata - hash values for each row, and some header information. This means that while you can use a RAW Source component to read the data from a cache file, you must use the Cache Transform to actually create a cache file.

Unfortunately, the Cache Transform does not support incremental updates to the cache file. Each time you use it, it recreates the file completely. However, with a little creativity, you can minimize the amount of data you have to process. I've created a sample package that illustrates one approach to this.

The control flow looks like this (the numbers refer to the steps below):

image

The major components of the process are:

  1. Retrieve the last modified date from the Lookup reference table, by using an Execute SQL Task.
  2. Use a Script Task to determine if the cache file already exists. This leverages the System.IO.File class, as discussed in Checking for the Existence of a File. Execution can proceed to either step 3, 4, or 7, based on precedence constraints with expressions.
  3. If the file does not exist, it is created through a data flow. This data flow is straightforward - it retrieves all the rows from the reference table and runs them through a Cache Transform.
      image
  4. If the cache file does exist and the reference table has been modified since the last run of the package, a File System task is used to rename it. This done so that the cache file can be recreated in the next step. It cannot be read from and written to at the same time, so there has to be two copies of it for a short time.
  5. Now the cache file is updated within a data flow task. The OLE DB Source retrieves only the modified rows from the reference table. The rows that haven't been changed are pulled in from the cache file, using the Raw File Source. The lookup is used to determine if an existing row from the cache file is also included in the set of modified rows from reference table. The lookup uses an expression to set the reference query, so that it contains the same set of rows as the OLE DB Source. If the row matches one from the OLE DB Source, the row is discarded. Only non-matching rows are passed through the lookup. Finally, the rows are combined using a Union All and written back out to a new cache file (with the same name as the original).
    image
  6. The copied version of the cache file is deleted, now that it is no longer needed.
  7. This data flow is used to test the resulting cache file. It is not necessary in a production package.

The Lookup table for this example looks like:

 

CREATE TABLE [dbo].[StatusLookup](
    [ValidValue] [varchar](50) NULL,
    [AlternateValue] [varchar](50) NULL,
    [LastUpdate] [datetime] NULL
) ON [PRIMARY]
GO

This package uses a number of variables:

image

and several expressions (some of which are on the variables above):

image

These are used to make the package a bit more dynamic. The LastRunDate variable controls when the package thinks it was last executed. In a production application, this would need to be set from a parent package or an Execute SQL Task. The LookupTable variable controls the name of the cache files.

I encourage you to download the sample and walk through it to get a in-depth understanding. I've really only covered the high points in this article. It's posted on my SkyDrive.

This was done against CTP6 of SQL Server 2008. The RTM release may necessitate some changes to the pattern and/or the code described here.

In SSIS 2005, one of the challenges with the Lookup transform was that the SQL used to populate the cache was fixed at design time. You could dynamically pass in parameters to the SQL, but only if you turned on the advanced caching options, which puts the lookup into row-by-row operation. With smaller tables (<100,000 rows) this wasn't too much of an issue, but with larger tables it could be a problem. A common scenario where this functionality would be nice is with Type 2 slowly changing dimensions (SCD). If you are extracting a single day of source records (doing a daily incremental load from the source system), then the lookup cache only needs to contain records that were in effect on the date of the extract. In SSIS 2005, there was no easy way to accomplish this. However, the functionality has been added in 2008.

To illustrate this, I've created a simple package with a single data flow that uses AdventureWorks and AdventureWorksDW. The OLE DB Source extracts Sales Order Detail records for a single day from AdventureWorks, based on the RunDate variable.

image

image

The Product Lookup is getting its rows from the DimProduct table in AdventureWorksDW. This is a Type 2 SCD table, with effective dates. The Lookup is set up normally. Initially, I had to specify a default query that returned the columns needed from DimProduct, so that the metadata for the data flow was set up properly.

To add an expression to the Lookup, I went back to the Control Flow view, and selected the Data Flow Task. Now, in the Properties window, you can see two properties associated with the Lookup:

image

SqlCommand is used to override the inner SQL statement used by the Lookup, while SqlCommandParam can be used to modify the entire statement. Normally, SqlCommand is enough for most purposes. I have an expression set on the property, that builds a SELECT statement dynamically, including a WHERE clause that filters by the RunDate variable.

image

The primary advantage to doing this is a reduction in the number of rows in the lookup cache. For the AdventureWorks sample, filtering the set of dimension rows down to just the ones in effect as of the Sales Order date reduces the cached rows from 606 to 339. Not really a significant amount in this example, but a similar percentage reduction for a million row dimension table would lead to significant savings.

I've uploaded the sample package to my SkyDrive.

I found something interesting in SSIS 2008 today, and thought I'd share it. I was working with David Darden on a package that included a script task. In 2008, the Visual Studio Tools for Applications (VSTA) environment is used to edit script tasks. It exposes a few more details about the script task, including the project name. By default, it creates a rather long project name, as you can see in the screen shot below. (Yes, that is C#. It's my preferred language, so expect to see more of it here now that it's available in 2008). The namespace is based on the project name, so you get the same long value for it.

image

While poking around in the script task's expressions editor, we found the ScriptProjectName property. This property does not show up in the Properties window (F4), only in the expressions editor. By setting this, you can update the project name in the script. For example, setting this property to "NewScriptProjectName", as shown here:

image

results in this showing up in the script project:

image 

There are a few caveats to this. One, if you do this after initially editing the script, the project name will be updated, but the namespace will remain the same. Two, you must save the package after setting the expression, in order for the expression to be applied. If you add the expression, then edit the script without saving, it will not use the new project name.

So, if you want to name your script project with a friendly name, do the following:

  1. Add the Script Task.
  2. Add an expression to set the ScriptProjectName property to your desired value.
  3. Save the package.
  4. Now you can use the Edit Script button to edit the script project with the friendly name.

I do want to point out that the name of the project has no real impact on functionality. The script will not run any differently if you do this. It's purely for aesthetic purposes. But if the long, random looking project names bother you, it's nice to know you can fix them. There is also a MSDN forum post here that details the steps to edit the namespace after the script has been edited for the first time. It's a little easier to set it this way, though, so that it is handled up front.

One of my colleagues, David Darden, has started blogging on www.agilebi.com. He's been working with the Microsoft BI stack for several years now, and has a lot of great knowledge to share. His first post is on using the PerformancePoint SDK to create scorecards. It's a good read, and if you are interested in extending the built-in PerformancePoint functionality, I highly recommend taking a look.

http://agilebi.com/cs/blogs/ddarden/archive/2008/04/26/creating-a-scorecard-transform-in-performancepoint-server-monitor.aspx

I use the Derived Column transform in the Data Flow a lot. In fact, I can't remember the last time I wrote a production package that didn't include at least one Derived Column transform in it. Over time, I've built up a lot of common expressions that I keep using, as well as some that I stored "just in case". I thought it would be good to post a few of them here, primarily because it makes it easier for me to find them, but hopefully others will find these useful as well.

References to columns ([StringColumn], for example) need to be replaced with the appropriate column name in the data flow. Values in italics need to be updated with the appropriate value (literal, column, or variable). Also, some of the date expressions may not work if your regional settings are not US date formats.

Description Expression
Count the occurrences of a specific character in a string LEN([StringColumn]) - LEN(REPLACE([StringColumn],"Character to count",""))
Cast a date stored in a string column as YYMMDD to DT_DBTIMESTAMP (DT_DBTIMESTAMP)(SUBSTRING([StringColumn],3,2) + "-" + SUBSTRING([StringColumn],5,2) + "-" + SUBSTRING([StringColumn],1,2))
Cast a date stored in a string column as YYYYMMDD to DT_DBTIMESTAMP (DT_DBTIMESTAMP)(SUBSTRING([StringColumn],1,4) + "-" + SUBSTRING([StringColumn],5,2) + "-" + SUBSTRING([StringColumn],7,2))
Cast a date stored in a string column as DDMMMYYYY to DT_DBTIMESTAMP (DT_DBTIMESTAMP)(SUBSTRING([StringColumn],3,3) + "-" + SUBSTRING([StringColumn],1,2) + "-" + SUBSTRING([StringColumn],6,4))
Pad a string to the specified length [StringColumn] + REPLICATE(" ", 10 - LEN([StringColumn]))
Check a date to see if it in a valid range and replace with NULL if it is not [DateColumn] < (DT_DBTIMESTAMP)"1753/01/01" ? NULL(DT_DBTIMESTAMP) : [DateColumn > (DT_DBTIMESTAMP)"9999/12/31" ? NULL(DT_DBTIMESTAMP) : [DateColumn]
Convert seconds to minutes and seconds (DT_WSTR,20)([NumberOfSeconds] / 60) + ":" + (DT_WSTR,20)([NumberOfSeconds] % 60)
Combine date and time strings to DT_DBTIMESTAMP (DT_DBTIMESTAMP)("2008-03-25" + " " + "10:20:00")
Return True if Date is a Weekend DATEPART("Weekday",[DateColumn]) == 7 ? (DT_BOOL)"True" : DATEPART("Weekday",[DateColumn]) == 1 ? (DT_BOOL)"True" : (DT_BOOL)"False"
Cast a string to a numeric with a range check and return 0 if out of range [StringColumn] > "9" ? 0 : [StringColumn] < "0" ? 0 : ((DT_NUMERIC,1,0)[StringColumn])

A question that has come up in a few different situations recently is how to send someone an alert if your package is taking too long to execute. For example, you might have a package that loads a large number rows from a source system to a destination on a nightly basis. There are a number of other packages that depend on this one to complete before they can begin running. The initial package normally takes 15 minutes to run, but on occasion, the source system can be under heavy load, so the package can run 45 minutes to an hour. In this scenario, you'd like to alert people that the load will be delayed.

If you are using an scheduling tool, you should first check to see whether it has this capability. Ideally, you want this monitoring process outside of the package itself. If it is internal to the package, then there is potential for the alerts not to be sent, if the DTEXEC process has really gotten hung up. Some scheduling tools have the capability to monitor the duration of the execution. However, SQL Agent does not have this, so you may need to implement this in the package itself. Another reason you may need to do it in the package is to monitor the duration of a specific set of tasks, but not the entire package.

Fortunately, this isn't too difficult to implement in SSIS. All it takes is a little scripting. I've set up a sample package that illustrates how to do this. The main package looks like this:

image

The Long Running Task script task simulates a long running task by running a specific number of seconds (the value of the TaskDelay variable below). The Mark Task Successfully Completed sets the TaskSuccessful variable to True, so that the package knows when the long running task completed. The Monitor Duration script task is the one that actually does the checking on execution time, based on the value in the TaskDuration variable. If the execution time has taken more seconds than the value of TaskDuration specifies, the script exits. There is a precedence constraint that checks that the value of the TaskSuccessful variable is still False, then the Send an Alert task is run. In the sample, the Send an Alert task is a script task, but in a real package it would likely be a Send Mail task.

image

As stated above, the real work happens in the script. Basically, it performs a check once a second to see if either:

  1. The TaskSuccessful variable is True, meaning that the long running task completed successfully, or
  2. The difference in seconds between the current time and the time when the script started running is greater than the value in the TaskDuration variable.

If either case is true, the script exits. If both cases are false, the script sleeps for a second, then checks again. Putting the thread to sleep helps make sure that CPU isn't tied up checking the conditions too often.

Public Sub Main()
    Dim vars As Variables
    Dim maxDuration As Integer
    Dim startTime As DateTime = DateTime.Now
    Dim continueWaiting As Boolean = True
    Dim taskCompleted As Boolean = False
 
    Dts.VariableDispenser.LockOneForRead("TaskDuration", vars)
    maxDuration = System.Convert.ToInt32(vars("TaskDuration").Value)
    vars.Unlock()
 
    Do While continueWaiting
        System.Threading.Thread.Sleep(1000)
 
 
        Dts.VariableDispenser.LockOneForRead("TaskSuccessful", vars)
        taskCompleted = System.Convert.ToBoolean(vars("TaskSuccessful").Value)
        vars.Unlock()
 
        If taskCompleted Or _
            (DateDiff(DateInterval.Second, startTime, DateTime.Now) > maxDuration) Then
            continueWaiting = False
        End If
    Loop
 
    Dts.TaskResult = Dts.Results.Success
End Sub

 

There are few things to be aware of with this approach. It's not foolproof by any means. Any process that ties up 100% of the CPU is likely going to make this fail. Also, this depends on parallel execution of the long running task and the monitoring task. If you set the MaxConcurrentExecutables property in your package to 1, this will not work. It needs to be run on a machine with enough power to start the task at roughly the same time. In my experience, this hasn't been a problem with the hardware that I work with, but it is a potential failure point.

The sample file is posted on my Skydrive, as usual.

More Posts Next page »