Calculating a Running Total In SSIS

Running totals are a common calculation for reporting, and are occasionally needed in ETL processes to populate aggregate tables. One way to accomplish this in SSIS, if you are retrieving data from a relational database, is to do the running total in the SELECT statement. There are a number of resources on the Internet that illustrate how to accomplish this. However, this may not work for you if you are retrieving information from a flat file. Also, the logic to implement a running total may not be straightforward in SQL, particularly if you need sliding windows for the totals (I’ll explain this later).

Fortunately, there are other ways to accomplish this in SSIS. A very flexible method is to use a script component, as this will let you customize the logic however you prefer. As an example, I’ll take a look at a couple of examples from AdventureWorks.

First, the more straightforward example. The AdventureWorks database has a Sales.SalesOrderDetail table that lists the itemized purchases for each order. Note – I’m only filtering the results for example purposes. The actual code in the package works with the whole table.

image

If I was to calculate a running total for the OrderQty for each SalesOrderID above, I’d expect to see:

SalesOrderID ProductID OrderQty RunningTotal
43685 765 3 3
43685 763 1 4
43685 754 1 5
43685 725 1 6
43686 758 3 3
43686 762 1 4
43686 770 1 5
43687 768 1 1
43687 765 2 3

Notice that the running total resets each time the SalesOrderID changes. To implement a script component that will calculate the running total, I’ve created a new script component and added an output column to it to hold the running total.

image

Inside the script component, I’ve added a variable to store the value of the control number (SalesOrderID), and another variable to capture the running total. When the control number changes, the running total is reset.

Public Class ScriptMain
    Inherits UserComponent
    Dim runningTotal As Integer
    Dim previousValue As Integer
 
 
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        If Not previousValue = Row.SalesOrderID Then
            runningTotal = 0
            previousValue = Row.SalesOrderID
        End If
 
        runningTotal += Row.OrderQty
 
        Row.RunningTotal = runningTotal
    End Sub
 
End Class

This is an easy method to calculate running totals in the data flow. One thing to note is that the data must be sorted by the control number (SalesOrderID in the example) in order for this to work properly.

This is in an example package on my SkyDrive here. I will follow up this post with another sample showing a more complex "running total" problem and how it can be solved in SSIS as well.

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.