Running totals are a common calculation for reporting, and are occasionally needed in ETL processes to populate aggregate tables. One way to accomplish this in SSIS, if you are retrieving data from a relational database, is to do the running total in the SELECT statement. There are a number of resources on the Internet that illustrate how to accomplish this. However, this may not work for you if you are retrieving information from a flat file. Also, the logic to implement a running total may not be straightforward in SQL, particularly if you need sliding windows for the totals (I’ll explain this later).
Fortunately, there are other ways to accomplish this in SSIS. A very flexible method is to use a script component, as this will let you customize the logic however you prefer. As an example, I’ll take a look at a couple of examples from AdventureWorks.
First, the more straightforward example. The AdventureWorks database has a Sales.SalesOrderDetail table that lists the itemized purchases for each order. Note – I’m only filtering the results for example purposes. The actual code in the package works with the whole table.
If I was to calculate a running total for the OrderQty for each SalesOrderID above, I’d expect to see:
SalesOrderID | ProductID | OrderQty | RunningTotal |
43685 | 765 | 3 | 3 |
43685 | 763 | 1 | 4 |
43685 | 754 | 1 | 5 |
43685 | 725 | 1 | 6 |
43686 | 758 | 3 | 3 |
43686 | 762 | 1 | 4 |
43686 | 770 | 1 | 5 |
43687 | 768 | 1 | 1 |
43687 | 765 | 2 | 3 |
Notice that the running total resets each time the SalesOrderID changes. To implement a script component that will calculate the running total, I’ve created a new script component and added an output column to it to hold the running total.
Inside the script component, I’ve added a variable to store the value of the control number (SalesOrderID), and another variable to capture the running total. When the control number changes, the running total is reset.
Public Class ScriptMain
Inherits UserComponent
Dim runningTotal As Integer
Dim previousValue As Integer
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not previousValue = Row.SalesOrderID Then
runningTotal = 0
previousValue = Row.SalesOrderID
End If
runningTotal += Row.OrderQty
Row.RunningTotal = runningTotal
End Sub
End Class
This is an easy method to calculate running totals in the data flow. One thing to note is that the data must be sorted by the control number (SalesOrderID in the example) in order for this to work properly.
This is in an example package on my SkyDrive here. I will follow up this post with another sample showing a more complex "running total" problem and how it can be solved in SSIS as well.