Dynamically Pivoting Rows to Columns

Previously, I posted about dynamically un-pivoting columns to rows. In this post, we’ll look at the reverse scenario. Suppose that you have a list of sales orders, in this format:
























Sales Order Number Product
SO-001 Hammer
SO-001 Table Saw
SO-001 Cordless Screwdriver
SO-002 Table Saw
SO-002 Wrench
SO-003 Flashlight

If there is a (reasonable) maximum number of products, you can create the maximum number of columns needed, and use the Pivot transform. However, if the number of products per order vary significantly, you might want to use something a little more flexible. This can be accomplished by storing the values in a single column as XML, or as a delimited set of values. 


For this example, we’ll take the input described above and transform it to this:















Sales Order Number Product
SO-001 Hammer|Table Saw|Cordless Screwdriver
SO-002 Table Saw|Wrench
SO-003 Flashlight

The data flow for this is straightforward – a Flat File Source, a Script component, and a Multicast (so that a data viewer can be added to see the results of the script). The Script component is where the bulk of the work is done.


image 


The script transform is configured with a single asynchronous output, by setting Output 0’s SynchronousInputID property to “None”. Two columns are added to Output 0, one for the Sales Order Number, and one for the list of products.


image


The script itself is listed below. Essentially, the script checks each row to see if the sales order number has changed from the previous row. If it is the same, it appends the product to a list. If the sales order number is different, the currently stored values for sales order number and the product list are sent to the asynchronous output.


There is some special handling for the first and last rows. On the first row, we don’t want to output anything, as we need to check the next row first. When the input buffer is finished, we need send the last value for sales order number and product list, and then call the SetEndOfRowSet on the output buffer. To make sure we’ve processed all the rows, we need to check EndOfRowSet on the input buffer. This should be checked in the Input0_ProcessInput routine, as it will never be equal to true in the Input0_ProcessInputRow method.


 



   1: Public Class ScriptMain
   2:     Inherits UserComponent
   3:  
   4:     Private orderNum As String
   5:     Private productList As String
   6:     Private blnFirstRow As Boolean = True
   7:  
   8:     Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
   9:  
  10:         If Row.SalesOrderNumber = orderNum Then
  11:             ‘append this item to the list
  12:             productList = productList + “|” + Row.Product
  13:         Else
  14:             If blnFirstRow Then
  15:                 blnFirstRow = False
  16:             Else
  17:                 ‘output the current values
  18:                 Output0Buffer.AddRow()
  19:                 Output0Buffer.SalesOrderNumber = orderNum
  20:                 Output0Buffer.ProductList = productList
  21:             End If
  22:  
  23:             ‘reset the control variables
  24:             orderNum = Row.SalesOrderNumber
  25:             productList = Row.Product
  26:         End If
  27:     End Sub
  28:  
  29:     Public Overrides Sub Input0_ProcessInput(ByVal Buffer As Input0Buffer)
  30:         MyBase.Input0_ProcessInput(Buffer)
  31:         If Buffer.EndOfRowset Then
  32:             Output0Buffer.AddRow()
  33:             Output0Buffer.SalesOrderNumber = orderNum
  34:             Output0Buffer.ProductList = productList
  35:             Output0Buffer.SetEndOfRowset()
  36:         End If
  37:     End Sub
  38: End Class

This can be useful if you need to pivot a varied number of columns. The same techniques can also be used anywhere you want to process a set of rows and change the output based on previous row values. If you’d like to take a more in-depth look, the sample files are on my SkyDrive:



 


 

This entry was posted in Uncategorized. Bookmark the permalink.

Comments are closed.