Presentation at the Triad SQL Server User Group

On May 15th I’ll be presenting at the Triad SQL Server User Group, located in High Point, NC. The presentation will be focusing on new features in SQL Server 2008 for business intelligence and data warehousing. If you’re in the area, please stop in and introduce yourself.

Posted in Uncategorized | Comments Off on Presentation at the Triad SQL Server User Group

SSIS 2008 – Incrementally Updating the Lookup Cache File

There are a number of new features in the 2008 Lookup Transform. Jamie Thomson posted a good overview of the features, and I recommend reading it prior to this article. One of the new features in SSIS 2008 is the ability to cache the data from the Lookup to a local file. This should help improve performance, as the lookup data would only need to be retrieved once, stored locally, and reused. Internally, the lookup cache file (.caw) uses a RAW file structure (.raw), which is very fast to read and write from disk.


One of the first things that I wanted to try with the new caching features was to dynamically update the cache file. That proved to be a bit of a challenge, as Jamie pointed out in his article. One of the main reasons for this is that, while the cache file is very similar to a RAW file, the cache file stores some additional metadata – hash values for each row, and some header information. This means that while you can use a RAW Source component to read the data from a cache file, you must use the Cache Transform to actually create a cache file.


Unfortunately, the Cache Transform does not support incremental updates to the cache file. Each time you use it, it recreates the file completely. However, with a little creativity, you can minimize the amount of data you have to process. I’ve created a sample package that illustrates one approach to this.


The control flow looks like this (the numbers refer to the steps below):


image


The major components of the process are:



  1. Retrieve the last modified date from the Lookup reference table, by using an Execute SQL Task.

  2. Use a Script Task to determine if the cache file already exists. This leverages the System.IO.File class, as discussed in Checking for the Existence of a File. Execution can proceed to either step 3, 4, or 7, based on precedence constraints with expressions.

  3. If the file does not exist, it is created through a data flow. This data flow is straightforward – it retrieves all the rows from the reference table and runs them through a Cache Transform.
      image

  4. If the cache file does exist and the reference table has been modified since the last run of the package, a File System task is used to rename it. This done so that the cache file can be recreated in the next step. It cannot be read from and written to at the same time, so there has to be two copies of it for a short time.

  5. Now the cache file is updated within a data flow task. The OLE DB Source retrieves only the modified rows from the reference table. The rows that haven’t been changed are pulled in from the cache file, using the Raw File Source. The lookup is used to determine if an existing row from the cache file is also included in the set of modified rows from reference table. The lookup uses an expression to set the reference query, so that it contains the same set of rows as the OLE DB Source. If the row matches one from the OLE DB Source, the row is discarded. Only non-matching rows are passed through the lookup. Finally, the rows are combined using a Union All and written back out to a new cache file (with the same name as the original).
    image

  6. The copied version of the cache file is deleted, now that it is no longer needed.

  7. This data flow is used to test the resulting cache file. It is not necessary in a production package.

The Lookup table for this example looks like:


 



CREATE TABLE [dbo].[StatusLookup](
    [ValidValue] [varchar](50) NULL,
    [AlternateValue] [varchar](50) NULL,
    [LastUpdate] [datetime] NULL
) ON [PRIMARY]
GO

This package uses a number of variables:


image


and several expressions (some of which are on the variables above):


image


These are used to make the package a bit more dynamic. The LastRunDate variable controls when the package thinks it was last executed. In a production application, this would need to be set from a parent package or an Execute SQL Task. The LookupTable variable controls the name of the cache files.


I encourage you to download the sample and walk through it to get a in-depth understanding. I’ve really only covered the high points in this article. It’s posted on my SkyDrive.


This was done against CTP6 of SQL Server 2008. The RTM release may necessitate some changes to the pattern and/or the code described here.

Posted in Uncategorized | Comments Off on SSIS 2008 – Incrementally Updating the Lookup Cache File

Setting the Lookup Query With an Expression

In SSIS 2005, one of the challenges with the Lookup transform was that the SQL used to populate the cache was fixed at design time. You could dynamically pass in parameters to the SQL, but only if you turned on the advanced caching options, which puts the lookup into row-by-row operation. With smaller tables (<100,000 rows) this wasn’t too much of an issue, but with larger tables it could be a problem. A common scenario where this functionality would be nice is with Type 2 slowly changing dimensions (SCD). If you are extracting a single day of source records (doing a daily incremental load from the source system), then the lookup cache only needs to contain records that were in effect on the date of the extract. In SSIS 2005, there was no easy way to accomplish this. However, the functionality has been added in 2008.


To illustrate this, I’ve created a simple package with a single data flow that uses AdventureWorks and AdventureWorksDW. The OLE DB Source extracts Sales Order Detail records for a single day from AdventureWorks, based on the RunDate variable.


image


image


The Product Lookup is getting its rows from the DimProduct table in AdventureWorksDW. This is a Type 2 SCD table, with effective dates. The Lookup is set up normally. Initially, I had to specify a default query that returned the columns needed from DimProduct, so that the metadata for the data flow was set up properly.


To add an expression to the Lookup, I went back to the Control Flow view, and selected the Data Flow Task. Now, in the Properties window, you can see two properties associated with the Lookup:


image


SqlCommand is used to override the inner SQL statement used by the Lookup, while SqlCommandParam can be used to modify the entire statement. Normally, SqlCommand is enough for most purposes. I have an expression set on the property, that builds a SELECT statement dynamically, including a WHERE clause that filters by the RunDate variable.


image


The primary advantage to doing this is a reduction in the number of rows in the lookup cache. For the AdventureWorks sample, filtering the set of dimension rows down to just the ones in effect as of the Sales Order date reduces the cached rows from 606 to 339. Not really a significant amount in this example, but a similar percentage reduction for a million row dimension table would lead to significant savings.


I’ve uploaded the sample package to my SkyDrive.

Posted in Uncategorized | Comments Off on Setting the Lookup Query With an Expression

Renaming the Script Project in SSIS 2008

I found something interesting in SSIS 2008 today, and thought I’d share it. I was working with David Darden on a package that included a script task. In 2008, the Visual Studio Tools for Applications (VSTA) environment is used to edit script tasks. It exposes a few more details about the script task, including the project name. By default, it creates a rather long project name, as you can see in the screen shot below. (Yes, that is C#. It’s my preferred language, so expect to see more of it here now that it’s available in 2008). The namespace is based on the project name, so you get the same long value for it.

image

While poking around in the script task’s expressions editor, we found the ScriptProjectName property. This property does not show up in the Properties window (F4), only in the expressions editor. By setting this, you can update the project name in the script. For example, setting this property to "NewScriptProjectName", as shown here:

image

results in this showing up in the script project:

image 

There are a few caveats to this. One, if you do this after initially editing the script, the project name will be updated, but the namespace will remain the same. Two, you must save the package after setting the expression, in order for the expression to be applied. If you add the expression, then edit the script without saving, it will not use the new project name.

So, if you want to name your script project with a friendly name, do the following:

  1. Add the Script Task.
  2. Add an expression to set the ScriptProjectName property to your desired value.
  3. Save the package.
  4. Now you can use the Edit Script button to edit the script project with the friendly name.

I do want to point out that the name of the project has no real impact on functionality. The script will not run any differently if you do this. It’s purely for aesthetic purposes. But if the long, random looking project names bother you, it’s nice to know you can fix them. There is also a MSDN forum post here that details the steps to edit the namespace after the script has been edited for the first time. It’s a little easier to set it this way, though, so that it is handled up front.

Posted in Uncategorized | Comments Off on Renaming the Script Project in SSIS 2008

MDX Filters in PPS Monitor

I had a question on how to implement MDX Filters in PerformancePoint Monitor awhile back.  Nick Barclay  provided an answer here.

Some samples Nick provided were:

   1: [Time].[Fiscal Year].Children

and

   1: DESCENDANTS(
   2:   [Geography].[Geography].[All Geographies]
   3:  ,[Geography].[Geography].[Country]
   4:  ,SELF_AND_BEFORE) 

Depending on how you have implemented your cube, you can also do some other clever things.  On a recent project, I had a Date dimension with a hierarchy that would present the current month as ‘Current Month’, so MDX such as the following :

   1: {[Dim Date].[Calendar].[Month Name].&[Current Month].Lag(6):[Dim Date].[Calendar].[Month Name].&[Current Month]}

could be used to provide the last 6 months contained in the cube as a drop down list:

   1: Nov 2007
   2: Dec 2007
   3: Jan 2008
   4: Feb 2008
   5: Mar 2008
   6: Current Month

Another useful possibility is to provide a filter based on a hierarchy, but an individual member (in this case the ‘Unknown’ member) should not be displayed in the pick list:

   1: FILTER(
   2:     DESCENDANTS(
   3:          [Entity].[Entity]
   4:         ,[Entity].[Entity]
   5:         ,SELF_AND_BEFORE
   6:     )
   7:     ,NOT [Entity].[Entity].CURRENTMEMBER IS [Entity].[Entity].&[Unknown]
   8: )

Posted in Uncategorized | Tagged , , | Comments Off on MDX Filters in PPS Monitor

David Darden’s Blog

One of my colleagues, David Darden, has started blogging on www.agilebi.com. He’s been working with the Microsoft BI stack for several years now, and has a lot of great knowledge to share. His first post is on using the PerformancePoint SDK to create scorecards. It’s a good read, and if you are interested in extending the built-in PerformancePoint functionality, I highly recommend taking a look.

http://agilebi.com/cs/blogs/ddarden/archive/2008/04/26/creating-a-scorecard-transform-in-performancepoint-server-monitor.aspx

Posted in Uncategorized | Comments Off on David Darden’s Blog

Creating a Scorecard Transform in PerformancePoint Server Monitor


Overview

I’ve spent a couple of days doing some prototyping around PerformancePoint Monitor… there have been a few things that I’ve wanted to accomplish, so when the PerformancePoint Monitoring SDK was released, it seemed like the way to do it.  However, there aren’t a lot of resources around Scorecard Transforms out there, so I thought I’d document some of the things I found.  This documentation won’t be exhaustive, but it will hopefully point everyone in the right direction.  Keep in mind this is just prototype code… a number of enhancements should be made before putting this into production.

I’ve created a sample project (and a simple dashboard) to do the following:

  1. Perform a Magic Number transformation – I’ve had clients want a message to be returned to the user instead of a value… such has having conditional logic in a calculated member that returns percentages between 0 – 1, then -1 if no data exists, -2 if the KPI is invalid in the chosen scenario, etc.
  2. Display text on the scorecard that is database driven – I have a request to provide database driven, text metadata for KPIs on each scorecard.

The dashboard contains two scorecards (‘Sample Scorecard’ that the transforms are applied to, and ‘Sample Scorecard 2’ which the transforms are not applied to), with two KPIs.  You will need to publish the objects in the .bswx file (included in the Visual Studio solution) in order to see the results of the transforms.

You can download the sample project here.

Where to Start

The first thing to do is to create a DLL containing your transform.  I started with a C# class library.  Remember that your DLL will have to be strongly named.  The basic steps will be to run SN.exe to create a key file, then using the resulting .snk file in your project.

Once you’ve created your project, you’ll need to implement a class that inherits from IGridViewTransform, and implement three methods in your class:

  1. public GridViewTransformType GetTransformType()
  2. public string GetId()
  3. public void Execute(GridViewData viewData, PropertyBag parameters, IGlobalCache cache)

The first method is used to determine when in the lifecycle of a scorecard the transform is run.  Your options are PreRender, PreQuery, PostQuery, and PerUser.  Here’s a sample:

   1: /// <summary>
   2: /// Returns the type of transform that will be applied
   3: /// </summary>
   4: /// <returns>Grid view transform type</returns>
   5: public GridViewTransformType GetTransformType()
   6: {
   7:     return GridViewTransformType.PreRender;
   8: }

The second method simply returns the name of the transform:

   1: /// <summary>
   2: /// Returns the name of the transform
   3: /// </summary>
   4: /// <returns>Name of transform</returns>
   5: public string GetId()
   6: {
   7:     return "TransformMagicNumberToText";
   8: }

The Execute method is what we’re really interested in… this is run every time a scorecard is displayed.  OK, that’s not quite true… week in mind that every time a scorecard is run for an individual for a particular set of parameters, it get’s cached for awhile… keep that in mind when debugging.  An IISRESET will make sure the transform is run again.

   1: /// <summary>
   2: /// This method is called each time a Scorecard is rendered
   3: /// </summary>
   4: /// <param name="viewData">Scorecard view that the transform acts on</param>
   5: /// <param name="parameters">Parameters passed to the method</param>
   6: /// <param name="cache">Global cache of objects</param>
   7: public void Execute(GridViewData viewData, PropertyBag parameters, IGlobalCache cache)
   8: {
   9:  
  10: }

First, we’ll set up some variables that we’ll use later.  The idea is that there are some of these transforms that should only be run on certain scorecards, columns, KPIs, etc.  I’m prototyping these for a publicly accessible dashboard… so I want to keep performance in mind.  The final version will probably utilize caching in order to increase performance.  Right now, I just threw in some basic functionality to make the transform a little more ‘selective’.

   1: // These will be read from a config file or a database in production
   2: // Use to limit the columns in a Scorecard that this transform will run on
   3: string columnNames = "|Magic Column|Another Column|";  
   4: // Use to limit what scorecards this transform will run on
   5: string scorecardNames = "|Sample Scorecard|Another Scorecard|";
   6: // Value to replace, and text to replace it with; this would probably really be a list
   7: decimal magicValue = Convert.ToDecimal(-1); 
   8: string replacementText = "Invalid Data";
   9:  
  10: // Columns we're interested in that are on the scorecard
  11: List<GridHeaderItem> columnsInScorecard = new List<GridHeaderItem>();

Next, we’re going to get the name of the scorecard, so we will only run the transform on scorecards we want.

   1: // Get the Scorecard that is currently being transformed
   2: Scorecard scorecard = cache.GetScorecard(viewData.ScorecardId);
   3: // Get the name of the scorecard
   4: string scorecardName = scorecard.Name.Text;
   5:  
   6: // Validate that this scorecard should have the transform applied
   7: // If not, short-circuit
   8: if (scorecardNames.IndexOf("|" + scorecardName + "|") < 0)
   9: {
  10:     return;
  11: }

Next, we’ll run through the list of GridHeaderItems that exists on the Scorecard.  We’ll build a list of the the GridHeaderItems who’s display text matches the columns we want to be able to transform.

   1: // Get the row and column headers
   2: List<GridHeaderItem> rowHeaders = viewData.RootRowHeader.GetAllHeadersInTree();
   3: List<GridHeaderItem> columnHeaders = viewData.RootColumnHeader.GetAllHeadersInTree();
   4:  
   5: // Iterate through the Column headers to get a list
   6: // containing the columns we're interested in
   7: foreach (GridHeaderItem ghi in columnHeaders)
   8: {
   9:     // See if the column exists in our list
  10:     if (columnNames.IndexOf("|" + ghi.DisplayText + "|") >= 0)
  11:     {
  12:         // Add it to the list if so
  13:         columnsInScorecard.Add(ghi);
  14:     }
  15: }

Now we’re going to look through all the the columns and rows on the scorecard.  We’ll look at each display condition, see if it matches our criteria, then replace it if it does.  We accomplish changing a number to text by removing the initial Display Element, and replacing it with a new one.

Now we have a basic transform that, for a particular scorecard and column, will replace specific numbers with text.

   1: // Iterate through the list of columns we're interested in
   2: foreach (GridHeaderItem columnHeader in columnsInScorecard)
   3: {
   4:    // Look at each row on the scorecard
   5:    foreach (GridHeaderItem rowHeader in rowHeaders)
   6:    {
   7:        // Variable to hold the index that we want to remove
   8:        // This may need to become a list, or another type of structure
   9:        int indexToRemove = -1;
  10:  
  11:        for (int i = 0; i < viewData.Cells[rowHeader, columnHeader].DisplayElements.Count; ++i)
  12:        {
  13:            GridDisplayElement gde = viewData.Cells[rowHeader, columnHeader].DisplayElements[i];
  14:            
  15:            // Look for number in the display properties
  16:            // In this case, look for the Value, not the formatted display
  17:            if (gde.DisplayElementType == DisplayElementTypes.Number)
  18:            {
  19:                // Find our magic value that we use as a key
  20:                // to know when to replace the value 
  21:                if (gde.Value == magicValue)
  22:                {
  23:                    indexToRemove = i;
  24:                }
  25:            }
  26:        }
  27:  
  28:        if (indexToRemove > -1)
  29:        {
  30:            viewData.Cells[rowHeader, columnHeader].DisplayElements.RemoveAt(indexToRemove);
  31:  
  32:            GridDisplayElement newGde = new GridDisplayElement();
  33:            newGde.DisplayElementType = DisplayElementTypes.Text;
  34:            newGde.Text = replacementText;
  35:  
  36:            viewData.Cells[rowHeader, columnHeader].DisplayElements.Add(newGde);
  37:        }
  38:    }
  39: }

 

Deploying the Transform

Alright, so we’ve written and compiled our first transform.  Now, all we need to do is actually install it.  There isn’t any documentation (at the time of writing this blog) in the PerformancePoint Monitoring SDK on how to install a Scorecard Transform… but there is documentation on how to Install Report Viewer Extensions.  I took a guess that the process would be pretty much the same (albeit instead of using the <CustomReportViews> section you use the <ScorecardTemplateExtensions> section in the config files), and it appears so.  You can just follow those steps with that minor change.

Here is an example of what your new <ScorecardTemplateExtensions> section will look like:

   1: ...
   2: <CustomViewTransforms>
   3:   <add key="ExpandNamedSets" value="Microsoft.PerformancePoint.Scorecards.GridViewTransforms.ExpandNamedSets, Microsoft.PerformancePoint.Scorecards.Server, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
   4:   <add key="RowsColumnsFilterTransform" value="Microsoft.PerformancePoint.Scorecards.GridViewTransforms.RowsColumnsFilterTransform, Microsoft.PerformancePoint.Scorecards.Server, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
   5:   <add key="AnnotationTransform" value="Microsoft.PerformancePoint.Scorecards.GridViewTransforms.AnnotationTransform, Microsoft.PerformancePoint.Scorecards.Server, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
   6:   <add key="UpdateDisplayText" value="Microsoft.PerformancePoint.Scorecards.GridViewTransforms.UpdateDisplayText, Microsoft.PerformancePoint.Scorecards.Client, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
   7:   <add key="ComputeRollups" value="Microsoft.PerformancePoint.Scorecards.GridViewTransforms.ComputeRollups, Microsoft.PerformancePoint.Scorecards.Client, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
   8:   <add key="ComputeAggregations" value="Microsoft.PerformancePoint.Scorecards.GridViewTransforms.ComputeAggregations, Microsoft.PerformancePoint.Scorecards.Client, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35" />
   9:   <!-- add key="ApplyDefaultFormatInfo" value="Microsoft.PerformancePoint.Scorecards.Client.ApplyDefaultFormatInfo, Microsoft.PerformancePoint.Scorecards.Client, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35"/-->
  10:   <!-- New Scorecard Transforms -->
  11:   <add key="TransformReplacePropertyText" value="ScorecardTransformPrototypeLibrary.TransformReplacePropertyText, ScorecardTransformPrototypeLibrary, Version=1.0.0.0, Culture=neutral, PublicKeyToken=dc084a0be77df9c1" />
  12:   <add key="TransformMagicNumberToText" value="ScorecardTransformPrototypeLibrary.TransformMagicNumberToText, ScorecardTransformPrototypeLibrary, Version=1.0.0.0, Culture=neutral, PublicKeyToken=dc084a0be77df9c1" />
  13:   <!-- End New Scorecard Transforms -->
  14: </CustomViewTransforms>
  15: ...

After modifying your three web.config files and GACing your assembly, your scorecard should be ready to run!

Debugging your Assembly

OK, so you’re brand new assembly is running… and not exactly doing what you’re expecting it to.  Or perhaps you just want to see what’s going on in all those objects that you’re using in your code.  Unless you’re already developing on your SharePoint box, you’ve now got a few hoops to jump through in order to be able to debug.  The configuration I use is to develop locally, then deploy to a server… so I’ll run through how to get that working.

First, you need to setup remote debugging on your server.

Next, you’ll need to put your debug symbols into the GAC with your your assembly.

Then, you’ll need to restart IIS so your new assembly gets loaded (otherwise, you’ll probably get some funky results).

I went ahead and created a batch script to do this (included in the Visual Studio solution… I created a folder on the SharePoint server for this to live in)… your filenames and paths will be different, but this is the basic concept:

   1: @ECHO OFF
   2: REM: This batch file will pull the assembly and debug symbols
   3: REM: from the development machine, GAC the assembly, and 
   4: REM: put the debug symbols into the GAC folder.
   5:  
   6: REM: Copy the scorecard files to this server
   7: COPY "\DDARDENC$UsersddardenDocumentsVisual Studio 2005ProjectsScorecardTransformPrototypeScorecardTransformPrototypeLibrarybinDebugScore*.*" .
   8: ECHO Copied the scorecard files to the server
   9:  
  10: REM: Add the DLL to the GAC
  11: "C:Program FilesMicrosoft Visual Studio 8SDKv2.0Bingacutil.exe" /i ScorecardTransformPrototypeLibrary.dll
  12: ECHO GAC'd the DLL
  13:  
  14: REM: Copy the debug symbols to the GAC directory to enable remote debugging
  15: COPY ScorecardTransformPrototypeLibrary.pdb C:WINDOWSassemblyGAC_MSILScorecardTransformPrototypeLibrary1.0.0.0__dc084a0be77df9c1
  16: ECHO Loaded the symbols
  17:  
  18: REM: Recycle IIS to reload the assembly
  19: IISRESET
  20: ECHO Reset IIS

Now you’re going to need to attach to the correct process on the remote machine.  You want the w3wp process… there may be a few, so you might have to experiment a little to find the right one (in my configuration, I want the one running under the network service – NOT the SharePoint domain service account).  When you attach to a process, look at the output to see if your assembly got loaded with symbols.  The process your looking for will have the other PerformancePoint Server Monitor assemblies loaded, such as:

   1: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.WebParts3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.WebParts.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   2: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.Client3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.Client.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   3: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.Server3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.Server.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   4: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.DataSourceProviders.Standard3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.DataSourceProviders.Standard.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   5: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.Client.resources3.0.0.0_en_31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.Client.resources.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   6: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.WebControls3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.WebControls.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   7: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.Script3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.Script.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   8: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.Common3.0.0.0__31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.Common.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
   9: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.WebParts.resources3.0.0.0_en_31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.WebParts.resources.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
  10: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.Script.resources3.0.0.0_en_31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.Script.resources.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.
  11: 'w3wp.exe' (Managed): Loaded 'C:WINDOWSassemblyGAC_MSILMicrosoft.PerformancePoint.Scorecards.WebControls.resources3.0.0.0_en_31bf3856ad364e35Microsoft.PerformancePoint.Scorecards.WebControls.resources.dll', Skipped loading symbols. Module is optimized and the debugger option 'Just My Code' is enabled.

If you see them being loaded, but you don’t see yours, then you probably messed up registering your assembly.

Now, you should be able to set break points, and debug your assembly!

Conclusion

OK… those are the steps to create a new Scorecard Transform, deploy it to the server, and to attach to the process and debug it.  Happy Monitoring!

Posted in Uncategorized | Comments Off on Creating a Scorecard Transform in PerformancePoint Server Monitor

Common Expressions In the Data Flow

I use the Derived Column transform in the Data Flow a lot. In fact, I can’t remember the last time I wrote a production package that didn’t include at least one Derived Column transform in it. Over time, I’ve built up a lot of common expressions that I keep using, as well as some that I stored "just in case". I thought it would be good to post a few of them here, primarily because it makes it easier for me to find them, but hopefully others will find these useful as well.

References to columns ([StringColumn], for example) need to be replaced with the appropriate column name in the data flow. Values in italics need to be updated with the appropriate value (literal, column, or variable). Also, some of the date expressions may not work if your regional settings are not US date formats.

Description Expression
Count the occurrences of a specific character in a string LEN([StringColumn]) – LEN(REPLACE([StringColumn],"Character to count",""))
Cast a date stored in a string column as YYMMDD to DT_DBTIMESTAMP (DT_DBTIMESTAMP)(SUBSTRING([StringColumn],3,2) + "-" + SUBSTRING([StringColumn],5,2) + "-" + SUBSTRING([StringColumn],1,2))
Cast a date stored in a string column as YYYYMMDD to DT_DBTIMESTAMP (DT_DBTIMESTAMP)(SUBSTRING([StringColumn],1,4) + "-" + SUBSTRING([StringColumn],5,2) + "-" + SUBSTRING([StringColumn],7,2))
Cast a date stored in a string column as DDMMMYYYY to DT_DBTIMESTAMP (DT_DBTIMESTAMP)(SUBSTRING([StringColumn],3,3) + "-" + SUBSTRING([StringColumn],1,2) + "-" + SUBSTRING([StringColumn],6,4))
Pad a string to the specified length [StringColumn] + REPLICATE(" ", 10 – LEN([StringColumn]))
Check a date to see if it in a valid range and replace with NULL if it is not [DateColumn] < (DT_DBTIMESTAMP)"1753/01/01" ? NULL(DT_DBTIMESTAMP) : [DateColumn > (DT_DBTIMESTAMP)"9999/12/31" ? NULL(DT_DBTIMESTAMP) : [DateColumn]
Convert seconds to minutes and seconds (DT_WSTR,20)([NumberOfSeconds] / 60) + ":" + (DT_WSTR,20)([NumberOfSeconds] % 60)
Combine date and time strings to DT_DBTIMESTAMP (DT_DBTIMESTAMP)("2008-03-25" + " " + "10:20:00")
Return True if Date is a Weekend DATEPART("Weekday",[DateColumn]) == 7 ? (DT_BOOL)"True" : DATEPART("Weekday",[DateColumn]) == 1 ? (DT_BOOL)"True" : (DT_BOOL)"False"
Cast a string to a numeric with a range check and return 0 if out of range [StringColumn] > "9" ? 0 : [StringColumn] < "0" ? 0 : ((DT_NUMERIC,1,0)[StringColumn])
Posted in Uncategorized | Comments Off on Common Expressions In the Data Flow

Sending an Alert if a Package Runs Too Long

A question that has come up in a few different situations recently is how to send someone an alert if your package is taking too long to execute. For example, you might have a package that loads a large number rows from a source system to a destination on a nightly basis. There are a number of other packages that depend on this one to complete before they can begin running. The initial package normally takes 15 minutes to run, but on occasion, the source system can be under heavy load, so the package can run 45 minutes to an hour. In this scenario, you’d like to alert people that the load will be delayed.

If you are using an scheduling tool, you should first check to see whether it has this capability. Ideally, you want this monitoring process outside of the package itself. If it is internal to the package, then there is potential for the alerts not to be sent, if the DTEXEC process has really gotten hung up. Some scheduling tools have the capability to monitor the duration of the execution. However, SQL Agent does not have this, so you may need to implement this in the package itself. Another reason you may need to do it in the package is to monitor the duration of a specific set of tasks, but not the entire package.

Fortunately, this isn’t too difficult to implement in SSIS. All it takes is a little scripting. I’ve set up a sample package that illustrates how to do this. The main package looks like this:

image

The Long Running Task script task simulates a long running task by running a specific number of seconds (the value of the TaskDelay variable below). The Mark Task Successfully Completed sets the TaskSuccessful variable to True, so that the package knows when the long running task completed. The Monitor Duration script task is the one that actually does the checking on execution time, based on the value in the TaskDuration variable. If the execution time has taken more seconds than the value of TaskDuration specifies, the script exits. There is a precedence constraint that checks that the value of the TaskSuccessful variable is still False, then the Send an Alert task is run. In the sample, the Send an Alert task is a script task, but in a real package it would likely be a Send Mail task.

image

As stated above, the real work happens in the script. Basically, it performs a check once a second to see if either:

  1. The TaskSuccessful variable is True, meaning that the long running task completed successfully, or
  2. The difference in seconds between the current time and the time when the script started running is greater than the value in the TaskDuration variable.

If either case is true, the script exits. If both cases are false, the script sleeps for a second, then checks again. Putting the thread to sleep helps make sure that CPU isn’t tied up checking the conditions too often.

Public Sub Main()
    Dim vars As Variables
    Dim maxDuration As Integer
    Dim startTime As DateTime = DateTime.Now
    Dim continueWaiting As Boolean = True
    Dim taskCompleted As Boolean = False
 
    Dts.VariableDispenser.LockOneForRead("TaskDuration", vars)
    maxDuration = System.Convert.ToInt32(vars("TaskDuration").Value)
    vars.Unlock()
 
    Do While continueWaiting
        System.Threading.Thread.Sleep(1000)
 
 
        Dts.VariableDispenser.LockOneForRead("TaskSuccessful", vars)
        taskCompleted = System.Convert.ToBoolean(vars("TaskSuccessful").Value)
        vars.Unlock()
 
        If taskCompleted Or _
            (DateDiff(DateInterval.Second, startTime, DateTime.Now) > maxDuration) Then
            continueWaiting = False
        End If
    Loop
 
    Dts.TaskResult = Dts.Results.Success
End Sub

 

There are few things to be aware of with this approach. It’s not foolproof by any means. Any process that ties up 100% of the CPU is likely going to make this fail. Also, this depends on parallel execution of the long running task and the monitoring task. If you set the MaxConcurrentExecutables property in your package to 1, this will not work. It needs to be run on a machine with enough power to start the task at roughly the same time. In my experience, this hasn’t been a problem with the hardware that I work with, but it is a potential failure point.

The sample file is posted on my Skydrive, as usual.

Posted in Uncategorized | Comments Off on Sending an Alert if a Package Runs Too Long

RetainSameConnection and Expressions

I ran across a forum post today that highlighted a limitation with connection managers that I wasn’t aware of, so I thought I’d post it here. If you have RetainSameConnection set to TRUE, and you are also trying to update the connection string through an expression, be aware the connection will not change to reflect the new connection string. Basically, the connection will retain the original value (as RetainSameConnection would indicate).


You might encounter this if you are looping through a set of databases on a server. If you wanted to perform an operation on each database that required a temp table, you would set RetainSameConnection to TRUE. Unfortunately, as soon as you do that, the connection will no longer be dynamic.


I’ve created a sample package to illustrate this. The package flow is straightforward. There is a ForEach Loop that iterates through the databases on the server. The Execute SQL Task retrieves the row count from the sys.tables system view, and stores it in a variable. The Script Task fires two information messages that show the current value of the connection string, and the row count from the variable.


image


These are the results if RetainSameConnection is set to FALSE (with the correct counts):


image


and these are the results if RetainSameConnection is set to TRUE (the counts are all the same, from the initial connection):


image


This isn’t a major issue, but it is helpful to be aware of it. The same behavior occurs under CTP6 for Katmai. It makes sense, given that the RetainSameConnection property is supposed to keep the same connection throughout the package, but it can cause some unintended consequences. I’ve posted the sample to my Skydrive here.

Posted in Uncategorized | Comments Off on RetainSameConnection and Expressions