Posts tagged ‘Automation Tools’

SQL Server Analysis Services ‘Project Helper’

A little while back I spent an afternoon prototyping some functionality for Analysis Services.  I’ve been working a bit on improving the collaborative development story around SSAS, and I was after Building a SQL Server Analysis Services .ASDatabase file from a Visual Studio SSAS Project. I did come up with some nifty functionality in the process that I thought I’d share.

SsasHelper / ProjectHelper

I added the SsasHelper sample to the Microsoft SQL Server Community Samples:  Analysis Services site on CodePlex

The SsasHelper sample currently only contains the ProjectHelper library, which contains some functions to help work with a Visual Studio SSAS project. I also built a command line interface to demonstrate how to use the library to do a few tasks. There are a bunch of things you can use the library for, but the sample should at least get you started. The sample has the following functionality:

  1. De-serialize a Visual Studio SSAS Project into an AMO Database, then re-serialize all the components (.cube, .dim, .dsv, etc.) back out to files. You could use this functionality to let you programmatically work with a AS DB.
  2. Build a .ASDatabase file based on the Visual Studio SSAS Project. You could use this functionality as part of a robust build process.
  3. "Clean" a Visual Studio SSAS Project for comparison/source code merges. This is basically the same functionality I wrote about here, just ported into this library.
  4. Re-order a SSAS Project File. This is basically the "Smart Diff" functionality from BIDS Helper… I included it because De-serializing/serializing a project re-orders the .partition files, making them a pain to validate. I used this function to sort the .partition files to make them easy to diff.

Serialize/De-Serializing Visual Studio SSAS Projects

This is something I’d be wanting to figure out for awhile.  Most of this is really straight forward, with just one or two little tricks to make it work.  We start by reading the SSAS Project file… we just load up the XML Document and parse each set of objects. It looks like this:

   1: // Load the SSAS Project File

   2: XmlReader reader = new XmlTextReader(ssasProjectFile);

   3: XmlDocument doc = new XmlDocument();

   4: doc.Load(reader);

   5:  

   6: // Load the Database

   7: nodeList = doc.SelectNodes("//Database/FullPath");

   8: fullPath = nodeList[0].InnerText;

   9: innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

  10: Utils.Deserialize(innerReader, (MajorObject)database);

  11:  

  12: // Load all the Datasources

  13: nodeList = doc.SelectNodes("//DataSources/ProjectItem/FullPath");

  14: DataSource dataSource = null;

  15: foreach (XmlNode node in nodeList)

  16: {

  17:     fullPath = node.InnerText;

  18:     innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

  19:     dataSource = new RelationalDataSource();

  20:     Utils.Deserialize(innerReader, (MajorObject)dataSource);

  21:     database.DataSources.Add(dataSource);

  22: }

Loading each object type looks pretty much the same… the SSAS team provided that handy Utils class that helps to serialize and de-serialize objects.  There is, of course, one little wrinkle… the .cube and .partition files.  They don’t play by the same rules as the other objects.  Luckily, they’re close enough.  A few little hacks get us there.

The De-Serialize for cubes looks like this:

   1: // Load all the Cubes

   2: nodeList = doc.SelectNodes("//Cubes/ProjectItem/FullPath");

   3: Cube cube = null;

   4: foreach (XmlNode node in nodeList)

   5: {

   6:     fullPath = node.InnerText;

   7:     innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

   8:     cube = new Cube();

   9:     Utils.Deserialize(innerReader, (MajorObject)cube);

  10:     database.Cubes.Add(cube);

  11:  

  12:     // Process cube dependencies (i.e. partitions

  13:     // Little known fact:  The Serialize/Deserialize methods DO handle partitions... just not when 

  14:     // paired with anything else in the cube.  We have to do this part ourselves

  15:     dependencyNodeList = node.ParentNode.SelectNodes("Dependencies/ProjectItem/FullPath");

  16:     foreach (XmlNode dependencyNode in dependencyNodeList)

  17:     {

  18:         fullPath = dependencyNode.InnerText;

  19:         innerReader = ProjectHelper.FixPartitionsFileForDeserialize( ssasProjectDirectory + fullPath, cube);

  20:         Cube partitionsCube = new Cube();

  21:         Utils.Deserialize(innerReader, (MajorObject)partitionsCube);

  22:         MergePartitionCube(cube, partitionsCube);

  23:     }

  24: }

De-serializing the .cube files is easy… but I figured out you have to muck with the .partitions file a bit before it will de-serialize.  We basically just have to add a Name node to the XML for the de-serialize to work… we don’t use it for anything, but it is required.

   1: private static XmlReader FixPartitionsFileForDeserialize(string partitionFilename,Cube sourceCube)

   2: {

   3:     // Validate inputs

   4:     if (sourceCube == null)

   5:     {

   6:         throw new ArgumentException("Provide a Cube object that matches the partitions file");

   7:     }

   8:  

   9:     if (string.IsNullOrEmpty(partitionFilename))

  10:     {

  11:         throw new ArgumentException("Provide a partitions file");

  12:     }

  13:     // I am NOT validating the extention to provide some extra flexibility here

  14:  

  15:     XmlDocument document = new XmlDocument();

  16:     document.Load(partitionFilename);

  17:  

  18:     // Setup for XPath queries

  19:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  20:     string defaultNamespaceURI = "http://schemas.microsoft.com/analysisservices/2003/engine";

  21:      

  22:     // Get all the MeasureGroup IDs

  23:     XmlNodeList nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:ID", xmlnsManager);

  24:     XmlNode newNode = null;

  25:     

  26:     // Add a Name node underneath the ID node if one doesn't exist, using the MeasureGroup's real name

  27:     foreach (XmlNode node in nodeList)

  28:     {

  29:         // Verify the node doesn't exist

  30:         if (XmlHelper.NodeExists(node.ParentNode, "Name"))

  31:             continue;

  32:  

  33:         newNode = document.CreateNode(XmlNodeType.Element, "Name", defaultNamespaceURI);

  34:         // Lookup the MG name from the cube based on the ID in the file

  35:         newNode.InnerText = sourceCube.MeasureGroups.Find(node.InnerText).Name;

  36:         node.ParentNode.InsertAfter (newNode, node);

  37:     }

  38:  

  39:     // Return this as an XmlReader, so it can be manipulated

  40:     return new XmlTextReader(new StringReader(document.OuterXml));

  41: }

Next, the MergeParition method will just loop thru the “partitions only” cube we de-serialized, and add each partition and aggregation design to the “main” cube.  We have to make sure a create a copy of each object we add, because a Partition and AggregationDesign can only be in a single cube… using a Add on an existing object removes it from the initial collection.

   1: private static void MergePartitionCube(Cube baseCube, Cube partitionCube)

   2: {

   3:     MeasureGroup baseMG = null;

   4:  

   5:     foreach (MeasureGroup mg in partitionCube.MeasureGroups)

   6:     {

   7:         baseMG = baseCube.MeasureGroups.Find(mg.ID);

   8:  

   9:         // Heisenberg principle in action with these objects; use 'for' instead of 'foreach'

  10:         if (mg.Partitions.Count > 0)

  11:         {

  12:             for (int i = 0; i < mg.Partitions.Count; ++i)

  13:             {

  14:                 Partition partitionCopy = mg.Partitions[i].Clone();

  15:                 baseMG.Partitions.Add(partitionCopy);

  16:             }

  17:         }

  18:  

  19:         // Heisenberg principle in action with these objects; use 'for' instead of 'foreach'

  20:         if (mg.AggregationDesigns.Count > 0)

  21:         {

  22:             for (int i = 0; i < mg.AggregationDesigns.Count; ++i)

  23:             {

  24:                 AggregationDesign aggDesignCopy = mg.AggregationDesigns[i].Clone();

  25:                 baseMG.AggregationDesigns.Add(aggDesignCopy);

  26:             }

  27:         }

  28:     }

  29: }

We now have a AMO Database from our Visual Studio Project!  Serializing it back out is pretty similar… we just loop through the Database, and serialize everything out:

   1: XmlTextWriter writer = null;

   2:  

   3: // Iterate through all objects in the database and serialize them

   4: foreach (DataSource dataSource in database.DataSources)

   5: {

   6:     writer = new XmlTextWriter(targetDirectory + dataSource.Name + ".ds", Encoding.UTF8);

   7:     writer.Formatting = Formatting.Indented;

   8:     Utils.Serialize(writer, (MajorObject)dataSource, false);

   9:     writer.Close();

  10: }

Again, the one wrinkle is the .cube and .partitions file… the Utils methods don’t 100% work with those guys.  Luckily, they work about 90%, so we have a work around.

   1: // Special case:  The cube serialization won't work for partitions when Partion/AggregationDesign

   2: // objects are mixed in with other objects.  Serialize most of the cube, then split out

   3: // Partion/AggregationDesign objects into their own cube to serialize, then clean up

   4: // a few tags

   5: foreach (Cube cube in database.Cubes)

   6: {

   7:     writer = new XmlTextWriter(targetDirectory + cube.Name + ".cube", Encoding.UTF8);

   8:     writer.Formatting = Formatting.Indented;

   9:     Utils.Serialize(writer, (MajorObject)cube, false);

  10:     writer.Close();

  11:  

  12:     // Partitions and AggregationDesigns may be written to the Cube file, and we want

  13:     // to keep them all in the Partitions file; strip them from the cube file

  14:     FixSerializedCubeFile(targetDirectory + cube.Name + ".cube");

  15:  

  16:     Cube partitionCube = SplitPartitionCube(cube);

  17:     writer = new XmlTextWriter(targetDirectory + cube.Name + ".partitions", Encoding.UTF8);

  18:     writer.Formatting = Formatting.Indented;

  19:     Utils.Serialize(writer, (MajorObject)partitionCube, false);

  20:     writer.Close();

  21:  

  22:     // The partitions file gets serialized with a few extra nodes... remove them

  23:     FixSerializedPartitionsFile(targetDirectory + cube.Name + ".partitions");

  24: }

We first serialize the whole cube (for each cube, of course) and then manually fix up the file.  Some of the Partition and Measuregroup info gets serialized out to the .cube file at this point, and we want to keep all of that information in the .partition file where it belongs.

   1: private static void FixSerializedCubeFile(string cubeFilename)

   2: {

   3:     // Validate inputs

   4:     if (string.IsNullOrEmpty(cubeFilename))

   5:     {

   6:         throw new ArgumentException("Provide a cube file");

   7:     }

   8:     // I am NOT validating the extention to provide some extra flexibility here

   9:  

  10:     XmlDocument document = new XmlDocument();

  11:     document.Load(cubeFilename);

  12:  

  13:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  14:  

  15:     XmlNodeList nodeList = null;

  16:  

  17:     // Remove the MeasureGroup Names

  18:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:Partitions", xmlnsManager);

  19:     XmlHelper.RemoveNodes(nodeList);

  20:  

  21:     // Remove the StorageModes

  22:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:AggregationDesigns", xmlnsManager);

  23:     XmlHelper.RemoveNodes(nodeList);

  24:  

  25:     document.Save(cubeFilename);

  26: }

We then create a temporary cube with just the Partition and AggregationDesign objects in it, and serialize it.  Again, we have to clean up the file just a bit.

   1: private static void FixSerializedPartitionsFile(string partitionFilename)

   2: {

   3:     // Validate inputs

   4:     if (string.IsNullOrEmpty(partitionFilename))

   5:     {

   6:         throw new ArgumentException("Provide a partitions file");

   7:     }

   8:     // I am NOT validating the extention to provide some extra flexibility here

   9:  

  10:     XmlDocument document = new XmlDocument();

  11:     document.Load(partitionFilename);

  12:  

  13:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  14:  

  15:     XmlNodeList nodeList = null;

  16:     

  17:     // Remove the MeasureGroup Names

  18:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:Name", xmlnsManager);

  19:     XmlHelper.RemoveNodes(nodeList);

  20:  

  21:     // Remove the StorageModes

  22:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:StorageMode", xmlnsManager);

  23:     XmlHelper.RemoveNodes(nodeList);

  24:  

  25:     // Remove the ProcessingModes

  26:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:ProcessingMode", xmlnsManager);

  27:     XmlHelper.RemoveNodes(nodeList);

  28:  

  29:     document.Save(partitionFilename);

  30: }

…and there we go!  We can start with a Visual Studio SSAS Project, create the Database in memory, do whatever we want with it, then write it back out to files!

Build .ASDatabase File

I talk more about the what and why in <<this blog>>, but the actual mechanics of this functionality is really easy (once we’ve de-serialized the project, that is).  This could basically be condensed to a single line of code:

   1: Database database;

   2:  

   3: database = ProjectHelper.DeserializeProject(ssasProjectFile);

   4:  

   5: ProjectHelper.GenerateASDatabaseFile(database, targetFilename);

 

Clean Visual Studio SSAS Project

I won’t go into a lot of detail on this here… I posted a blog on SQL Server Analysis Services Projects with Multiple Developers a little while back.  I originally wrote this functionality in PowerShell, but just ported it into this library because I thought it would be useful.  This method just goes through each unlocked file and removes fields maintained by Visual Studio that aren’t required by the AS DB per se, but that make merging code from multiple developers substantially more difficult.  If you’re testing out the serialize/de-serialize functionality, you might also want to use this against your projects to make them easier to compare.

Sort SSAS Project File

One annoying side affect of serializing the .partition file is that all the elements get re-arranged.  Everything still works, of course, but if you’re like me you’ll want to verify that everything is working.  The good people over at BIDS Helper let me use a version of their “Smart Diff” XSLT that, which will sort a SSAS file.  I modified it slightly and customized it to help validate the .partition files.  I would be very careful using it for anything else… its only been tested for the intended scenario.

Known Issues

I’ve tested this with Visual Studio 2008 SP1, SQL Server Analysis Services SP1 CU3, and the Adventure Works 2008 AS DB with no issues.  Definitely make a backup before you try this out, as your mileage may vary.

  1. The sample I wrote just serializes objects into files based on their names.  It would be trivial to maintain a list based on the Visual Studio project file to make sure the objects are written out with their correct names.
  2. Partitions are reordered when De-Serialized/Serialized. Makes it a pain to validate, but I’ve seen no ill effects. Use SortSssasFile functionality to make a copy of the files for easier comparison.
  3. Some fields maintained for Visual Studio (State (Processed, Unprocessed), CreatedTimestamp, LastSchemaUpdate, LastProcessed, dwd:design-time-name, CurrentStorageMode) are lost when a project is De-Serialized/Serialized.

Next Steps

I’ve thought of some nifty things to do with this functionality, including…

  1. Generate a new Visual Studio project based on metadata.
  2. Maintain a project with external metadata… you can maintain a separate data dictionary with descriptions of each attribute/measure/cube, and just programmatically update your source when needed.
  3. Automate repetitive tasks and maintenance of the project (like adding a bunch of partitions to your measure groups, or replacing every occurrence of a certain word in the project.
  4. Build a more robust development and deployment process.
  5. …and many more things.

Cheers,

David

Building a SQL Server Analysis Services .ASDatabase file from a Visual Studio SSAS Project

There are several methods available for deploying Analysis Services databases once you’ve build your solution, including direct connections, generating XMLA to deploy, and using the Deployment Wizard with answer files.  Still, building and deploying AS Databases can sometimes be a challenge in enterprise development scenarios.  One common scenario is when you have multiple developers working on a single solution where all the files are under source control.  I wrote a blog about about SQL Server Analysis Services Projects with Multiple Developers recently, and this is one of the issues that you tend to run into.  I created a sample you can use to help with this problem.  I’m not going to go over how to deploy AS solutions (it’s documented plenty of other places)… this is just a tool to address this particular scenario.

I create the SsasHelper sample demonstrate the functionality, and posted it to the Microsoft SQL Server Community Samples:  Analysis Services site on CodePlex.  I posted about this functionality in my blog SQL Server Analsysis Services ‘Project Helper’ if you want details on the mechanics.

Background

So you’ve built your solution, tested it, and everything works great.  All of your developers are hard at work constantly improving your OLAP capabilities, and they’re storing all of their changes under source control.  Now you need to do routine deployments to multiple environments.  One of the problems you might have run into is that you have to use Visual Studio to actually do a deployment.  Even if you’re deploying via the Deployment Wizard (using answer files), you still need to generate the .ASDatabase by building the solution.  This problem becomes a little bigger if you want to use a Build Server, that doesn’t have Visual Studio, where you compile your solution, test, deploy it, etc.  Currently, the only option you really have for this scenario is to make sure every developer Builds the solution after all changes and checks in a new .ASDatabase file.  This is a bit of a pain, and tends to lead to (at least occasional) issues with deploying updated versions.

SSAS Helper

I created a class library to work with Analysis Services projects.  This library has a couple of neat pieces of functionality, including being able to de-serialize a SSAS project to an AMO database, write an AMO database back out to component files, clean a AS project (removing volatile, non-essential fields… I went over it in SQL Server Analysis Services Projects with Multiple Developers), and creating a .ASDatabase file based on a project.  The library is written in C# 3.5, and is designed/tested with SSAS 2008 (though it should work with 2005… test it though).

Using SSAS Helper

If you’re running into this problem, you probably already have a good build solution, so I won’t go over that here.  What you’ll want to do is create a new build task (using whatever framework you’re using) using this component.  As an example, you can create a custom MSBuild using this library that takes the SSAS Project as an input, and delivers a .ASDatabase file as an output.  The task would then be added to the build file, so you will no longer have a dependency on Visual Studio to compile all your .cube, .dim, .role, etc. files into something that can be deployed.  I’ll try and post a sample in the next few weeks, but it shouldn’t be a major task.

Caveats

I built this tool and tested on my current project, and tested with Adventure WOrks… I haven’t seen any problems with it, but your mileage may vary.  Make sure you test with your project.

Here are some known issues for this functionality:

  1. Partitions are reordered when De-Serialized/Serialized. Makes it a pain to validate, but I’ve seen no ill effects. Use the SortSssasFile functionality to make a copy of the files for easier comparison.
  2. Some fields maintained for Visual Studio (State (Processed, Unprocessed), CreatedTimestamp, LastSchemaUpdate, LastProcessed, dwd:design-time-name, CurrentStorageMode) are lost when a project is De-Serialized/Serialized.

When I use this tool on a project, I work on a copy, and then do a file diff to check the output the first time.  I used Beyond Compare (http://www.scootersoftware.com/) to compare the entire directory, and each file side by side, just to make sure there are no unintended side affects.  I would recommend you do the same to make sure… this works fine on the projects I’ve used it on, but you need to make sure there’s nothing special about yours so you don’t accidently destroy something.

How it Works

This project works by de-serializing all the files referenced by the .dwproj file into AMO objects then serializing the entire database.  There is more detail on de-serializing/serializing the objects in my post SQL Server Analsysis Services ‘Project Helper’ .  The code is fairly well commented (well… I think it is :) and should be fairly straight forward.

Next Steps

To use this, you’ll need create a build task in your framework of choice, and just plug it in to your solution.  If this proves difficult for people, I’ll try and provide a sample, but it should be fairly straight forward to do.  If it doesn’t work right out of the box, it should work with minimal modification.  Just make sure all of your project files (.dwproj, .cube, .dim, .role, .dsv, .ds, .dmm, etc.) are published and you have a way to push the .ASDatabase file to your target.

Conclusion

That’s about it… I hope this helps some folks out.  Let me know if you have any problems, and we’ll see what we can get working.

Cheers,

David

SQL Database Tuning Advisor Output Renamer

I’ve uploaded the ‘Database Tuning Advisor Output Renamer’ at http://DtaOutputRenamer.codeplex.com/.


OK… so Friday marked the first day I’ve ever gotten sunburned while coding.  I had a little bit of free time while at an outdoor event, and whipped up a a little utility to help apply standards to to DTA recommendations.


I use the SQL Database Tuning Advisor (DTA) a lot to generate basic recommendations for indexes and statistics based on a workload.  In my team, we store all index and statistics creation scripts in .SQL files, which are then run as part of our deployments.  We use a standard naming convention for each of the objects to enhance the maintainability.


Last week I ran the DTA against a workload I generated based on running reports on some new schema… not surprisingly, quite a few recommendations were generated.  It occurred to me my time could be better spent doing something besides renaming 50 database objects based on their definitions.  I decided to write a small application to help change the default names (such as ‘_dta_index_SsasProcessingRunArchive_c_7_1677965054__K10’) to something a little more user friendly (like ‘IX_dbo_SsasProcessingRunArchive_ObjectType_EventClass_SessionID_I_StartTime’).  You’ll want to modify the application to match your local coding standards, but it should be pretty straight forward to do. 


This application only handles a few cases, but does cover Clustered/Non-Clustered Indexes (with and without INCLUDE columns) and Statistics.  It should be easy to extend it if you need to.  This app is just something I whipped up in an hour or two, so it isn’t the most robust thing ever created.


I created the following regex (remove the line breaks… I just used those for presentation) to capture the index name, table name, and column/include lists for the indexes:



   1: CREATE\s(?<NonClustered>NON)?CLUSTERED\sINDEX\s\[(?<IndexName>.*?)\].*?ON\s\
   2: [(?<Schema>.*?)\]\.\[(?<Table>.*?)\].*?\((?<ColumnList>.*?)\)\s*?(?:INCLUDE\s
   3: \((?<IncludeList>.*?)\))??\s*?(?:WITH\s*?\(.*?\)\s*?ON\s\[.*?\])

I created the following regex (remove the line breaks… I just used those for presentation) to capture the statistics name, table name, and column list for the statistics:



   1: CREATE\sSTATISTICS\s\[(?<StatisticsName>.*?)\].*?ON\s\[(?<Schema>.*?)\]
   2: \.\[(?<Table>.*?)\].*?\((?<ColumnList>.*?)\

I then just update the object names with a new name created based column lists and such.  I also through in the functionality to strip ‘go’ statements from input.


Enjoy…

SQL Load Generator Tool

A few weeks ago I spoke at PASS on using the Resource Governor in SQL Server 2008.  For my demo, I created a little application that could fire off multiple queries against SQL Server to simulate different users/applications.  This tool… gasp… creates a load on SQL Server.  People seemed interested in the application, so I’ve decided to release it on CodePlex.  You can find the project here.


For anyone that saw that demo, this tool has gotten a a bit of a face lift… I sort of got bored over the holidays, and added a few features that I wished I had when I was using it to demo, as well as just playing around.  It’s nothing fancy… I just put it together one Saturday for my presentation, then gave into some feeping creaturism a couple of nights.


I originally created this to test settings for the SQL Server 2008 Resource Governor.   You can also use it as a (very) light weight load testing tool, but no promises on that.  I’m not planning on adding any features to it, or supporting it much, but I thought I’d put it out there.


Here’s what it looks like:


SqlLoadGenerator1


Here’s a summary of the features:



  • Runs multiple queries against SQL Server.  You can add as many as you like.

  • Each query can be either a SQL User or Domain User.

  • You can specify an Application name for the connection.

  • You can specify the new of concurrent threads to use for each query.

  • You can start all queries, stop all queries, remove all queries.

  • There is logging (you can toggle on and off… it isn’t precisely thread safe, and can cause crashes when there are lots of failures on multiple threads) for failed queries.

  • You can set all the defaults on a per user basis, and persist them.

  • Each query has a # of Runs and a # of Fails counter.  You can use the ‘Reset Counters’ feature to reset the total counts (not the per query counts).

You can save your settings via the ‘Options’ menu.  You can add default items to the different dropdowns, provide default query settings, change the log locations, etc.  You can also modify the stock connection string… though keep in mind some of the settings (particularly ‘pooling=false’ will affect the way the application works… namely, the connections to SQL Server won’t be closed). 


SqlLoadGenerator2 


That’s about it… enjoy!


David

Testing and Tuning an Internet Facing PerformancePoint M&A Dashboard

I’ve spent the past 15 months, off and on, working as the technical lead on a performance management project using PerformancePoint Server.  The project was to create a SharePoint Portal where users could view the performance of a school system.  The portal was initially internally facing, but the goal was to make it publicly accessible.  My last project with this particular client was to implement an Internet facing version of the portal. 


Since the school system in question has more than 120,000 students, scalability and performance was definitely an issue.  I spent a week at the Microsoft Technology Center in Atlanta, GA testing and tuning the solution.  We performed our testing in a lab with seven desktop machines (to test with) and three servers to simulate our production environment.  I was joined for the week by one of the PPS M&A Architects, who helped me evaluate the solution.  Everyone at the MTC was super helpful and friendly, and we got a lot of really good advice. 


I want to share some of the things I learned during the performance testing of the solution.


Recommended Reading


A few good things to read before getting started on a venture like this (among all the other white papers about scaling SQL Server, SharePoint, and the like).



Steps


To performance test and tune our SharePoint site, we needed to…



  1. Deploy the site in the test lab.

  2. Create load tests.

  3. Tune Windows Server, SharePoint, IIS, PerformancePoint Server, SQL Server, and Analysis Services as needed.

  4. Run and analyze the load tests.

  5. Modify the PPS design as required.

Just five little items on the list… how hard can that be, right?  Items #2 through #5 were done in iterations.


System Under Test


The application we created uses SharePoint Enterprise Server 2007 (MOSS), PerformancePoint Server Monitor & Analyze 2007 (SP1), and SQL Server 2005 (SP2 CU8).  Both SQL Server RDBMS and SSAS reside on the same server.  The RDBMS supports only SharePoint and PerformancePoint.  Analysis Services is dedicated to the Dashboards. 


Data Source Design


The data source is a SQL Server Analysis Services DB.  It uses a straightforward star schema design.  All of the KPI’s that we’re using in the PPS Dashboards are surfaced using calculated measures in the cube.  This is definitely a good way to go.  This design makes developing the dashboards very easy, it encapsulates the logic of the KPIs, and it helps with efficient caching inside of SSAS.  The cube was less than 1 GB in size for this phase of the project.  Some default aggregations were built, and usage based aggregations were applied against the cube.  Query performance was base lined at a few dozen milliseconds for the largest/most complex queries once the cache was warmed.


Dashboard Design


Our portal has seven dashboards.  Each dashboard has several pages. 



  1. The Overall view contains a scorecard with a Year Member Selection filter.  Each KPI on the scorecard conditionally displays an Analytic Grid and a Web Page report (pointing to a HTML file in SharePoint).

  2. The Compare view contains two scorecards, with up to 14 filters (primarily Member Selection… more on this later).  Seven filters apply to each scorecard; users can make different selections, and compare the two views.

  3. The Look Inside the Data view contains a single scorecard with up to 12 filters (and a few more columns than the Compare view, to show historical values).

The Analytic Charts were all fairly straightforward, with a default view of a calculated measure against time.  The users can drill into the charts. 


The filters were primarily Member Selection, with at most a few dozen members.  We have one MDX filter (for Schools), that returns about 400 members (this will come into play later).


The Dashboard was implemented using Application Security.


SharePoint


I enabled Anonymous Authentication for SharePoint.  The site basically supports all of our Dashboards, as well as a number of Help files and some content to help people use the site and the data.  I customized some of the master pages (to remove features such as SharePoint Help, Search, the breadcrumb trail, and stuff like that), but the site design was fairly trivial for this version of the site.  Excel Services and other MOSS features were not used.


Deploying to a Test Environment


Our friendly neighborhood Microsoft rep arranged for us to have some time in the Microsoft Technology Center in Atlanta, GA.  The provided an environment for us to test in, and they provided three servers (to mimic our production hardware) and seven desktop machines for us to run load tests on.  It took a bit ~1.5 days to install/configure the software, install service packs, and deploy our solution (include a SharePoint site collection, PPS M&A Dashboards, and SSAS DB).


Hardware Under Test


We tested our solution on the following hardware.  Our production hardware was similar.  A hardware network load balancing appliance was used for the web front ends.


Web Front End (1)



  • DL380 G5

  • Dual Core x 2 – 2.66 GHz – 64 bit

  • 8 GB RAM

  • Windows Server 2003 (SP2)

  • IIS 6.0

  • SharePoint Enterprise Server 2007 (SP1+Infrastructure update)

  • PerformancePoint Server Monitor & Analyze (SP1)

Web Front End (2)



  • DL 382 G5

  • Dual Core x 2 – 2.6 GHz – 64 bit

  • 4 GB RAM

  • Windows Server 2003 (SP2)

  • IIS 6.0

  • SharePoint Enterprise Server 2007 (SP1+Infrastructure update)

  • PerformancePoint Server Monitor & Analyze (SP1)

SQL Server (RDBMS and SSAS)



  • DL 585 G2

  • Dual Core x4 – 2.66 GHZ – 64 bit

  • 16 GB RAM

  • Windows Server 2003 (SP2)

  • SQL Server (SP2 CU8)

  • RAID10 disk array

Test Rig


We used Visual Studio 2008 (SP1) Team Systems for Testers (VSTT) to perform our testing.  We used one machine as a Controller and four machines as Agents.  We saved our test results to SQL Server on a separate machine.


Test Machines (x5)



  • Dual core 64 bit

  • 2 GB RAM

  • Vista

This configuration let us scale to 4,000+ users without severely taxing any of the machines.  Due to the network appliance being used for our load balancing, IP Spoofing was not required for our test machines.  We had some issues with using Vista for the agents… it was just a little bit trickier to configure than XP.  We were never able to get the counter collection working for those machines (annoying, but not really an issue since we had so much capacity and all the machines were in the same room).  We didn’t run into any major issues getting our test rig up and running, but it definitely took some time.


Creating the Load Tests


In order to do our tuning and performance testing, we first had to create everything needed to create a load test.  VSTT is a good tool for load testing, but be aware that there is a learning curve.  If you don’t have any experience performing web based load testing, then this could be a significant hurdle.  Luckily, in a previous life I was a Load Test Architect, and I had experience with the tool.


Test Design


I first identified several use cases that I thought would be common for users of our portal.  These consisted of actions such as viewing the top level summary Dashboards, using filters to change the view of the scorecards, clicking on analytic grids, and things like that.  I created a mixture of coded and graphical Web Tests for each small work flow (such as changing a filter four times on a certain Dashboard, or clicking through eight charts on a Dashboard).  I then used Web Tests that call other Web Tests to build up my use cases.  The use cases were then combined in different ratios in each of my load tests so I could simulate a variety of user populations.


Designing your tests correctly is crucial.  If you’re way off on what the users will be doing, all of your performance tuning could be for naught.  I would recommend using a pilot group, or at least some test subjects, to figure out how people will use your portal.


Web Test Implementation


Microsoft Consulting Services provided a test harness for PPS.  The test harness is a coded web test that can parse the XML snippets used by PPS, and perform actions such as changing filter values, activating charts, etc.  It’s a pretty cool tool, and was very helpful.  Unfortunately, it is proprietary, so I can’t share it… you’ll have to get engage MCS if you want to use it.  I did make some modifications to it to change the way it would use filters, to make it work when Web Tests called other Web Tests, etc.  If anyone is using this test harness (or if it does become publicly available), I can share the (minor) modifications that I made.


I constructed all of the visual web tests using Fiddler 2 (in conjunction with a home grown helper application to insert comments, and do a few other useful tasks).  Thanks to the MCS test harness, a few extra tools, and the way the web tests were constructed, the actual test generation was very straightforward.


Testing Considerations


One of the key elements to consider when performing a load test against PPS is that, due to the architecture, the process of loading a single page may involve dozens of calls.  For example, a dashboard that has 15 filters, two scorecards, and two charts would actually make 20 http requests… so if each request takes 1 second, it takes 20 seconds to load the page.  I was primarily interested in the time to last byte for each page, so I used comments while recording my tests to know where to place the transactions, so I could analyze those instead of individual calls.


I implemented reasonable timing and pacing for the individual steps in each of the tests, and made sure to gradually ramp up my users when applying the load.  Testing was slightly weighted towards the Dashboards that were the largest and would have the heaviest usage, though another use case was used to randomly hit every part of the site to bust the cache if possible.


System Tuning


We tuned a few things, or at least reviewed the settings, out of the gate.  During our testing we continuously monitored the servers (and test machines), identified bottlenecks, tweaked settings, then did it all again.


During our initial testing, we found SSAS to be the bottleneck for our solution.  I suspect that our web front ends (SharePoint and/or IIS) could be tweaked a little more, but it wasn’t really necessary in our scenario.


ASP.NET Tuning


We started tuning ASP.NET with some of the suggestions in the article Contention, poor performance, and deadlocks when you make Web service requests from ASP.NET applications.  In the end, I found the performance to be good using the autoConfig=”true” (the formulas listed in the article are automagically applied) with a few tweaks.  I increased the number of threads available significantly… that was a major bottleneck for awhile.  I also increased the minWorkThreads to 50 (otherwise SharePoint stalled after adding a few hundred users while spinning up more threads, even when ramping users up fairly slowly).  I also increased the minFreeThreads and minLocalRequestFreeThreads.  I increased the number of connections to the DB server.  Don’t just blindly apply any of these settings… your mileage will vary.  I went through a number of test iterations carefully monitoring each of the servers and testing these settings. 


I made the following change to the machine.config (on all web front ends):

  1: …
  2:   <system.web>
  3:     <processModel autoConfig=“true” maxWorkerThreads=“400″ maxIoThreads=“400″ minWorkerThreads=“50″ />
  4:     <httpRuntime minFreeThreads=“352″ minLocalRequestFreeThreads=“304″ />
  5:     …
  6:   </system.web>
  7: …
  8:   <system.net>
  9:    <connectionManagement>
 10:      <add address=“10.1.11.11″ maxconnection=“48″/>       ? Database server ip address
 11:   </connectionManagement>
 12:   </system.net>
 13: …
 14: </configuration> 

IIS Tuning


I adjusted the application pool (for both the Performance Management site as well as the Central Administration page). Pinging and Rapid Failure were turned off on the Health tab. I did NOT use web gardens in this configuration, even though I have seen it recommended.  I did not see a difference in performance between using them and not, and was advised by one of the local IIS gurus not to use them.


SQL Server 2005 Tuning


We updated SQL Server to Service Pack 2 – Cumulative Update 8 (build 3257). Before applying the CU, we ran into some significant performance issues.  SSAS did not like some of our very small dimensions, and performance was severely degraded.  The CU resolved the issues. 


Data files, transaction logs, and the Temp DB were spread out over multiple disks.  Our usage of the RDBMS was very light, so we needed minimal tuning here.


SQL Server Analysis Services 2005 Tuning


I updated the Query \ Max Threads element of Analysis Services to 20 to provide additional threads for querying. Adjusting this element needs to be done carefully, as it can improve the performance of the web front ends at the expense of maxing out SQL Server.  I actually ramped this number up quite a bit during testing (with good effect), but found 20 to be adequate after resolving some other issues.  I suggest monitoring the queued requests in IIS along with the queued and active threads in SSAS to determine a good balance here.  I initially found a lot of queuing in the Long Parsing Jobs and Short Parsing Jobs as well, due to some filters.


PerformancePoint Server M&A Tuning


OK, this is important.  Since we Internet facing, we’re supporting anonymous users… a decent number of them, in fact.  I updated the stored procedure [PPSMonitoring]. [dbo].[ParameterValuesCreate] to comment out a section that is not needed for an anonymous site, but that has an extremely negative impact on performance.  This is the piece that stores the current filter selections.  In addition to not being necessary, it was totally thrashing our transaction log. 


I made the following change to the [PPSMonitoring]. [dbo].[ParameterValuesCreate]  sproc:

  1: – init 
  2:     SET @TransactionIsOurs = 0
  3:     IF @@TRANCOUNT = 0
  4:     BEGIN – only if @@TRANCOUNT is 0, we do BEGIN TRAN
  5:         BEGIN TRANSACTION
  6:         SET @TransactionIsOurs = 1
  7:     END
  8: – DD – 20080821 – Commented out the following section to increase performance
  9: –    –If this parameter value already exists (for this login), update. Otherwise, insert.
 10: –    IF(EXISTS(SELECT * FROM [ParameterValues] WHERE Login = @Login AND [ParameterUniqueName] = @ParameterUniqueName))
 11: –    BEGIN
 12: –        SELECT 1
 13: –        UPDATE [ParameterValues]
 14: –        SET 
 15: –            [LastUpdated] = GETDATE(),
 16: –            [SerializedXml] = @SerializedXml
 17: –        WHERE [Login] = @Login AND [ParameterUniqueName] = @ParameterUniqueName
 18: –        IF @@ERROR <> 0 
 19: –        BEGIN
 20: –           RAISERROR (5580001, 16, 1, @tErrMsg, 7, N’ParameterValues’) WITH LOG
 21: –           SET @ReturnCode = -1
 22: –           GOTO exit_label
 23: –        END
 24: –    END
 25: –    ELSE
 26: –    BEGIN
 27: –        –Insert record
 28: –        INSERT INTO [ParameterValues]
 29: –            ([Login], [ParameterUniqueName],[LastUpdated],[SerializedXml])
 30: –        VALUES     
 31: –            (@Login, @ParameterUniqueName, GETDATE(), @SerializedXml)
 32: –        IF @@ERROR <> 0 
 33: –        BEGIN
 34: –           RAISERROR (5580002, 16, 1, @tErrMsg, 8, N’ParameterValues’) WITH LOG
 35: –           SET @ReturnCode = -1
 36: –           GOTO exit_label
 37: –        END
 38: –    END
 39:
 40:     IF @TransactionIsOurs = 1
 41:     BEGIN
 42:         COMMIT TRANSACTION
 43:     END
 44:     RETURN 0


SharePoint Tuning


There is a lot of tuning that can be done to SharePoint; tuning SharePoint was not a focus for us.  We did perform a few changes. 


Allowing the SharePoint Server worker process to consume large amounts of memory can decrease performance. For computers that have more than 4 GB of RAM, the ASP.NET cache size can be constrained with the privateBytesLimit attribute, set on the cache element of the Web.config file. By setting privateBytesLimit=”2576980378” (that is, 60% of 4 GB), you can avoid a scenario in which a server that has more than 4 GB of memory creates an oversized cache.


For a machine with 8 GB of RAM with a limit of 60% of the memory, we would add the following to the web.config. The private bytes limit number is calculated via the formula (<gigabytes of RAM available> * <percentage of RAM to use>* 1,024^3). 


I made the following change to my web.config file for the SharePoint site:

  1: …
  2: <system.web>
  3: …
  4: <caching>
  5:   <cache privateBytesLimit = “5153960755″ />
  6: </caching>
  7: …
  8: </system.web>
  9: …

Tweaking the Dashboard Design


Overall, I was expecting to be able to support about 500 concurrent users on each of the front end servers.  I expected the WFE’s to be the bottleneck.  With the original implementation of the Dashboard, I found we could support up to about 2,000 users… as long as they weren’t using our Comparison and Look Inside the Data pages. 


When testing those pages, SSAS was getting slammed.  I was able to support 1,000 users… but SSAS was at 80% utilization constantly, with occasional spikes and severe performance degradation.  Not a really good situation.


Through testing and profiling, I identified the culprit… it was the School filter.


The School filter is an MDX filter, and it looks like this:

  1: FILTER(
  2:   DESCENDANTS(
  3:     [Entity].[Entity]
  4:     ,[Entity].[Entity]
  5:     ,SELF_AND_BEFORE
  6:   )
  7:   ,[Entity].[Entity Type] = [Entity].[Entity Type].&[SCHOOL]
  8: )

No big deal, right?  Except… each time it is loaded (i.e., a lot) it fires off a few other MDX statements.  They look like this:

  1: SELECT
  2:   {   [Entity].[Entity].[All]
  3:     ,[Entity].[Entity].&[School A]
  4:     ,[Entity].[Entity].&[School B]
  5:     //…about 400 more
  6:   } DIMENSION PROPERTIES MEMBER_TYPE on 0
  7: FROM [MetricsDatamart]
  8: 

and this:
  1: SELECT
  2:   {
  3:    IIF([Entity].[Entity].[All].Parent IS NULL,[Entity].[Entity].[All],[Entity].[Entity].[All].Parent)
  4:   ,IIF([Entity].[Entity].&[School A].Parent IS NULL,[Entity].[Entity].&[School A],[Entity].[Entity].&[School A].Parent)
  5:     // … about 400 more
  6:  } DIMENSION PROPERTIES MEMBER_TYPE on 0
  7: FROM [MetricsDatamart]
  8: 

These are fairly long statements, and they are expensive to run.  SSAS was having to queue dozens of these statements just to parse them when under load.  They are fired for each page load, even though our data is static (i.e., no caching is occurring).  All those inline IIF’s did horrible, terrible things to SSAS.  SSAS was getting very backed up, which caused IIS to queue, which caused extremely long waits on the front end.


To fix the problem, I switched that filter to use a SQL Server RDBMS source.  The data was static, so no big deal there.  I did find that using more than 1 tabular source gave me an error in PPS… I wasn’t able to resolve that on our timeline, and I was able to fix the performance issue anyway.


Final Results


After that one change, the performance of the whole system skyrocketed.  With our heaviest test case, I was able to run 1,500 concurrent users. The WFE processor was at ~70% utilization, averaging 300 requests per second with no contention or queuing. The database server was at ~20 utilization, no issues. Responses were < 1 second, with an average response time of ~.3 seconds. User experience excellent. Even when the response times would increase, page load times were still reasonable (under 10 seconds for pages with more 15 objects).


Next Steps


We got everything pretty much humming for this phase of the project… but we’re building a much larger version for our client in the very near future.  Here are some thoughts on making our solution bigger, faster, better.


SharePoint


We can tune SharePoint quite a bit to increase performance.  This includes lightening some of the templates and base pages, optimizing images and web pages, and making sure all the user supplied content is fairly lean (i.e., not using a 300kb file produced by Word when a 20kb html file could be used).  We can use some of the caching features available in MOSS to improve the performance on the web servers.


PerformancePoint Filters


OK, I saw our filters as a bit of a bottleneck.  For a large installation, I think performance could be significantly improved by building a custom filter using the Monitor SDK.  Some of the key features would be enabling caching and perhaps using XMLA discover queries instead of some of the MDX.  I’d also make the ordering of the elements a little more configurable, and perhaps try and implement a way to create a ‘global filter’ (I find I use a single filter on many different Dashboards… and I’d like to reduce the maintenance/development time/improve consistency.


Servers


Scaling out the web servers and scaling up the SQL Servers looks like a fairly effective strategy here.  Coupled with some custom components and a well thought out design, this looks like an effective strategy with fairly predictable performance characteristics.  We could easily double our capacity right now by snagging two extra WFEs.  To scale by a factor of 10, we’d probably need an extra SQL Server and some minor changes to our strategy.


Key Takeaways


Our testing went well, and we were able to exceed the performance we were expecting.  Here are a few key takeaways from this exercise.



  1. Plan on hunting down some performance bottlenecks.  In a PPS deployment, they can be located on any tier, and it may take a little time to find and fix them.  We had to make adjustments to basically every technology in this instance.

  2. You’re going to want to tune your servers.  It took a little over a week for me to go through that exercise, but I had help, a lab, and experience with all the various components. You’re mileage may vary.  We identified and corrected a number of significant performance issues… everything from tweaking IIS and SQL Server to ensuring the correct Cumulative Update was applied.  Make only small changes for each test iteration, and analyze carefully.  And document the changes you make.

  3. The components you use, and your design, will have extreme ramifications on performance.  If your dashboards heavily use PAS views vs. Excel Services vs. Analytic Charts, your performance characteristics will change dramatically.  Carefully consider the ramifications of the different technologies, and look at where the work is being done in your configuration.

  4. Overall, scalability and performance was good, and everything was pretty much in line with what I saw in the Performance Tuning and Capacity Planning for PerformancePoint Monitoring Server.  Keep in mind that your implementation will probably be be somewhat different than the PPS Team’s test installation, so keep that in mind in your planning.

Happy PerformancePointing…


David