Posts tagged ‘C#’

SQL Server Analysis Services ‘Project Helper’

A little while back I spent an afternoon prototyping some functionality for Analysis Services.  I’ve been working a bit on improving the collaborative development story around SSAS, and I was after Building a SQL Server Analysis Services .ASDatabase file from a Visual Studio SSAS Project. I did come up with some nifty functionality in the process that I thought I’d share.

SsasHelper / ProjectHelper

I added the SsasHelper sample to the Microsoft SQL Server Community Samples:  Analysis Services site on CodePlex

The SsasHelper sample currently only contains the ProjectHelper library, which contains some functions to help work with a Visual Studio SSAS project. I also built a command line interface to demonstrate how to use the library to do a few tasks. There are a bunch of things you can use the library for, but the sample should at least get you started. The sample has the following functionality:

  1. De-serialize a Visual Studio SSAS Project into an AMO Database, then re-serialize all the components (.cube, .dim, .dsv, etc.) back out to files. You could use this functionality to let you programmatically work with a AS DB.
  2. Build a .ASDatabase file based on the Visual Studio SSAS Project. You could use this functionality as part of a robust build process.
  3. "Clean" a Visual Studio SSAS Project for comparison/source code merges. This is basically the same functionality I wrote about here, just ported into this library.
  4. Re-order a SSAS Project File. This is basically the "Smart Diff" functionality from BIDS Helper… I included it because De-serializing/serializing a project re-orders the .partition files, making them a pain to validate. I used this function to sort the .partition files to make them easy to diff.

Serialize/De-Serializing Visual Studio SSAS Projects

This is something I’d be wanting to figure out for awhile.  Most of this is really straight forward, with just one or two little tricks to make it work.  We start by reading the SSAS Project file… we just load up the XML Document and parse each set of objects. It looks like this:

   1: // Load the SSAS Project File

   2: XmlReader reader = new XmlTextReader(ssasProjectFile);

   3: XmlDocument doc = new XmlDocument();

   4: doc.Load(reader);

   5:  

   6: // Load the Database

   7: nodeList = doc.SelectNodes("//Database/FullPath");

   8: fullPath = nodeList[0].InnerText;

   9: innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

  10: Utils.Deserialize(innerReader, (MajorObject)database);

  11:  

  12: // Load all the Datasources

  13: nodeList = doc.SelectNodes("//DataSources/ProjectItem/FullPath");

  14: DataSource dataSource = null;

  15: foreach (XmlNode node in nodeList)

  16: {

  17:     fullPath = node.InnerText;

  18:     innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

  19:     dataSource = new RelationalDataSource();

  20:     Utils.Deserialize(innerReader, (MajorObject)dataSource);

  21:     database.DataSources.Add(dataSource);

  22: }

Loading each object type looks pretty much the same… the SSAS team provided that handy Utils class that helps to serialize and de-serialize objects.  There is, of course, one little wrinkle… the .cube and .partition files.  They don’t play by the same rules as the other objects.  Luckily, they’re close enough.  A few little hacks get us there.

The De-Serialize for cubes looks like this:

   1: // Load all the Cubes

   2: nodeList = doc.SelectNodes("//Cubes/ProjectItem/FullPath");

   3: Cube cube = null;

   4: foreach (XmlNode node in nodeList)

   5: {

   6:     fullPath = node.InnerText;

   7:     innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

   8:     cube = new Cube();

   9:     Utils.Deserialize(innerReader, (MajorObject)cube);

  10:     database.Cubes.Add(cube);

  11:  

  12:     // Process cube dependencies (i.e. partitions

  13:     // Little known fact:  The Serialize/Deserialize methods DO handle partitions... just not when 

  14:     // paired with anything else in the cube.  We have to do this part ourselves

  15:     dependencyNodeList = node.ParentNode.SelectNodes("Dependencies/ProjectItem/FullPath");

  16:     foreach (XmlNode dependencyNode in dependencyNodeList)

  17:     {

  18:         fullPath = dependencyNode.InnerText;

  19:         innerReader = ProjectHelper.FixPartitionsFileForDeserialize( ssasProjectDirectory + fullPath, cube);

  20:         Cube partitionsCube = new Cube();

  21:         Utils.Deserialize(innerReader, (MajorObject)partitionsCube);

  22:         MergePartitionCube(cube, partitionsCube);

  23:     }

  24: }

De-serializing the .cube files is easy… but I figured out you have to muck with the .partitions file a bit before it will de-serialize.  We basically just have to add a Name node to the XML for the de-serialize to work… we don’t use it for anything, but it is required.

   1: private static XmlReader FixPartitionsFileForDeserialize(string partitionFilename,Cube sourceCube)

   2: {

   3:     // Validate inputs

   4:     if (sourceCube == null)

   5:     {

   6:         throw new ArgumentException("Provide a Cube object that matches the partitions file");

   7:     }

   8:  

   9:     if (string.IsNullOrEmpty(partitionFilename))

  10:     {

  11:         throw new ArgumentException("Provide a partitions file");

  12:     }

  13:     // I am NOT validating the extention to provide some extra flexibility here

  14:  

  15:     XmlDocument document = new XmlDocument();

  16:     document.Load(partitionFilename);

  17:  

  18:     // Setup for XPath queries

  19:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  20:     string defaultNamespaceURI = "http://schemas.microsoft.com/analysisservices/2003/engine";

  21:      

  22:     // Get all the MeasureGroup IDs

  23:     XmlNodeList nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:ID", xmlnsManager);

  24:     XmlNode newNode = null;

  25:     

  26:     // Add a Name node underneath the ID node if one doesn't exist, using the MeasureGroup's real name

  27:     foreach (XmlNode node in nodeList)

  28:     {

  29:         // Verify the node doesn't exist

  30:         if (XmlHelper.NodeExists(node.ParentNode, "Name"))

  31:             continue;

  32:  

  33:         newNode = document.CreateNode(XmlNodeType.Element, "Name", defaultNamespaceURI);

  34:         // Lookup the MG name from the cube based on the ID in the file

  35:         newNode.InnerText = sourceCube.MeasureGroups.Find(node.InnerText).Name;

  36:         node.ParentNode.InsertAfter (newNode, node);

  37:     }

  38:  

  39:     // Return this as an XmlReader, so it can be manipulated

  40:     return new XmlTextReader(new StringReader(document.OuterXml));

  41: }

Next, the MergeParition method will just loop thru the “partitions only” cube we de-serialized, and add each partition and aggregation design to the “main” cube.  We have to make sure a create a copy of each object we add, because a Partition and AggregationDesign can only be in a single cube… using a Add on an existing object removes it from the initial collection.

   1: private static void MergePartitionCube(Cube baseCube, Cube partitionCube)

   2: {

   3:     MeasureGroup baseMG = null;

   4:  

   5:     foreach (MeasureGroup mg in partitionCube.MeasureGroups)

   6:     {

   7:         baseMG = baseCube.MeasureGroups.Find(mg.ID);

   8:  

   9:         // Heisenberg principle in action with these objects; use 'for' instead of 'foreach'

  10:         if (mg.Partitions.Count > 0)

  11:         {

  12:             for (int i = 0; i < mg.Partitions.Count; ++i)

  13:             {

  14:                 Partition partitionCopy = mg.Partitions[i].Clone();

  15:                 baseMG.Partitions.Add(partitionCopy);

  16:             }

  17:         }

  18:  

  19:         // Heisenberg principle in action with these objects; use 'for' instead of 'foreach'

  20:         if (mg.AggregationDesigns.Count > 0)

  21:         {

  22:             for (int i = 0; i < mg.AggregationDesigns.Count; ++i)

  23:             {

  24:                 AggregationDesign aggDesignCopy = mg.AggregationDesigns[i].Clone();

  25:                 baseMG.AggregationDesigns.Add(aggDesignCopy);

  26:             }

  27:         }

  28:     }

  29: }

We now have a AMO Database from our Visual Studio Project!  Serializing it back out is pretty similar… we just loop through the Database, and serialize everything out:

   1: XmlTextWriter writer = null;

   2:  

   3: // Iterate through all objects in the database and serialize them

   4: foreach (DataSource dataSource in database.DataSources)

   5: {

   6:     writer = new XmlTextWriter(targetDirectory + dataSource.Name + ".ds", Encoding.UTF8);

   7:     writer.Formatting = Formatting.Indented;

   8:     Utils.Serialize(writer, (MajorObject)dataSource, false);

   9:     writer.Close();

  10: }

Again, the one wrinkle is the .cube and .partitions file… the Utils methods don’t 100% work with those guys.  Luckily, they work about 90%, so we have a work around.

   1: // Special case:  The cube serialization won't work for partitions when Partion/AggregationDesign

   2: // objects are mixed in with other objects.  Serialize most of the cube, then split out

   3: // Partion/AggregationDesign objects into their own cube to serialize, then clean up

   4: // a few tags

   5: foreach (Cube cube in database.Cubes)

   6: {

   7:     writer = new XmlTextWriter(targetDirectory + cube.Name + ".cube", Encoding.UTF8);

   8:     writer.Formatting = Formatting.Indented;

   9:     Utils.Serialize(writer, (MajorObject)cube, false);

  10:     writer.Close();

  11:  

  12:     // Partitions and AggregationDesigns may be written to the Cube file, and we want

  13:     // to keep them all in the Partitions file; strip them from the cube file

  14:     FixSerializedCubeFile(targetDirectory + cube.Name + ".cube");

  15:  

  16:     Cube partitionCube = SplitPartitionCube(cube);

  17:     writer = new XmlTextWriter(targetDirectory + cube.Name + ".partitions", Encoding.UTF8);

  18:     writer.Formatting = Formatting.Indented;

  19:     Utils.Serialize(writer, (MajorObject)partitionCube, false);

  20:     writer.Close();

  21:  

  22:     // The partitions file gets serialized with a few extra nodes... remove them

  23:     FixSerializedPartitionsFile(targetDirectory + cube.Name + ".partitions");

  24: }

We first serialize the whole cube (for each cube, of course) and then manually fix up the file.  Some of the Partition and Measuregroup info gets serialized out to the .cube file at this point, and we want to keep all of that information in the .partition file where it belongs.

   1: private static void FixSerializedCubeFile(string cubeFilename)

   2: {

   3:     // Validate inputs

   4:     if (string.IsNullOrEmpty(cubeFilename))

   5:     {

   6:         throw new ArgumentException("Provide a cube file");

   7:     }

   8:     // I am NOT validating the extention to provide some extra flexibility here

   9:  

  10:     XmlDocument document = new XmlDocument();

  11:     document.Load(cubeFilename);

  12:  

  13:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  14:  

  15:     XmlNodeList nodeList = null;

  16:  

  17:     // Remove the MeasureGroup Names

  18:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:Partitions", xmlnsManager);

  19:     XmlHelper.RemoveNodes(nodeList);

  20:  

  21:     // Remove the StorageModes

  22:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:AggregationDesigns", xmlnsManager);

  23:     XmlHelper.RemoveNodes(nodeList);

  24:  

  25:     document.Save(cubeFilename);

  26: }

We then create a temporary cube with just the Partition and AggregationDesign objects in it, and serialize it.  Again, we have to clean up the file just a bit.

   1: private static void FixSerializedPartitionsFile(string partitionFilename)

   2: {

   3:     // Validate inputs

   4:     if (string.IsNullOrEmpty(partitionFilename))

   5:     {

   6:         throw new ArgumentException("Provide a partitions file");

   7:     }

   8:     // I am NOT validating the extention to provide some extra flexibility here

   9:  

  10:     XmlDocument document = new XmlDocument();

  11:     document.Load(partitionFilename);

  12:  

  13:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  14:  

  15:     XmlNodeList nodeList = null;

  16:     

  17:     // Remove the MeasureGroup Names

  18:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:Name", xmlnsManager);

  19:     XmlHelper.RemoveNodes(nodeList);

  20:  

  21:     // Remove the StorageModes

  22:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:StorageMode", xmlnsManager);

  23:     XmlHelper.RemoveNodes(nodeList);

  24:  

  25:     // Remove the ProcessingModes

  26:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:ProcessingMode", xmlnsManager);

  27:     XmlHelper.RemoveNodes(nodeList);

  28:  

  29:     document.Save(partitionFilename);

  30: }

…and there we go!  We can start with a Visual Studio SSAS Project, create the Database in memory, do whatever we want with it, then write it back out to files!

Build .ASDatabase File

I talk more about the what and why in <<this blog>>, but the actual mechanics of this functionality is really easy (once we’ve de-serialized the project, that is).  This could basically be condensed to a single line of code:

   1: Database database;

   2:  

   3: database = ProjectHelper.DeserializeProject(ssasProjectFile);

   4:  

   5: ProjectHelper.GenerateASDatabaseFile(database, targetFilename);

 

Clean Visual Studio SSAS Project

I won’t go into a lot of detail on this here… I posted a blog on SQL Server Analysis Services Projects with Multiple Developers a little while back.  I originally wrote this functionality in PowerShell, but just ported it into this library because I thought it would be useful.  This method just goes through each unlocked file and removes fields maintained by Visual Studio that aren’t required by the AS DB per se, but that make merging code from multiple developers substantially more difficult.  If you’re testing out the serialize/de-serialize functionality, you might also want to use this against your projects to make them easier to compare.

Sort SSAS Project File

One annoying side affect of serializing the .partition file is that all the elements get re-arranged.  Everything still works, of course, but if you’re like me you’ll want to verify that everything is working.  The good people over at BIDS Helper let me use a version of their “Smart Diff” XSLT that, which will sort a SSAS file.  I modified it slightly and customized it to help validate the .partition files.  I would be very careful using it for anything else… its only been tested for the intended scenario.

Known Issues

I’ve tested this with Visual Studio 2008 SP1, SQL Server Analysis Services SP1 CU3, and the Adventure Works 2008 AS DB with no issues.  Definitely make a backup before you try this out, as your mileage may vary.

  1. The sample I wrote just serializes objects into files based on their names.  It would be trivial to maintain a list based on the Visual Studio project file to make sure the objects are written out with their correct names.
  2. Partitions are reordered when De-Serialized/Serialized. Makes it a pain to validate, but I’ve seen no ill effects. Use SortSssasFile functionality to make a copy of the files for easier comparison.
  3. Some fields maintained for Visual Studio (State (Processed, Unprocessed), CreatedTimestamp, LastSchemaUpdate, LastProcessed, dwd:design-time-name, CurrentStorageMode) are lost when a project is De-Serialized/Serialized.

Next Steps

I’ve thought of some nifty things to do with this functionality, including…

  1. Generate a new Visual Studio project based on metadata.
  2. Maintain a project with external metadata… you can maintain a separate data dictionary with descriptions of each attribute/measure/cube, and just programmatically update your source when needed.
  3. Automate repetitive tasks and maintenance of the project (like adding a bunch of partitions to your measure groups, or replacing every occurrence of a certain word in the project.
  4. Build a more robust development and deployment process.
  5. …and many more things.

Cheers,

David

Building a SQL Server Analysis Services .ASDatabase file from a Visual Studio SSAS Project

There are several methods available for deploying Analysis Services databases once you’ve build your solution, including direct connections, generating XMLA to deploy, and using the Deployment Wizard with answer files.  Still, building and deploying AS Databases can sometimes be a challenge in enterprise development scenarios.  One common scenario is when you have multiple developers working on a single solution where all the files are under source control.  I wrote a blog about about SQL Server Analysis Services Projects with Multiple Developers recently, and this is one of the issues that you tend to run into.  I created a sample you can use to help with this problem.  I’m not going to go over how to deploy AS solutions (it’s documented plenty of other places)… this is just a tool to address this particular scenario.

I create the SsasHelper sample demonstrate the functionality, and posted it to the Microsoft SQL Server Community Samples:  Analysis Services site on CodePlex.  I posted about this functionality in my blog SQL Server Analsysis Services ‘Project Helper’ if you want details on the mechanics.

Background

So you’ve built your solution, tested it, and everything works great.  All of your developers are hard at work constantly improving your OLAP capabilities, and they’re storing all of their changes under source control.  Now you need to do routine deployments to multiple environments.  One of the problems you might have run into is that you have to use Visual Studio to actually do a deployment.  Even if you’re deploying via the Deployment Wizard (using answer files), you still need to generate the .ASDatabase by building the solution.  This problem becomes a little bigger if you want to use a Build Server, that doesn’t have Visual Studio, where you compile your solution, test, deploy it, etc.  Currently, the only option you really have for this scenario is to make sure every developer Builds the solution after all changes and checks in a new .ASDatabase file.  This is a bit of a pain, and tends to lead to (at least occasional) issues with deploying updated versions.

SSAS Helper

I created a class library to work with Analysis Services projects.  This library has a couple of neat pieces of functionality, including being able to de-serialize a SSAS project to an AMO database, write an AMO database back out to component files, clean a AS project (removing volatile, non-essential fields… I went over it in SQL Server Analysis Services Projects with Multiple Developers), and creating a .ASDatabase file based on a project.  The library is written in C# 3.5, and is designed/tested with SSAS 2008 (though it should work with 2005… test it though).

Using SSAS Helper

If you’re running into this problem, you probably already have a good build solution, so I won’t go over that here.  What you’ll want to do is create a new build task (using whatever framework you’re using) using this component.  As an example, you can create a custom MSBuild using this library that takes the SSAS Project as an input, and delivers a .ASDatabase file as an output.  The task would then be added to the build file, so you will no longer have a dependency on Visual Studio to compile all your .cube, .dim, .role, etc. files into something that can be deployed.  I’ll try and post a sample in the next few weeks, but it shouldn’t be a major task.

Caveats

I built this tool and tested on my current project, and tested with Adventure WOrks… I haven’t seen any problems with it, but your mileage may vary.  Make sure you test with your project.

Here are some known issues for this functionality:

  1. Partitions are reordered when De-Serialized/Serialized. Makes it a pain to validate, but I’ve seen no ill effects. Use the SortSssasFile functionality to make a copy of the files for easier comparison.
  2. Some fields maintained for Visual Studio (State (Processed, Unprocessed), CreatedTimestamp, LastSchemaUpdate, LastProcessed, dwd:design-time-name, CurrentStorageMode) are lost when a project is De-Serialized/Serialized.

When I use this tool on a project, I work on a copy, and then do a file diff to check the output the first time.  I used Beyond Compare (http://www.scootersoftware.com/) to compare the entire directory, and each file side by side, just to make sure there are no unintended side affects.  I would recommend you do the same to make sure… this works fine on the projects I’ve used it on, but you need to make sure there’s nothing special about yours so you don’t accidently destroy something.

How it Works

This project works by de-serializing all the files referenced by the .dwproj file into AMO objects then serializing the entire database.  There is more detail on de-serializing/serializing the objects in my post SQL Server Analsysis Services ‘Project Helper’ .  The code is fairly well commented (well… I think it is :) and should be fairly straight forward.

Next Steps

To use this, you’ll need create a build task in your framework of choice, and just plug it in to your solution.  If this proves difficult for people, I’ll try and provide a sample, but it should be fairly straight forward to do.  If it doesn’t work right out of the box, it should work with minimal modification.  Just make sure all of your project files (.dwproj, .cube, .dim, .role, .dsv, .ds, .dmm, etc.) are published and you have a way to push the .ASDatabase file to your target.

Conclusion

That’s about it… I hope this helps some folks out.  Let me know if you have any problems, and we’ll see what we can get working.

Cheers,

David