Presenting At SSIG in Greenville, SC

I’ll be doing a presentation on Analysis Services at the SQL Server Innovators Guild in Greenville, SC on Tuesday, Dec. 1st. I’ll be delivering an introduction to SSAS, with lots of demos. If you are interested in attending, please register here. It’s a presentation that I’ve done a few times now, but because it’s mostly demo, something new and interesting always comes up.

 

Introduction to Analysis Services 2008

This session is intended to introduce database developers to Analysis Services 2008, with a focus on being able to quickly construct usable OLAP cubes. This presentation will be light on slides, and heavy on demonstrating how to perform the steps to create the cubes. During this session, we will cover the creation of a new cube from an existing database step by step. We will also highlight the reasons for using Analysis Services, and applicable scenarios for using it.

Posted in Uncategorized | Comments Off on Presenting At SSIG in Greenville, SC

SSIS 101: Viewing Variable Values at Runtime

One of the common problems that beginners have with SSIS is debugging errors involving variables. One example of this occurs when a package uses a Foreach Loop container. These are often used to set a variable value differently for each iteration of a loop. If something fails during the loop, you might want to check the value of the variable in order to determine what went wrong.

Fortunately, this is pretty easy to accomplish in SSIS. You can see the value of any package variable in BIDS when you debug the package by following the steps below:

  1. First, set a breakpoint on a task where you’d like to check the current variable values. You can set a breakpoint by right-clicking on the task and choosing Edit Breakpoints.image
  2. Choose OnPreExecute to see values before the task executes and OnPostExecute to see them after execution. Click OK after enabling the breakpoint.
    image 
  3. Run the package in debug mode (press F5) in Visual Studio. The package will run until the breakpoint is hit.
    image
  4. Once execution stops at the breakpoint, open the Locals window (Ctrl+Alt+V, L or Debug..Windows..Locals)
    image
  5. Expand the Variable node in the Locals window. You can see the current values for all your variables, including system variables, in this window. You may have to scroll down to see your variables in the list
    image

This is a useful technique for troubleshooting packages that use variables, particularly if the variable values are changed during package execution.

Posted in Uncategorized | Comments Off on SSIS 101: Viewing Variable Values at Runtime

Using MSBuild with SQL Server Analysis Services Projects

I’ve written several blogs and community samples on working with SSAS Projects directly using AMO (instead of SSAS Databases on an Analysis Services server).  I was travelling this weekend, and got a chance to create a sample MSBuild task that will generate a .ASDatabase file directly from a Visual Studio project, without requiring Visual Studio itself.  This means that multiple developers can work on a project, check-in files via source control, and can schedule an automated build, build on a dedicated “clean” machine (without VS), or any of a number of other scenarios.

I added the custom MSBuild task to the Analysis Services Community Samples project on CodePlex under the SsasHelper sample.

The Build task code itself is trivial.  I won’t go into a lot of detail on that, has creating/debugging custom tasks is well documented (you can start the library with MSBuild.exe as the external program and the project file as the argument).  I used it as a wrapper for previously developed ProjectHelper code that does all the heavy lifting.  All we do is inherit from Microsoft.Build.Utilities.Task and implement the Execute method.  All I do in the task is de-serialize the project (based on the Visual Studio project passed in), validate the project (based on the target version of SSAS), and write out the .ASDatabase file.  This could of course be modified to use another method to deploy the database, but I’ve been using the .ASDatabase method for awhile with no issues.

Here’s the main code for the method:

try
{
Database database = ProjectHelper.DeserializeProject(SsasProjectFile);

// ... Verify our project doesn't have any errors ...
ValidationResultCollection results;

bool isValidated = ProjectHelper.ValidateDatabase(database, SsasServerEdition, out results);

// If the database doesn't validate (i.e., a build error)
// log the errors and return failure.
foreach (ValidationResult result in results)
{
Log.LogError(result.Description);
}

if (!isValidated)
{
return false;
}

// Build the .ASDatabase file
ProjectHelper.GenerateASDatabaseFile(database, SsasTargetFile);
}
catch (Exception ex)
{
Log.LogErrorFromException(ex);
return false;
}

return true;

So… how do we actually use this?  I included a sample project file in the SsasBuilder project.  The basic idea is you have a project file (a file that ends in “.[optional prefix]proj”, such as “.proj”, “.csproj”, etc.).  You can call this via MSBuild.  Note that the standard SSAS project file DOES NOT work with MSBuild.  The schemas required for that project conflict with the MSBuild schema, so you’ll have to create another project file, or build the build step into somewhere else.  Here’s an example project file:

   1: <Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">

   2:     <UsingTask TaskName="SsasBuilder.SsasBuildASDatabaseFileTask"

   3:         AssemblyFile="C:TFSSsasHelperSsasBuilderbindebugSsasBuilder.dll"/>

   4:     <Target Name="BuildASDatabaseFile">

   5:         <SsasBuildASDatabaseFileTask SsasProjectFile = "C:Testenterprise_GoldAdventure Works DW 2008.dwproj"

   6:             SsasTargetFile = "C:TestSsasBuildTestAdventureWorks.ASDtabase"

   7:             SsasServerEdition = "Enterprise" />

   8:     </Target>

   9: </Project>

Here I’m using absolute paths, but you can use either properties or relative paths as required for your particular project.  You just use a UsingTask tag to point to the assembly containing the build task, then use the task in a target.  For this sample I’m I’m taking the SSAS project file and target filename, along with the server edition, as parameters.  If there are no errors in the project file, the .ASDatabase file will be generated in the specified location.

Now, all you have to do is call “MSBuild.exe <Whateveryounamedyourproject>”, and you’ll get a .ASDatabase file out of it…

Cheers,

David

Posted in Uncategorized | Tagged , | Comments Off on Using MSBuild with SQL Server Analysis Services Projects

Validating SSAS Projects Programmatically

Earlier this week I got a feature request from someone someone that was looking in to my SsasHelper sample on the Microsoft SQL Server Community Samples:  Analysis Services site on CodePlex.  She was interested in Building a SQL Server Analysis Services .ASDatabase file from a Visual Studio SSAS Project, but pointed out that I didn’t actually *validate* a project before creating the .ASDatabase file, so if someone had checked in a project with errors, the whole process could blow up (or we might deploy a database with some issues).  I looked into doing this, and it turns out it’s really easy to accomplish.  I updated the code in SsasHelper sample on CodePlex to show how to do this.

The actual code is really simple:

   1: bool doesBuild = false;

   2: results = new ValidationResultCollection();

   3:  

   4: // We have to provide a ServerEdition for this method to work.  There are 

   5: // overloads that look like the will work without them, but they can't be used

   6: // in this scenario.

   7: // The ServerEdition might need to be changed for your situation.

   8: // This can be modified to return warnings and messages as well.

   9: doesBuild = database.Validate(results, ValidationOptions.None, ServerEdition.Developer);

  10:  

  11: return doesBuild;

You can use the method I created like so:

   1: Database database;

   2: bool hasErrors = false;

   3:  

   4: // Load a SSAS database object based on a BIDS project

   5: database = ProjectHelper.DeserializeProject(ssasProjectFile);

   6:  

   7: // ... Verify our project doesn't have any errors ...

   8: ValidationResultCollection results;

   9: hasErrors = ProjectHelper.ValidateDatabase(database, out results);

  10:  

  11: foreach (ValidationResult result in results)

  12: {

  13:     Console.WriteLine(string.Format("{0}", result.Description));

  14: }

  15:  

  16: Console.WriteLine(string.Format("Project is Error Free?  {0}", hasErrors));

  17:  

  18: Console.WriteLine("Project validated!");

This will take a Database object and validate it to see if there are any errors.  You can modify it to return Warnings and Messages as well by changing the Validation Options.

One thing to note is that the Server Edition is a required parameter.  There are a few overloads of the Validate method that don’t require this parameter, but what they try and do is walk up the object tree to get the Server object (associated with the Database) and retrieve the edition.  Since I’m de-serializing a project into a Database object, this property isn’t available (and can’t be set).  It is important to use this method with the correct Server Edition.  The validation process will throw errors if you use some (but I don’t think *all*) of the features for a different edition.  For example, if your project includes Translations (an Enterprise-only feature), validation will succeed if you use ServerEdition.Developer or ServerEdition.Enterprise, but will fail if you use ServerEdition.Standard.

Cheers,

David

Posted in Uncategorized | Tagged , | Comments Off on Validating SSAS Projects Programmatically

PASS Summit 2009

I’m really looking forward to the PASS Summit next week, and getting a chance to visit with a lot of the people in the community that I interact with on a regular basis. It’s going to be a really busy week, as there’s a lot of great sessions that I’m looking forward to attending, and a few things that I’m going to be delivering myself.

A quick summary of where I’ll be during the conference:

Outside of that, I’ll be around at other sessions, the evening events, and in the “Ask the Experts” area. Looking forward to seeing everyone there.

Posted in Uncategorized | Comments Off on PASS Summit 2009

SQL Server MVP Deep Dives

There’s a new book available for pre-order – “SQL Server MVP Deep Dives”. This book is a little unusual in that 53 MVPs came together to contribute 59 chapters to the book. Some of the best SQL Server authors in the world contributed chapters to it. I’m certainly not one of that group, but somehow, I managed to get included, and it’s a great honor to be in such good company. The book covers a wide variety of SQL Server topics, including:

  • design
  • development
  • administration
  • tuning and optimization
  • and business intelligence (my personal favorite)

This book was a special project for the authors involved. 100% of the author royalties go to War Child International, which is a charity that works to help children affected by war across the world.

So, if you like the idea of learning some interesting things about SQL Server and helping children at the same time, get this book. If you are attending the PASS Summit, it will be available for purchase from the conference bookstore onsite. There will be a large number of the authors at the Summit (including me), so there will be plenty of opportunities to get your copy signed.

Posted in Uncategorized | Comments Off on SQL Server MVP Deep Dives

Passing an Object from a Parent Package To a Child

Occasionally, you may run into the need to pass values between packages. In most cases, you can use a Parent Package Variable configuration to pass the value of a variable in the calling (or parent) package to the called (or child) package. However, Parent Package Variable configurations can’t be set up on variables of type Object. Fortunately, you can work around this pretty easily, thanks to the fact that the calling package variables are inherently accessible in the called packages. 

I’ve set up a sample parent and child package to illustrate this. The parent package is straightforward. It uses an Execute SQL task to populate an object variable named TestVar with a recordset object, and then calls the child package.

image image

The child package has a Foreach Loop Container to iterate over the recordset object. It has a Script task that is used to copy the parent package’s variable (TestVar) to a local variable named LocalVar. This is the variable that the Foreach Loop is configured to use. Why copy the value? If you don’t have a local variable to reference in the Foreach Loop, it won’t validate properly.

image image

The script in Copy Variable is pretty simple. It relies on the fact that you can reference parent package variables inherently, as they are included in the collection of variables accessible in the local package. The script just copies the value from one variable to the other, so that the Foreach Loop will have something to do.

public void Main()
{
    Variables vars = null;
    Dts.VariableDispenser.LockForWrite("User::LocalVar");
    Dts.VariableDispenser.LockForRead("User::TestVar");
    Dts.VariableDispenser.GetVariables(ref vars);

    vars["User::LocalVar"].Value = vars["User::TestVar"].Value;
    vars.Unlock();

    Dts.TaskResult = (int)ScriptResults.Success;
}

Please note that for this to work, you cannot have a variable in the child package with the same name as the variable in the parent package. If you do, the local variable will hide the parent variable. Outside of that, this works really well for passing object values between packages. The same technique can also be used in reverse to send values back to the parent package, if you have that need.

The sample has been uploaded to my Skydrive. Let me know if you have any questions.

Posted in Uncategorized | Comments Off on Passing an Object from a Parent Package To a Child

Implementing PerformUpgrade in Custom Components

If you develop custom components for SSIS, you may have the need to update them as you add new functionality. If you are just upgrading the functionality, but not changing the metadata, then you can simply recompile and redeploy the component. An example of this type of update would be changing the component to do additional warning or informational logging. The code has to be updated, but the metadata (the properties of the component, the settings for the inputs and outputs) was not modified.

The other type of update involves changing the component’s metadata. Examples of this would be adding a new property to the component or adding new inputs or outputs. In this case, you could increment the assembly version of your component, but then you would have to remove the old one from any data flows, and then add the new one back in and reconnect it. Rather than forcing users of the component to go through that effort for every package that uses the component, you can implement the PerformUpgrade method on your component. The PerformUpgrade method will be called when the package is loaded and the current version of the component does not match the version stored in the package’s metadata. You can use this method to compare the current version of the component to the expected version, and adjust the metadata appropriately.

Setting the CurrentVersion

To use this, you have to tell SSIS what the current version of your component is. You do this by setting the CurrentVersion property in the DtsPipelineComponent attribute that can be set on the PipelineComponent class:

[DtsPipelineComponent(
    DisplayName = "Test Component",
    ComponentType = ComponentType.Transform,
    CurrentVersion = 1,
    NoEditor = true)]
public class TestComponent : PipelineComponent

The CurrentVersion property defaults to zero, so a value of 1 indicates that this component is now on it’s second version.

Performing the Upgrade

Next, you need to implement some code in the PerformUpgrade method. This consists of first getting the value of the CurrentVersion property, and at the end of the method, setting the version in the component’s metadata to the current version.

public override void PerformUpgrade(int pipelineVersion)
{
    // Obtain the current component version from the attribute.
    DtsPipelineComponentAttribute componentAttribute = 
      (DtsPipelineComponentAttribute)Attribute.GetCustomAttribute(this.GetType(), typeof(DtsPipelineComponentAttribute), false);
    int currentVersion = componentAttribute.CurrentVersion;

    if (ComponentMetaData.Version < currentVersion)
    {
        //Do the upgrade here
    }

    // Update the saved component version metadata to the current version.
    ComponentMetaData.Version = currentVersion;
}

The actual upgrade code can vary a good bit, from adding custom properties, adjusting the data types of outputs, or adding / deleting inputs or outputs. I won’t show the logic for these things here, but it’s pretty similar to the same code you’d use in ProvideComponentProperties.

Handling Multiple Upgrades

The code above is based on the sample in Books Online, but there’s a slight issue. Determining what upgrades need to be applied can be more complicated than simply comparing the current version to the ComponentMetaData version. Imagine that you have already upgraded the component from version 0 to version 1, by adding a new property. Now, you discover a need to add another new property, which will result in version 2. What do you do about the property added in version 1? You don’t want to add it twice for components that have already been upgraded to version 1. But it’s also possibly that not all packages have been upgraded from version 0 yet, so for those you need to add both properties. By altering to version check logic a little, you can accommodate upgrading from multiple versions pretty easily:

if (ComponentMetaData.Version < 1)
{
    //Perform upgrade for V1
}

if (ComponentMetaData.Version < 2)
{
    //Perform upgrade for V2
}

This change will ensure that the appropriate upgrade steps are taken for each version.

Some Other Thoughts

There’s a few things to be aware of with PerformUpgrade. One, it’s called only when the package is loaded, and the version stored in the package’s metadata is different than the binary component. This can occur both at design time (when the package is opened in Visual Studio), or at runtime (when executing the package from DTEXEC, etc).

Two, when you update the CurrentVersion property, and then add the component to a new package, the version number in the package metadata will initially be set to 0. So the next time the package is opened, the PerformUpgrade will be performed. Since the ProvideComponentProperties would have already set the metadata appropriately for new version of the component, the PerformUpgrade can cause errors by attempting to add the same metadata again. This appears to be a bug in the behavior when adding the component to the data flow, and it occurs under both 2005 and 2008. The workaround is code the PerformUpgrade method to check before altering any metadata, to make sure that it doesn’t already exist.

Three, due to what looks like another bug, when the package is opened the second time after the component is initially added to the package, the version will be incremented at the end of PerformUpgrade (assuming you use the code above that updates the version). However, this change does not mark the package as dirty in the designer, so the updated version number will not be saved unless some other property in the package is modified, and then the package is saved. This isn’t a huge problem – though you do need to make sure that the code in PerformUpgrade can be run repeatedly to avoid issues.

That’s pretty much it. Hopefully this will be helpful if you are developing custom components for SSIS.

Posted in Uncategorized | Comments Off on Implementing PerformUpgrade in Custom Components

SQL Server Analysis Services ‘Project Helper’

A little while back I spent an afternoon prototyping some functionality for Analysis Services.  I’ve been working a bit on improving the collaborative development story around SSAS, and I was after Building a SQL Server Analysis Services .ASDatabase file from a Visual Studio SSAS Project. I did come up with some nifty functionality in the process that I thought I’d share.

SsasHelper / ProjectHelper

I added the SsasHelper sample to the Microsoft SQL Server Community Samples:  Analysis Services site on CodePlex

The SsasHelper sample currently only contains the ProjectHelper library, which contains some functions to help work with a Visual Studio SSAS project. I also built a command line interface to demonstrate how to use the library to do a few tasks. There are a bunch of things you can use the library for, but the sample should at least get you started. The sample has the following functionality:

  1. De-serialize a Visual Studio SSAS Project into an AMO Database, then re-serialize all the components (.cube, .dim, .dsv, etc.) back out to files. You could use this functionality to let you programmatically work with a AS DB.
  2. Build a .ASDatabase file based on the Visual Studio SSAS Project. You could use this functionality as part of a robust build process.
  3. "Clean" a Visual Studio SSAS Project for comparison/source code merges. This is basically the same functionality I wrote about here, just ported into this library.
  4. Re-order a SSAS Project File. This is basically the "Smart Diff" functionality from BIDS Helper… I included it because De-serializing/serializing a project re-orders the .partition files, making them a pain to validate. I used this function to sort the .partition files to make them easy to diff.

Serialize/De-Serializing Visual Studio SSAS Projects

This is something I’d be wanting to figure out for awhile.  Most of this is really straight forward, with just one or two little tricks to make it work.  We start by reading the SSAS Project file… we just load up the XML Document and parse each set of objects. It looks like this:

   1: // Load the SSAS Project File

   2: XmlReader reader = new XmlTextReader(ssasProjectFile);

   3: XmlDocument doc = new XmlDocument();

   4: doc.Load(reader);

   5:  

   6: // Load the Database

   7: nodeList = doc.SelectNodes("//Database/FullPath");

   8: fullPath = nodeList[0].InnerText;

   9: innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

  10: Utils.Deserialize(innerReader, (MajorObject)database);

  11:  

  12: // Load all the Datasources

  13: nodeList = doc.SelectNodes("//DataSources/ProjectItem/FullPath");

  14: DataSource dataSource = null;

  15: foreach (XmlNode node in nodeList)

  16: {

  17:     fullPath = node.InnerText;

  18:     innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

  19:     dataSource = new RelationalDataSource();

  20:     Utils.Deserialize(innerReader, (MajorObject)dataSource);

  21:     database.DataSources.Add(dataSource);

  22: }

Loading each object type looks pretty much the same… the SSAS team provided that handy Utils class that helps to serialize and de-serialize objects.  There is, of course, one little wrinkle… the .cube and .partition files.  They don’t play by the same rules as the other objects.  Luckily, they’re close enough.  A few little hacks get us there.

The De-Serialize for cubes looks like this:

   1: // Load all the Cubes

   2: nodeList = doc.SelectNodes("//Cubes/ProjectItem/FullPath");

   3: Cube cube = null;

   4: foreach (XmlNode node in nodeList)

   5: {

   6:     fullPath = node.InnerText;

   7:     innerReader = new XmlTextReader(ssasProjectDirectory + fullPath);

   8:     cube = new Cube();

   9:     Utils.Deserialize(innerReader, (MajorObject)cube);

  10:     database.Cubes.Add(cube);

  11:  

  12:     // Process cube dependencies (i.e. partitions

  13:     // Little known fact:  The Serialize/Deserialize methods DO handle partitions... just not when 

  14:     // paired with anything else in the cube.  We have to do this part ourselves

  15:     dependencyNodeList = node.ParentNode.SelectNodes("Dependencies/ProjectItem/FullPath");

  16:     foreach (XmlNode dependencyNode in dependencyNodeList)

  17:     {

  18:         fullPath = dependencyNode.InnerText;

  19:         innerReader = ProjectHelper.FixPartitionsFileForDeserialize( ssasProjectDirectory + fullPath, cube);

  20:         Cube partitionsCube = new Cube();

  21:         Utils.Deserialize(innerReader, (MajorObject)partitionsCube);

  22:         MergePartitionCube(cube, partitionsCube);

  23:     }

  24: }

De-serializing the .cube files is easy… but I figured out you have to muck with the .partitions file a bit before it will de-serialize.  We basically just have to add a Name node to the XML for the de-serialize to work… we don’t use it for anything, but it is required.

   1: private static XmlReader FixPartitionsFileForDeserialize(string partitionFilename,Cube sourceCube)

   2: {

   3:     // Validate inputs

   4:     if (sourceCube == null)

   5:     {

   6:         throw new ArgumentException("Provide a Cube object that matches the partitions file");

   7:     }

   8:  

   9:     if (string.IsNullOrEmpty(partitionFilename))

  10:     {

  11:         throw new ArgumentException("Provide a partitions file");

  12:     }

  13:     // I am NOT validating the extention to provide some extra flexibility here

  14:  

  15:     XmlDocument document = new XmlDocument();

  16:     document.Load(partitionFilename);

  17:  

  18:     // Setup for XPath queries

  19:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  20:     string defaultNamespaceURI = "http://schemas.microsoft.com/analysisservices/2003/engine";

  21:      

  22:     // Get all the MeasureGroup IDs

  23:     XmlNodeList nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:ID", xmlnsManager);

  24:     XmlNode newNode = null;

  25:     

  26:     // Add a Name node underneath the ID node if one doesn't exist, using the MeasureGroup's real name

  27:     foreach (XmlNode node in nodeList)

  28:     {

  29:         // Verify the node doesn't exist

  30:         if (XmlHelper.NodeExists(node.ParentNode, "Name"))

  31:             continue;

  32:  

  33:         newNode = document.CreateNode(XmlNodeType.Element, "Name", defaultNamespaceURI);

  34:         // Lookup the MG name from the cube based on the ID in the file

  35:         newNode.InnerText = sourceCube.MeasureGroups.Find(node.InnerText).Name;

  36:         node.ParentNode.InsertAfter (newNode, node);

  37:     }

  38:  

  39:     // Return this as an XmlReader, so it can be manipulated

  40:     return new XmlTextReader(new StringReader(document.OuterXml));

  41: }

Next, the MergeParition method will just loop thru the “partitions only” cube we de-serialized, and add each partition and aggregation design to the “main” cube.  We have to make sure a create a copy of each object we add, because a Partition and AggregationDesign can only be in a single cube… using a Add on an existing object removes it from the initial collection.

   1: private static void MergePartitionCube(Cube baseCube, Cube partitionCube)

   2: {

   3:     MeasureGroup baseMG = null;

   4:  

   5:     foreach (MeasureGroup mg in partitionCube.MeasureGroups)

   6:     {

   7:         baseMG = baseCube.MeasureGroups.Find(mg.ID);

   8:  

   9:         // Heisenberg principle in action with these objects; use 'for' instead of 'foreach'

  10:         if (mg.Partitions.Count > 0)

  11:         {

  12:             for (int i = 0; i < mg.Partitions.Count; ++i)

  13:             {

  14:                 Partition partitionCopy = mg.Partitions[i].Clone();

  15:                 baseMG.Partitions.Add(partitionCopy);

  16:             }

  17:         }

  18:  

  19:         // Heisenberg principle in action with these objects; use 'for' instead of 'foreach'

  20:         if (mg.AggregationDesigns.Count > 0)

  21:         {

  22:             for (int i = 0; i < mg.AggregationDesigns.Count; ++i)

  23:             {

  24:                 AggregationDesign aggDesignCopy = mg.AggregationDesigns[i].Clone();

  25:                 baseMG.AggregationDesigns.Add(aggDesignCopy);

  26:             }

  27:         }

  28:     }

  29: }

We now have a AMO Database from our Visual Studio Project!  Serializing it back out is pretty similar… we just loop through the Database, and serialize everything out:

   1: XmlTextWriter writer = null;

   2:  

   3: // Iterate through all objects in the database and serialize them

   4: foreach (DataSource dataSource in database.DataSources)

   5: {

   6:     writer = new XmlTextWriter(targetDirectory + dataSource.Name + ".ds", Encoding.UTF8);

   7:     writer.Formatting = Formatting.Indented;

   8:     Utils.Serialize(writer, (MajorObject)dataSource, false);

   9:     writer.Close();

  10: }

Again, the one wrinkle is the .cube and .partitions file… the Utils methods don’t 100% work with those guys.  Luckily, they work about 90%, so we have a work around.

   1: // Special case:  The cube serialization won't work for partitions when Partion/AggregationDesign

   2: // objects are mixed in with other objects.  Serialize most of the cube, then split out

   3: // Partion/AggregationDesign objects into their own cube to serialize, then clean up

   4: // a few tags

   5: foreach (Cube cube in database.Cubes)

   6: {

   7:     writer = new XmlTextWriter(targetDirectory + cube.Name + ".cube", Encoding.UTF8);

   8:     writer.Formatting = Formatting.Indented;

   9:     Utils.Serialize(writer, (MajorObject)cube, false);

  10:     writer.Close();

  11:  

  12:     // Partitions and AggregationDesigns may be written to the Cube file, and we want

  13:     // to keep them all in the Partitions file; strip them from the cube file

  14:     FixSerializedCubeFile(targetDirectory + cube.Name + ".cube");

  15:  

  16:     Cube partitionCube = SplitPartitionCube(cube);

  17:     writer = new XmlTextWriter(targetDirectory + cube.Name + ".partitions", Encoding.UTF8);

  18:     writer.Formatting = Formatting.Indented;

  19:     Utils.Serialize(writer, (MajorObject)partitionCube, false);

  20:     writer.Close();

  21:  

  22:     // The partitions file gets serialized with a few extra nodes... remove them

  23:     FixSerializedPartitionsFile(targetDirectory + cube.Name + ".partitions");

  24: }

We first serialize the whole cube (for each cube, of course) and then manually fix up the file.  Some of the Partition and Measuregroup info gets serialized out to the .cube file at this point, and we want to keep all of that information in the .partition file where it belongs.

   1: private static void FixSerializedCubeFile(string cubeFilename)

   2: {

   3:     // Validate inputs

   4:     if (string.IsNullOrEmpty(cubeFilename))

   5:     {

   6:         throw new ArgumentException("Provide a cube file");

   7:     }

   8:     // I am NOT validating the extention to provide some extra flexibility here

   9:  

  10:     XmlDocument document = new XmlDocument();

  11:     document.Load(cubeFilename);

  12:  

  13:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  14:  

  15:     XmlNodeList nodeList = null;

  16:  

  17:     // Remove the MeasureGroup Names

  18:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:Partitions", xmlnsManager);

  19:     XmlHelper.RemoveNodes(nodeList);

  20:  

  21:     // Remove the StorageModes

  22:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:AggregationDesigns", xmlnsManager);

  23:     XmlHelper.RemoveNodes(nodeList);

  24:  

  25:     document.Save(cubeFilename);

  26: }

We then create a temporary cube with just the Partition and AggregationDesign objects in it, and serialize it.  Again, we have to clean up the file just a bit.

   1: private static void FixSerializedPartitionsFile(string partitionFilename)

   2: {

   3:     // Validate inputs

   4:     if (string.IsNullOrEmpty(partitionFilename))

   5:     {

   6:         throw new ArgumentException("Provide a partitions file");

   7:     }

   8:     // I am NOT validating the extention to provide some extra flexibility here

   9:  

  10:     XmlDocument document = new XmlDocument();

  11:     document.Load(partitionFilename);

  12:  

  13:     XmlNamespaceManager xmlnsManager = LoadSsasNamespaces(document);

  14:  

  15:     XmlNodeList nodeList = null;

  16:     

  17:     // Remove the MeasureGroup Names

  18:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:Name", xmlnsManager);

  19:     XmlHelper.RemoveNodes(nodeList);

  20:  

  21:     // Remove the StorageModes

  22:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:StorageMode", xmlnsManager);

  23:     XmlHelper.RemoveNodes(nodeList);

  24:  

  25:     // Remove the ProcessingModes

  26:     nodeList = document.SelectNodes("/AS:Cube/AS:MeasureGroups/AS:MeasureGroup/AS:ProcessingMode", xmlnsManager);

  27:     XmlHelper.RemoveNodes(nodeList);

  28:  

  29:     document.Save(partitionFilename);

  30: }

…and there we go!  We can start with a Visual Studio SSAS Project, create the Database in memory, do whatever we want with it, then write it back out to files!

Build .ASDatabase File

I talk more about the what and why in <<this blog>>, but the actual mechanics of this functionality is really easy (once we’ve de-serialized the project, that is).  This could basically be condensed to a single line of code:

   1: Database database;

   2:  

   3: database = ProjectHelper.DeserializeProject(ssasProjectFile);

   4:  

   5: ProjectHelper.GenerateASDatabaseFile(database, targetFilename);

 

Clean Visual Studio SSAS Project

I won’t go into a lot of detail on this here… I posted a blog on SQL Server Analysis Services Projects with Multiple Developers a little while back.  I originally wrote this functionality in PowerShell, but just ported it into this library because I thought it would be useful.  This method just goes through each unlocked file and removes fields maintained by Visual Studio that aren’t required by the AS DB per se, but that make merging code from multiple developers substantially more difficult.  If you’re testing out the serialize/de-serialize functionality, you might also want to use this against your projects to make them easier to compare.

Sort SSAS Project File

One annoying side affect of serializing the .partition file is that all the elements get re-arranged.  Everything still works, of course, but if you’re like me you’ll want to verify that everything is working.  The good people over at BIDS Helper let me use a version of their “Smart Diff” XSLT that, which will sort a SSAS file.  I modified it slightly and customized it to help validate the .partition files.  I would be very careful using it for anything else… its only been tested for the intended scenario.

Known Issues

I’ve tested this with Visual Studio 2008 SP1, SQL Server Analysis Services SP1 CU3, and the Adventure Works 2008 AS DB with no issues.  Definitely make a backup before you try this out, as your mileage may vary.

  1. The sample I wrote just serializes objects into files based on their names.  It would be trivial to maintain a list based on the Visual Studio project file to make sure the objects are written out with their correct names.
  2. Partitions are reordered when De-Serialized/Serialized. Makes it a pain to validate, but I’ve seen no ill effects. Use SortSssasFile functionality to make a copy of the files for easier comparison.
  3. Some fields maintained for Visual Studio (State (Processed, Unprocessed), CreatedTimestamp, LastSchemaUpdate, LastProcessed, dwd:design-time-name, CurrentStorageMode) are lost when a project is De-Serialized/Serialized.

Next Steps

I’ve thought of some nifty things to do with this functionality, including…

  1. Generate a new Visual Studio project based on metadata.
  2. Maintain a project with external metadata… you can maintain a separate data dictionary with descriptions of each attribute/measure/cube, and just programmatically update your source when needed.
  3. Automate repetitive tasks and maintenance of the project (like adding a bunch of partitions to your measure groups, or replacing every occurrence of a certain word in the project.
  4. Build a more robust development and deployment process.
  5. …and many more things.

Cheers,

David

Posted in Uncategorized | Comments Off on SQL Server Analysis Services ‘Project Helper’

SSIS and SQL Azure – Getting Started

Since SQL Azure is currently in a Community Technology Preview, the technology and this information provided below is subject to change. This post is based on the August 18th CTP.

Now that I’ve been working with SSIS against Azure for a few days, I thought I’d post about my experiences. Overall, I’m pretty happy with it, considering that it is a pre-release product. I’ve had some good and and some bad experiences, but with what I am seeing right now, and the direction it’s heading in, I think it has a good future.

Prior to the CTP, people wanting to get an early start with SQL Azure were advised to developed locally against SQL Express. Theoretically, you could then simply change your connection strings to point to SQL Azure, and away you go. In practice, that’s not exactly how it worked for me with SSIS (your mileage may vary – .NET apps are probably much easier to port).

Make sure you read through the documentation first – there’s a lot of good information there, and some of it is pretty important. The first thing to note is that SQL Azure currently does not support OLE DB. The normal recommendation for SSIS is to use the OLE DB Source or Destination to access SQL Server. However, if you want to port your packages to SQL Azure, you must use the ADO.NET Source and Destinations. This is fine for 2008, but if you are using SSIS 2005, there is no ADO.NET Destination, so you would have to implement your own through a script component.

The second thing to be aware of is that bulk insert operations are not currently supported (though it’s been said they will be available in a later CTP). Since the ADO.NET Destination doesn’t support bulk inserts anyway, this isn’t a huge issue. However, if you are writing your own destination (in a script component or custom component), you can’t currently use the ADO.NET SqlBulkCopy class.

So, with those two caveats out of the way, it should be pretty much like creating any data flow in SSIS – add a source, add a destination, and you are ready to go. However, I got the following error when using the ADO.NET Source and Destination:

image

This error appears to come up because SQL Azure does not currently support the system catalog tables that ADO.NET calls to retrieve table information.  For the ADO.NET Source, since you can’t type the table name in, the simplest way to work around  this is to use the SQL Command option and specify a SQL Statement instead of the Table or View option.

image

For the ADO.NET Destination, your only choice is to use the Table or View option, so you can just type the table name in. The table name must be provided in the following format: “schema”.”tablename”.

Once this is done, you can run the package, and watch your data move. Once or twice, I saw validation warnings that prevented the package from running, but these all went away the next time I ran it, so I’m guessing it was a momentary connectivity issue. I’m on the road right now, so I don’t have the most stable internet connection available.

I’ll be posting a follow up to this soon that talks about performance, and how you can tune your packages to move data in and out more quickly. I should also have a few performance test results to share.

Posted in Uncategorized | Comments Off on SSIS and SQL Azure – Getting Started