SSIS and SQL Azure – Getting Started

Since SQL Azure is currently in a Community Technology Preview, the technology and this information provided below is subject to change. This post is based on the August 18th CTP.

Now that I’ve been working with SSIS against Azure for a few days, I thought I’d post about my experiences. Overall, I’m pretty happy with it, considering that it is a pre-release product. I’ve had some good and and some bad experiences, but with what I am seeing right now, and the direction it’s heading in, I think it has a good future.

Prior to the CTP, people wanting to get an early start with SQL Azure were advised to developed locally against SQL Express. Theoretically, you could then simply change your connection strings to point to SQL Azure, and away you go. In practice, that’s not exactly how it worked for me with SSIS (your mileage may vary – .NET apps are probably much easier to port).

Make sure you read through the documentation first – there’s a lot of good information there, and some of it is pretty important. The first thing to note is that SQL Azure currently does not support OLE DB. The normal recommendation for SSIS is to use the OLE DB Source or Destination to access SQL Server. However, if you want to port your packages to SQL Azure, you must use the ADO.NET Source and Destinations. This is fine for 2008, but if you are using SSIS 2005, there is no ADO.NET Destination, so you would have to implement your own through a script component.

The second thing to be aware of is that bulk insert operations are not currently supported (though it’s been said they will be available in a later CTP). Since the ADO.NET Destination doesn’t support bulk inserts anyway, this isn’t a huge issue. However, if you are writing your own destination (in a script component or custom component), you can’t currently use the ADO.NET SqlBulkCopy class.

So, with those two caveats out of the way, it should be pretty much like creating any data flow in SSIS – add a source, add a destination, and you are ready to go. However, I got the following error when using the ADO.NET Source and Destination:

image

This error appears to come up because SQL Azure does not currently support the system catalog tables that ADO.NET calls to retrieve table information.  For the ADO.NET Source, since you can’t type the table name in, the simplest way to work around  this is to use the SQL Command option and specify a SQL Statement instead of the Table or View option.

image

For the ADO.NET Destination, your only choice is to use the Table or View option, so you can just type the table name in. The table name must be provided in the following format: “schema”.”tablename”.

Once this is done, you can run the package, and watch your data move. Once or twice, I saw validation warnings that prevented the package from running, but these all went away the next time I ran it, so I’m guessing it was a momentary connectivity issue. I’m on the road right now, so I don’t have the most stable internet connection available.

I’ll be posting a follow up to this soon that talks about performance, and how you can tune your packages to move data in and out more quickly. I should also have a few performance test results to share.

One Comment

  1. mmasson says:

    We’re looking into improving SSIS support for SQL Azure in SQL Server 2008 R2.

    The partitioning that SQL Azure does makes retrieving the available tables a bit harder than expected, but we should be able to drastically improve the performance of loading data using the ADO.NET Destination by taking advantage of the SqlBulkCopy support.

Leave a Reply