Posts tagged ‘SQL Azure’

Presenting at SQLSaturday #49 in Orlando, FL on October 16, 2010

I have a very busy schedule in October. As already mentioned, I’m presenting at SQLSaturday #48 in Columbia earlier in the month. I also have the privilege of presenting at the Orlando, FL SQLSaturday #49 on October 16th, 2010. Thanks to Andy Warren for squeezing me into a slot that opened up in the schedule.

I’ll be presenting a newly revised presentation on SQL Azure – Moving Data with SQL Azure and SSIS. The abstract is:

SQL Azure allows you to host your data in SQL Server in the cloud. That provides some big benefits in scalability and management. However, it leaves open the question, “How do you get your data into / out of the cloud?” At some point, you are going to need to move data to or from an on-premise store to SQL Azure. In this session, we’ll discuss the available options for this, including SSIS, the Sync Framework, and BCP. We’ll cover the pros and cons for each. We’ll drill into one of the options, SSIS, in detail, and review performance options and potential issues that you may encounter when doing this.

I’ve presented on SQL Azure in the past, focusing on how it could be leveraged for BI purposes. In this session, we’ll focus in on moving data in and out of SQL Azure, as that’s an area that’s changing pretty rapidly right now.

If you read the blog, please introduce yourself. I’ll be around all day, and I’m always happy to chat.

SSIS and SQL Azure – Getting Started

Since SQL Azure is currently in a Community Technology Preview, the technology and this information provided below is subject to change. This post is based on the August 18th CTP.

Now that I’ve been working with SSIS against Azure for a few days, I thought I’d post about my experiences. Overall, I’m pretty happy with it, considering that it is a pre-release product. I’ve had some good and and some bad experiences, but with what I am seeing right now, and the direction it’s heading in, I think it has a good future.

Prior to the CTP, people wanting to get an early start with SQL Azure were advised to developed locally against SQL Express. Theoretically, you could then simply change your connection strings to point to SQL Azure, and away you go. In practice, that’s not exactly how it worked for me with SSIS (your mileage may vary – .NET apps are probably much easier to port).

Make sure you read through the documentation first – there’s a lot of good information there, and some of it is pretty important. The first thing to note is that SQL Azure currently does not support OLE DB. The normal recommendation for SSIS is to use the OLE DB Source or Destination to access SQL Server. However, if you want to port your packages to SQL Azure, you must use the ADO.NET Source and Destinations. This is fine for 2008, but if you are using SSIS 2005, there is no ADO.NET Destination, so you would have to implement your own through a script component.

The second thing to be aware of is that bulk insert operations are not currently supported (though it’s been said they will be available in a later CTP). Since the ADO.NET Destination doesn’t support bulk inserts anyway, this isn’t a huge issue. However, if you are writing your own destination (in a script component or custom component), you can’t currently use the ADO.NET SqlBulkCopy class.

So, with those two caveats out of the way, it should be pretty much like creating any data flow in SSIS – add a source, add a destination, and you are ready to go. However, I got the following error when using the ADO.NET Source and Destination:

image

This error appears to come up because SQL Azure does not currently support the system catalog tables that ADO.NET calls to retrieve table information.  For the ADO.NET Source, since you can’t type the table name in, the simplest way to work around  this is to use the SQL Command option and specify a SQL Statement instead of the Table or View option.

image

For the ADO.NET Destination, your only choice is to use the Table or View option, so you can just type the table name in. The table name must be provided in the following format: “schema”.”tablename”.

Once this is done, you can run the package, and watch your data move. Once or twice, I saw validation warnings that prevented the package from running, but these all went away the next time I ran it, so I’m guessing it was a momentary connectivity issue. I’m on the road right now, so I don’t have the most stable internet connection available.

I’ll be posting a follow up to this soon that talks about performance, and how you can tune your packages to move data in and out more quickly. I should also have a few performance test results to share.

SQL Azure, CREATE TABLE, and USE

I’ve been playing around with the SQL Azure CTP for a little bit, and generally, it’s going well. However, as with any new technology, there are plenty of things to learn.  I’m planning a series of posts around SQL Azure to share what I’m learning about it. And yes, there will be some SSIS thrown in there, too – what good is a database in the cloud if you can’t get your data in and out?

One of the first things I did was create a new database and some tables (rather obvious, I suppose – you can’t really do much in SQL without that). Something that you will likely encounter immediately when creating tables is the difference between what you can do in SQL Server and what SQL Azure supports. Primarily, it’s related to physical options affecting the storage. As a comparison, here’s the script that SQL Server Management Studio generates if you right-click on a table and choose Script Table As…Create To.

USE [MyDatabase]

GO 

 

SET ANSI_NULLS ON

GO 

 

SET QUOTED_IDENTIFIER ON

GO 

 

SET ANSI_PADDING ON

GO 

 

CREATE TABLE [dbo].[MyTable](

    [MyKey] [int] IDENTITY(1,1) NOT NULL,

    [MyString] [varchar](30) NOT NULL,

    [UpdateID] [int] NOT NULL,

    [UpdateDate] [datetime] NOT NULL,

 CONSTRAINT [pkMyTable] PRIMARY KEY CLUSTERED (

    [MyKey] ASC)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY],

 CONSTRAINT [akMyTable] UNIQUE NONCLUSTERED (

    [MyString] ASC)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY])

 ON [PRIMARY] 

GO 

 

SET ANSI_PADDING OFF

GO 

 

ALTER TABLE [dbo].[MyTable] ADD

  CONSTRAINT [dfUpdateDate]  DEFAULT (getdate()) FOR [UpdateDate]

GO

Here’s the same CREATE TABLE script, but trimmed down to just the items SQL Azure supports:

SET QUOTED_IDENTIFIER ON

GO 

 

CREATE TABLE [dbo].[MyTable](

    [MyKey] [int] IDENTITY(1,1) NOT NULL,

    [MyString] [varchar](30) NOT NULL,

    [UpdateID] [int] NOT NULL,

    [UpdateDate] [datetime] NOT NULL,

 CONSTRAINT [pkMyTable] PRIMARY KEY CLUSTERED (

    [MyKey] ASC)WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF),

 CONSTRAINT [akMyTable] UNIQUE NONCLUSTERED (

    [MyString] ASC)WITH (STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF))

GO

 

ALTER TABLE [dbo].[MyTable] ADD

  CONSTRAINT [dfUpdateDate]  DEFAULT (getdate()) FOR [UpdateDate]

GO

This is all documented in the SQL Azure documentation on MSDN, under the Transaction SQL Reference. And, as expected, most of the options that aren’t supported are related to physical storage.

One item that does stand out a bit, though, is USE. The USE statement is supported, but only if it references the current database, as in USE MyDatabase when you are connected to MyDatabase. Executing USE MyOtherDatabase when you are connected to MyDatabase will result in an error. Instead, you have to disconnect from MyDatabase and connect to MyOtherDatabase. It does make some sense not to allow users to switch databases in a multi-tenant model (I can picture all sorts of interesting hacks being created if that were possible). I do wonder, though, why it was included at all, as it is fairly useless in it’s current form. Maybe a future enhancement?

Anyway, if you are interested in SQL Azure, what this space for more updates as I continue working with it.

Presenting at the SQL Server Best Practices Conference

I’m going to be presenting two sessions at the SQL Server Best Practices Conference on August 24-26 in Washington, DC. This conference has a different focus than many of the technical conferences that I’ve presented at in the past. Instead of technical deep dives, the focus is on providing attendees with information on the decision points they are likely to encounter in their projects, and guidance on how to choose the best course of action at those points. It promises to be a very interesting conference.

I’ll be presenting on two topics. The first, Unit Testing SSIS Packages, is a topic I’ve presented on before, but not with this focus. I’m a big fan of unit testing and the test driven development (TDD) model, but I’ve found that there are numerous challenges in applying this approach to SSIS. So my session will focus on the best practices you can use to make this successful.

My other topic, SQL Server BI in the Cloud, is a new one for me (and for pretty much everyone, considering that SQL Azure is still pre-CTP). However, I’ve been doing some work with it and other cloud based models for BI, and there are some clear key decision points that you need to consider when embarking on this type of project. I’m really looking forward to this one, as it’s a new area, and that always prompts good discussions.

If you are attending, please consider dropping by for my sessions. If you’re not registered for the conference, there’s still time, and there’s a great lineup of speakers.

 

BestPractices_banner640x96