SQL Server – Agile BI http://agilebi.com A community for sharing ideas about business intelligence development using agile methods. Sat, 14 Jan 2012 02:02:24 +0000 en-US hourly 1 https://wordpress.org/?v=6.2.2 When To Use NVARCHAR http://agilebi.com/blog/2011/09/09/when-to-use-nvarchar/ Thu, 08 Sep 2011 18:27:27 +0000 http://7.115 Continue reading ]]> The Short Answer

Only when you need to store multi-lingual (unicode) data. 

The Nitty Gritty

The fundamental unit of storage in a SQL Server database is an 8 KB page.  To paraphrase the MSDN, SQL Server reads and writes whole data pages and as a result, the maximum amount of data that can be contained within a single row on a page is 8,060 bytes (8 KB). 

The page size in turn limits the maximum size of VARCHAR, a variable-length non-Unicode character datatype, to 8,000 bytes.  Take note!  The operative word here is non-Unicode.  A non-Unicode character is stored as UTF-8 (8 bits) and requires one byte of storage per character, meaning that the VARCHAR datatype may contain at most 8,000 characters. 

In contrast, NVARCHAR is a variable-length Unicode datatype.  And unlike non-Unicode characters, Unicode characters are stored as UTF-16 (16 bits) and require two bytes of storage per character.   Because of the SQL Server page size, NVARCHAR datatypes have the same length restrictions as their VARCHAR cousins, 8,000 bytes.  This means that an NVARCHAR datatype may contain, at most, 4,000 characters.  The net result is NVARCHAR dataypes take up twice as much space as a VARCHAR datatype.  Said another way, NVARCHAR(4000) is the same size as VARCHAR(8000).

But the NVARCHAR datatype shouldn’t be dismissed entirely.  The English language can be encoded in UTF-8, making VARCHAR the datatype of choice.  However, other languages, such as Japanese, Hebrew, Arabic, etc., have an extended set of character codes that are only found in UTF-16.  In order to store data in these languages you must use the NVARCHAR datatype, but that should be the only time.

Warehouse Beware

The rule for when to use an NVARCHAR datatype applies to both OLTP systems and OLAP systems.  If your OLTP system is riddled with unnecessary NVARCHAR datatypes you may want to consider converting them to VARCHAR in the data warehouse to save space.  It’s tempting just to convert all NVARCHARS to VARCHARS but you always have to ask yourself, was the column defined as an NVARCHAR for a reason?  This is a question you’ll need to have answered by the business users.  Is your company planning to expand their business globally?  If so, it might make sense to keep the NVARCHARs, but only for columns where it really matters.

]]>
How to Generate Insert Scripts for Existing Data http://agilebi.com/blog/2011/04/04/how-to-generate-insert-scripts-for-existing-data/ Mon, 04 Apr 2011 03:35:46 +0000 http://7.77 Continue reading ]]> Let’s say you have a bunch of data stored in a table and you need to generate an insert script for each record. Maybe you need to save the scripts in source control or maybe you need the scripts to initialize the same table in another environment. What would you do?

Traditional Methods

Several ideas come to mind, all painfully tedious:

  • You could dynamically create the insert statements with a stored procedure

If you do a quick search you’ll find several stored procedures that will generate insert scripts. This isn’t a bad way to go in terms of effort but I would definitely scrutinize the procedure before using it. And as you’ll see in a bit, there’s an even better approach.

  • You could dynamically create the insert statements with a SQL query

Following this approach your SQL query would look something like this:

SELECT 'INSERT INTO dbo.MyTable (MyCol1, MyCol2) VALUES (' + MyCol1 + ',' + MyCol2 + ')' AS InsertStatement FROM dbo.MyTable

If your table is very wide then the script can be very complex to write and difficult to read and maintain.

  • You could dynamically create the insert statements with Excel

This is a two-step approach where first you select all data from the table and then copy it into Excel.  You then add columns before and after each value so you can create an expression for the insert statement.  For example, cell A1 would look something like this:

=INSERT INTO dbo.MyTable (MyCol1, MyCol2) VALUES (

Using the fill handle you drag the expression down for each row and repeat for each of the other columns (remember, each value you insert needs to be separated by a comma and if you have character data then it must also be enclosed in single quotes). Once you copy your script out of Excel you will notice that the values in each column are separated by a tab. If you are inserting character data you will need to either update your expression and surround it with the RTRIM() and LTRIM() functions, or manually remove the tab yourself. This can get very ugly very quickly.

  • You could manually type the insert statements

When all else fails you can always start to write the script yourself. Depending on how many rows of data you have, and the width of the table, this would require a tremendous amount of effort,

  • You could convince someone else to generate the scripts for you

A-la Tom Sawyer and the fence, your mileage may vary with this approach.

A Better Way

Luckily, there is a far easier approach where SQL Server does all the work.  Unfortunately, if you’re running SQL Server 2005 or earlier you’re still limited to the approaches mentioned above – this well-hidden gem is only available in SQL Server 2008 and later.

Starting in SQL Server Management Studio, navigate to the database where your data lives in the Object Explorer.  Right click on the database name, go to Tasks, and select Generate Scripts.

The SQL Server Scripts Wizard will appear, click Next to begin.

Select the database where your data lives.  In this example I am choosing AdventureWorksDW2008.

In the Choose Script Options there are a number properties that you can tweak.  This is where the magic happens!  In order to generate insert scripts you need to set the “Script Data” property located in the “Table/View Options” group to True (by default it’s set to False). 

Select the type of objects you  want scripted.  Since we want to generate insert scripts the object type table nust be selected.

Select the tables for which you want to generate insert scripts.  Here I’ve chosen DimAccount.

Select your output option. 

Review your settings.  Notice that “Script Data” is set to True.

Let the wizard spin…

And voila!  An insert statement for each record (along with the table definition) has been generated!

So Now What?

I *highly* recommend that you check out and familiarize yourself with all of the options in the Check Options screen as it contains numerous ways to customize your scripts that could end up saving you a lot of time!

]]>