h1

Slowly Changing Dimension Type 1 with Upsert Destination from Pragmatic Works’ Task Factory

March 19, 2012

Over my last 3 blogs I’ve shown you a variety of ways to handle Type 1 changes in a Slowly Changing Dimension.  All of those design patterns can be created with components that come standard with SSIS.  This time I’m going to show you a 3rd party tool from Pragmatic Works, that replaces every component that we’ve used before except the Source Component, with just one component.

image

SELECT
[UserAK]
,[ScreenName]
,[AccountCreateDate]
,[TimeZone]
,[Language]
,[GeographySK]
,CHECKSUM(
[ScreenName]
,[AccountCreateDate]
,[TimeZone]
,[Language]
,[GeographySK]
) as Hash_CD
FROM
DimUserUpdate

Now instead of adding the lookup and conditional split or the SCD component all we need to do is add the Upsert Destination and configure it and we are done.

image

The first thing we have to do is create a new connection manager.  The Upsert Destination requires an ADO.net Connection to the database instead of the standard OLEDB connection. Once you’ve set the connection and selected the Destination Table, you must identify the key column for the destination by placing a check in the box next to the correct column.  Once that is done, go to the Update Method Tab.

image

On this tab we can set several different ways to handle the update.  The first is the Bulk Update Method. This method is the fastest, but if you plan on doing any kind of row count logging you will not be able to track updated rows. The next option is the Column Compare Method.  This method will compare the values in the columns from the source with that of the columns in the destination and if a change exists it will update the column.  This method is slower but allows you to log row counts for both updates and inserts.  The next option is the Timestamp compare which will compare two timestamp values to see if a record has been updated.  Similarly the Last Update Compare uses a Last update column value.  For this package we are going to use the Column Compare method. Simply select the columns you want to use for the comparison using the check boxes down the left hand side.  Now you can click OK and be done, or move on to the Advanced Tab.  I’m going to set up a Row Count for updates so I’m going to the Advanced Tab next.

image

Simply map each row count variable to the correct drop down and then click OK.  I’m going to add a message box using a Script Task to the Control Flow just to show that the update row count works properly.

You can add this little snippet to get the row count to show in a message box:

Public Sub Main()

MsgBox(Dts.Variables(“UpdateRowCount”).Value)


Dts.TaskResult = ScriptResults.Success
End Sub

Now I’m just going to update the Update table so we actually get some updates going through.

UPDATE DimUserUpdate SET GeographySK = GeographySK + 1

 WHERE [UserAK] in ('98939115','97454425','97202071','9567382','87627850')
Now Execute the package
image

As you can see although 2,056 rows came through the source query, only 5 records were actually updated.  The Upsert Destination makes designing packages for handling SCD type 1 changes much faster and in my opinion easier.  If you get a chance I would suggest trying it out, as well as the multitude of other components that Task Factory has to offer.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: