New vs. New SCD2

Topics: Issues 2: Using the Component (Design-Time)
Nov 10, 2010 at 11:30 PM

I'm guessing this may be a simple an obvious question. 

What is the difference between the New output and the New SCD2 output? 

Rob

 

Nov 18, 2010 at 2:26 AM

Hi Rob, 

Hopefully you are already familiar with the principles of Slowly Changing Dimensions, if not I suggest you read up on that first.

The New output are all the rows where the Source System input rows have not matched any rows from the Existing Dimension input, i.e. these are new rows (based on the Business Key(s)) that did not previously exist in the dimension.

The New SCD2 output are all the rows where the Source System input rows *did* match a row from the Existing Dimension input based on the business key, and one or more columns that were identified as SCD2 column type had different values than the Existing Dimension row. These new rows should be inserted into the dimension and there will be a set of corresponding rows in the Expired SCD2 output which should go to an update SQL task to set things like the row expiry date and current row indicator.

Hope that helps!

cheers,

Nathan

Nov 18, 2010 at 1:42 PM

Nathan,

Yes.  I'm familair with the SCD concepts.  After I started working more with the Kimball SCD, I guessed that New SCD2 was for those records that needed to be expired and then added as new records but I wasnt sure.  I meant to update my question.

Thank you for your response.

Rob

Nov 23, 2010 at 4:37 PM

The "New SCD2" output is for the "new version" of SCD 2 rows.  You don't have to use the rows on that output for "double duty" - the "old" version of the row will be emitted through the "Expired SCD2" output.  You (typically) only need to route rows from the New SCD2 output directly to a Destination component, and that's it.