Update history for SCD Type 1 Changes

Topics: Issues 2: Using the Component (Design-Time), Possible Bugs/Missing Features?
Jan 24, 2012 at 1:56 PM

Hi

Before closing this discussion thinking - this dude does not know what he is talking about - SCD1 Changes do not have history - let me explain.

We have a dimension that has a mix of SCD type 1 and SCD type 2 changes. Lets use a Client dimension as an example. One of the attributes, example Geographical location has been identified as Type 2 and this will create new records and expire old when the Geographic location changes. Another attribute is Date of Birth which is type 1 - a clients date of birth does not change so why track history.

Over time Client with a business key of 'ABC' had 3 different Geographic locations and therefore has two expired records and one current one. Next we discover that the Date of Birth had always been incorrect and the new one is run in the ETL process. This correctly updates the current record with the correct date of birth.

But now we want to update the 2 expired records as well - and we want to do it via the Dimension Merge SCD wizard.

Currently our Update SCD 1 leg does an update statement like:

 Update DimClient Set DateOfBirth = ? WHERE ClientKey = ?

but for the history update this must be something like:

Update DimClient Set DateOfBirth = ? WHERE ClientCode = ? (i.e. business key in where clause will update all history and not just the individual row)

Anyone have any ideas of how to do this via the wizard. Can it be done.

Thanks

Trevor

Jan 24, 2012 at 4:37 PM

what discussion? this is a new discussion? a tad confusing Trevor.

The SSIS Dimension Merge SCD component doesn't create your down stream components or alter them when you make changes to the component, One of the advantages to using it compared to the standard  Slowly Changing Dimension component. So no wizard? Are you getting confused between the components?

or am I missing some major piece of the puzzle here?

Jan 25, 2012 at 5:08 AM
Edited Jan 25, 2012 at 5:10 AM

WeeeBob - The intro was an attempt on some light humour - sorry you missed it.

I am fully aware of how the DM SCD component works and that it does not create downstream components but rather different output paths to deal with the different types of SCD changes - so no I am not getting confused.

The major piece of the puzzle that you might be missing is that I am asking whether there might be some kind of workaround to deal with Type 1 changes that:

1) update the value in the current record for some columns (i.e we are not interested in fixing expired records) or

2) update the value in all expired rows for other columns (i.e. we want to fix expired records)

I was thinking of maybe adding a multicast component to split the Update SCD 1 output into 2 different streams and doing a update on the surrogate key for 1 above and an update using the business key(s) for type 2

Still thinking about this - comments welcome

Jan 25, 2012 at 10:03 AM

Ahh a little clearer now.

I tend to just go for option 2 for Type 1 changes, if your not doing this then its not a Kimball type 1 change? as you are now tracking change in time for this column. I'll assume you have a valid reason though (maybe late arriving dimension attributes?), and by the sounds of it a conditional split (rather than multicast) on the SCD 1 output would be your best bet.