Kimball Method SCD Component Advantages
There are quite a few advantages to using the Kimball Method SCD component over the SCD Wizard. Here’s a starter list:
- One component on the design surface that can be edited without adverse effects on the rest of the Data Flow. The SCD Wizard creates multiple components that are destroyed and rebuilt if the Wizard is run again. Any customizations that were made in any of
the components – including the “secondary” Derived Column and OLE DB Command transforms – are completely lost.
- Insane performance - measured to be 100x superior (see the
Performance Improvement by use of multiple threads, sort optimization, and implied outcome determination. Those architectural differences beat the SCD Wizard’s single-threaded uncached row-by-row lookups.
- Surrogate Key management inside the component – if desired. The SCD Wizard offers no assistance creating or maintaining surrogate keys.
- "Special" (unknown) member support, per Kimball Method best practices. If you have some “constant” members that should always appear in your dimension, you can manage them (and changes to them) using the Kimball SCD. Maintaining those rows is
completely external to the SCD Wizard, which could result in unintentional collisions with actual business information.
- Includes a "Row Change Reason" output column on all (except Unchanged) outputs, per Kimball Method best practices. Doing so gives you some meaningful way to understand why a row was sent to a particular output. This is in comparison to zero support
for debugging why a row was directed to a particular output in the SCD Wizard, which can get quite frustrating – especially when dealing with understanding mapping your RDBMS data types to SSIS data types.
- Supports simple and advanced styles of Row Auditing for inserts and updates, per Kimball Method best practices. The SCD Wizard offers no auditing support.
- Flexible column comparisons: case (in)sensitive and space (in)sensitive as desired, plus culture-sensitivity. The SCD Wizard is case-, space-, and culture-sensitive with no features available to alter that.
- Flexible SCD 2 "current row" handling - permits specification of the date "endpoints". Comparatively, the only choice with the SCD Wizard is to have the "expiry" date be NULL to signify the current record.
- Flexible SCD 2 date handling - permits specification of what date expired and new rows get marked with. The SCD Wizard leaves that up to a Derived Column component that will get destroyed when the Wizard is run again to adjust other properties.
- Options to use other data types for the Current Member and Inferred Member columns - not just Boolean types as in the SCD Wizard.
- Reads the existing dimension from the Data Flow, not a Connection Manager - allowing the package designer to cache the existing dimension table as they see fit, and use any repository to store the dimension table. The SCD Wizard requires an OLE DB Connection
Manager for the dimension table, and only supports some OLE DB providers.
- Supports "retiring" dimension members that have been deleted from the Source System by marking their "expiry date" with a real date. The SCD Wizard does not support this, because it uses "Change Stream" style SCD processing.
With the SCD Wizard, your dimension table will never be queryable for "how many active things are there?" at any specific point in time.
But it's not all rainbows and roses - you do have to
give some things up
in order to get the above.