Existing Row going to both New and Unmatched

Mar 10, 2010 at 7:10 AM
Edited Mar 10, 2010 at 7:12 AM

Hello, I'm a bit of a newbie with Data Warehouse work, and am trying to use the KimballSCD control.

I have constructed my system to be the same as the one in the video example, with the only difference being as follows:

I have a derived column on the input source system to include a Business Key field which is this Years Date (ie, 2010).

I have two derived columns on the dimension system to include two fields for the Effective and Expiry date fields (the way we do things here is the Effective Date and Expiry Date columns in the dimension table are actually integers of the format YYYYMMDD (we have them as Surrogate Keys to a Date Dimension table used for other purposes) - since this component only allows output to Date fields I add these as a place holder for the values generated by the system, then change them and set the surrogate key columns appropriately at a later stage.

These two inputs are then fed into the Kimball SCD component.

When I run my task with one row from the source system, it goes through the system, and gets 1 new record.  I then send this record to an OLE DB Destination and it writes to the database the single record fine.

When I run the task a second time, with the same data from the source system, it goes through the system, and flags the record as both a New Record and an Unchanged record - I currently have the Unchanged going to an output grid, but the New Record gets reinserted .  I can't work out why it's doing this (maybe I'm overlooking something).  Also puzzling is it shows "Deleted: 1" in both instances.

Here is the debug code in case that helps:

Initial Task:

Information: 0x40043006 at Data Flow Task, SSIS.Pipeline: Prepare for Execute phase is beginning.
Information: 0x40043007 at Data Flow Task, SSIS.Pipeline: Pre-Execute phase is beginning.
Information: 0x4004300C at Data Flow Task, SSIS.Pipeline: Execute phase is beginning.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from SS.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Business Key 'xxxxxx' read from Source_System input.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): constructed 1 work units from SS.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): Returning work unit 0 of 1.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeys): Key 'xxxxxx' from Source_System is DEFINITELY NOT in Existing_Dimension
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): Pushed 1 keys to matched queue, all DEFINITELY NOT matched.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeys): Key 'xxxxxx' not queued (duplicate)
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (8)' (in ProcessKey): Processing key 'xxxxxx'
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (8)' (in ProcessKey): Key 'xxxxxx' is a new record
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (8)' (in ProcessKey): Sending 'xxxxxx' to New
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from SS.
Information: 0x402090DF at Data Flow Task, OLE DB Destination [1620]: The final commit for the data insertion in "component "OLE DB Destination" (1620)" has started.
Information: 0x402090E0 at Data Flow Task, OLE DB Destination [1620]: The final commit for the data insertion  in "component "OLE DB Destination" (1620)" has ended.
Information: 0x40043008 at Data Flow Task, SSIS.Pipeline: Post Execute phase is beginning.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Input rows processed: 0 Existing Dimension, 0 Special Members, 1 Source System
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Rows output: 0 Unchanged, 1 New, 1 Deleted, 0 SCD1, 0 SCD2 Expired, 0 SCD2 New, 0 Invalid Input
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: 1.3 seconds to complete.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Delays: 0.0s waiting for input (3.1%), 0.1s waiting for caching (3.8%), 0.0s waiting for row matching (1.9%), 0.0s waiting for row processing (0.0%), 0.0s waiting for output buffers (0.0%)
Information: 0x4004300B at Data Flow Task, SSIS.Pipeline: "component "OLE DB Destination" (1620)" wrote 1 rows.
Information: 0x40043009 at Data Flow Task, SSIS.Pipeline: Cleanup phase is beginning.
SSIS package "blah.dtsx" finished: Success.

Second Task:

Information: 0x40043006 at Data Flow Task, SSIS.Pipeline: Prepare for Execute phase is beginning.
Information: 0x40043007 at Data Flow Task, SSIS.Pipeline: Pre-Execute phase is beginning.
Information: 0x4004300C at Data Flow Task, SSIS.Pipeline: Execute phase is beginning.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from SS.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Business Key 'xxxxxx' read from Existing_Dimension input.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from SS.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Business Key 'xxxxxx' read from Source_System input.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 1 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 2 work units from SS.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): Returning work unit 0 of 2.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): Returning work unit 0 of 1.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeys): Key 'xxxxxx' from Source_System MATCHED to key in Existing_Dimension
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeys): Key 'xxxxxx' from Existing_Dimension MATCHED to key in Source_System
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeys): Key 'xxxxxx' queued up
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeys): Key 'xxxxxx' not queued (duplicate)
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (9)' (in ProcessKey): Processing key 'xxxxxx'
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (9)' (in ProcessKey): Key 'xxxxxx' is a new record
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (9)' (in ProcessKey): Sending 'xxxxxx' to New
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'ProcessKey (9)' (in ProcessKey): Sending 'xxxxxx' to Unchanged
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (6)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from SS.
Information: 0x402090DF at Data Flow Task, OLE DB Destination [1620]: The final commit for the data insertion in "component "OLE DB Destination" (1620)" has started.
Information: 0x402090E0 at Data Flow Task, OLE DB Destination [1620]: The final commit for the data insertion  in "component "OLE DB Destination" (1620)" has ended.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): No work units left - constructing.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from ED.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Thread 'MatchKeys (7)' (in MatchKeyThreadManager.GetWorkUnit): constructed 0 work units from SS.
Information: 0x40043008 at Data Flow Task, SSIS.Pipeline: Post Execute phase is beginning.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Input rows processed: 1 Existing Dimension, 0 Special Members, 1 Source System
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Rows output: 1 Unchanged, 1 New, 1 Deleted, 0 SCD1, 0 SCD2 Expired, 0 SCD2 New, 0 Invalid Input
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: 3.6 seconds to complete.
Information: 0x0 at Data Flow Task, Kimball Method Slowly Changing Dimension: Delays: 0.1s waiting for input (1.5%), 0.1s waiting for caching (2.8%), 0.4s waiting for row matching (11.8%), 0.0s waiting for row processing (0.0%), 0.0s waiting for output buffers (0.0%)
Information: 0x4004300B at Data Flow Task, SSIS.Pipeline: "component "OLE DB Destination" (1620)" wrote 1 rows.
Information: 0x40043009 at Data Flow Task, SSIS.Pipeline: Cleanup phase is beginning.
SSIS package "blah.dtsx" finished: Success.

Coordinator
Mar 14, 2010 at 9:06 PM

I think "ur doing it wrong" :)

You can't be deriving things like the effective date and expiry date - those are constructs of the SCD process itself and need to be generated and managed by the component.  You need to feed in your Existing Dimension contents - unadulterated - into the component.  Then you need to send your Source System data into the component, without any of the "housekeeping" columns - like the effective and expiry dates.  The component will figure out from there which rows need updating, which rows need new "versions", etc...

If you're starting out, I highly recommend The Data Warehouse Toolkit by Ralph Kimball and Margy Ross.  It's a spectacular book for beginners as well as experts - I used it as an intro to warehousing, and still go back to it as a reference.  At under $50, spending a couple hours just reading the first few chapters will save you tons of time and money understanding what the whole process is about.  If you're more of a hands-on person, get your company to send you to a Kimball University course - it's very well worth it.