Pass field unlinked?

Dec 9, 2009 at 12:52 PM


Probably this question was already asked, but I didn't find any references... please do point me to any previous discussions if applicable!

The problem/question is this:

I have a field (lets call it FieldX) that should be filled for any ('new' dimension record (new business key).
After that, the field should be updated anytime a type 1 or type 2 change was detected (but NOT if FieldX was the only change).

For example, I store the user that has last changed a contract in the source system. But the datawarehouse does of course not track ALL properties of contracts.
Therefore, if someone changed the contract, I only want this information in my dimension if that user's change was relevant for the datawarehouse.

Can I have FieldX (with the value form the 'source system' input) available in the outputs, even though no change checking should be done on that field? How would I do that?





Dec 17, 2009 at 8:47 PM

I don't think you can get exactly what you're looking for, but here are some options:

1. Use version 1.5's new Audit columns - specifically the "Audit - Last Changed" column.  Unfortunately, this doesn't allow date types - but it does allow string types (which can easily be converted to dates and back).  This column is intended to update each row with a new value (you specify the new value) any time a change (1 or 2) is made to the row.  It doesn't require a FieldX on the Source input (nor will it like one).  This should work if your FieldX isn't marked with a time/date that is of higher granularity than the load process.  That is - if you're loading your warehouse daily, the best it can do is update that with a "day", not a time of day.

2. Use a Derived Column on the "changed" outputs (SCD1 Updated, ...) to "add in" your FieldX after the processing is done.  Since FieldX doesn't influence how the SCD determines what's changed, you can do it after the fact.  Unless (like #1) your FieldX is of higher granularity than the load process frequency.

3. Multicast your Source input.  Send one to the KSCD, send another to the Right input of a Merge Join after the KSCD.  The Left input of the Merge Join would be a "changed" output (say, SCD1 Updated).  Join on the BK, adding your FieldX from the Right.  You will need a Sort before the KSCD output flow can be used in the Merge Join.

Hope that helps.