Updating a warehouse from a partially filled source

Jun 15, 2010 at 5:26 PM
I have been using this component for about a year now and it works great, but I am wondering how to do one thing. Some of my dimensions are a million+ rows; one of the source systems for the warehouse is another relational source that is not the OLTP system and it is populated by an incremental load process. It has a staging database that has only the data that is new and updated. I would like to make use of the stage tables instead of using the main reporting tables, but if I do that, most of the records in the warehouse will be marked as expired. Is there a way to prevent this from happening other than pointing the component at the full table? Records are rarely deleted from the source systems and when they are, the application area owner can follow up with the warehouse team to have it marked as expired. The reason I want to view fewer rows is because these large dimensions tend to bloat the amount of RAM that SSIS takes since I run dozens of dimension processing tasks in parallel, with a bunch being a million+ rows. Also, I would imagine that this process would be faster than pulling the entire dataset back.
Jun 16, 2010 at 5:02 PM

I believe that if you use v1.5, you can use a Conditional Split on the Expired output to filter out the records that show "Deleted" as a "Row Change Reason".  Let me know if that works for you.

Jun 18, 2010 at 12:35 PM

That did it, thanks Todd!