successive runs of the my SCD create duplicate rows

Topics: Issues 3: Executing the Component (Run-Time)
Nov 3, 2010 at 12:58 AM
Edited Nov 3, 2010 at 4:43 AM

Hi I have created an SSIS package based on the harness standard package where I have replaced the ipput and output data sources as table in my database.

The initial run from an empty existing table gives me 204 rows but sees the special members as invalid.

Thats one question, what could be wrong that causes the special members to be invalid.

A successive run adds another 204 rows, and each new execution adds another 204 rows.

What should I do to have successive rows see that the data does not need to be updated unless changed.

 I can send you some screen captures and dtsx files my email if it helps

Regards,

Alan

Nov 4, 2010 at 12:41 AM

I have removed the output of the unchanged derived field going to the union all.  Is this a valid tactic?

Coordinator
Nov 9, 2010 at 6:51 PM

I'm sorry - that sample package is not a "real life" implementation of how you update a data warehouse.  I've gotten myself into more trouble with that "sample" than it's worth, and I shouldn't have made it like it is.

REMOVE THE UNION ALL.

You need to handle your outputs individually.  For example, rows coming out of the "New" output should go into an OLE DB Destination (or other Destination) and get them inserted.  Rows coming out of the "SCD1 Updated" output should go into an OLE DB Command for UPDATE (or better, an OLE DB Destination that inserts into a temp table, followed by an UPDATE command in an Execute SQL Task).

To determine why the special members are getting rejected, put a data viewer on the Invalid Input Output.  There will be a "reason" shown there why the rows are getting rejected.

To determine why you're getting "new rows" all the time, I'd first rearchitect like I said earlier, then you can examine each output of the component with a data viewer to look at the Row Change Reason column.

Nov 9, 2010 at 8:47 PM

Thanks Todd,

Do you have a real life implementation of a SCD package?

Regards,

Alan

Alan Richmond

Business Intelligence Lead

PFD Food Services Pty Ltd
ABN: 29 006 972 381
6 Henderson Rd
Knoxfield VIC 3180
Phone: 03 9756 2093

Mobile: 0419 001123

Email: arichmond@pfdfoods.com.au
Web Site: www.pfdfoods.com.au
Fax: 03 9753 4205

From: toddmcdermid [mailto:notifications@codeplex.com]
Sent: Wednesday, 10 November 2010 5:52 AM
To: Alan Richmond
Subject: Re: successive runs of the my SCD create duplicate rows [kimballscd:233246]

From: toddmcdermid

I'm sorry - that sample package is not a "real life" implementation of how you update a data warehouse. I've gotten myself into more trouble with that "sample" than it's worth, and I shouldn't have made it like it is.

REMOVE THE UNION ALL.

You need to handle your outputs individually. For example, rows coming out of the "New" output should go into an OLE DB Destination (or other Destination) and get them inserted. Rows coming out of the "SCD1 Updated" output should go into an OLE DB Command for UPDATE (or better, an OLE DB Destination that inserts into a temp table, followed by an UPDATE command in an Execute SQL Task).

To determine why the special members are getting rejected, put a data viewer on the Invalid Input Output. There will be a "reason" shown there why the rows are getting rejected.

To determine why you're getting "new rows" all the time, I'd first rearchitect like I said earlier, then you can examine each output of the component with a data viewer to look at the Row Change Reason column.

Read the full discussion online.

To add a post to this discussion, reply to this email (kimballscd@discussions.codeplex.com)

To start a new discussion for this project, email kimballscd@discussions.codeplex.com

You are receiving this email because you subscribed to this discussion on CodePlex. You can unsubscribe on CodePlex.com.

Please note: Images and attachments will be removed from emails. Any posts to this discussion will also be available online at CodePlex.com

-------------------------------Safe Stamp-----------------------------------
Your Anti-virus Service scanned this email. It is safe from known viruses.
For more information regarding this service, please contact your service provider.
Coordinator
Nov 25, 2010 at 7:17 PM

I've got lots of them :) - that I can't share...

But I do have one that I can, and a video of it should be available shortly on bi.sqlpass.org.  Download this project: DWETLDemo.

That demo shows four ways to process a dimension table.  The dimension table in question represents "computer assets" and has one SCD2 change and one SCD1 change in the data, with appropriate persistence setup on the data flows.  Since it has three other comparable flows - one with the Wizard, one with regular components, and one with T-SQL MERGE, you ought to be able to compare the KSCD with them to understand how it works.  But please post back if anything is still unclear.

I do need to make a better video...