SCD 1.4 Hanging

Dec 3, 2009 at 9:13 PM

I have a very wide (109 Columns) with about 1400 rows source going into an equivalent dimension.

Everything is SCD1, ALLOW NULLS.

The transformation just pings out the CPU at 100% and hangs on processing the Kimball Method transform.

An equivalent SQL MERGE runs in a second or two.

Any ideas?

Coordinator
Dec 4, 2009 at 12:15 AM

Not many.

My widest dimension has 78 columns, half type 2, half type 1 - with 1.3 million rows, so I don't think it's the size that's the issue.

Are there any messages in the Output window?  Warnings, errors?  Do you have any free RAM?

If you empty out the "source system" (put a Conditional Split before the component) as well as the "existing dimension" flow (meaning no records go in), does it finish?  If you zero out only one of the sources, does it work?

What data types do you have for your business key and surrogate key?  What types do you have in your other columns?

Dec 4, 2009 at 5:09 AM

There are a lot of messages about unused columns (I am only using the New and SCD1 Update outputs... but I don't know how to turn off the other types...

There are about 4 gigs of RAM available... but it really is the CPU pinging out...

Data types for Business key is VARCHAR(50) and Surrogate Key is Integer (Identity, not used in the xform)... Other columns are mix of varchar, int and datetime.

I will try some of those other things too./

Dec 16, 2009 at 10:10 PM
Edited Dec 16, 2009 at 10:30 PM

I have a similar issue with the cpu maxing out and hanging  in the Kimball 1.4 transform.  It only started happening when I added a single Type 2 column.  It was running fine will all type 1 columns.  Also, it processes an inconsistant amount of source rows prior to hanging.  And, in one case it completed the whole transform and load.

 

My surrogate key is int.  My business key is a composite key of 2 columns...one varchar and one nvarchar.  Any suggestions?

 

This is an edit to my original.   I have simply restarted by BIDs session 2x and each time immediately after restart the transform ran successfully.  However, if I repeat the process several times in one BIDs session, I get back into a hanging situation with the Kimball transform.

 

 

 

Coordinator
Dec 17, 2009 at 9:49 PM

There is an issue with using v1.4 inside a loop - some of its internal state variables weren't getting reset.  I can't recall what issue that was, and if I fixed it in a later v1.4 release, or just in v1.5.  Hopefully v1.5 will address your issue.  If not, post back.

Dec 22, 2009 at 10:29 PM

OK, but I think I'll have to wait until v1.5 gets out of beta.

Dec 24, 2009 at 4:42 PM

Todd,  hmmm....I just hate to ditch this component until 1.5 is out of beta.  My problem only seems to happen for large dimensions (over 200k rows) and only when I run the package...that is...it runs successfully when I am in debug mode and simply run the dataflow.   Let me know if you have any further thoughts on things I could try in 1.4.  Thanks

Dec 28, 2009 at 4:38 PM

I've been doing some more testing on v1.4.  It seems that the package only hangs when I execute it as a child package that runs as an object within a Master Dimension package.  Please let me know if that should make a difference or if there is some other property that I may need to set to get it to work consistantly as a child package.

Coordinator
Dec 31, 2009 at 4:43 AM
Edited Dec 31, 2009 at 4:44 AM

Sorry for the delay.  I think I finally found the root of the problem in v1.5.  I'm currently in the process of reworking some package architecture of my ETL processes, and will be completely reloading my DW with four years of data.  Since I was able to (unreliably) repro the hang issue while testing the package architecture changes, a full reload should stress test it wonderfully to make sure.

I don't happen to have any processes that use the component in a child package... although I can't conceive of why that would make a difference.  I'll check again for static variables that aren't getting reset with every run.  I know that was giving me an issue in loops before, but I'd cleaned that up.  Perhaps I introduced some more sloppiness...

No ETA on when v1.5 will be out of beta, but I'm feeling as though the reload stress test will validate it enough for me to release it.

Jan 3, 2010 at 8:17 PM

Todd, thanks for the follow-up.  I must admit that I like the Kimball  component provides such function/performance that I can't really let it go.  During some additional testing, I finally noticed some SSIS logging errors about shortage of memory for a 10k row lookup transformation upstream in the same dataflow as the Kimball transform.    I was not able to completely recreate this error, but immediatley took a look at the server memory resources.  It turns out we were only enabled for 6 gb RAM on the SSIS server when we were told we were getting 12 gb.  I immediately requested an increase in memory to 12 gb.  We have not had *any* hanging problems after the increase in memory.  I find this strange in the that the successful  jobs are only peaking out the OS at 6gb of memory consumed, so I thought that even an undersized 6gb memory with matching swap file would have handled these SCD jobs.

Nevertheless, the Kimball transforms have been running daily since Dec 29 on the server with the 12 gb RAM.    Hopefully, I'm not being to cavalier in deciding to move forward with the 1.4 Kimball transform and move these jobs into production at the end of January.  In the meantime, I will look forward to the 1.5 release.

Coordinator
Jan 4, 2010 at 1:01 AM

Thanks for the support.  I'mmad at myself for not having enough time to get 1.5 ready - but I plan to reload my own DW this week as a stress test.  The one real problem with the KSCD and memory is that I'm not swapping to disk.  SSIS doesn't have any facility to let me do this using any of their code - not that I think I'd be able to use anything they had - it would probably be too specific to their uses.  It would be nice if I knew how to detect a near out-of-memory condition though...

Jan 5, 2010 at 9:50 PM

Oh yeah,  I will post "near out of memory" detection as an issue.  And, I think that I'll vote at least 10 times for it!