Using SCD Component in File Loop

Apr 20, 2009 at 8:14 PM
Has anyone got any experience of using this component in a File Loop to loop through some csv files in a folder.

My SCD component works fine with one file in the folder, but as soon as a second is added, it hangs.  I have no idea what to try next.  The two files I am trying to load will load individually so I dont understand why they wont go in one execution.  I was wondering if there is some requirement to release the component from memory somehow before re-running on the next file.

Any help much appreciated.
Coordinator
Apr 21, 2009 at 4:41 PM
That sounds interesting, David.  I must admit I've never tried that - and thinking about it, I'm curious as to what business purpose that serves for you.  Are you loading historical data one day at a time?

There isn't anything to my knowledge that hangs around and isn't cleaned up - but I'll add an issue for this so that I don't forget to try it out and test it.

In the meantime, a possible workaround may be to put your dimension processing Data Flow into a separate package, and have your looping container run an Execute Package Task to run it.  At the very worst, you can configured the Execute Package Task to run the child package in a separate process - that will for sure decouple any interactions between subsequent runs.  (I don't like that solution - even if it works!)
Apr 21, 2009 at 9:18 PM
I have separated the SCD into a separate task and this is now working.  Not sure about performance havent got to that bit yet, but in principle this works.

Thanks for the advice - your component is excellent by the way.
Coordinator
Apr 21, 2009 at 10:54 PM
Excellent.  I expect performance to suffer somewhat - probably 15 seconds per iteration.

Your issue is #2812 - keep an eye on that one.
May 1, 2009 at 3:56 PM
Edited May 1, 2009 at 3:57 PM
I recpectfully disagree with placing this into a separate Package and calling using an Execute Package Control Flow task within a For Each Container task.
I would argue that doing that in SSIS is like calling a UDF in TSQ for each and every row when placed in the SELECT clause of a query.

My advice on this pattern is to call the Data Flow itself in a For Each Container task within the Control Flow (which seems to be the way David has designed his solution).
I have used this pattern in almost every data warehouse load to enable user runtime selection of dates to load while running the data flow for each date provided.

You might want to ensure that the keep connection property for the connection managers is set so that each iteration will not waste time reinitializing the connections.
Also note that the speed of SSIS found within BIDS is not representative of that achived when run via DTEXEC at the command line or via SQL Agent.
I'd be interested in understanding more about your procesing details to understand what the loop is doing in relation to the SCD component.

Best of luck.
Coordinator
May 8, 2009 at 8:46 PM

I agree with you, I hate that workaround too.  I just have no idea what's making it not work in a loop at the moment.  Issue 2812 is up to deal with that.

Jun 18, 2009 at 6:16 PM

Hi  I am doing  it in loop ., It is working fine with me. I am loading 30 weeks   file in loop by changing the file name

Coordinator
Jun 19, 2009 at 3:58 PM

FYI - Issue 2812 is "solved" - there is a small change to the source code that will rectify it.  That fix is NOT applied in the current v1.4 release, but it will be in the v1.5 release.  (If I had time, I'd patch 1.4... but I don't, so I won't.)