Special Members Usage

Topics: Issues 2: Using the Component (Design-Time)
Jul 30, 2012 at 1:06 PM

The documentation says: "This flow is entirely optional, but provides a clean way of specifying and maintaining any possible "special" members in your data warehouse dimension table. In typical dimensions, there is an "unknown" member that you can use to link fact table rows to when the fact table row refers to a NULL or otherwise unknown dimension value. ".

And that's exactly what I'm trying to do, after upgrading to 1.6.

I want to use an SK of -1 to indicate 'Unknown', so I create a table with one record

SK = -1; BK = -1; Desc = 'Unknown'

and pre-load the Dimension table a single record with those values (and ValidFrom of 1753-01-01, validTo of 9999-01-01 and ValidNow of True).  

The SK column in the Dimension is an Identity with  Increment = 1 and Seed = 0.

The idea is that the single special member will always equal the dimension record with SK of -1, and therefore that will remain static forever.  BK lookup misses will be mapped to -1 to indicate unknowns.

But on initial load, I'm getting n+1 records going down the New path ( and 1 record going down Unchanged), instead of n records from the Source System path and the existing 1 record being ignored (as it's equal to the special member).

Clearly I am not understanding how Special Member is supposed to be used.  Would you be so kind as to explain how I should be using it?  (If it is documented elsewhere sorry that I missed it; I did search for it, though).

 

Cheers, Donna Kelly

Jul 30, 2012 at 3:18 PM
Edited Jul 30, 2012 at 3:21 PM

Hi Donna,

Sounds like you are using special members as intended.

Have you wired up the existing dimension properly? My hunch is its some sort of mapping issue. Check the 'column mapping' tab of the component and verify all mappings. Also check you are not removing the SK=-1 from the 'existing dimension' input. It has to be included. Also verify that one record is going into the 'Special Members' input.

Jul 30, 2012 at 4:34 PM

WeeeBob,

thanks for your comment.  I checked the column mapping tab.  

Special Members -> Existing Dimension has this: SK->SK; BK->BK; Description->Description

Source System -> Existing Dimension has this: BK -> BK; Description ->Description

The Existing Dimension input has 1 record.

The Special Members input has 1 record.

The Source System has 53 rows.

I sent all the different outputs to different trash destinations.

New . . . 54 rows

Updated SCD 1 . . . 0 rows

Expired SCD 2 with SCD 1 Changes . . . 0 rows

Deleted . . . 0 rows

Unchanged . . . 1 rows 

I changed the Description field in the Special Member to Unknowns and it was found in the New datastream (Row is New) with a ValidFrom of Today.

The Previous single row is EExisting Dimension was the 1 row in the Unchanged datastream.

What's puzzling me is why a row from Existing Dimension would end up in the New output.  Expired, or even Deleted, sure, but New????

Cheers, Donna

Jul 30, 2012 at 4:38 PM

BTW, I tried this with 

a) contents of Description field same and different

b) with 1.4 behaviour routing New SCD2 to New, and without, i.e. to separate output datastream (which remained 0 rows).

HTH, Donna 

Jul 30, 2012 at 4:46 PM

A followup question . . . 

Does anyone know exactly what matching criteria does the component use to determin the equivalency of a Special Member and a row in Existing Dimension?

Is it SK to SK?  BK to BK?  All fields to all fields?  Or what?

BTW, it would appear mandatory to include SK in the Special Members row (although I do recall seeing somewhere in this wiki that it was optional).

FWIW, Donna

Jul 30, 2012 at 9:10 PM

Latest update:  I tried with an empty Source System table and a) all fields identical b) SK changed c) BK changed d) Description changed.

In all cases, the single Existing Dimension record was passed through to the Unchanged datastream, and the Special Member was passed through to the New datastream, with Effective Date of today, Expired Date of 9999-01-01 and Effective Now of True.

This cannot be correct behavior, can it?  I must be doing something wildly wrong, yes?

I'd like to re-iterate my initial question:  how I should be using [Special Member]?

Time to try something new.  I've a notion that I can achieve the same effect without the Special Member input, by using the my special member one-record table in a Union All with the Source System.  I'll still need to pre-load the Existing Dimension with one record with a -1 SK, of course.  All I need to ensure is that I use a non-existent BK to get the necessary match.  We shall see . . .

Cheers, Donna

Jul 30, 2012 at 10:19 PM

<SCREAMS WITH FRUSTRATION!!!>

Well, I tried something new.  Set up a complete new test environment.  1 record on existing dimension, 1 record on my-special, and 2 records on source system. Union All the source system and my-special.  Fed the expected 3 records into Source System.  Fed the one record into Existing Dimension.

Got 1 record output in Unchanged and 3 (NOT 2!!!) as New.  Grrr!

The detail is below.  Note the very last Row is New.  It shouldn't be there, and I do not know how to stop the component from generating it.  I suppose I could kludge it by putting in a filter into the New datastream to exclude any row with SK < 0 . . . but that's yukky.  I'll give it until tomorrow, and then if I can't get it working, I'll just have to abandon the component in favour of T-SQL MERGE . . . unless anyone has any better ideas?

Not cheers, but <anguished, anxious, cheerless, cross, despondent, discomposed, discontent, disgruntled, displeased, disquieted, dissatisfied, dissident, exasperated, fed up, gloomy, irreconcilable, irritated, joyless, malcontent, miserable, non-satisfied, restless, sour, teed off, ticked off, uncomfortable, uneasy, ungratified, unhappy, and wretched/> Donna.

SOURCE QuoteStatusSK QuoteStatusID QuoteStatusDesc ValidFrom ValidTo ValidNow
-1 -1 Unknown 01/01/1753 01/01/9999 True

UNION ALL
QuoteStatusID QuoteStatus
2 Quote Completed
1 Quote Started
-1 Unknown


UNCHANGED
QuoteStatusSK QuoteStatusID QuoteStatusDesc ValidFrom ValidTo ValidNow
-1 -1 Unknown 01/01/1753 01/01/9999 True

NEW
Row Change Reason QuoteStatusSK QuoteStatusID QuoteStatusDesc ValidFrom ValidTo ValidNow
Row is New NULL 1 Quote Started 30/07/2012 31/12/9999 True
Row is New NULL 2 Quote Completed 30/07/2012 31/12/9999 True
Row is New NULL -1 Unknown 30/07/2012 31/12/9999 True
Jul 31, 2012 at 11:37 AM
Edited Jul 31, 2012 at 11:43 AM

Hey Donna,

I've just noticed something. Have you confirmed that the special member is present on the 'Existing Dimension' input? From reading above it sounds like you have the Special members input coming from table called x , but the existing dimension data coming from table y .  The purpose of the special members being input into the component is so that the component won't expire the special members or alter them.

So in effect what you are doing is saying that the special member doesn't exist in the existing dimension data, so the component says hey it should be there, I'll stick it on the 'New' output so it will get added to the dimension.

If you let the component add the special dimension member on the first run, does the problem go away on the next run???

The way I set up special members for a SCD 2 dimension is:

1. Create new dimension table in relational warehouse  (DimDimension).

2. Insert my 'unknown' special member (with a surrogate key of 0, StartDate 1975-10-10 End Date 9999-12-31) into DimDimension .  (I use 1975-10-10 as min date on all my date attributes)

On the SSIS...

3. Set the source of 'Existing Dimension' to be Select * from DimDimension Order By BusinessKey

4. Set the source of 'Special Members' to be Select * from DimDimension Where Surrogate Key=0 ORDER BY BusinessKey

5. Set the source of 'Source System' to be the select * from StagingArea.DimensionData ORDER BY BusinessKey   . This feed never has the unknown member in it...

Try give that a shot. 

 

As for the using the merge SQL instead, yeah that will work. But once you get the hang of the Dimension merge component I find it a lot quicker to dev for my dataloads , which are very low volume wise, dimensions with tens of thousands of rows with only hundreds of row changes a day. Fact tables of millions of rows but only loading tens of thousands a day.

hopefully this may help to turn those screams of frustration into screams of something nicer.

Jul 31, 2012 at 12:35 PM
WeeeBob wrote:

 Have you confirmed that the special member is present on the 'Existing Dimension' input? 

Yes, in an earlier post I said "The Existing Dimension input has 1 record.  The Special Members input has 1 record."

In fact, the way I've been setting up the Special Member is to use the Import/Export Wizard to duplicate the Existing Dimension.  However, your way is much more cunning (I'm referring to the Select * from DimDimension Where Surrogate Key=0 ORDER BY BusinessKey).  

[Why the Order by BK though, given there's only a single unknown?]

As I said, the reason for shoving the Unknown through the source system input (via a union all)  was to try another way of stopping the component from shoving out the New datastream.  Which didn't work! :-(

I shall try your cunning plan . . . although logically, it;s the same as using a duplicate Special Members table.   Also, I like your notion of testing "If you let the component add the special dimension member on the first run, does the problem go away on the next run???".  I think I shall give that a go, too.

A final thought . . . I've been using the Kimball SCD component for several years, and I can't understand why it's gone all growly on me in its latest incarnation.  I think it's been great, which is why I'm investing so much time and effort into trying to get it working properly.  BTW, if you know the answer to "exactly what matching criteria does the component use to determine the equivalency of a Special Member and a row in Existing Dimension?", I'd be pleased to hear it.

Thanks for your help, Donna 

 

 

Jul 31, 2012 at 1:35 PM

[Why the Order by BK though, given there's only a single unknown?]  It's a habit I've gotten into with the SCD component, sometimes I do have more than one special member. I also set the advanced 'issorted' property always. I've never tried running a source component without an Order BY, but with an IsSorted=True, I think it has the possibility to cause all kinds of madness if the order in the data set is not the same as what was specified in the advanced settings. It would be one of those irratating problems that would takes ages to resolve!

"exactly what matching criteria does the component use to determine the equivalency of a Special Member and a row in Existing Dimension?"  I've briefly wondered about this myself, But if using the technique I mention above, then all fields will always be exactly the same anyways! I do still wire up all the fields together between special and existing in the component. I presume you could just select the Surrogate key to match on and get away with it. I think what I'm saying here is on the column mapping tab of the SCD component the items you map together between 'existing dimension' and 'special members' is the criteria used to determine the equivalancy. 

I'm intrigued as to why you would keep your special members on a seperate table (if that is what you mean by using the import/.export wizard to duplicate)? Are you saying you never have the unknown member in the actual dimension? presumably these means you don't enforce referential integrity then, as it wouldn't work? As any fact that has its DimDimensionID foregin key set to zero would fail integrity check?  Or is it so you can update the unknown member by some method? For me an update to the Unknown member has to raised as a Change request from the business and I just write a SQL update statement to update DimDimension where Surrogate Key=0 for the fields required.

btw I use redgates SQL source control for control of my DDL and on a version checkin that creates a new Dimension table, I also create a migration script that creates the unknown member, so when I then add the foreign key onto an existing Fact table I set a default of 0 and non nullable , so all previously loaded facts point to the unknown member, and I can then if I so desire switch on referential integrity for the field.  Sometimes I've also created another special member for 'Before Data was available' and pointed previously loaded facts to this special member

 


 

 


Jul 31, 2012 at 9:08 PM
WeeeBob wrote:

 

I'm intrigued as to why you would keep your special members on a seperate table (if that is what you mean by using the import/.export wizard to duplicate)? Are you saying you never have the unknown member in the actual dimension? presumably these means you don't enforce referential integrity then, as it wouldn't work? As any fact that has its DimDimensionID foregin key set to zero would fail integrity check?  Or is it so you can update the unknown member by some method? For me an update to the Unknown member has to raised as a Change request from the business and I just write a SQL update statement to update DimDimension where Surrogate Key=0 for the fields required.

Well, yes, I never twigged that Special Members could be a Select from Existing Dimension. It really was as simple as that; I wasn't trying to do something clever.  Doh!

Referential integrity?  Sure I do.  And does the unknown member live in the Existing Dimension?  Sure it does.

But until this project, I've been using the Kimball SCD component, and had no problems with it.  This is the first time I've used the 1.6 Dimension Merge Component.

Change requests?  Source Control?  We don't use no stinkin' Change Requests! :-) 

(I've been in charge of all of the BI programmes for my clients, for quite some time now.)

Jul 31, 2012 at 9:43 PM
Edited Jul 31, 2012 at 11:20 PM

At this point, I'm stymied.  Here's the situation:

Two input tables, QSource and Dim_Q.  Trash destinations on every output (9 in all).  The Special Member OLE DB input component has this statement:

SELECT     QuoteStatusSK, QuoteStatusID, QuoteStatusDesc
FROM         rdw.Dim_Q
WHERE     (QuoteStatusSK = - 1)

Beginning Debug, with Data Viewers on just about everything.  QSource has 2 rows. There are 2 columns.  Here's the data:

QuoteStatusID QuoteStatus
1 Quote Started
2 Quote Completed

Click proceed on the data viewer, and Special Members has 1 row.  There are 3 columns.  Here's the data:

QuoteStatusSK QuoteStatusID QuoteStatusDesc
-1 -1 Unknown

Click proceed on the data viewer, and Dim_Q has 1 row.  There are 6 columns.  Here's the data:

QuoteStatusSK QuoteStatusID QuoteStatusDesc ValidFrom ValidTo ValidNow
-1 -1 Unknown 01/01/1753 01/01/9999 True

------------------- now the component is ready to run---------------------------

Click proceed on the data viewer, and we immediately get 4 fresh data viewer pop-ups from 4 of the 9 output datastreams.  Apart from Auditing and Statistics, we also get data viewers on the New and Unchanged output streams. 

Here's the data on the New output stream:

Row Change Reason QuoteStatusSK QuoteStatusID QuoteStatusDesc ValidFrom ValidTo ValidNow
Row is New NULL 2 Quote Completed 31/07/2012 31/12/9999 True
Row is New -1 -1 Unknown 31/07/2012 31/12/9999 True
Row is New NULL 1 Quote Started 31/07/2012 31/12/9999 True

Here's the data on the Unchanged output stream:

QuoteStatusSK QuoteStatusID QuoteStatusDesc ValidFrom ValidTo ValidNow
-1 -1 Unknown 01/01/1753 01/01/9999 True

The red one shouldn't be there of course, and the suggestion earlier of "If you let the component add the special dimension member on the first run, does the problem go away on the next run???" doesn't fly, I'm afraid, as trying to insert a -1 key when there's already a -1 row present fails PK constraints.

This is not the behavior I recall from version 1.5.  Speaking bluntly, this doesn't seem right to me, and I can't afford to waste any more time on it.   Maybe I've got the whole Special Members usage thing wrong . . . if so, I'd appreciate someone putting me right.

Anyway, I just removed the Special Members input completely, and the red row disappeared from the New datastream.  OK, but where did it go?  Surely, the -1 row in Existing Dimension should have gone to Expired SCD2?  It didn't.  There were just the 2 records in New, and

Clearly, I have no idea of how Special Members is supposed to work, and I cannot - under any of the scenarios I've tried - make it do what I need it to do.  And I have tried all of the scenarios I can envisage.  GRRRR!

There's only one other thing I can think of to try:  remove the 1.6 component and re-install 1.5, and see what happens.  I cannot think of anything left to try regarding 1.6.

 I will say thank you very much, WeeeBob, for your support and suggestions.  It has been kind of you.  I will do another post to let people know how I get on with 1.5.

Cheers, Donna