Audit value for new rows

Topics: Issues 2: Using the Component (Design-Time)
Aug 27, 2010 at 11:04 AM
Edited Aug 27, 2010 at 11:32 AM

In the past, I have stored an audit key, unique to the execution of the package, against both new and changed rows in a dimension. The Kimball SCD control allows me to allocate a user variable to changed rows, but I would also like to use the same value for new rows. The Auditing tab allows me to set a 'Default Value For New Rows', however if I type in a variable, e.g. User::AuditKey, it crashes the control, returning me to the SSIS design screen. Does anyone have a solution for this?

Further to the above, the 'Default Value For New Rows' doesn't allow -1 to be set. It allows it to be entered, but it doesn't save. If you click on Finish, then go back into the control properties, the default value resets to 0. Again, any suggestions?

Coordinator
Aug 28, 2010 at 11:53 PM

If you're using v1.5 or higher of the component, you should see more than one row available on the Auditing tab to specify variables.  The number of rows on the auditing tab depends on how many columns you've identified as Auditing columns on the Existing Dimension tab.  You can identify a column as an "audit add" column - it doesn't sound like you've done that.  Second, you don't have to type anything in to the "variable" column on the auditing tab - just pull down the combobox and you should see a list of variables.

I do see the problem with setting a default value of -1.  Not sure what's causing that at the moment, but I'm going to create a workitem for that.

Nov 1, 2010 at 7:57 PM
Edited Nov 1, 2010 at 9:46 PM

Hi guys, am just trying to get my head around the auditing implementation.

Initially I thought that the 4 Audit type fields should be datetime fields and they record datetimes for each corresponding action. Elsewhere on this site I saw an indication that a date would be appropriate for Default Value for new rows.

However I now don't think this is right.

Now I think I understand that there can be an audit table which has a single key. This Audit key can be assigned to dimension rows in one of the 4 audit field types and the component will assign the "Auditing Value" for each value accordingly. However the thing that it confusing me is why does the system allow there to be 4 different values? Surely as there is only ever one Auditing record produced - there only needs to be one key? (Here I am talking about the key in the Audit Table, not the <= 4 that are in the dimension table, I get the need for those)

Additionally what is the recommended method for setting the variable(s) - do you have a control flow task that works this out before the DataFlow task or some other trick?

And what is the purpose of the "Default Value for new rows" column? If it is has to be a variable, can't you just rely on the variable default value?

Sorry, there isn't any documentation that I can find for this and although I have read the DW Lifecycle toolkit book and understand the concepts, this level of detail has passed me (maybe this is detailed in the ETL books that I have yet to buy)

Nov 2, 2010 at 5:49 PM

Further to this - I have just noticed that the Invalid Input (output) Audit Columns (Row added and Last Changed) always seem to be null.

Is this expected behaviour?

I was hoping to be able to write the output to a "Invalid Input" table and have the Audit fields link to the appropriate execution run?

Coordinator
Nov 9, 2010 at 5:28 PM

Better documentation would be great, right?  Right. ;)

Yes, the Audit column is primarily intended to be given an integer value that really maps into an "audit dimension table" that has one row per "load".  However, I made it such that you could throw in dates or strings, or whatever.

Why the default value?  Why not "just use" the variable?  Because it doesn't make much sense to fill in the "SCD2 Changed" with the auditing "key" that you've defined for "adds".

Nov 9, 2010 at 6:12 PM

Thanks Todd, that has clarified things.

So the (normal) practice would be that before running the component -

  1. You create a new Audit dimension record
  2. Capture the id
  3. Load that id into a variable 
  4. Provide this variable value to all the 4 Audit columns (or as many as you have used)

If the Auditing tab just had a single "Audit " variable and default value I think I would have understood it intuitively. I have thought hard but I can't imagine a case where these 4 values would differ from each other, is there any common case this would happen?

I understand the default value now - I am not using SCD2 currently hence my confusion - however I had anticipated this would be handled by default values on the "New" output or default values on the table but can see that this provides simplification in some cases.

How about the behaviour I reported where the Audit * Columns are blank on the "Invalid Input" output . Is this expected or is it a bug?

Coordinator
Nov 25, 2010 at 6:19 PM
Edited Nov 25, 2010 at 6:21 PM

Your interpretation is correct - that's exactly what I do.

I have had specific requests for "more flexible" auditing - and that's why there are four different capabilities there.  You and I find a number - a surrogate key reference to an audit dimension - perfectly acceptable.  But if I recall correctly, a few others were wanting to put dates or strings or GUIDs in there, and/or were wanting a few rows in their audit dimension referencing the different actions.

I'll have to look at the Audit columns on the Invalid Input output.  I'm thinking they shouldn't be there at all - since it's not an "output" row... but it could be useful, couldn't it?