Options Tab

<previous tab (I/O Counters) | next tab (Debugging)>

Definitely the most complicated part of the configuration. It consists of several sections:
  • String comparison options
  • SCD2 date handling options
  • Surrogate key generation options
  • Warning suppression
  • Parallelism options

String Comparisons for Business Keys and SCD1/2 Columns

For all string columns (DT_STR and DT_WSTR), these options define how matches are determined. One culture identifier can be selected for character interpretation. Options for case- and space-sensitive comparisons for Business Keys, SCD1, and SCD2 fields are available.
The case-sensitivity option determines whether "Value" is considered equivalent to "value".
The space-sensitivity option determines whether "_Value_" is considered equivalent to "Value" (spaces shown as underscores).

Date Handling for SCD Type 2

These options instruct the component how to handle assignment of effective and expiry dates in the SCD2 housekeeping columns.
Since loading of the dimensions could occur in a historical load, the component has an option to accept a variable that contains "today's date". The value for "today's date" is used in the following options.
Depending on your business users' needs for the data warehouse, your SCD2 date assignment may vary. As such, when the component detects that it needs to "retire" a version of a row and create a "new" version, it needs instructions on how to mark up the effective and expiry dates on each row. The "record changes by marking" option holds those instructions, allowing you to specify if the expiry date of the old record is "two days ago", "yesterday", or "today".
To complement those options, the SCD component needs to know the extremes of the date range for the effective and expiry dates, so that its assignment of the effective date of the "first" version and expiry date of the "last" version fit the range of the datatype in the dimension table, as well as the comparison semantics your OLAP tools are capable of understanding. At the moment, there are four choices:
  1. The minimum or maximum value of SQL Server's DateTime datatype
  2. The minimum or maximum value of SQL Server's SmallDateTime datatype
  3. A null value
  4. An explicitly (manually) defined date

Surrogate Key Generation

If your Existing Dimension input identifies a Surrogate Key column on the Existing Dimension Column Usage Tab, this section will be available. If you do not identify a Surrogate Key column, the component won't generate keys for you, so the options are disabled.
For both the key seed and the increment, the option exists to specify the values manually, or specify them via package variables.
There is also a performance-boosting option to inform the component that the value you have chosen for the seed is guaranteed to be unused in the Existing Dimension. If all of the inputs are also sorted by the Business Key, this permits the component to begin processing and outputting rows (reducing memory pressure) before all rows arrive on the inputs. If you can not guarantee the seed is unused, the component will have to wait for all inputs to be completely received before it begins processing. The simplest method of using this performance booster is to have a script component or OLE DB Command component earlier in another preceding data flow (not this data flow!) collect the highest (or lowest, depending on your increment setting) value seen in the Existing Dimension.

Warning Suppression

Check these boxes to acknowledge the performance and surrogate key generation warnings you may be receiving, and suppress their display.

Number of Processing Threads

The component has the ability to spawn several worker threads to process input rows in parallel, improving throughput. The default setting of "automatic" instructs the component to use the main thread to receive rows from the inputs, and spawn a number of worker threads equivalent t two times the number of CPU cores to "process" the accumulated rows. If you manually specify a thread count, it applies to the processing worker threads only.

<previous tab (I/O Counters) | next tab (Debugging)>

Last edited Apr 12, 2009 at 7:10 PM by toddmcdermid, version 6

Comments

blakmk Jan 5, 2010 at 3:50 PM 
For the surrogat key generation, is there a plan to populate this from an idenity column?