|
Input |
The Input sub-processor of matching processors is used to map attributes from input data streams to matching processors.
The Input sub-processor is a necessary part of matching, used to control the data that is used in the matching process.
Normally, all attributes from each input data stream are included in a matching process. However, you may want to vary the attributes used in matching, and only include those that you need either to match on, use in the review of possible matches, or use in making output selections.
Note:For versions of OEDQ older than 7.0, it was also necessary to configure the selection of input attributes carefully as all input attributes would be included in the Decision Key used to re-apply ('remember') manual match decisions. However, it is now possible to configure which of the input attributes to use in the Decision Key - see Advanced options for match processors. |
For example, from a typical Customer table, the following attributes might be included in a matching process:
Purpose |
Attributes |
Needed for matching |
First_name Surname Birth_date Address_1 Postcode Home_tel_number |
Needed for the review of possibly matching records |
Title Address_2 Town County Customer_type |
Needed to identify specific records for data updates |
Customer_ID |
Needed to make output decisions (for example, to choose the most recent record) |
Last_modified_date Has_active_account |
A number of other attributes in the source data might be excluded from the matching process.
In order to input data into matching, you first need to connect up the data stream(s) to the match processor on the canvas. Note that the number and type of data streams accepted by the processor depends on the type of processor, as follows:
Match processor type |
Accepts input data streams |
Group and Merge |
A single working data stream |
Deduplicate |
A single working data stream |
A single working data stream, and any number of reference data streams |
|
Link |
Any number of working and reference data streams |
Consolidate |
Any number of working data streams |
Advanced match |
Any number of working and reference data streams |
Data streams are connected to match processors either directly from Readers, or from output filters of other processors.
The below example shows three data streams being connected to an Enhance processor - one working data stream, and two reference data streams:
Once the data streams are connected, you can use the Inputs dialog to select attributes, in the same way as for all processors. Note that a tab appears for each connected data stream:
Two additional options appear when configuring the options for a match processor (except Group and Merge):
Compare against self - this option allows you to change whether or not the match processor will look for matches within the data stream (rather than between data streams). This option is set to the most likely default depending on the type of match processor. Note that working data streams are always compared with each other, and reference data streams are never compared with each other.
Enabled - this option allows you to retain the configuration of an input data stream, but to switch on and off the use of it in the match process - for example to run a match of some working data against some, but not all, configured reference data streams.
Oracle ® Enterprise Data Quality Help version 9.0
Copyright ©
2006,2011 Oracle and/or its affiliates. All rights reserved.