Similarity task

From the task type screen, select Similarity.

Adding a Similarity task

Task Name

Enter a unique task name here. Task names must be unique to a project.

Source Connection

Select the source from the Source Connection drop down. The connections list is based on the connections defined in Connections. Select the schema the Source Schema drop down. The Source Schema can only be selected from existing schemas in the selected data connection. Note: The Source Connection MUST be a SQL Server Connection.

Target Connection

Select the target from the Target Connection drop down. The Connections list is based on the SQL Server connections defined in Connections. Note: The Target Connection MUST be an MDS Connection and must reside on the same server as the source.

Tip: Target connections can only be of the following types: Files, Blob Storage, SharePoint, and SQL Databases.

Source Table/View

Select the source. The tables and views listed on the left will be based on the selected Source Connection and Source Schema.

MDS Model

The MDS Model in which the Master Cluster and related child records will be created. This Model must be pre-configured in accordance with the instructions in the Similarity Task Guide.

Select the MDS model you want to load the source data into.

Tip: The MDS model must already have been created.

MDS Entity

The MDS Entity in which the clustered source records will be created (refer to Similarity Task Guide).

Select the MDS entity the data will be loaded into.

Tip: Only entities that have already been created in the selected model will be available.

Similarity Step

This is the step within the Similarity process to be executed. The Similarity task consists of 3 steps, all of which are executed by default. To execute a specific step, select one of the following options from the drop down.

The available Similarity step options are:

  • Filter & Trial Match - initial match; this step runs the full filtering and matching process, creating a DGStg.SimilarityCluster table in the source database for review and refinement.
  • Re-Match - reruns the matching process, using the filters generated in Step 1. Use this step while refining the matching rules.
  • Import to MDS - exports the clustered members to MDS for verification and manual adjustment.
  • Filter, Match & Import to MDS

Advanced Settings

Toggle to show or hide the advance settings. Advanced settings are different for each task type.

Subject Areas:

Select Subject Area for the task if applicable. Refer to Using a Subject Area for more information.

Logging:

Logging

You can choose to select task logging.

Select a logging level. Logging options vary depending on the task type selected.

The available logging options for a Similarity task type are:

  • Standard - the default logging level as provided by Data Governor
  • Debug - the standard logging level with additional logging to assist with investigation of issues

Tip: If you have both job logging and task logging on- when job logging is set to Standard then the task logging option will override this. Otherwise job logging will always take precedence.

Action Buttons

When you have entered all the necessary task details, click Save.

Information: Task saved!

An information box will appear to confirm that the task has been successfully saved. Click Close.

Tip: New tasks are added to the bottom of the list as Enabled.