Data Quality Rules

Topics:

This section provides a reference for the applicable Data Quality (DQ) rules (Cleansing, Matching, Merging, and Remediation).

Cleansing

  • Names

    If name parts are populated, then you can populate full name. If full name is populated, then parse full name to populate name parts.

    Requirements

    • None

    Tags

    • ERR_NAME_BLANK
    • ERR_LAST_NAME_BLANK
  • Social Security Number

    Standardize SSNs to xxx-xx-xxxx. You can tag invalid or questionable values.

    Requirements

    • None

    Tags

    • ERR_SSN_NO_9_DIGIT
    • ERR_SSN_ZEROS_IN_GROUP
    • ERR_SSN_UNACCEPTED_NUMBER
    • ERR_SSN_USED_FOR_ADVERT
    • ERR_SSN_BLACKLISTED
    • ERR_SSN_ZEROS_ADDED
    • ERR_SSN_NOT_A_SSN
  • Email

    Validate email addresses.

    Requirements

    • None

    Tags

    • ERR_EMAIL_INVALID
    • ERR_EMAIL_TLD_MISSING
    • ERR_EMAIL_DOMAIN_ONLY
    • ERR_EMAIL_AT_SIGN_MISSING
    • INF_EMAIL_SUSPICIOUS
    • INF_EMAIL_CLEANSED
    • ERR_EMAIL_WEB_ADDRESS
  • Phone

    Validate phone numbers and standardize to (xxx) xxx-xxxx format.

    Requirements

    • None

    Tags

    • ERR_PHONE_NOT_A_NUMBER
    • ERR_PHONE_TOO_SHORT
    • ERR_PHONE_BLACKLISTED
    • ERR_PHONE_AREACODEINVALID
    • ERR_PHONE_CO_CODE_INVALID
  • Date of Birth

    Requirements

    • None

    Tags

    • ERR_DOB_BLACKLISTED
    • ERR_DOB_IN_FUTURE
  • Country

    Standardize to the ISO3 country code.

    Requirements

    • None

    Tags

    • ERR_UNRECOGNIZED
    • ERR_AMBIGUOUS
  • Address

    Cleanse, enhance, standardize, and geocode addresses.

    Requirements

    • Loqate for address cleansing and verification

    Tags

    • ERR_ADDRESS_INVALID

Matching

Matching is performed based on the following attributes:

  • SSN
  • DOB
  • Full Name and Name Parts
  • eMail
  • Phone Number
  • Address

Each attribute has a weight assigned, based on the uniqueness of the attribute. Attributes may have reduced weighting where values do not have exact matches or contain transpositions. Attributes unique to the subject may have negative weighting when the values are completely or somewhat different.

It is considered a Strong match when the total combined score of the match is greater to or equal 200 and a Potential match when greater to or equal to 130 but less than 200.

Records considered as a Potential match have a matching ticket created so as to have an individual manually review the low-quality match for accuracy.

Merging

Merging is performed differently based on the subject. The mastered subjects are merged to create a representative view of the entity. The child subjects are sometimes merging the instances to create a representative view of the entity, while other times preserving all records in the subject.

  • Customer. Instance records are merged to form a single, representative view of the Customer. The most recent, non-blank values are selected.
  • Email. For each email type, select the non-blank value with the least ERR_, WRN_, or INF_ tags associated.
  • Phone. For each phone type, select the non-blank value with the least ERR_, WRN_, or INF_ tags associated.
  • Account. All unique account records create golden accounts.
  • Account Team. All unique account team records create golden account teams.
  • Contact. All unique contacts create golden contacts.
  • Address. All unique addresses create golden addresses.
  • CustDemographics. The customer demographics record with the most complete data create the golden customer demographics.

Remediation

Remediation creates the following two types of tickets:

  • Cleansing. Cleansing tickets are created whenever the tag begins with ERR_. For a complete list of potential tags generated, see Cleansing.
  • Matching. Matching tickets are created when the match quality is only considered to be a Potential match. For more information on match quality, see Matching.