Maintaining Persistent Tamr IDs

Tamr Cloud requires persistent primary keys in order to maintain persistent Tamr IDs.

You must preserve primary keys and maintain the source dataset name if you update the source dataset; otherwise the Tamr ID will change.

There are several situations in which the Tamr ID for clustered source records and the mastered entity can change:

  • The source record primary key changes.
  • The source dataset name changes.
  • The source record values have changed and it no longer matches a previously assigned cluster.

Consider the examples below:

Example 1: The primary keys have changed in cluster source records, resulting in a new Tamr ID.

In this example, four source records are clustered together into the mastered Victor Marks entity. The mastered entity and its clustered records have been assigned a persistent Tamr ID, as shown in the Tamr ID and Persistent ID columns in the image below.

The source dataset is then updated in Tamr Cloud, and the updated dataset includes different primary key values for the clustered source records, as shown in the Unique Key column in the image below. When the mastering flow is run, Tamr Cloud considers these to be new records because their primary key values have changed. As a result, the records are still clustered together, but the records and the mastered Victor Marks entity are assigned a new Tamr ID, as shown the Tamr ID and Persistent ID columns in the image below. The original Tamr ID is retired.

Example 2: The values in a clustered source record changed, resulting in that source record being clustered into a different mastered entity .

In this example, three source records are clustered into the mastered John Adams entity.

The content of one of the source records for John Adams is then modified by the contact management system used to obtain the source records. During the next mastering flow run, Tamr Cloud considers the modified source record to no longer match the original cluster records and moves it to a different cluster (Jonathan Adams). As a result, Tamr Cloud assigns that record to a new cluster with a new Tamr ID, and the original cluster record retains the same Tamr ID, as shown in the as shown the Persistent ID columns in the images below.

Original cluster:

New cluster: