Limitations

Maximum Number of Keys in key_columns

The maximum number of keys that can be specified in key_columns: for each table within tables: is 32.

Copy
Copied
tables:
  - database: prod
    table: pageviews
    key_columns:
      - {column: td_client_id, key: td_client_id}
      # Up to 32 keys can be listed

  - database: brand2
    table: pageviews
    as: brand2_pageviews
    key_columns:
      - {column: td_client_id, key: td_client_id}
      - {column: td_global_id, key: td_global_id}
      - {column: email, key: email}
      # Up to 32 keys can be listed

Maximum Number of Tables in tables

The maximum number of tables that can be listed under tables: is 255.

Maximum Occurrences of a Key in tables

The maximum number of times a key can appear across all tables listed in tables: is 32. Exceeding this limit will result in an error.

Copy
Copied
keys:
  - name: td_client_id
    invalid_texts: ['']

  - name: td_global_id
    valid_regexp: "3rd_*"
    invalid_texts: ['']

  - name: email
    valid_regexp: ".*@.*"
    invalid_texts: ['']

# Error occurs because td_client_id, td_global_id, and email appear 33 times
tables:
  - database: db1
    table: pageviews
    key_columns:
      - {column: td_client_id, key: td_client_id}
      - {column: td_global_id, key: td_global_id}
      - {column: email, key: email}

  - database: db2
    table: pageviews
    key_columns:
      - {column: td_client_id, key: td_client_id}
      - {column: td_global_id, key: td_global_id}
      - {column: email, key: email}
...
  - database: db33
    table: pageviews
    key_columns:
      - {column: td_client_id, key: td_client_id}
      - {column: td_global_id, key: td_global_id}
      - {column: email, key: email}

Restrictions on do_not_merge_key and merge_by_keys

When using do_not_merge_key:, the key specified in do_not_merge_key: must be listed as the first (highest-priority) element in merge_by_keys:. Failure to follow this rule will result in the following error:

Copy
Copied
400 Bad Request: {"canonical_ids[0].do_not_merge_key": ["must be the first element of merge_by_keys"]}