Skip to content

Defining Variables

This page covers best practices for defining variables in Treasure workflows, focusing on avoiding common pitfalls that can cause unexpected operator behavior.

Prerequisites: Understand basic variable concepts in Workflow Definition.

DON'T: Top-level variables

Avoid defining variables at the top-level _export - they become options for ALL operators.

# BAD: Affects all operators unexpectedly
_export:
  database: my_db      # Goes to ALL operators
  engine: presto       # Goes to ALL operators
  timeout: 30m         # Goes to ALL operators

+td_task:
  td>: query.sql       # Uses global database 'my_db'

+pg_task:
  pg>: other.sql       # ALSO uses global database! (unexpected)

The workflow definition is equivalent to:

+td_task:
  td>: query.sql
  database: my_db
  engine: presto 
  timeout: 30m

+pg_task:
  pg>: other.sql
  database: my_db
  engine: presto
  timeout: 30m

DON'T: Custom variables in reserved namespaces

Avoid adding your own variables to operator namespaces like td, aws, bq - they merge with operator options. These options might be interpreted as actual operator options in future versions. Please refer to Appendix for a list of reserved namespaces.

# BAD: Custom variables in operator namespace
_export:
  td:                         # td namespace
    database: my_db           # OK - this is a real TD option
    engine: presto            # OK - this is a real TD option
    timeout: 10m              # BAD - not a real TD option, might conflict with future operator options
    user_name: treasure       # BAD - not a real TD option, might conflict with future operator options

+td_task:
  td>: query.sql

+td_task2:
  td_wait>: query.sql

The workflow definition is equivalent to:

+td_task:
  td>: query.sql
  database: my_db     
  engine: presto      
  timeout: 10m         
  user_name: treasure 

+td_task2:
  td_wait>: query.sql
  database: my_db
  engine: presto
  timeout: 10m
  user_name: treasure 

DO: Use separate namespaces

Create your own namespaces for custom variables and use operator namespaces only for real operator options.

# GOOD: Clear separation
_export:
  # Your custom variables/namespaces
  config:
    timeout: 10
    user_name: "treasure"

  # Only real TD operator options
  td:
    database: analytics_db
    engine: presto

+my_task:
  td>: query.sql
  # You can access custom variables with ${config.user_name}

DO: Override at task level when needed

_export:
  td:
    database: default_db

+special_task:
  td>: special_query.sql
  database: special_db    # Override for this task only

Summary

  1. No top-level variables - use namespaces
  2. No custom variables in td/aws/bq namespaces - only real operator options
  3. Create your own namespaces for custom variables/configs
  4. Override at task level when needed

Appendix: Operator Reserved Namespaces

Some operators have reserved namespaces. Variables defined in these namespaces are automatically merged with operator options.

Treasure Data Operators

  • td>, td_run>, td_ddl>, td_load>, td_for_each>, td_wait>, td_wait_table>, td_table_export>, td_result_export>td

Network Operators

  • mail>mail
  • http>http

Database Operators

  • pg>pg

Amazon Web Services Operators

  • s3_wait>, s3_delete>, s3_copy>, s3_move>aws.s3, aws
  • redshift>redshift
  • redshift_load>redshift_load, redshift
  • redshift_unload>redshift_unload, redshift

Google Cloud Platform Operators

  • bq>, bq_ddl>, bq_extract>, bq_load>bq
  • gcs_wait>gcs

Scripting Operators

  • py>py