bq>: Running Google BigQuery queries

bq> operator runs a query on Google BigQuery.

Copy
Copied
_export:
  bq:
    dataset: my_dataset

+step1:
  bq>: queries/step1.sql
+step2:
  bq>: queries/step2.sql
  destination_table: result_table
+step3:
  bq>: queries/step3.sql
  destination_table: other_project:other_dataset.other_table

Note: The bq> operator uses standard SQL by default, whereas the default in the BigQuery console is legacy SQL. To run legacy SQL queries, please set use_legacy_sql: true. For more information about standard SQL on BigQuery, see Migrating from legacy SQL.

Secrets

When you don't know how to set secrets, please refer to Managing Workflow Secret

Copy
Copied
  digdag secrets --project my_project --set gcp.credential=@my-svc-account-b4df00d.json

If gcp.credential is not set, digdag tries to get Application Default Credential.

Options

  • bq> : query.sql

    Path to a query template file. This file can contain ${...} syntax to embed variables.

    Examples:

Copy
Copied
  bq>: queries/step1.sql
  • location : LOCATION

    The location where the query job should run. See locations for a list of available locations.

    Examples:

Copy
Copied
  location: asia-northeast1
  • dataset : NAME

    Specifies the default dataset to use in the query and in the destination_table parameter.

    Examples:

Copy
Copied
  dataset: my_dataset
Copy
Copied
  dataset: other_project:other_dataset
  • destination_table : NAME

    Specifies a table to store the query results in.

    Examples:

Copy
Copied
  destination_table: my_result_table
Copy
Copied
  destination_table: some_dataset.some_table
Copy
Copied
  destination_table: some_project:some_dataset.some_table

You can append a date as $YYYYMMDD form at the end of table name to store data in a specific partition. See Creating and Updating Date-Partitioned Tables document for details.

Copy
Copied
  destination_table: some_dataset.some_partitioned_table$20160101
  • create_disposition : CREATE IF NEEDED | CREATE_NEVER

    Specifies whether the destination table should be automatically created when executing the query.

    • CREATE_IF_NEEDED : (default) The destination table is created if it does not already exist.
    • CREATE_NEVER : The destination table must already exist, otherwise the query will fail.

    Examples:

Copy
Copied
  create_disposition: CREATE_IF_NEEDED
Copy
Copied
  create_disposition: CREATE_NEVER
  • write_disposition : WRITE TRUNCATE | WRITE APPEND | WRITE_EMPTY

    Specifies whether to permit writing of data to an already existing destination table.

    • WRITE_TRUNCATE : If the destination table already exists, any data in it will be overwritten.
    • WRITE_APPEND : If the destination table already exists, any data in it will be appended to.
    • WRITE_EMPTY : (default) The query fails if the destination table already exists and is not empty.

    Examples:

Copy
Copied
  write_disposition: WRITE_TRUNCATE
Copy
Copied
  write_disposition: WRITE_APPEND
Copy
Copied
  write_disposition: WRITE_EMPTY
  • priority : INTERACTIVE | BATCH

    Specifies the priority to use for this query. Default: INTERACTIVE.

  • use_query_cache : BOOLEAN

    Whether to use BigQuery query result caching. Default: true.

  • allow_large_results : BOOLEAN

    Whether to allow arbitrarily large result tables. Requires destination_table to be set and use_legacy_sql to be true.

  • flatten_results : BOOLEAN

    Whether to flatten nested and repeated fields in the query results. Default: true. Requires use_legacy_sql to be true.

  • use_legacy_sql : BOOLEAN

    Whether to use legacy BigQuery SQL. Default: false.

  • maximum_billing_tier : INTEGER

    Limit the billing tier for this query. Default: The project default.

  • table_definitions : OBJECT

    Describes external data sources that are accessed in the query. For more information see BigQuery documentation.

  • user_defined_function_resources : LIST

    Describes user-defined function resources used in the query. For more information see BigQuery documentation.

Output parameters

  • bq.last_job_id

    The id of the BigQuery job that executed this query.

    Note: bq.last_jobid parameter is kept only for backward compatibility but you must not use it because it will be removed in a near future release.