bq>: Running Google BigQuery queries

bq> operator runs a query on Google BigQuery.

Copy

Copied

_export:
  bq:
    dataset: my_dataset

+step1:
  bq>: queries/step1.sql
+step2:
  bq>: queries/step2.sql
  destination_table: result_table
+step3:
  bq>: queries/step3.sql
  destination_table: other_project:other_dataset.other_table

Note: The bq> operator uses standard SQL by default, whereas the default in the BigQuery console is legacy SQL. To run legacy SQL queries, please set use_legacy_sql: true. For more information about standard SQL on BigQuery, see Migrating from legacy SQL.

Secrets

When you don't know how to set secrets, please refer to Managing Workflow Secret

gcp.credential : CREDENTIAL
The Google Cloud Platform account credential private key to use, in JSON format.

For information on how to generate a service account key, see the Google Cloud Platform Documentation.

Upload the private key JSON file to the digdag server using the secrets client command:

Copy

Copied

  digdag secrets --project my_project --set gcp.credential=@my-svc-account-b4df00d.json

If gcp.credential is not set, digdag tries to get Application Default Credential.

Options

bq> : query.sql
Path to a query template file. This file can contain ${...} syntax to embed variables.

Examples:

Copy

Copied

  bq>: queries/step1.sql

location : LOCATION
The location where the query job should run. See locations for a list of available locations.

Examples:

Copy

Copied

  location: asia-northeast1

dataset : NAME
Specifies the default dataset to use in the query and in the destination_table parameter.

Examples:

Copy

Copied

  dataset: my_dataset

Copy

Copied

  dataset: other_project:other_dataset

destination_table : NAME
Specifies a table to store the query results in.

Examples:

Copy

Copied

  destination_table: my_result_table

Copy

Copied

  destination_table: some_dataset.some_table

Copy

Copied

  destination_table: some_project:some_dataset.some_table

You can append a date as $YYYYMMDD form at the end of table name to store data in a specific partition. See Creating and Updating Date-Partitioned Tables document for details.

Copy

Copied

  destination_table: some_dataset.some_partitioned_table$20160101

create_disposition : CREATE IF NEEDED | CREATE_NEVER
Specifies whether the destination table should be automatically created when executing the query.
- CREATE_IF_NEEDED : (default) The destination table is created if it does not already exist.
- CREATE_NEVER : The destination table must already exist, otherwise the query will fail.
Examples:

Copy

Copied

  create_disposition: CREATE_IF_NEEDED

Copy

Copied

  create_disposition: CREATE_NEVER

write_disposition : WRITE TRUNCATE | WRITE APPEND | WRITE_EMPTY
Specifies whether to permit writing of data to an already existing destination table.
- WRITE_TRUNCATE : If the destination table already exists, any data in it will be overwritten.
- WRITE_APPEND : If the destination table already exists, any data in it will be appended to.
- WRITE_EMPTY : (default) The query fails if the destination table already exists and is not empty.
Examples:

Copy

Copied

  write_disposition: WRITE_TRUNCATE

Copy

Copied

  write_disposition: WRITE_APPEND

Copy

Copied

  write_disposition: WRITE_EMPTY

priority : INTERACTIVE | BATCH
Specifies the priority to use for this query. Default: INTERACTIVE.
use_query_cache : BOOLEAN
Whether to use BigQuery query result caching. Default: true.
allow_large_results : BOOLEAN
Whether to allow arbitrarily large result tables. Requires destination_table to be set and use_legacy_sql to be true.
flatten_results : BOOLEAN
Whether to flatten nested and repeated fields in the query results. Default: true. Requires use_legacy_sql to be true.
use_legacy_sql : BOOLEAN
Whether to use legacy BigQuery SQL. Default: false.
maximum_billing_tier : INTEGER
Limit the billing tier for this query. Default: The project default.
table_definitions : OBJECT
Describes external data sources that are accessed in the query. For more information see BigQuery documentation.
user_defined_function_resources : LIST
Describes user-defined function resources used in the query. For more information see BigQuery documentation.

Output parameters

bq.last_job_id
The id of the BigQuery job that executed this query.

Note: bq.last_jobid parameter is kept only for backward compatibility but you must not use it because it will be removed in a near future release.