Troubleshooting

Submitted Workflows Troubleshooting

In this article, we create an error in a workflow to guide you through the process of troubleshooting a workflow that you’ve submitted to Treasure Data.

Introductory Tutorial

If you haven’t already, start by going through the TD Workflows Introductory Tutorial. You will download and use the workflow project in the tutorial.

Create an Error to Debug

Navigate to the nasdaq_analysis directory from the introductory tutorial.

Use the following syntax to create an error for us to debug:

Copy
Copied
$ cat > queries/monthly_open.sql <<EOF
SELECT TD_DATE_TRUNC('month', time),AVG(daily_avg_open) AS monthly_avg_open, 
AVG(daily_avg_close) AS month_avg_close
FROM daily_open
GROUP BY 1
EOF

Push the broken workflow to Treasure Data**

Copy
Copied
$ td wf push nasdaq_analysis
# Submitting workflow "nasdaq_analysis"...
# Done!

Start the workflow, on Treasure Data’s side:

Copy
Copied
$ td wf start nasdaq_analysis nasdaq_analysis --session now

Check failure status:

Copy
Copied
$ td wf session nasdaq_analysis nasdaq_analysis

You should see the following as your output:

Copy
Copied
2016-05-11 16:40:24 +0900: Digdag v0.6.1
Session attempts:
  attempt id: 100
  uuid: ef704e1f-3eb5-4ba7-9be0-4ebfaeee4424
  project: nasdaq_analysis
  workflow: nasdaq_analysis
  session time: 2016-05-11 07:38:15 +0000
  retry attempt name:
  params: {"td":{"apikey":"..."},"last_session_time":"2016-05-11T00:00:00+00:00","next_session_time":"2016-05-12T00:00:00+00:00"}
  created at: 2016-05-11 16:38:17 +0900
  kill requested: false
  status: error

Troubleshooting Workflow Errors

Determine What Tasks Failed

In above example, attempt_id = 100.

Copy
Copied
$ td wf tasks <attempt_id>

The command should return output similar to the following:

Copy
Copied
2016-05-16 21:18:19 -0700: Digdag v0.7.1
   id: 1105
   name: +nasdaq_analysis
   state: group_error
   config: {"schedule":{"daily>":"07:00:00"},"_export":{"td":{"database":"workflow_temp"}}}
   parent: null
   upstreams: []
   export params: {"td":{"database":"workflow_temp"}}
   store params: {}
   state params: {}

   id: 1106
   name: +nasdaq_analysis+task1
   state: success
   config: {"td>":"queries/daily_open.sql","create_table":"daily_open"}
   parent: 1105
   upstreams: []
   export params: {}
   store params: {"td":{"last_job_id":"66338029"}}
   state params: {}

   id: 1107
   name: +nasdaq_analysis+task2
   state: error
   config: {"td>":"queries/monthly_open.sql","create_table":"monthly_open"}
   parent: 1105
   upstreams: [1106]
   export params: {}
   store params: {}
   state params: {}

You can see under the last task listed, named +nasdaq_analysis+task2 that state: error, meaning this task is the one that failed.

Review Logs of the Failed Task

The command to get the logs for a particular tasks is as follows

Copy
Copied
$ td wf logs <attempt_id> <task_name>

Specifically, put the following:

Copy
Copied
$ td wf logs <attempt_id> +nasdaq_analysis+task2

Review the output to determine the cause of the errors.

You can also use the job id to review error logs in TD Console.

Fix the Query

Fix the query and rerun the workflow.

Copy
Copied
$ cat > queries/monthly_open.sql <<EOF
SELECT TD_DATE_TRUNC('month', time), AVG(daily_avg_open) AS
monthly_avg_open, AVG(daily_avg_close) AS month_avg_close
FROM daily_open
GROUP BY 1
EOF

Push the Fix to Treasure Data

Copy
Copied
$ td wf push nasdaq_analysis

Retry the Workflow Session

Rerun the workflow.

Copy
Copied
$ td wf retry <attempt_id> --name fix-typo --latest-revision --all

Quickly run td wf attempts to see the new session attempt running. Run it again, and you’ll likely see it succeeded successfully.

The most recent attempt has the same session time as the previous attempt that failed. This is the benefit of using retry in this instance, instead of start. This is particularly important if you have a daily scheduled workflow, and you only want to retry the current day’s session using any time-related parameters embedded into the workflow.

Alternatively, you can use --resume to only rerun starting at the failed task and all subsequent tasks.

Troubleshooting Enabled or Disabled Policies

As the account owner, be aware of the actions that occur when you enable or disable policies.

Enabling the Policies Feature

When you enable the Policies feature, you must:

  1. Create policies that contain specified permissions
  2. Assign existing users to policies

Disabling the Policies Feature

When you disable the Policies feature, the Policies pane is no longer visible. To view the permissions for your users, you must:

  1. Go to Administration > Users .
  2. Select a user to view.
  3. Select Permissions to view and specify permissions.