TD Toolbelt Quickstart

Using the TD Toolbelt you can use a command line interface to access most of the functionality of the Treasure Data product. This includes the ability to

  • Create databases and tables
  • Import or export data into and from tables
  • Set and modify the table schema
  • Issue queries
  • Monitor job status
  • View and download job results
  • Create schedule queries, and much more.

Setup

Install

The TD Toolbelt is distributed in different ways to best suit the target environment:

  • Windows and Mac OSX are distributed as a prepackaged installer called the Toolbelt.
  • Linux (Ubuntu, RedHat, CentOS) is distributed as part of the Treasure Agent (td-agent) package.
  • A Ruby 'gem' module can be installed on any system using the following command: gem install td.

Use the Ruby option if you are already familiar with Ruby and already have it installed in your system.

Ruby Gem

If you are familiar with Ruby, you can use the CLI to install and maintain td as a gem. To install TD Toolbelt and its dependencies, simply run this command:

Copy
Copied
gem install td

If you are using a Ruby environment manager such as rbevn or rvm, different versions of the TD Toolbelt can be confined within each project, environment or ruby version in use and you might need to install the td gem multiple times. Refer to the Upgrading, Ruby Gem section below for more information.

Windows

This package contains all the necessary dependencies to allow TD Toolbelt to run on a Windows system and therefore includes a version of the Ruby environment which will be installed on the system as part of the process.

  1. Navigate to the TD Toolbelt download page: https://toolbelt.treasuredata.com/
  2. Select Windows.
  3. Select Download.

Mac OSX

This package contains all the necessary dependencies to allow TD Toolbelt to run on a Mac system and therefore includes a version of the Ruby environment which will be installed on the system as part of the process.

  1. Navigate to the TD Toolbelt download page: https://toolbelt.treasuredata.com/
  2. Select OSX.
  3. Select Download.

If you receive an error "package can't be opened" when you try to install the package, right-click the package and select Open to continue the installation.

Linux

Info

Prior to its rebranding in 2023, fluent-package was launched as td-agent. Treasure Data had supported several earlier versions of td-agent. However, newer versions of td-agent and fluent-package are now supported under the Fluentd Community. Thus, fluent-package itself is not supported and maintained by Treasure Data. Please use it at your own risk. Treasure Data continues to provide a support to td command, not its package.

The TD Toolbelt is distributed on Linux as part of the td-agent package and the fluent-package package.

This package contains all the necessary dependencies to allow the td-agent and td on the CLI to run on a Linux system, including the version of Ruby these tools are guaranteed to work with.

Ubuntu and DebianRedhat and CentOSAmazon Linux
Copy
Copied
# 22.04 Jammy
curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-jammy-fluent-package5.sh | sh

# 20.4 Focal
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | sh

# 18.04 Bionic
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-bionic-td-agent4.sh | sh

# Debian Bookworm:
curl -fsSL https://toolbelt.treasuredata.com/sh/install-debian-bookworm-fluent-package5.sh | sh


# Debian Bullseye
curl -L https://toolbelt.treasuredata.com//sh/install-debian-bullseye-td-agent4.sh | sh

# Debian Buster
curl -L https://toolbelt.treasuredata.com/sh/install-debian-buster-td-agent4.sh | sh
Copy
Copied
# Redhat/CentOS Fluent Package 5
curl -fsSL https://toolbelt.treasuredata.com/sh/install-redhat-fluent-package5.sh | sh
Copy
Copied
# Amazon Linux 2023
curl -fsSL https://toolbelt.treasuredata.com/sh/install-amazon2023-fluent-package5.sh | sh

# Amazon Linux 2
curl -fsSL https://toolbelt.treasuredata.com/sh/install-amazon2-fluent-package5.sh | sh
Legacy support for EOL versions is also still available. Click here for more information.
Ubuntu and DebianRedhat and CentOSAmazon Linux
Copy
Copied
# 16.04 Xenial
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent4.sh | sh

# 14.04 Trusty
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent3.sh | sh

# 12.04 Precise
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent3.sh | sh

# Debian Stretch (64-bit only) 
curl -L https://toolbelt.treasuredata.com/sh/install-debian-stretch-td-agent3.sh | sh

# Debian Jessie (64-bit only)
curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent3.sh | sh
Copy
Copied
# Redhat/CentOS 6 td-agent3
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh
Copy
Copied
# Amazon Linux 1 td-agent3
curl -L https://toolbelt.treasuredata.com/sh/install-amazon1-td-agent3.sh | sh

Verify Install

Open your terminal, and make sure you have the td command available.

Windows and MacOSRedhat/CentOSUbuntu
Copy
Copied
td --version
Copy
Copied
rpm -q td-agent
Copy
Copied
dpkg - l | grep td-agent

Update

Mac OSX or Windows

If you downloaded the TD Toolbelt installer package (.pkg file) for Mac OSX or a Toolbelt installer executable for Windows (64-bit support only) from the TD Toolbelt website, the Toolbelt auto-updates itself.

Whenever a command is invoked from the CLI, the program checks whether a new version exists and will download and install the updated version in the background. TD Toolbelt checks for an updated version every hour. The user can trigger an update at any time with the following command:

Copy
Copied
td update

The auto-update feature is available as of v0.10.77. If you are running an earlier version (check the version with the td --version command), upgrade as soon as possible by installing a more recent package from Treasure Data Toolbelt.

The td update command is just a placeholder for the following other installation methods.

Ruby Gem

If you installed TD Toolbelt as a gem you must periodically check for a newer version. It is always recommended to update to the latest version because we strive to maintain 100% backward compatibility. To update using the gem command run:

Copy
Copied
gem update td

If you are using a Ruby environment manager such as rbevn or rvm, then different versions of the TD Toolbelt might be confined within each project, environment, or ruby version.

Also if gem install was invoked with a customized GEM_HOME environment variable or bundler's bundle install was called with the -path option, the installed gems are local to the project (and typically installed within the project's folder structure itself), not globally installed and available everywhere in the system.

You can double check the version of td gem in use by running:

Copy
Copied
gem list | grep -E '^td '
td (0.11.2, 0.10.99, 0.10.97)

Depending on the situation, you might have to replace the gem with a wrapper: e.g. /usr/bin/ruby/toolbelt/bin/gem</code> or <code>/usr/lib/fluent/ruby/bin/fluent-gem

Basic Use

The td command has multiple help sub-menus you can use through the td help <command>.

For example:

Copy
Copied
td help db
td help table
td help job

Set API Key

The first thing to do is set your API key. Make sure to use a Master API Key.

Copy
Copied
td apikey:set YOUR_API_KEY 

To confirm your API Key is set correctly, you can run the following command.

Copy
Copied
td apikey

List databases

To list all databases on the account that you have access to run the following command.

db:listdb response
Copy
Copied
td db:list
Copy
Copied
+------------------------------------+--------+
| Name                               | Count  |
+------------------------------------+--------+
| database_1                         | 10     |
| database_2                         | 0      |
| database_3                         | 1000   |
+------------------------------------+--------+
3 rows in set

List tables in database

Now that you have a list of databases, you can easily lookup all the tables in the database by name.

table:list database_nametable response
Copy
Copied
td table:list database_1
Copy
Copied
+---------------+--------------+------+-------+--------+---------------------------+---------------------------+---------------------------------------------+
| Database      | Table        | Type | Count | Size   | Last import               | Last log timestamp        | Schema                                      |
+---------------+--------------+------+-------+--------+---------------------------+---------------------------+---------------------------------------------+
| database_1    | table_1      | log  | 1     | 0.0 GB | YYYY-MM-DD HH:MM:SS +0000 | YYYY-MM-DD HH:MM:SS +0000 | param1:string, param2:string                |
| database_1    | table_2      | log  | 11    | 0.0 GB | YYYY-MM-DD HH:MM:SS +0000 | YYYY-MM-DD HH:MM:SS +0000 | param1:string, param2:string, param3:string |
+---------------+--------------+------+-------+--------+---------------------------+---------------------------+---------------------------------------------+
2 rows in set

Run a Query

Next lets take a look at all the data in the table table_1 in the database database_1. We will do this by issuing a SQL query command.

queryquery response
Copy
Copied
td query -d database_1 -w "SELECT * from table_1"
Copy
Copied
Job 1224749766 is queued.
Use 'td job:show 1224749766' to show the status.
  Hive history file=/mnt/hive/tmp/12345/hive_job_log_dff88b37-a800-4df6-a2a3-26e53e83e047_1845878157.txt
  Job is running in resource pool: hadoop2 with priority: default
  **
  ** WARNING: time index filtering is not set on database_name.table_name!
  ** This query could be very slow as a result.
  ** If you used 'unix_timestamp' please modify your query to use TD_SCHEDULED_TIME instead
  **   or rewrite the condition using TD_TIME_RANGE
  ** https://docs.treasuredata.com/display/public/PD/Hive+Performance+Tuning#HivePerformanceTuning-LeveragingTime-basedPartitioning
  **
  OK
  MapReduce time taken: 0.651 seconds
  Fetching results...
  Total CPU Time: 0
  Total Records: 11
  Time taken: 2.368 seconds
  Debug log = debug_1224749766_1638717494786.gz
Status      : success
Result      :
WARNING: the job result is being downloaded...  182 B /  182 B : ================= 100 ================= 
+--------+--------+--------+-----------------------------------------------------------------------------+------------+
| param1 | param2 | param3 | v                                                                           | time       |
+--------+--------+--------+-----------------------------------------------------------------------------+------------+
| value1 | value2 | null   | {"param1":"value1","param2":"value2","time":"1623230494"}                   | 1623230494 |
| value1 | value2 | null   | {"param1":"value1","param2":"value2","time":"1623229900"}                   | 1623229900 |
| value1 | value2 | null   | {"param1":"value1","param2":"value2","time":"1623229819"}                   | 1623229819 |
+--------+--------+--------+-----------------------------------------------------------------------------+------------+
3 rows in set

The result of the query will be displayed on the command line. You can also review the results by running td job:show JOB_ID, or by viewing it in the web console.

Manage Jobs

You can view all recent jobs by running the following command.

job:listjob:list response
Copy
Copied
td job:list
Copy
Copied
+------------+---------+---------------------------+-------------+-------------------+------------+----------+--------+------+---------------+------------------------------+----------+
| JobID      | Status  | Start                     | Elapsed     | CPUTime           | ResultSize | Priority | Result | Type | Database      | Query                        | Duration |
+------------+---------+---------------------------+-------------+-------------------+------------+----------+--------+------+---------------+------------------------------+----------+
| 0123456789 | success | YYYY-MM-DD HH:MM:SS +0000 |         12s |                   | 182 B      | NORMAL   |        | hive | database_name | select * from table_1 ... | 00:00:12 |
+------------+---------+---------------------------+-------------+-------------------+------------+----------+--------+------+---------------+------------------------------+----------+
1 rows in set

Further, if you have a job that is taking longer than expected, you can easily terminate it by running the following command.

Copy
Copied
td job:kill JOB_ID

Further Reading