TD Toolbelt Quickstart
Using the TD Toolbelt you can use a command line interface to access most of the functionality of the Treasure Data product. This includes the ability to
- Create databases and tables
- Import or export data into and from tables
- Set and modify the table schema
- Issue queries
- Monitor job status
- View and download job results
- Create schedule queries, and much more.
Setup
Install
The TD Toolbelt is distributed in different ways to best suit the target environment:
- Windows and Mac OSX are distributed as a prepackaged installer called the Toolbelt.
- Linux (Ubuntu, RedHat, CentOS) is distributed as part of the Treasure Agent (
td-agent
) package. - A Ruby 'gem' module can be installed on any system using the following command:
gem install td
.
Use the Ruby option if you are already familiar with Ruby and already have it installed in your system.
Ruby Gem
If you are familiar with Ruby, you can use the CLI to install and maintaintd
as a gem. To install TD Toolbelt and its dependencies, simply run this command:gem install td
rbevn
or rvm
, different versions of the TD Toolbelt can be confined within each project, environment or ruby version in use and you might need to install the td gem multiple times. Refer to the Upgrading, Ruby Gem section below for more information.Windows
This package contains all the necessary dependencies to allow TD Toolbelt to run on a Windows system and therefore includes a version of the Ruby environment which will be installed on the system as part of the process.
- Navigate to the TD Toolbelt download page: https://toolbelt.treasuredata.com/
- Select Windows.
- Select Download.
Mac OSX
This package contains all the necessary dependencies to allow TD Toolbelt to run on a Mac system and therefore includes a version of the Ruby environment which will be installed on the system as part of the process.
- Navigate to the TD Toolbelt download page: https://toolbelt.treasuredata.com/
- Select OSX.
- Select Download.
If you receive an error "package can't be opened" when you try to install the package, right-click the package and select Open to continue the installation.
Linux
Info
Prior to its rebranding in 2023, fluent-package was launched as td-agent. Treasure Data had supported several earlier versions of td-agent. However, newer versions of td-agent and fluent-package are now supported under the Fluentd Community. Thus, fluent-package itself is not supported and maintained by Treasure Data. Please use it at your own risk. Treasure Data continues to provide a support to td command, not its package.
td-agent
package and the fluent-package
package.This package contains all the necessary dependencies to allow the td-agent and td on the CLI to run on a Linux system, including the version of Ruby these tools are guaranteed to work with.
# 22.04 Jammy
curl -fsSL https://toolbelt.treasuredata.com/sh/install-ubuntu-jammy-fluent-package5.sh | sh
# 20.4 Focal
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-focal-td-agent4.sh | sh
# 18.04 Bionic
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-bionic-td-agent4.sh | sh
# Debian Bookworm:
curl -fsSL https://toolbelt.treasuredata.com/sh/install-debian-bookworm-fluent-package5.sh | sh
# Debian Bullseye
curl -L https://toolbelt.treasuredata.com//sh/install-debian-bullseye-td-agent4.sh | sh
# Debian Buster
curl -L https://toolbelt.treasuredata.com/sh/install-debian-buster-td-agent4.sh | sh
# Redhat/CentOS Fluent Package 5
curl -fsSL https://toolbelt.treasuredata.com/sh/install-redhat-fluent-package5.sh | sh
# Amazon Linux 2023
curl -fsSL https://toolbelt.treasuredata.com/sh/install-amazon2023-fluent-package5.sh | sh
# Amazon Linux 2
curl -fsSL https://toolbelt.treasuredata.com/sh/install-amazon2-fluent-package5.sh | sh
Legacy support for EOL versions is also still available. Click here for more information.
# 16.04 Xenial
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-xenial-td-agent4.sh | sh
# 14.04 Trusty
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-trusty-td-agent3.sh | sh
# 12.04 Precise
curl -L https://toolbelt.treasuredata.com/sh/install-ubuntu-precise-td-agent3.sh | sh
# Debian Stretch (64-bit only)
curl -L https://toolbelt.treasuredata.com/sh/install-debian-stretch-td-agent3.sh | sh
# Debian Jessie (64-bit only)
curl -L https://toolbelt.treasuredata.com/sh/install-debian-jessie-td-agent3.sh | sh
# Redhat/CentOS 6 td-agent3
curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent3.sh | sh
# Amazon Linux 1 td-agent3
curl -L https://toolbelt.treasuredata.com/sh/install-amazon1-td-agent3.sh | sh
Verify Install
Open your terminal, and make sure you have thetd
command available.td --version
rpm -q td-agent
dpkg - l | grep td-agent
Update
Mac OSX or Windows
If you downloaded the TD Toolbelt installer package (.pkg file) for Mac OSX or a Toolbelt installer executable for Windows (64-bit support only) from the TD Toolbelt website, the Toolbelt auto-updates itself.
Whenever a command is invoked from the CLI, the program checks whether a new version exists and will download and install the updated version in the background. TD Toolbelt checks for an updated version every hour. The user can trigger an update at any time with the following command:
td update
td --version
command), upgrade as soon as possible by installing a more recent package from Treasure Data Toolbelt.The td update
command is just a placeholder for the following other installation methods.Ruby Gem
If you installed TD Toolbelt as a gem you must periodically check for a newer version. It is always recommended to update to the latest version because we strive to maintain 100% backward compatibility. To update using the gem command run:
gem update td
rbevn
or rvm
, then different versions of the TD Toolbelt might be confined within each project, environment, or ruby version.Also if gem install
was invoked with a customized GEM_HOME
environment variable or bundler's bundle install
was called with the -path
option, the installed gems are local to the project (and typically installed within the project's folder structure itself), not globally installed and available everywhere in the system.You can double check the version of td gem in use by running:
gem list | grep -E '^td '
td (0.11.2, 0.10.99, 0.10.97)
gem
with a wrapper: e.g.
/usr/bin/ruby/toolbelt/bin/gem</code>
or <code>/usr/lib/fluent/ruby/bin/fluent-gem
Basic Use
Thetd
command has multiple help sub-menus you can use through the td help <command>
. For example:
td help db
td help table
td help job
Set API Key
The first thing to do is set your API key. Make sure to use a Master API Key.
td apikey:set YOUR_API_KEY
To confirm your API Key is set correctly, you can run the following command.
td apikey
List databases
To list all databases on the account that you have access to run the following command.
td db:list
+------------------------------------+--------+
| Name | Count |
+------------------------------------+--------+
| database_1 | 10 |
| database_2 | 0 |
| database_3 | 1000 |
+------------------------------------+--------+
3 rows in set
List tables in database
Now that you have a list of databases, you can easily lookup all the tables in the database by name.
td table:list database_1
+---------------+--------------+------+-------+--------+---------------------------+---------------------------+---------------------------------------------+
| Database | Table | Type | Count | Size | Last import | Last log timestamp | Schema |
+---------------+--------------+------+-------+--------+---------------------------+---------------------------+---------------------------------------------+
| database_1 | table_1 | log | 1 | 0.0 GB | YYYY-MM-DD HH:MM:SS +0000 | YYYY-MM-DD HH:MM:SS +0000 | param1:string, param2:string |
| database_1 | table_2 | log | 11 | 0.0 GB | YYYY-MM-DD HH:MM:SS +0000 | YYYY-MM-DD HH:MM:SS +0000 | param1:string, param2:string, param3:string |
+---------------+--------------+------+-------+--------+---------------------------+---------------------------+---------------------------------------------+
2 rows in set
Run a Query
Next lets take a look at all the data in the tabletable_1
in the database database_1
. We will do this by issuing a SQL query command. td query -d database_1 -w "SELECT * from table_1"
Job 1224749766 is queued.
Use 'td job:show 1224749766' to show the status.
Hive history file=/mnt/hive/tmp/12345/hive_job_log_dff88b37-a800-4df6-a2a3-26e53e83e047_1845878157.txt
Job is running in resource pool: hadoop2 with priority: default
**
** WARNING: time index filtering is not set on database_name.table_name!
** This query could be very slow as a result.
** If you used 'unix_timestamp' please modify your query to use TD_SCHEDULED_TIME instead
** or rewrite the condition using TD_TIME_RANGE
** https://docs.treasuredata.com/display/public/PD/Hive+Performance+Tuning#HivePerformanceTuning-LeveragingTime-basedPartitioning
**
OK
MapReduce time taken: 0.651 seconds
Fetching results...
Total CPU Time: 0
Total Records: 11
Time taken: 2.368 seconds
Debug log = debug_1224749766_1638717494786.gz
Status : success
Result :
WARNING: the job result is being downloaded... 182 B / 182 B : ================= 100 =================
+--------+--------+--------+-----------------------------------------------------------------------------+------------+
| param1 | param2 | param3 | v | time |
+--------+--------+--------+-----------------------------------------------------------------------------+------------+
| value1 | value2 | null | {"param1":"value1","param2":"value2","time":"1623230494"} | 1623230494 |
| value1 | value2 | null | {"param1":"value1","param2":"value2","time":"1623229900"} | 1623229900 |
| value1 | value2 | null | {"param1":"value1","param2":"value2","time":"1623229819"} | 1623229819 |
+--------+--------+--------+-----------------------------------------------------------------------------+------------+
3 rows in set
td job:show JOB_ID
, or by viewing it in the web console. Manage Jobs
You can view all recent jobs by running the following command.
td job:list
+------------+---------+---------------------------+-------------+-------------------+------------+----------+--------+------+---------------+------------------------------+----------+
| JobID | Status | Start | Elapsed | CPUTime | ResultSize | Priority | Result | Type | Database | Query | Duration |
+------------+---------+---------------------------+-------------+-------------------+------------+----------+--------+------+---------------+------------------------------+----------+
| 0123456789 | success | YYYY-MM-DD HH:MM:SS +0000 | 12s | | 182 B | NORMAL | | hive | database_name | select * from table_1 ... | 00:00:12 |
+------------+---------+---------------------------+-------------+-------------------+------------+----------+--------+------+---------------+------------------------------+----------+
1 rows in set
Further, if you have a job that is taking longer than expected, you can easily terminate it by running the following command.
td job:kill JOB_ID
Further Reading
- GitHub Source
- Proxy Access - Use the toolbelt with a proxy.
- API Key search order - Order in which TD Toolbelt searches for the API key and API server. Useful for when you have multiple keys across multiple projects on the same machine.
- API Reference
- TD Toolbelt website - Original source for downloading the toolbelt. Guaranteed to be up to date.