Installing Bulk Data Import
You can import data using Treasure Data’s open-source bulk data loader Embulk. Embulk is a open-source bulk data loader that helps data transfer between various databases, storage locations, file formats, and cloud services. For more information, see Embulk documentation.
Prerequisites
- Basic knowledge of Treasure Data.
- Basic Knowledge of Embulk.
- Embulk is a Java application. Make sure that Java is installed.
Installing Embulk from the Command Line
Platform | Execute the following commands |
---|---|
Linux Mac BSD |
curl --create-dirs -o ~/.embulk/bin/embulk -L "https://dl.embulk.org/embulk-latest.jar" chmod +x ~/.embulk/bin/embulk echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc source ~/.bashrc |
Windows Using PowerShell |
"& {Invoke-WebRequest https://dl.embulk.org/embulk-latest.jar -OutFile embulk.bat} |
Installing the Embulk Treasure Data Plugin
You can use Embulk plugins to load data to or from various systems and file formats. View a list of Embulk plugins by category.
The following command installs embulk-output-td
plugin, which imports records to Treasure Data:
embulk gem install embulk-output-td
Using a Proxy Server
If you cannot upload, verify that your network is using a proxy. You can set the proxy by using the following command line options:
Linux:
embulk -J-Dhttp.proxyHost=xxxx -J-Dhttp.proxyPort=xxxx -J-Dhttp.proxyUser=xxxx -J-Dhttp.proxyPassword=xxxx run config.yml
Windows:
embulk.bat "-J-Dhttps.proxyHost=xxxx" "-J-Dhttps.proxyPort=xxxx" "-J-Dhttp.proxyUser=xxxx" "-J-Dhttp.proxyPassword=xxxx" run config.yml
Or,
"java" -Dhttps.proxyHost="host" -Dhttps.proxyPort="port" -jar embulk.bat run config.yml