Installing Bulk Data Import

You can import data using Treasure Data’s open-source bulk data loader Embulk. Embulk is a open-source bulk data loader that helps data transfer between various databases, storage locations, file formats, and cloud services. For more information, see Embulk documentation.

Prerequisites

  • Basic knowledge of Treasure Data.
  • Basic Knowledge of Embulk.
  • Embulk is a Java application. Make sure that Java is installed.

Installing Embulk from the Command Line

Platform Execute the following commands
Linux
Mac
BSD

  • curl --create-dirs -o ~/.embulk/bin/embulk -L "https://dl.embulk.org/embulk-latest.jar"
  • chmod +x ~/.embulk/bin/embulk
  • echo 'export PATH="$HOME/.embulk/bin:$PATH"' >> ~/.bashrc
  • source ~/.bashrc
  • Windows
    Using PowerShell
    "& {Invoke-WebRequest https://dl.embulk.org/embulk-latest.jar -OutFile embulk.bat}

    Installing the Embulk Treasure Data Plugin

    You can use Embulk plugins to load data to or from various systems and file formats. View a list of Embulk plugins by category.

    The following command installs embulk-output-td plugin, which imports records to Treasure Data:

    Copy
    Copied
    embulk gem install embulk-output-td

    Using a Proxy Server

    If you cannot upload, verify that your network is using a proxy. You can set the proxy by using the following command line options:

    Copy
    Copied
    Linux:
      embulk -J-Dhttp.proxyHost=xxxx -J-Dhttp.proxyPort=xxxx -J-Dhttp.proxyUser=xxxx -J-Dhttp.proxyPassword=xxxx run config.yml
    Windows:
      embulk.bat "-J-Dhttps.proxyHost=xxxx" "-J-Dhttps.proxyPort=xxxx" "-J-Dhttp.proxyUser=xxxx" "-J-Dhttp.proxyPassword=xxxx" run config.yml
    Or,
      "java"  -Dhttps.proxyHost="host" -Dhttps.proxyPort="port" -jar embulk.bat run config.yml