pytd vs td-client-python vs pandas-td
Treasure Data offers three different Python clients. The following list summarizes each client’s characteristics.
td-client-python
- Basic REST API wrapper.
- Similar functionalities to td-client-{ ruby , java , node , go }.
- The capability is limited by what Treasure Data REST API can do .
pytd
- Efficient connection to Presto based on presto-python-client .
- Multiple data ingestion methods and a variety of utility functions.
pandas-td (deprecated)
- Old tool optimized for pandas and Jupyter Notebook .
-
pytd
offers a compatible function set under
pytd.pandas_td
.
Choosing a Client
The client you choose depends on your specific use case. Here are some common guidelines:
- Use td-client-python if you want to execute basic CRUD operations from Python applications.
- Use pytd for (1) analytical purpose relying on pandas and Jupyter Notebook, and (2) achieving more efficient data access .
Info
There is a known difference to the pandas_td.to_td
function for type conversion. Since pytd.writer.BulkImportWriter
(default writer pytd) uses CSV as an intermediate file before uploading a table, the column type might change via pandas.read_csv
. To respect the column type as much as possible, you need to pass a fmt=”msgpack” argument to to_td
function.