Accessing Presto 350 (Legacy)

info

The current version of Presto on Treasure Data is Trino 423. For additional information for Trino 423, including release information and known limitations, see Accessing Trino 423.

TD Presto API Release Information for Presto 350 (Legacy)

All native Presto functions are supported in Treasure Data. However, the following section provides notes and information about the features, changes, and limitations of the Presto 350.

Performance Improvements

Here is a list of updates related to performance improvements:

  • Performance improvements in querying information_schema tables #999 , #1306 , #1329 , #1543 , #2488
  • Pushdown support for some functions and operators in PostgreSQL (DataTank) connector #3881 , #4111 , #4112 , #5261
  • Improved query performance by reducing worker to worker communication overhead. #6126 , #5905 , #5949
  • Improved performance of ORDER BY ... LIMIT queries. #6072
  • Improved performance of queries with uncorrelated IN clauses. #5582
  • Improved performance of queries that use the decimal type. #4730 , #4886 , #5181
  • Improved performance and accuracy of approx_percentile() . #5158
  • Improved performance of certain cross join queries. #5276
  • Reduced latency for queries that perform a broadcast join of a large table. #5237
  • Improved performance of queries involving comparisons between DOUBLE or REAL values and integer values. #3533
  • Improved performance of queries involving row_number() . #3614
  • Improved performance of queries containing LIKE predicate. #3618
  • Improved performance for queries that read fields from nested structures. #2672
  • Improved performance of queries involving constant scalar subqueries #3432
  • Improved query performance by removing redundant data reshuffling. #2853
  • Improved performance of inequality joins involving BETWEEN . #2859
  • Improved join performance for dictionary encoded data. #2862
  • Improved performance of queries containing redundant scalar subqueries. #2456
  • Improved performance of INSERT and CREATE TABLE ... AS queries containing redundant ORDER BY clauses. #2044
  • Improved performance when processing columns of map type. #2015
  • Reduced query memory usage by improving retained size estimation for VARCHAR and CHAR types. #4123
  • Improved performance of certain join queries by reducing the amount of data that needs to be scanned. #1673
  • Improved performance of queries containing complex predicates. #1515
  • Improved performance of certain window functions when using bounded window frames (e.g., ROWS BETWEEN ... PRECEDING AND ... FOLLOWING ). #464
  • Improved performance of certain queries involving coercions and complex expressions in JOIN conditions. #1390

New UDFs

Many new UDFs are available. See the Trino official documentation to understand the usage of each functions.

  • contains _ sequence()
  • concat _ ws()
  • murmur3()
  • from _ unixtime _ nanos()
  • T-Digest functions: T-Digest functions — Trino 361 Documentation
  • from _ iso8601 _ timestamp _ nanos()
  • human _ readable _ seconds()
  • bitwise _ left _ shift(), bitwise _ right _ shift() and bitwise _ right _ shift _ arithmetic()
  • luhn _ check()
  • approx _ most _ frequent()
  • random(m, n)
  • starts _ with()
  • regexp _ count(), regexp _ position()
  • strpos(string, substring, instance)
  • Geospatial functions: to _ encoded _ polyline(), from _ encoded _ polyline(), line _ interpolate _ point(), line _ interpolate _ points(), geometry _ from _ hadoop _ shape(), ST _ Length(SphericalGeography)
  • at _ timezone(), with _ timezone()
  • last _ day _ of _ month()

New SQL support

  • Add IF EXISTS and IF NOT EXISTS syntax to ALTER TABLE. #4651
  • Add support for INTERSECT ALL and EXCEPT ALL. #2152
  • Add support for DISTINCT clause in aggregations within correlated subqueries. #5904
  • Add support for RANGE BETWEEN <value> PRECEDING AND <value> FOLLOWING window frames. #609
  • Add support for window frames based on GROUPS . #5713
  • Add support for extract() with TIMEZONE_HOUR and TIMEZONE_MINUTE for time with time zone values. #5668
  • Add support for correlated subqueries in recursive queries. #4877
  • Add support for IN predicate with subqueries in outer join condition. #4151
  • Add support for quantified comparisons (e.g., > ALL (...) ) in aggregation queries. #4128
  • Add support for variable-precision TIME type. #4381
  • Add support for variable precision TIME WITH TIME ZONE type. #4905
  • Add support for variable-precision TIMESTAMP (without time zone) type. #3783
  • Add support for variable-precision TIMESTAMP WITH TIME ZONE type #3947
  • Allow inserting values of a larger type into as smaller type when the values fit. For example, BIGINT into SMALLINT, or VARCHAR(10) into VARCHAR(3). Values that don't fit will cause an error at runtime. #2061
  • Allow using .* on expressions of type ROW in the SELECT clause to convert the fields of a row into multiple columns. #1017
  • Allow references to tables in the enclosing query when using ". * ." #1867
  • Add support for IGNORE NULLS for window functions. #1244
  • Add support for INNER and OUTER joins involving UNNEST. #1522

JDBC Improvements

Query parameters are supported in LIMIT, OFFSET, and FETCH FIRST clauses.

Copy
Copied
Connection conn = ...
PreparedStatement stmt = conn.prepareStatement(
  "SELECT * FROM sample_datasets.www_access OFFSET ? LIMIT ?");
stmt.setInt(1, 10); // OFFSET = 10
stmt.setInt(2, 20); // LIMIT = 20
ResultSet rs = stmt.executeQuery();
...

Time Partitioning Shuffle Magic Comment for INSERT CTAS

Treasure Data supports a TD-specific setting to customize a partition size for INSERT/CTAS queries for better query performance. The following is an example of a time_partitioning_range_session property that is supported as a magic comment.

Copy
Copied
-- set session time_partitioning_range = '12h'

This comment can also be used to disable shuffling by specifying no-shuffle as a value. In this case, however, the partition size cannot be customized. In Presto 350, we introduced a new magic comment time_partitioning_shuffle = 'true'|'false' so that you can specify partition shuffling and time partition ranges independently.

Migration Guide for Presto 350 (Legacy)

While most existing Presto v317 queries are compatible with the new version v350, some queries require changing the syntax.

Run the New Presto Version

You can specify the Presto version that runs your queries by using the following query hints:

Copy
Copied
(i) Presto 350
 
-- @TD engine_version: 350
Copy
Copied
(ii) Presto 317:
 
-- @TD engine_version: 317

You can request Treasure Data Support to change the default engine version to Presto 350.

Next Steps

We plan to start migrating the Presto default version in the near future. The target date will be announced soon, but we recommend you verify your Presto queries and switch to the new version as soon as possible.

We plan to inform a list of potential incompatible Presto queries to the executing users through our simulation when we announce the schedule.

If you have any further inquiries, please reach out to support@treasure-data.com or your Customer Success representative.

Changed Presto Queries

SELECT DISTINCT, ORDER BY

When SELECT DISTINCT is used with ORDER BY statement, expressions must appear in the select list.

The following query works in the current version of Presto.

Copy
Copied
SELECT DISTINCT from_unixtime(time) , COUNT(*), time
FROM sample_datasets.www_access
GROUP BY from_unixtime(time), time
ORDER BY from_unixtime(time)

This query fails on Presto 350 with an error message like For SELECT DISTINCT, ORDER BY expressions must appear in select list. To solve this issue on Presto 350, rewrite this query as follows.

Copy
Copied
SELECT DISTINCT from_unixtime(time) , COUNT(*), time
FROM sample_datasets.www_access
GROUP BY from_unixtime(time) , time
ORDER BY from_unixtime(sample_datasets.www_access.time)

lag() and lead() ORDER BY

lag() and lead() require ORDER BY in Presto 350. For example, the following query works on the current version, but it doesn't on Presto 350.

Copy
Copied
SELECT
  time,
  LAG (path, 1) OVER () AS lag_data,
  path,
  LEAD (path, 1) OVER () AS lead_data
FROM
  sample_datasets.www_access

To support this query in Presto 350, you must explicitly use the ORDER BY clause as follows.

Copy
Copied
SELECT
  time,
  LAG (path, 1) OVER (ORDER BY time) AS lag_data,
  path,
  LEAD (path, 1) OVER (ORDER BY time) AS lead_data
FROM
  sample_datasets.www_access
Window Frame

Presto 350 has a new semantic check for lag() and lead(); the frame cannot be specified. For example, the following query works on the current version but doesn't produce expected results. However, this query fails on Presto 350.

Copy
Copied
SELECT
  time,
  user,
  LAG (path, 1) OVER (ORDER BY time ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS lag_data,
  path,
  LEAD (path, 1) OVER (ORDER BY time ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) AS lead_data
FROM
  sample_datasets.www_access

To fix this query in Presto 350, remove ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING.

Copy
Copied
SELECT
  time,
  user,
  LAG (path, 1) OVER (ORDER BY time) AS lag_data,
  path,
  LEAD (path, 1) OVER (ORDER BY time) AS lead_data
FROM
  sample_datasets.www_access

TIME and TIMESTAMP Behavior Changes

In Presto 350, TIME and TIMESTAMP behaviors follow SQL standards; it has some incompatibilities with the current version.

Presto has the following data types to represent time and timestamp:

  • TIME
  • TIME WITH TIME ZONE
  • TIMESTAMP
  • TIMESTAMP WITH TIME ZONE

In the current version, TIME and TIMESTAMP are similar to WITH TIME ZONE types, but in Presto 350, TIME and TIMESTAMP don't have an associated timezone. This might affect query results.

Examples

The result of the following query will be different depending on the Presto version:

Copy
Copied
SELECT CAST('2000-01-01 00:00:00 US/Eastern' AS TIMESTAMP)
--> 2000-01-01 05:00:00.000 on Presto 317
--> 2000-01-01 00:00:00.000 on Presto 350

When you convert WITH TIME ZONE types to TIME or TIMESTAMP, timezone is considered in the current version, but not in Presto 350. If you have queries that contain such conversion, you might need to revise them for Presto 350. In the previous example, the query can be rewritten as follows:

Copy
Copied
SELECT CAST(
  -- Convert to UTC before CAST as TIMESTAMP
  TIMESTAMP '2000-01-01 00:00:00 US/Eastern' AT TIME ZONE 'UTC'
AS TIMESTAMP)

or simply use TIMESTAMP WITH TIME ZONE type instead:

Copy
Copied
SELECT CAST('2000-01-01 00:00:00 US/Eastern' AS TIMESTAMP WITH TIME ZONE)

In addition, political timezones (e.g. America/Los_Angeles) are no longer allowed in TIME WITH TIME ZONE, to avoid issues around DST and possible future policy changes.

Therefore, the following query works on the current version, but not on Presto 350.

Copy
Copied
SELECT TIME '01:02:03.456 America/Los_Angeles'

Rewrite this query as follows:

Copy
Copied
SELECT TIME '01:02:03.456 -07:00'

Note that timezone is case-sensitive in Presto 350. For example, Asia/tokyo is invalid and has to be Asia/Tokyo when using Presto 350.

Query results can be also different. SELECT CURRENT_TIME generates 09:29:52.540 UTC on the current version; in Presto 350, use 09:29:52.540+00:00.

Query Length Limitation

Presto 350 might generate a longer byte-code internally, so long queries might hit the query length limitation. If you get error messages like the following, you need to shorten the query.

  • java.lang.IllegalArgumentException: bad parameter count 256
  • io.prestosql.spi.PrestoException: Query exceeded maximum columns

checksum()

checksum function implementation has changed in Presto 350. The checksum() function generates a different result from previous Presto versions.

approx percentile

approx_percentile() now uses T-digest as an internal data structure, which is more accurate and faster than the previous version. The function produces slightly different results.

However, if you specify the accuracy parameter, approx_percentile() doesn't use T-digest because T-digest doesn't allow the accuracy parameter. If you want to benefit from T-digest-based approx_percentile(), consider dropping the accuracyparameter.

Info

approx_percentile() doesn't allow infinite values (values divide by zero). If such a value is given, the query fails with java.lang.IllegalArgumentException: value must be finite or java.lang.IllegalArgumentException: value is NaN.. In this case, you have to exclude infinite values or NaN before approx_percentile(), or you can use the old version of approx_percentile() by specifying accuracy parameter intentionally as a workaround as follows:

Copy
Copied
-- Before
SELECT approx_percentile(
  column1 / column2, -- can be infinite value
  0.5                -- percentile
) FROM ...
 
-- Exclude invalid values
SELECT approx_percentile(
  column1 / column2, -- can be infinite value
  0.5                -- percentile
) filter (where is_finite(x)) FROM ...
   
-- Use old version
SELECT approx_percentile(
  column1 / column2, -- can be infinite value
  1,                 -- weight
  0.5,               -- percentile
  0.01                -- accuracy
) FROM ...

double NaN to integer

Presto 350 doesn't allow converting NaN value to INTEGER, however, it can be converted to 0 by CAST in the current version. The following query works on the current version but fails on Presto 350 with Cannot cast double NaN to integer error message.

Copy
Copied
SELECT CAST(nan() AS INTEGER)

You can restore the original behavior by using TRY_CAST and COALESCE as follows:

Copy
Copied
SELECT COALESCE(TRY_CAST(nan() AS INTEGER), 0)

information_schema.columns

comment and extra_info columns have been removed from information_schema.columns. If you have queries that refer to these columns, you have to revise them.

Here is a query result of information_schema.columns on Presto 350.

Copy
Copied
presto> select * from information_schema.columns limit 10;
 table_catalog |    table_schema    |    table_name    |   column_name   | ordinal_position | column_default | is_nullable | data_type
---------------+--------------------+------------------+-----------------+------------------+----------------+-------------+-----------
 td-presto     | information_schema | tables           | table_catalog   |                1 | NULL           | YES         | varchar
 td-presto     | information_schema | tables           | table_schema    |                2 | NULL           | YES         | varchar
...

Length check for CHAR(x) and VARCHAR(x)

A length check for CHAR(X) and VARCHAR(x) has been introduced in Presto 350; extra characters are simply truncated from the results, but you might see the following error in some cases.

Copy
Copied
io.prestosql.spi.PrestoException: Could not serialize column 'col1' of type 'varchar(10)' at position 1:2

io.prestosql.spi.PrestoException: Character count exceeds length limit 10: base64:xxxxxxxxxxxx

You might see this error if your query contains CAST(xxx AS VARCHAR(10)) AS col1 and col1 value is longer than 10. In this case, you need to revise the query to limit the size or without the data size using CAST(xxx AS VARCHAR).

Trailing delimiter in ARRAY_JOIN() result

In the current version, ARRAY_JOIN() appends the delimiter if the last element is NULL, but this is not the case in Presto 350. For example, the following query yields "A,B," on the current Presto version, but "A,B" on Presto 350. The ending comma (,) disappears on Presto 350.

Copy
Copied
SELECT ARRAY_JOIN(ARRAY ['A', 'B', NULL], ',')

Invalid value error at TRY(CAST(VARCHAR as TIMESTAMP)

The following query yields NULL on Presto 317 because although the timestamp format is valid, 32 is invalid as a value of a day of the month.

Copy
Copied
SELECT TRY(CAST('2022-01-32 00:00:00' AS TIMESTAMP)

You might see an error like this on Presto 350.

Copy
Copied
Invalid value for DayOfMonth (valid values 1 - 28/31): 32

Instead, use TRY_CAST instead of the combination of TRY and CAST.

Copy
Copied
SELECT TRY_CAST('2022-01-32 00:00:00' AS TIMESTAMP)

Known Limitations with Presto 350 (Legacy) on TD

The Presto query engine has some known limitations.

Column Name Escaping

When a column name matches the name of a reserved keyword, the name of the column needs to be quoted. While in Hive, as most of the SQL-based query languages, the quotation character is the backtick character `. In Presto, quoting of a column name is accomplished with the double-quote character instead . See the following comparison:

Copy
Copied
SELECT `join` FROM mytbl      # Hive
SELECT "join" FROM mytbl      # Presto

Quoting a column name in single quotes will cause the query parser interpret the content as a simple string that might produce an unexpected result. For example:

Copy
Copied
SELECT "join" FROM mytbl LIMIT 10

result:
value1
value2
value3
value4
...
value10

SELECT 'join' FROM mytbl

result:
join
join
join
join
...
join

DELETE Statement

The Presto DELETE statement has several known limitations.

No DELETE Without a WHERE Clause

To reduce the occurrence of unintentional DELETE commands, DELETE statements require you to include a WHERE clause.

Error sample:

Copy
Copied
Does not support DELETE statement without WHERE condition

DELETE against Data Tank

DELETE statements cannot be issued against data stored in Data Tank. Connect directly to the PostgreSQL Data Tank to issue DELETE statements on data stored there.

Limit number of input Partitions for DELETE query

Treasure Data limits the number of input partitions to 500K for DELETE queries.

DELETE queries exceeding the input partition limit may fail.

Copy
Copied
Cannot delete too many input files more than 500000. Try reducing input files by using 'TD_TIME_RANGE 

Multiple DELETE jobs on the same table are NOT allowed when there is overwrap on a partition file

Multiple DELETE jobs on the same table are allowed only when there's no overwrap on a partition file. If the table is partitioned by 1-hour, it can be deleted with different time-range. But there could be large time range data files so it's not always guaranteed to be executed. More precisely, multiple jobs cannot remove the same partition file at the same time.

Also when a DELETE job is running and internal optimization process for partitioning merges (deletes small files and creates large ones) the data which DELETE involves, an error occurs: Metadata transaction conflicted with others.

In this case, retry the DELETE job manually.

Undoing Delete Statements

If you want to rollback an executed delete statement and recover deleted data, you must contact Treasure Data technical support and provide the jobID of the job that contained the executed Presto DELETE statement. In many cases, you can undo a delete statement. However, certain situations prevent the rollback of Presto DELETE statements.

Situations that prevent rollback include:

  • When new partitions have been deleted by another DELETE statement.
  • When original partitions have been discarded, which can happen after a retention period.
  • When a table is dropped and a new table with the same name is created.
  • When new partitions have been further modified or replaced, which can happen as a result of internal storage maintenance processes.

DELETE statement Timeout Error

Depending on the number of partitions that contain rows to be deleted, a DELETE operation can take several hours. An operation can time out and the job fail if too many partitions are affected. If the job fails and the output contains the following message:

Copy
Copied
org.plazmadb.metadb.MetadataSQLException: 
ERROR: canceling statement due to statement timeout

Then you should rewrite the DELETE statement to affect fewer rows or fewer partitions. The most effective way to do this is to include a WHERE clause that limits the time range for the DELETE with TD_TIME_RANGE or TD_INTERVAL. For example, if you have a year's worth of game history data, instead of

Copy
Copied
DELETE FROM game_history WHERE player_id=1000;

try

Copy
Copied
DELETE FROM game_history 
  WHERE player_id=1000 
  AND TD_TIME_RANGE(time, '2018-01-01', '2018-02-01','PDT')

and then delete more time ranges until all rows are deleted. TD support can provide a workflow that simplifies automating this process.

Presto DELETE Statement Resource Consumption

DELETE queries issue jobs and consume resources. Because of the storage and indexing methods used in Treasure Data, a DELETE query can be resource intensive.

To prevent a DELETE query from performing a full table scan, you should use:

  • a time expression, like TD _ TIME _ RANGE or TD _ INTERVAL, if the table uses default time-based partitioning
  • an equality predicate on all partition keys, if the table uses user-defined partitioning

For example, if your table is partitioned by time, and you are trying to delete a set of users that were created in June 2018, include the time restriction even if that's not a critical parameter for your DELETE statement.

For example, you should:

Copy
Copied
DELETEFROM logtable WHERE userid IN (1234, 1235, 1236) AND TD_TIME_RANGE(time, '2017-06-01','2017-07-01')

Including the time restriction significantly improves the speed of your query, and reduces the resources this query consumes.

6-Hour Limit for Presto Queries

Presto queries that run more than 6 hours are automatically canceled. Typically, Presto queries that run more than a day or so are failing queries.

For long-running queries, you can rewrite the Presto query as a Hive query.

JOIN Order

Presto does not support cost-based JOIN optimizations, meaning JOINs are not automatically reordered based on table size. Make sure that smaller tables are on the right-hand side of JOIN, and they must fit in memory. Otherwise, out of memory exceptions will cause the query to fail.

Copy
Copied
SELECT
  ...
FROM
  large_table
JOIN
  small_table

INSERT OVERWRITE Statements are NOT Supported

Presto does not support INSERT OVERWRITE Statements. Make sure that you delete the table before using INSERT INTO.

Error Message - Killed by the system because this query generates output for more than 100GB

This error is caused by trying to generate more than 100GB output by a single query.

As a workaround, use result_output_redirect option or CREATE TABLE AS.

Error Message - Output size too large. Exceeds maximum file size: 25GB

This message might appear if result_output_redirect is enabled by default or used in the magic comment.

Error Message - Accessing too many tables - Maximum 300 tables allowed

This error is caused by trying to scan more than 300 tables by a single query.

As a workaround, tune the SQL query by reducing scan tables or dividing a single query into multiple jobs.

Error Message - Creating partitions exceeds the limit of 500000

This error is caused by trying to commit more than 500,000 (500K) partitions by a single query.

As a workaround, use User Defined Partitioning (UDP) or tune parameters for UDP.

  • User-Defined Partitioning
  • Tune parameters for UDP. For example:
    • Reduce bucket _ count to 256. (512 in default)
    • Update max _ time _ range to adopt a more broad time range.
    • Increase max _ file _ size to 512MB. (256MB in default)

Error Message - Remote page is too large

This error is caused by hitting the limits of the intermediate buffers in Presto. The cause could be the size of the row or column being too high. The limit is set to 64MB in TD Presto.

As a workaround, reduce the size of a single row:

  • Reduce the number of columns to be scanned.
  • Reduce the size of a large column using SMART_DIGEST() .
  • Use the Hive engine instead of Presto.