Migrating Hive Queries into Presto
Presto follows the ANSI SQL syntax and semantics, while Hive uses a different SQL syntax dialect. Differences include:
- Quotation — In Presto, use double quotations for column and table names (e.g., "column1"), single quotations for strings (for example, ‘hello presto’).
- Implicit type conversion — In Presto, SELECT CAST(5 AS DOUBLE) / 2 gives 2.5, but SELECT 5 / 2 gives 2. In Hive, both return 2.5.
To migrate Hive queries into Presto:
- Review the SQL syntax requirements for Presto at Migrating from Hive .
- Determine which of your queries might need to have syntax updates.
- Make and test the changes to your queries. Or, create copies of the queries you want to migrate to Presto and make changes in the new queries.
- On the Query page of the TD Console for your query, make sure that the Type is set to Presto.
Supported Time Formats in TD TIME FORMAT UDF
This is a summary of the time formats supported by the
TD_TIME_FORMAT_UDF
udf.
Formats
Syntax | Date or Time Component | Presentation | Examples |
---|---|---|---|
G | Era designator | Text | AD |
yyyy | Year | Year | 1996 |
yy | Year | Year (2 digits) | 96 |
MMMM | Month in year | Month long name | July |
MMM | Month in year | Month short name | Jul |
MM, M | Month in year | Month number | 07 |
ww, w | Week in year | Number | 06 |
DDD, DD, D | Day in year | Number | 189 |
dd, d | Day in month | Number | 10 |
EEEE | Day in week | Text | Tuesday |
E, EEE | Day in week | Text (short form) | Tue |
a | Am/pm marker | Text | PM |
HH, H | Hour in day (0-23) | Number | 00 |
kk, k | Hour in day (1-24) | Number | 24 |
KK, K | Hour in AM/PM (0-11) | Number | 00 |
hh, h | Hour in AM/PM (1-12) | Number | 12 |
mm, m | Minute in hour | Number | 30 |
ss, s | Second in minute | Number | 55 |
SSS, SS, S | Millisecond | Number | 978 |
wyear | Week of the year | Number | TD_TIME_FORMAT(1582859260, 'wyear' TD_TIME_FORMAT(1582859260, 'wyear', 'UTC')) |
zzzz | Time zone | Zone long name | Pacific Standard Time, or GMT+01:00 |
z | Time zone | Zone short name | PST, or GMT+01:00 |
Z | Time zone | Zone offset | -0800 |
u | Day number of week (1-7) | Number | 1 (for Monday) |
Example
This example formats a UNIX timestamp into a date formatted string. You can use the sample_datasets database for example to try it out:
SQL
SELECT
time,
TD_TIME_FORMAT(time,
'yyyy-MM-dd HH:mm:ss z')
FROM
www_access LIMIT 1
;
Result
2012-01-01 00: 00: 00 UTC
SQL DDL Command Differences in Hive and Presto
Data Definition Language (DDL) is a concept in SQL that can be used to create and modify the structure of database objects or schema objects. You can use DDL statements to create, manage, and delete tables, indexes, users, and other schema objects.
As it pertains to data-processing, Presto is optimized for quickly processing smaller volumes of information while Hive is better for querying more expansive databases requiring large amounts of memory. Both Hive and Presto use DDL commands but in slightly different ways.
The following table illustrates DDL commands required for Hive and Presto.
SQL DDL Commands and Their Equivalents
SQL DDL Commands | Hive SQL DDL | Presto SQL DDL | Query Definition |
---|---|---|---|
ALTER | ALTER | ALTER | Modifies the structure of an existing table. |
CREATE | CREATE | CREATE | Creates a database or objects including tables, views, stored procedures, etc. |
DROP | DROP | DROP | Deletes an existing database or an object within a database. |
TRUNCATE | TRUNCATE | DELETE | Removes all the records from a table. |