What is the best performance alternative of datefromparts SQL function in AWS Athena (Presto DB)?
The use case is:
I have the date parts (i.e. the day, month, and year) and need the date from these.
You would typically use parse_date(), with the proper format specifiers. If your date is in ISO format, you can directly use from_iso_date() (or from_iso_timestamp()).
On the other hand, if you need to extract dates part, you can use extract(), like:
extract(hour from current_timestamp)
Note that Presto also offers a full range of short function name that correspond to the possible extraction parts: year(), quarter(), month(), ...
Related
Does SQL has some standard data-time function cross database? Such as :
extract year, month, day, hour, minute or second
format to specific formatter
parse from string
I believe the answers are yes, no, and no.
The extraction functions are extract(<whatever> from date). I don't think there is a standard for parsing and formatting. However, to_char() and to_date() are used across multiple databases.
I have date in S3 using which I created an Athena table. I have some date entries in S3 in json format which Athena is not accepting as either Date or timestamp when am running the queries.
Using AWS Athena which uses Prestodb as query engine
Example json :
{"creationdate":"2018-09-12T15:49:07.269Z", "otherfield":"value1"}
{"creationdate":"2018-09-12T15:49:07Z", "otherfield":"value2"}
AWS Glue is taking both the fields as string and when am changing them to timestamp and date respectively the queries around timestamp are not working giving ValidationError on the timestamp field.
Anyway, I found a way to use prestodb date_parse function but its not working either since some fields have milliseconds while other not.
parse_datetime(creationdate, '%Y-%m-%dT%H:%i:%s.%fZ')
parse_datetime(creationdate, '%Y-%m-%dT%H:%i:%sZ')
Both are failing because of different entries present i.e. one with millisecond %f and one without
Is there a way to provide a parser, regex so that am able to convert these strings into Date during sql query execution?
Instead of providing the timestamp format, you can use the from_iso8601_timestamp function.
This way, all timestamps get parsed.
select from_iso8601_timestamp(creationdate) from table1;
Do you just need date?
If so you could use date_parse(string, format).
date_parse(creationdate, ‘%Y-%m-%d’)
Use this:
SELECT requestdatetime, remoteip, requester, key
FROM MYDB.TABLE
WHERE parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z')
BETWEEN parse_datetime('2020-10-14:00:00:00','yyyy-MM-dd:HH:mm:ss')
AND parse_datetime('2020-10-14:23:59:59','yyyy-MM-dd:HH:mm:ss');
I am using BigQuery to output a formatted Timestamp value using STRFTIME_UTC_USEC function, the documentation leads me strftime C++ reference,
which specify modifiers like %b (for month) etc. which are locale specific,
is their a way to use locale specific month names using STRFTIME?
The only other alternative I see is to write my own UDF function and do a lookup using Map.
Even though STRFTIME_UTC_USEC function is based on C++'s strftime there is no provision to supply locale.
We usually recommend using Standard SQL which has FORMAT_TIMESTAMP function, but it does not allow changing locale either.
You probably don't have to write complex UDF, just a simple REPLACE or REGEXP_REPLACE can be enough. Or you can have an array with localized month names - ["Январь", "Февраль", "Март", "Апрель", ...] and get element out of it based on month EXTRACT(MONTH FROM date)
What is the exact difference between date_part() and extract functions in netezza?
In Netezza date_part and extract are equivalent functions. They provide different syntax for improved SQL compatibility, but are otherwise the same.
You can see this in the documentation here.
difference between date_part('week', ...) and extract(week from ...) not only but also from a performance point of view. I read extract is standard and date_part is not. Backwards compatibility maybe justifies the existence of date_part
I need a way to determine the number of days between two dates in SQL.
Answer must be in ANSI SQL.
ANSI SQL-92 defines DATE - DATE as returning an INTERVAL type. You are supposed to be able to extract scalars from INTERVALS using the same method as extracting them from DATEs using – appropriately enough – the EXTRACT function (4.5.3).
<extract expression> operates on
a datetime or interval and returns an
exact numeric value representing the
value of one component of the datetime
or interval.
However, this is very poorly implemented in most databases. You're probably stuck using something database-specific. DATEDIFF is pretty well implemented across different platforms.
Here's the "real" way of doing it.
SELECT EXTRACT(DAY FROM DATE '2009-01-01' - DATE '2009-05-05') FROM DUAL;
Good luck!
I can't remember using a RDBMS that didn't support DATE1-DATE2 and SQL 92 seems to agree.
I believe the SQL-92 standard supports subtracting two dates with the '-' operator.
SQL 92 supports the following syntax:
t.date_1 - t.date_2
The EXTRACT function is also ANSI, but it isn't supported on SQL Server. Example:
ABS(EXTRACT(DAY FROM t.date_1) - EXTRACT(DAY FROM t.date_2)
Wrapping the calculation in an absolute value function ensures the value will come out as positive, even if a smaller date is the first date.
EXTRACT is supported on:
Oracle 9i+
MySQL
Postgres