What would be spark-sql equivalent of dateadd (seconds) function in sql? - apache-spark-sql

I want to add 10 seconds to a timestamp column in a dataframe using spark-sql. The date_add() function seems to be able to add days, but not seconds.

You can use selectExpr along with INTERVAL as the example suggests here

Related

Equivalent of Hive's date_format function in Impala?

Is there an equivalent of Hive's date_format function in Impala?
I need to change a date column to the first day of the month (e.g., '2020-09-29' to '2020-09-01'), so I had originally used: date_format(LOG_DATE,'yyyy-MM-01') as FIRST_DAY_MONTH
Thanks!
You can use to_timestamp().
Pls use this to_timestamp('20200901','yyyyMMdd') to get a timestamp.
Generic command may be to_timestamp(concat(substr(data_col,1,7),'-01'),'yyyy-MM-dd')

how to find data based on exact timestamp

I'm trying to grab a data row in BigQuery by timestamp '2018-12-08 00:00:42.808 America/Los_Angeles'. This works with between clause. For example, timestamp BETWEEN '....00:00:42.808...' AND '....00:00:42.809...'.
However, I'm not able to find anything when I just want to do timestamp = '....00:00:42.808...'. I'm not sure why this is and I can't seem to find much answer on google for this particular case.
The timestamp in Google BigQuery is quite exact, so your query probably does not EXACTLY hit the timestamps in your table. You can use TIMESTAMP_TRUNC -function if you want to hit timestamp at millisecond, second or any rounded level. With this function you can have a where clause like this:
where TIMESTAMP_TRUNC(timestamp, millisecond, 'America/Los_Angeles')='2018-12-08 00:00:42.808 America/Los_Angeles'
This would give you the result at millisecond level you expect. You can find more information on TIMESTAMP_TRUNC and other BigQuery functions from https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators.

Athena date_parse for date with optional millisecond field

I have date in S3 using which I created an Athena table. I have some date entries in S3 in json format which Athena is not accepting as either Date or timestamp when am running the queries.
Using AWS Athena which uses Prestodb as query engine
Example json :
{"creationdate":"2018-09-12T15:49:07.269Z", "otherfield":"value1"}
{"creationdate":"2018-09-12T15:49:07Z", "otherfield":"value2"}
AWS Glue is taking both the fields as string and when am changing them to timestamp and date respectively the queries around timestamp are not working giving ValidationError on the timestamp field.
Anyway, I found a way to use prestodb date_parse function but its not working either since some fields have milliseconds while other not.
parse_datetime(creationdate, '%Y-%m-%dT%H:%i:%s.%fZ')
parse_datetime(creationdate, '%Y-%m-%dT%H:%i:%sZ')
Both are failing because of different entries present i.e. one with millisecond %f and one without
Is there a way to provide a parser, regex so that am able to convert these strings into Date during sql query execution?
Instead of providing the timestamp format, you can use the from_iso8601_timestamp function.
This way, all timestamps get parsed.
select from_iso8601_timestamp(creationdate) from table1;
Do you just need date?
If so you could use date_parse(string, format).
date_parse(creationdate, ā€˜%Y-%m-%dā€™)
Use this:
SELECT requestdatetime, remoteip, requester, key
FROM MYDB.TABLE
WHERE parse_datetime(requestdatetime,'dd/MMM/yyyy:HH:mm:ss Z')
BETWEEN parse_datetime('2020-10-14:00:00:00','yyyy-MM-dd:HH:mm:ss')
AND parse_datetime('2020-10-14:23:59:59','yyyy-MM-dd:HH:mm:ss');

Event dates in social tables

I'm trying to figure out how to pass in the date to create event. It says in MS and I'm not sure how to convert the date/time to MS without having a date to start from.
When we say milliseconds we mean a unix timestamp in ms. See https://en.wikipedia.org/wiki/Unix_time for a definition. In javascript it would be the DateTime.getTime() function.

Number of days between two dates - ANSI SQL

I need a way to determine the number of days between two dates in SQL.
Answer must be in ANSI SQL.
ANSI SQL-92 defines DATE - DATE as returning an INTERVAL type. You are supposed to be able to extract scalars from INTERVALS using the same method as extracting them from DATEs using ā€“ appropriately enough ā€“ the EXTRACT function (4.5.3).
<extract expression> operates on
a datetime or interval and returns an
exact numeric value representing the
value of one component of the datetime
or interval.
However, this is very poorly implemented in most databases. You're probably stuck using something database-specific. DATEDIFF is pretty well implemented across different platforms.
Here's the "real" way of doing it.
SELECT EXTRACT(DAY FROM DATE '2009-01-01' - DATE '2009-05-05') FROM DUAL;
Good luck!
I can't remember using a RDBMS that didn't support DATE1-DATE2 and SQL 92 seems to agree.
I believe the SQL-92 standard supports subtracting two dates with the '-' operator.
SQL 92 supports the following syntax:
t.date_1 - t.date_2
The EXTRACT function is also ANSI, but it isn't supported on SQL Server. Example:
ABS(EXTRACT(DAY FROM t.date_1) - EXTRACT(DAY FROM t.date_2)
Wrapping the calculation in an absolute value function ensures the value will come out as positive, even if a smaller date is the first date.
EXTRACT is supported on:
Oracle 9i+
MySQL
Postgres