Convert Teradata to Bigquery(GCP) . DAY() TO SECOND - sql

I am trying to convert the DAY() TO SECOND function in teradata to GCP sql.
Can someone help me convert this?
AVERAGE(((run_end_dttm - run_start_dttm )DAY(4) TO SECOND )) AS elapsed_time,

As was mentioned Bigquery doesn't support INTERVAL data type what actually Teradata DAY(4) TO SECOND function returns.
If your aim stands to get an interval difference between two timestamps in the compliant to Teradata output format i.e. day hour:minute:second.millisecond you might consider to write your own Bigquery UDF function which will afford this transformation.
Below I'm sharing Bigquery function prototype for this kind of conversion, leveraging some of the Timestamp and Time built-in functions:
CREATE TEMP FUNCTION
time_conv(t1 timestamp,
t2 timestamp) AS ((
SELECT
FORMAT( '%d %d:%d:%d.%d', ABS(day), EXTRACT(hour
FROM
time(second)), EXTRACT(minute
FROM
time(second)), EXTRACT(second
FROM
time(second)), EXTRACT(millisecond
FROM
time(second)) ) AS output
FROM
UNNEST([STRUCT( TIMESTAMP_DIFF(t1,t2, day) AS day,
TIMESTAMP_SECONDS(TIMESTAMP_DIFF(t1,t2, second)) AS second )]) ));
WITH
`example` AS (
SELECT
TIMESTAMP("2021-10-19 21:45:21") AS t1,
TIMESTAMP("2021-10-15 18:17:56") AS t2 )
SELECT
time_conv(t1,
t2)
FROM
example
You can also tweak Bigquery FORMAT() block getting the desirable output.

Related

converting Athena timestamp to date

I am running a query against Athena, and it breaks. Specifically, I get an error for the below fragment:
avg(
DATE_DIFF(
'minute',
CAST(from_iso8601_timestamp("sessions_staging".session_start_at) AS TIMESTAMP),
CASE
WHEN CAST("sessions_staging__end_raw" AS TIMESTAMP) + INTERVAL '1' MINUTE > CAST("sessions_staging".next_session_start_at AS TIMESTAMP) THEN CAST("sessions_staging".next_session_start_at AS TIMESTAMP)
ELSE CAST("sessions_staging__end_raw" AS TIMESTAMP) + INTERVAL '30' MINUTE
END
)
) "sessions_staging__average_duration_minutes"
Athena complains with Value cannot be cast to timestamp: 2022-08-03T00:05:54.300Z.
I tried a bunch of tricks like casting my date to string then casting again to a time or a timestamp type. A similar problem caused by the same issue is covered some in converting to timestamp with time zone failed on Athena
The value seems to be just fine. I am able to execute: SELECT CAST(From_iso8601_timestamp('2022-08-03T00:05:54.300Z') AS timestamp). If I do not use CAST() and just do: "sessions_staging".session_start_at, it says that (varchar(6), varchar, timestamp) for function date_diff so I know that session_start_at is perceived as VARCHAR.
However, for the type of casting described as a solution to my issue to work, in the linked discussion, SELECT need to be used, it seems. Everything that I tried including string manipulations did not work.
How could I re-write my query/casts for Athena to process my request?
I ended up with:
CAST(DATE_PARSE(my_varchar_date, '%Y-%m-%dT%H:%i:%s.%f%z') AS TIMESTAMP)

Converting date format number to date and taking difference in SQL

I have a data set as below,
Same is date in "YYYYMMDD" format, I wanted to convert the columns to date format and take the difference between the same.
I used to below code
SELECT to_date(statement_date_key::text, 'yyyymmdd') AS statement_date,
to_date(paid_date_key::text, 'yyyymmdd') AS paid_date,
statement_date - paid_date AS Diff_in_days
FROM Table
WHERE Diff_in_days >= 90
;
Idea is to convert both the columns to dates, take the difference between them and filter cases where difference in days is more than 90.
Later I was informed that server is supported by HiveSQL and does not support of using ":", date time, and temp tables can not be created.
I'm currently stuck on how to go about given the constraints.
Help would be much appreciated.
Sample date for reference is provided in the link
dbfiddle
Hive is a little convoluted in its use of dates. You can use unix_timestamp() and work from there:
SELECT datediff(to_date(unix_timestamp(cast(statement_date_key as varchar(10)), 'yyyyMMdd')),
to_date(unix_timestamp(cast(paid_date_key as varchar(10)), 'yyyyMMdd'))
) as diff_in_days
FROM Table;
Note that you need to use a subquery if you want to use diff_in_days in a where clause.
Also, if you have date keys, then presumably you also have a calendar table, which should make this much simpler.
Hello You Can Use Below Query It Work Well
select * from (
select convert(date, statement_date_key) AS statement_date,
convert(date, paid_date) AS paid_date,
datediff(D, convert(date, statement_date_key), convert(date, paid_date)) as Diff_in_days
from Table
) qry
where Diff_in_days >= 90
Simple way: Function unix_timestamp(string, pattern) converts string in given format to seconds passed from unix epoch, calculate difference in seconds then divide by (60*60*24) to get difference in days.
select * from
(
select t.*,
(unix_timestamp(string(paid_date_key), 'yyyyMMdd') -
unix_timestamp(string(statement_date_key), 'yyyyMMdd'))/86400 as Diff_in_days
from Table t
) t
where Diff_in_days>=90
You may want to add abs() if the difference can be negative.
One more method using regexp_replace:
select * from
(
select t.*,
datediff(date(regexp_replace(string(paid_date_key), '(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3')),
date(regexp_replace(string(statement_date_key), '(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3'))) as Diff_in_days
from Table t
) t
where Diff_in_days>=90

BigQuery Standard SQL: pass INTERVAL, or date_part as SQL UDF argument?

I am trying to build a simple TIMESTAMP_AGO SQL UDF. The function is a simple wrapper around CURRENT_TIMESTAMP and TIMESTAMP_SUB.
I want to call it, with signature:
SELECT TIMESTAMP_AGO(24, 'HOUR');
or, even:
SELECT TIMESTAMP_AGO(24 HOUR);
But BigQuery does not seem to like the date_part of INTERVAL as a variable, so it fails. I've tried a separation of arguments:
CREATE TEMP FUNCTION TIMESTAMP_AGO(_interval INT64, _date_part STRING) AS ((
SELECT TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL _interval _date_part)
));
and, trying to pass an INTERVAL as well :
CREATE TEMP FUNCTION TIMESTAMP_AGO(_interval INTERVAL) AS ((
SELECT TIMESTAMP_SUB(CURRENT_TIMESTAMP(), _interval)
));
Can INTERVAL's be passed around like this?
Or, is it possible to pass a dynamic date_part?
Failing these, would it be possible to use an External UDF (JS)?
Below is for BigQuery Standard SQL
TIMESTAMP_SUB supports the following values for date_part:
MICROSECOND
MILLISECOND
SECOND
MINUTE
HOUR
So, you just simply need to check your passed _date_part and use respective "version" as in below example
#standardSQL
CREATE TEMP FUNCTION TIMESTAMP_AGO(_interval INT64, _date_part STRING) AS (
CASE _date_part
WHEN 'MICROSECOND' THEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL _interval MICROSECOND)
WHEN 'MILLISECOND' THEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL _interval MILLISECOND)
WHEN 'SECOND' THEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL _interval SECOND)
WHEN 'MINUTE' THEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL _interval MINUTE)
WHEN 'HOUR' THEN TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL _interval HOUR)
END
);
So, now below will work
SELECT TIMESTAMP_AGO(24, 'HOUR')
You can obviously add UPPER() to CASE _date_part if you expect case-insensitive input, etc.

How can I extract just the hour of a timestamp using standardSQL

How can I extract just the hour of a timestamp using standardSQL.
I've tried everything and no function works. The problem is that I have to extract the time from a column and this column is in the following format:2018-07-09T02:40:23.652Z
If I just put the date, it works, but if I put the column it gives the error below:
Syntax error: Expected ")" but got identifier "searchIntention" at [4:32]
Follow the query below:
#standardSQL
select TOTAL, dia, hora FROM
(SELECT cast(replace(replace(searchIntention.createdDate,'T',' '),'Z','')as
DateTime) AS DIA,
FORMAT_DATETIME("%k", DATETIME searchIntention.createdDate) as HORA,
count(searchintention.id) as Total
from `searchs.searchs2016626`
GROUP BY DIA)
Please, help me. :(
How can I extract just the hour of a timestamp using standardSQL?
Below is for BigQuery Standard SQL
You can use EXTRACT(HOUR FROM yourTimeStampColumn)
for example:
SELECT EXTRACT(HOUR FROM CURRENT_TIMESTAMP())
or
SELECT EXTRACT(HOUR FROM TIMESTAMP '2018-07-09T02:40:23.652Z')
or
SELECT EXTRACT(HOUR FROM TIMESTAMP('2018-07-09T02:40:23.652Z'))
In BigQuery Standard SQL, you can use the EXTRACT timestamp function in order to return an INT64 value corresponding to the part of the timestamp that you want to retrieve, like.
The available parts includes a full list that you can check in the documentation page linked, but in your use case you can directly refer to the HOUR operator in order to retrieve the INT64 representation of the hour value in a field of TIMESTAMP type.
#standardSQL
# Create a table
WITH table AS (
SELECT TIMESTAMP("2018-07-09T02:40:23.652Z") time
)
# Extract values from a Timestamp expression
SELECT
EXTRACT(DAY FROM time) as day,
EXTRACT(MONTH FROM time) as month,
EXTRACT(YEAR FROM time) as year,
EXTRACT(HOUR FROM time) AS hour,
EXTRACT(MINUTE FROM time) as minute,
EXTRACT(SECOND from time) as second
FROM
table

How to convert Date-time into Date using Netezza

I am doing some calculation but my calculation is off because my date field is showing the time-stamp and i only want to use as Date only when i am doing the calculation. How can i just ignore the minutes and just use the date when doing the calculation? Here is what i have:
SELECT EF.DSCH_TS,
CASE WHEN EXTRACT (DAY FROM EF.DSCH_TS - EF.ADMT_TS)>=0 THEN 'GroupA' END AS CAL
FROM MainTable EF;
Netezza has built-in function for this by simply using:
SELECT DATE(STATUS_DATE) AS DATE,
COUNT(*) AS NUMBER_OF_
FROM X
GROUP BY DATE(STATUS_DATE)
ORDER BY DATE(STATUS_DATE) ASC
This will return just the date portion of the timetamp and much more useful than casting it to a string with TO_CHAR() because it will work in GROUP BY, HAVING, and with other netezza date functions. (Where as the TO_CHAR method will not)
Also, the DATE_TRUNC() function will pull a specific value out of Timestamp ('Day', 'Month, 'Year', etc..) but not more than one of these without multiple functions and concatenate.
DATE() is the perfect and simple answer to this and I am surprised to see so many misleading answers to this question on Stack. I see TO_DATE a lot, which is Oracle's function for this but will not work on Netezza.
With your query, assuming that you're interested in the days between midnight to midnight of the two timestamps, it would look something like this:
SELECT EF.DSCH_TS,
CASE
WHEN EXTRACT (DAY FROM (DATE(EF.DSCH_TS) - DATE(EF.ADMT_TS)))>=0 THEN 'GroupA'
END AS CAL
FROM MainTable EF;
You may want to consider rewriting your case statement to return an interval. This will allow for a little more flexibility.
SELECT EF.DSCH_TS,
CASE
WHEN age(date(EF.DSCH_TS),date(EF.ADMT_TS))>= interval '6 days'
THEN 'GroupA' END AS CAL
FROM MainTable EF;
Use date_trunc() with the first argument of 'day'. I think this is what you want:
SELECT EF.DSCH_TS,
(case when date_trunc('day', EF.DSCH_TS) >= date_trunc('day', EF.ADMT_TS) THEN 'GroupA' END) AS CAL
FROM MainTable EF;