Big Query: Compute Months Between Two Dates Which are Strings - google-bigquery

I am working in Big Query with a table. Two of the columns, Date1 and Date2, are supposed to be dates but they are strings in the 'YYYYMM' format.
I would like to compute the number of months between these two dates.
For example, if Date1 was '202106' and Date2 was '201901', the result would be 29.
My data table has approximately 500,000 rows.
Thank you.

Use below
select date_diff(
cast(Date1 as date format 'YYYYMM'),
cast(Date2 as date format 'YYYYMM')
, month)
If t apply to sample data in your question ( if Date1 was '202106' and Date2 was '201901',) - output is

Related

How to calculate exact hours between two datetime fields?

I need to calculate hours between datetime fields and I can achieve it by simply doing
select date1,date2,(date1-date2) from table; --This gives answer in DD:HH:MM:SS format
select date1,date2,(trunc(date1)-trunc(date2))*24 --This doesn't take into account the time, it only gives hours between two dates.
Is there a way I can find the difference between date times that gives the output in Hours as a number?
The 'format' comment on your first query suggests your columns are timestamps, despite the dummy column names, as the result of subtracting two timestamps is an interval. Your second query is implicitly converting both timestamps to dates before subtracting them to get an answer as a number of days - which would be fractional if you weren't truncating them and thus losing the time portion.
You can extract the number of hours from the interval difference, and also 24 * the number of days if you expect it to exceed a day:
extract(day from (date1 - date2)) * 24 + extract(hour from (date1 - date2))
If you want to include fractional hours then you can extract and manipulate the minutes and seconds too.
You can also explicitly convert to dates, and truncate or floor after manipulation:
floor((cast(date1 as date) - cast(date2 as date)) * 24)
db<>fiddle demo
Use the DATEDIFF function in sql.
Example:
SELECT DATEDIFF(HOUR, '2021-09-05 12:00:00', GETDATE());
You can find it using the differnece of dates and multiplying with 24
select date1
,date2
,(date1-date2)*24 as diff_in_hrs
from table

SQL difference between two datetime columns

I have a dataset with 2 columns of datetime datatype as shown here:
I want to take the difference between the two dates and I try it with this code:
Select
*,
original_due_date - due_date as difference
from
Table
However I'm not sure if the same would suffice as this is a datetime and not just date.
Any inputs would be much appreciated.
Desired output
The question was originally tagged Postgres, so this answers the original question.
Presumably, you are storing the values as timestamps. If you just want the results in days, then convert to dates and take the difference:
Select t.*,
(t.original_due_date::date - t.due_date::date) AS difference
from Table t;
If you want fractional days, then a pretty simple method is to extract the "epoch", which is measured in seconds, and use arithmetic:
Select t.*,
( extract(epoch from t.original_due_date -
extract(epoch from t.due_date
) / (24.0 * 60 * 60) AS decimal_days
from Table t;
transform timestamps to seconds (unix_timestamp), calculate difference and divide by (60*60*24) to get days
select (unix_timestamp(original_due_date, 'MM-dd-yyyy HH:mm')-unix_timestamp(due_date, 'MM-dd-yyyy HH:mm'))/(60*60*24) as difference_days
from (select '07-01-2021 00:00' as due_date, '02-10-2020 00:00' as original_due_date) t
Result:
-507

Converting date format number to date and taking difference in SQL

I have a data set as below,
Same is date in "YYYYMMDD" format, I wanted to convert the columns to date format and take the difference between the same.
I used to below code
SELECT to_date(statement_date_key::text, 'yyyymmdd') AS statement_date,
to_date(paid_date_key::text, 'yyyymmdd') AS paid_date,
statement_date - paid_date AS Diff_in_days
FROM Table
WHERE Diff_in_days >= 90
;
Idea is to convert both the columns to dates, take the difference between them and filter cases where difference in days is more than 90.
Later I was informed that server is supported by HiveSQL and does not support of using ":", date time, and temp tables can not be created.
I'm currently stuck on how to go about given the constraints.
Help would be much appreciated.
Sample date for reference is provided in the link
dbfiddle
Hive is a little convoluted in its use of dates. You can use unix_timestamp() and work from there:
SELECT datediff(to_date(unix_timestamp(cast(statement_date_key as varchar(10)), 'yyyyMMdd')),
to_date(unix_timestamp(cast(paid_date_key as varchar(10)), 'yyyyMMdd'))
) as diff_in_days
FROM Table;
Note that you need to use a subquery if you want to use diff_in_days in a where clause.
Also, if you have date keys, then presumably you also have a calendar table, which should make this much simpler.
Hello You Can Use Below Query It Work Well
select * from (
select convert(date, statement_date_key) AS statement_date,
convert(date, paid_date) AS paid_date,
datediff(D, convert(date, statement_date_key), convert(date, paid_date)) as Diff_in_days
from Table
) qry
where Diff_in_days >= 90
Simple way: Function unix_timestamp(string, pattern) converts string in given format to seconds passed from unix epoch, calculate difference in seconds then divide by (60*60*24) to get difference in days.
select * from
(
select t.*,
(unix_timestamp(string(paid_date_key), 'yyyyMMdd') -
unix_timestamp(string(statement_date_key), 'yyyyMMdd'))/86400 as Diff_in_days
from Table t
) t
where Diff_in_days>=90
You may want to add abs() if the difference can be negative.
One more method using regexp_replace:
select * from
(
select t.*,
datediff(date(regexp_replace(string(paid_date_key), '(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3')),
date(regexp_replace(string(statement_date_key), '(\\d{4})(\\d{2})(\\d{2})','$1-$2-$3'))) as Diff_in_days
from Table t
) t
where Diff_in_days>=90

Oracle SQL: How to modify query in order to get only results within a certain timeframe?

I use this statement in Oracle SQL Developer
select to_char(time,'DD/MM/YY hh24'),count(column) as xyz from table
where to_char(time,'DD/MM/YY')>= '08/04/21'
and to_char(time,'DD/MM/YY')<= '09/04/21'
and column='xyz'
group by to_char(time,'DD/MM/YY hh24')
order by to_char(time,'DD/MM/YY hh24');
What I expect is a result/table in which the result is ordered by time in ascending order (starting with the earliest hour on 08/04/21 and ending with the latest on 09/04/21. I would expect only entries for days 08/04/21 and 09/04/21. Instead, I get a result where also other dates are included like 09/02/21 or 08/12/20.
How can I modify my query?
You are converting your native date values to strings (with two-digit years!) and then comparing those strings. The string '08/12/20' is 'less than' the string '09/04/21'.
Compare your dates with other dates, which is easier as literals:
select to_char(trunc(time, 'HH'), 'DD/MM/YY HH24'), count(column) as xyz
from table
where time >= date '2021-04-08'
and time < date '2021-04-10'
and column='xyz'
group by trunc(time, 'HH')
order by trunc(time, 'HH');
I've used trunc() to remove/zero the minute and seconds parts, which means you can then group and order by that value; and just convert to a string for display at the last moment.
I've also converted to_char(time,'DD/MM/YY')<= '09/04/21' to time < date '2021-04-10' rather than time < date '2021-04-09'as your version include all data from the 9th; which may or may not be what you intended - you might have been trying to get a single day.
db<>fiddle demo
Assuming that time is of data type date, you don't want to do a to_char on it in your where clause or in your order by. As written, you're doing string comparisons rather than date comparisons so you're getting rows where the to_char(time string sorts alphabetically between the two values not rows where the date is between the two dates. Compare against date literals or do explicit to_date calls on your string literals
My wager is that you really want something like this
select trunc(time, 'HH24'),count(column) as xyz
from table
where time >= date '2021-08-04'
and time <= date '2021-09-04'
and column='xyz'
group by trunc(time, 'HH24')
order by trunc(time, 'HH24');

YYYYMMDD to YYYYMM in oracle

I have a column with DATE datatype in a table.
I am trying to retrieve the column values in YYYYMM format. My select query looks like below
select *
from tablename
where date column = to_char(to_date('12/31/4000','MM/DD/YYYY'),'YYYYMM');
I am getting below exception.
ORA-01847: day of month must be between 1 and last day of month
Appreciate any input on this.
I think the simplest method is:
where to_char(datecolumn, 'YYYYMM') = '400012'
Or, if you prefer:
where to_char(datecolumn, 'YYYYMM') = to_char(to_date('12/31/4000', 'MM/DD/YYYY'), 'YYYYMM');
Syntax-wise, the right hand date (to the right of the equals) is OK. But you are doing a character comparison, not a date comparison.
This works for me in multiple databases:
select to_char (to_date('12/31/4000','MM/DD/YYYY'),'YYYYMM')
from dual;
Even though your column is named DATE_COLUMN, you are comparing based on characters in the query.
So, try this instead - this compares based on dates (NOT a character comparison) and truncates off the hour, minute, ETC. so you are only comparing the DAY:
select * from DATE_TAB
where TRUNC(DATE1, 'DDD') = TRUNC(to_date('12/31/4000','MM/DD/YYYY'),'DDD');
NOTE: The DATE1 field above is a DATE field. If you're DATE_COLUMN is not a DATE field, you must
convert it to a DATE datatype first (using TO_DATE, ETC.)
Assuming that "date_column" is actually a date, and that you have an index on date_column, you can do something like this to return the data quickly (without truncating dates in all rows to do a comparison):
with dat as (
select level as id, sysdate - (level*10) as date_column
from dual
connect by level <= 100
)
select id, date_column
from dat
where date_column between to_date('11/1/2013', 'MM/DD/YYYY') and last_day(to_date('11/2013 23:59:59', 'MM/YYYY HH24:MI:SS'))
Here I just dummy up some data with dates going back a few years. This example picks all rows that have a date in the month of November 2013.
If your date_column's data-type is DATE, then use
select *
from tablename
where TO_CHAR(date_column,'YYYYMM') = to_char (to_date('12/31/4000','MM/DD/YYYY'),'YYYYMM');
If your date_column's data-type is VARCHAR, then use:
select *
from tablename
where date_column = to_char (to_date('12/31/4000','MM/DD/YYYY'),'YYYYMM');
I somehow feel your error is because you have a space between date and column as
"date column". If the field name in the table is "COLUMN", then just removing the word "DATE" from your original query would suffice, as:
select *
from tablename
where column = to_char(to_date('12/31/4000','MM/DD/YYYY'),'YYYYMM');
If your column (YYYYMMDD) is in number format, the simplest way to get YYYYMM would be
select floor(DATE/100)
from tablename;