Substract Days to a Date in HIVE APACHE - sql

How I can substract a number of days of a date, having as a result another date, for example: 01/12/2016 - 10 = 21/11/2016

(date argument)
hive> select date_sub(date '2016-12-01',10);
OK
2016-11-21
or
(string argument)
hive> select date_sub('2016-12-01',10);
OK
2016-11-21
date_sub(date/timestamp/string startdate, tinyint/smallint/int days)
Subtracts a number of days to startdate: date_sub('2008-12-31', 1) =
'2008-12-30'. Prior to Hive 2.1.0 (HIVE-13248) the return type was a
String because no Date type existed when the method was created.
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

there exist a hive udf to substract days to the hive datehttps://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions, you have two options, transform your date to the following format to use the udf directly
yyyy-MM-dd
or you can transform your current date to timestamp and apply the udf, for example
date_sub(from_unixtime(unix_timestamp('12/03/2010' , 'dd/MM/yyyy')), 10) -- subs 10 days
I hope it helps,
regards!

Related

Convert date to dateTtime format in SQL

I am trying to convert a date column (ie. 2012-10-02) to the first day of the year with time (ie. 2012-01-01T00:00:00) in sql.
Is there a way to do so in the SELECT query?
for BigQuery use below
select timestamp_trunc('2012-10-02', year)
with output
2012-01-01 00:00:00 UTC
Note - if you column is of date type - the output will be
2012-01-01T00:00:00
and finally, you can use datetime_trunc instead of timestamp_trunc and you will get expected result - 2012-01-01T00:00:00
Look at the YEAR() function.
It would allow you to extract just the year, and then just as the date and time you need.

AWS Athena - Format and filter datetime

I have a table which is fed two different date formats:
d/m/Y & m/d/Y. The date format wanted is d/m/Y
I am able to select the date column and do a check and format if the date is in the wrong format.
This is my current SQL query:
SELECT COALESCE(TRY(date_format(date_parse(tbl.date, %d/%m/%Y), %d/%m/%Y)),
TRY(date_format(date_parse(tbl.date, %m/%d/%Y), %d/%m/%Y))) as date
FROM xxx
That fixes the mismatched dates...however I also need to query a date range e.g. the last 7 days.
If I add a WHERE statement it does not execute as I have already queried the date earlier.
How can I format my dates AND filter based on a given range (last 7 days)?
In ANSI SQL -- implemented by Presto, which Athena is based on -- the WHERE clause cannot reference the SELECT projections, so you need a aubquery:
SELECT *
FROM (
SELECT COALESCE(TRY(date_parse ....... AS date
FROM xxx
)
WHERE date > current_date - INTERVAL '7' DAY

The difference between two dates in Hiveql

I wish to find the differences between two dates in date format in Hiveql. I used the blow function in SAS to return a date value by subtracting a number
intnx('day', 20MAR2019 , -7)
It subtracts 7 days from the date and returns 13MAR2019
I wish to convert it to Hiveql language. Any tips would be appreciated!
you can use date_sub function in hive to subtract the days from a given date.
hive> select current_date;
2019-07-25
hive> select date_sub(current_date,7);
2019-07-18
This will return null.
hive> select date_sub('13MAR2019',7);
OK
NULL
since your date is format 'ddMMMYYY', you can convert it yyy-MM-dd format.
hive> select date_sub(from_unixtime(unix_timestamp('13MAR2019' ,'ddMMMyyyy'), 'yyyy-MM-dd'),7);
OK
2019-03-06

Convert set of Dates to End of Month Date

Using Oracle SQL, how can I transform a set of dates to the date for the end of that month? Example below:
Date Amount
18/05/18 10
24/05/18 40
30/05/18 60
Date Amount
31/05/18 110
Thanks
Simply apply last_day (if there's a time part you must apply trunc to remove it):
TRUNC(LAST_DAY(datecol))

difference between two timestamps (in days) in oracle

SELECT MIN (snap_id) AS FIRST_SNAP,
MAX (snap_id) AS LAST_SNAP,
MIN (BEGIN_INTERVAL_TIME) AS FIRST_QUERY,
MAX (END_INTERVAL_TIME) AS LAST_QUERY,
max(end_interval_time) - min(begin_interval_time) as "TIME_ELAPSED"
FROM dba_hist_snapshot
ORDER BY snap_id;
2931 3103 5/28/2012 6:00:11.065 AM 6/4/2012 11:00:40.967 AM +07 05:00:29.902000
I would like the last columns output to be 7 (for the days). I have tried trunc and extract like some other posts mentioned but can't seem to get the syntax right. Any ideas?
Judging from your comment, you're using timestamp columns, not datetime. You could use extract to retrieve the hour difference, and then trunc(.../24) to get the whole number of days:
trunc(extract(hour from max(end_interval_time) - min(begin_interval_time))/24)
Or you could cast the timestamp to a date:
trunc(cast(max(end_interval_time) as date) -
cast(min(begin_interval_time) as date))