Hive + Pass previous day date to like clause - hive

Am trying to fetch records from hive table based on the previous date.
Example: If table is as follows
CustomerVisit table
ID NAME VISIT_DATE
------------------
01 Harish 2018-02-31
03 Satish 2017-02-13
04 Shiva 2018-03-04
Now i need to fetch all records that have visit_date = 2018-03-04 (i.e today's date -1).
Expected Query something like:
select ID, Name from CustomerVisit where
visit_date like concat((select date_sub(current_date, 1)),'%')
I have tried following
select current_date; - Gives current date
select date_sub(current_date, 1) - Gives previous date
select concat(tab1.date1,'%') from
(select date_sub(current_date, 1) as date1) as tab1; -- Gives the previous date appended with % which i need to use in like
but when i use the above as sub-query like below it fails with
select tab2.id, (select concat(tab1.date1,'%') as pattern from
(select date_sub(current_date, 1) as date1) as tab1) from CustomerVisit as tab2 limit 1;
FAILED: ParseException line 1:0 cannot recognize input near 'seelct' 'current_date' ''
How to write query to get results for previous date?

You don't need a LIKE clause. Just select using an equal to (=)
select ID, Name from CustomerVisit where
visit_date = date_sub(current_date, 1);

Related

How to find the number of occurences within a date range?

Let's say I have hospital visits in the table TestData
I would like to know which patients have had a second hospital visit within 7 days of their first hospital visit.
How would I code this in SQL?
I have patient_id as a TEXT
the date is date_visit is also TEXT and takes the format MM/DD/YYYY
patient_id
date_visit
A123B29133
07/12/2011
A123B29133
07/14/2011
A123B29133
07/20/2011
A123B29134
12/05/2016
In the above table patient A123B29133 fulfills the condition as they were seen on 07/14/2011 which is less that 7 days from 07/12/2011
You can use a subquery with exists:
with to_d(id, v_date) as (
select patient_id, substr(date_visit, 7, 4)||"-"||substr(date_visit, 1, 2)||"-"||substr(date_visit, 4, 2) from visits
)
select t2.id from (select t1.id, min(t1.v_date) d1 from to_d t1 group by t1.id) t2
where exists (select 1 from to_d t3 where t3.id = t2.id and t3.v_date != t2.d1 and t3.v_date <= date(t2.d1, '+7 days'))
id
A123B29133
Since your date column is not in YYYY-MM-DD which is the default value used by several sqlite date functions, the substr function was used to transform your date in this format. JulianDay was then used to convert your dates to an integer value which would ease the comparison of 7 days. The MIN window function was used to identify the first hospital visit date for that patient. The demo fiddle and samples show the query that was used to transform the data and the results before the final query which filters based on your requirements i.e. < 7 days. With this approach using window functions, you may also retrieve the visit_date and the number of days since the first visit date if desired.
You may read more about sqlite date functions here.
Query #1
SELECT
patient_id,
visit_date,
JulianDay(visit_date) -
MIN(JulianDay(visit_date)) OVER (PARTITION BY patient_id)
as num_of_days_since_first_visit
FROM
(
SELECT
*,
(
substr(date_visit,7) || '-' ||
substr(date_visit,0,3) || '-' ||
substr(date_visit,4,2)
) as visit_date
FROM
visits
) v;
patient_id
visit_date
num_of_days_since_first_visit
A123B29133
2011-07-12
0
A123B29133
2011-07-14
2
A123B29133
2011-07-20
8
A123B29134
2016-12-05
0
Query #2
The below is your desired query, which uses the previous query as a CTE and applies the filter for visits less than 7 days. num_of_days <> 0 is applied to remove entries where the first date is also the date of the record.
WITH num_of_days_since_first_visit AS (
SELECT
patient_id,
visit_date,
JulianDay(visit_date) - MIN(JulianDay(visit_date)) OVER (PARTITION BY patient_id) num_of_days
FROM
(
SELECT
*,
(
substr(date_visit,7) || '-' ||
substr(date_visit,0,3) || '-' ||
substr(date_visit,4,2)
) as visit_date
FROM
visits
) v
)
SELECT DISTINCT
patient_id
FROM
num_of_days_since_first_visit
WHERE
num_of_days <> 0 AND num_of_days < 7;
patient_id
A123B29133
View on DB Fiddle
Let me know if this works for you.
I would like to know which patients have had a second hospital visit within 7 days of their first hospital visit.
You can use lag(). The following gets all rows where this is true:
select t.*
from (select t.*,
lag(date_visit) over (partition by patient_id order by date_visit) as prev_date_visit
from t
) t
where prev_date_visit >= date(date_visit, '-7 day');
If you just want the patient_ids, you can use select distinct patient_id.

How do I create a new column showing difference between maximum date in table and date in row?

I need two columns: 1 showing 'date' and the other showing 'maximum date in table - date in row'.
I kept getting a zero in the 'datediff' column, and thought a nested select would work.
SELECT date, DATEDIFF(max_date, date) AS datediff
(SELECT MAX(date) AS max_date
FROM mytable)
FROM mytable
GROUP BY date
Currently getting this error from the above code : mismatched input '(' expecting {, ';'}(line 2, pos 2)
Correct format in the end would be:
date | datediff
--------------------------
2021-08-28 | 0
2021-07-26 | 28
2021-07-23 | 31
2021-08-11 | 17
If you want the date difference, you can use:
SELECT date, DATEDIFF(MAX(date) OVER (), date) AS datediff
FROM mytable
GROUP BY date
You can do this using the analytic function MAX() Over()
SELECT date, MAX(date) OVER() - date FROM mytable;
Tried this here on sqlfiddle

OR for two different columns

I am trying to select specific rows from an Oracle DB.
The table has the following structure:
Order
Date
Status
1
01.01.2018
10
2
01.01.2018
15
I would like to extract all rows where
Status = < 85 or
the order date is in this week
Unfortunately, column Status is declared as a text column.
How would you build a SQL to extract these specific rows?
Hmmm . . . I don't know what you mean by "this week". Perhaps:
where status <= '85' or
orderdate >= trunc(sysdate, 'IW')
This does a string comparison on the status, so '9' would not be matched, and uses the ISO definition of week for the current week.
If you want a numeric comparison for status, then:
where to_number(status) <= 85 or
orderdate >= trunc(sysdate, 'IW')
Maybe try the CAST function like
select* from yourtable where cast(Status as INT) <=85 AND
to_char(to_date(date,'MM/DD/YYYY'),'WW')=to_char(to_date(sysdate,'MM/DD/YYYY'),'WW')

Year in DD/MON/YY converts to 1943 and 2043 in 2 different tables with the same data

I am trying the same SQL on 2 different tables in the same database.
SELECT date_of_birth_1 from Table1 where id = '1111';
The output is 31/DEC/43.
SELECT date_of_birth_2 from Table2 where id = '1111';
The output is 31/DEC/43 again.
But when I run
SELECT extract(year from date_of_birth_1) from Table1 where id = '1111';
The output is 1943.
And when I run
SELECT extract(year from date_of_birth_2) from Table2 where id = '1111';
The output is 2043.
I don't understand what is going on, could you please help me. I want both the tables to use the same reference year which is 1900.
Edit: This happens only for some dates.
select EXTRACT(year FROM TO_DATE('01/AUG/43')) from dual;
The output is 2043.
select EXTRACT(year FROM TO_DATE('04/MAR/53')) from dual;
The output is 1953.
By default, Oracle is only showing the last two digits of the year. In one table, the date would seem to be 1943-12-31 and in the other 2043-12-31.
You can see the full date using to_char():
select to_char(dob, 'YYYY-MM-DD')
If you need to fix the data, you can do something like:
update t
set dob = add_months(dob, -12 * 100)
where dob > <whatever threshold you want here>

Appending the result query in bigquery

I am doing a query where the query will append the data from previous date as the outcome in BigQuery.
So, the result data for today will be higher than yesterdays as the data is appending by days.
So far, what I only managed to get the outcome is the data by days (where you can see the number of ID declining and is not appending from previous day) as this result:
What should I do to add appending function in the query so each day will get the result of data from the previous day in bigquery?
code:
WITH
table1 AS (
SELECT
ID,
...
FROM t
WHERE DATE_SUB('2020-01-31', INTERVAL 31 DAY) and '2020-01-31'
),
table2 AS (
SELECT
ID,
COUNTIF((rating < 7) as bad,
COUNTIF((rating >= 7 AND SAFE_CAST(NPS_Rating as INT64) < 9) as intermediate,
COUNTIF((rating as good
FROM
t
WHERE DATE_SUB('2020-01-31', INTERVAL 31 DAY) and '2020-01-31'
)
SELECT
DATE_SUB('2020-01-31', INTERVAL 31 DAY) as date,
*
FROM table1
FULL OUTER JOIN table2 USING (ID)
If you have counts that you want to accumulate, then you want a cumulative sum. The query would look something like this:
select datecol, count(*), sum(count(*)) over (order by datecol)
from t
group by datecol
order by datecol;