Using Previously Calculated Values in a CASE Statement - sql

I currently working on this Oracle database to figure out inactive relationships based on many conditions.
So currently I have a CASE statement to calculate the 'Status' field by using the last login date.
Following are the condition I have to check.
Last login date is within 6 months then set the status as Active - This I already did.
Last Login date is older than 6 months and if the user has another account/s which has an active status which was calculated in the previous condition, then make the second/multiple account/s Status as Active as well.
This one I'm unable to achieve.
I tried to select the resultant table in to a FROM clause once again hoping that I can populate the data. but I do not understand how to write that piece.
SELECT t1.UserName,
t1.UserID,
t1.LastLoginDate,
t1.status
FROM (SELECT UserName,
UserID,
LastLoginDate,
(CASE
WHEN LastLoginDate > ADD_MONTHS ('01-Jul-2019', -6)
THEN
'Active'
END)
AS status
FROM User_Mas_Table) t1
Above query gives me the following results.
UserName UserID LastLoginDate STATUS
---------- ---------- --------------- ------
AAAAAA 1 7/23/2019 Active
AAAAAA 2 7/24/2019 Active
AAAAAA 3 11/7/2018
CCCCCC 4 7/24/2019 Active
BBBBBB 5 4/30/2019 Active
DDDDDD 6 5/24/2019 Active
EEEEEE 7 7/22/2019 Active
FFFFFF 8 3/14/2019 Active
GGGGGG 9 7/24/2019 Active
GGGGGG 10 5/14/2018
HHHHHH 11 4/30/2019 Active
I need to fill those empty ones as active as well.

Use a window function to compare the most recent login date over all user accounts:
select um.*,
(case when max(lastlogindate) over (partition by username) > add_months(date '2019-07-01', -6)
then 'Active'
end) as status
from User_Mas_Table um;
If you only want "active" when the account has a lastlogindate, then the logic is:
select um.*,
(case when lastlogindate is not null and
max(lastlogindate) over (partition by username) > add_months(date '2019-07-01', -6)
then 'Active'
end) as status
from User_Mas_Table um;

Related

How to obtain information from 10 dates without using 10+ left joins

I have some information as shown in the simplified table below.
login_date | userid
-------------------------
2020-12-01 | 123
2020-12-01 | 456
2020-12-02 | 123
2020-12-02 | 456
2020-12-02 | 789
2020-12-03 | 123
2020-12-03 | 789
The range of dates found in login_date span from 2020-12-01 to 2020-12-12 and the userid for each day is unique.
What I wish to obtain comes in 2 folds:
The number of users who first logged in on a certain date. excluding users who logged in on preceding day(s).
For users who first logged in on a certain date (e.g. 2020-12-01), how many of them logged in on subsequent days as well? (i.e. of the batch who first logged in on 2020-12-01, how many were found to log in on 2020-12-02, 2020-12-03.. and so on)
For the above table, an example of the desired result may be as follows:
| 2020-12-01 | 2020-12-02 | 2020-12-03 | ... (users' first login date)
----------------------------------------------------------------------------------------
| 2020-12-01 | 2 x x
users who continued | 2020-12-02 | 2 1 x
to log in on these | 2020-12-03 | 1 1 0
dates | ... |
Reasoning:
On the first day, two new users logged in, 123 and 456.
On the second day, the same old users, 123 and 456, logged in as well. In addition, a new user (logging in for the first time), 789, was added.
On the third day, only one of the original old users, 123 logged in. (count of 1). The new user (from the second day), 789, logged in as well. (count of 1)
My attempt
I actually managed to obtain a (rough) solution in two parts. For the first day, 2012-12-01, I simply filtered users who logged in on the first day and performed left joins for all the remaining dates:
select count(d1.userid) as d1_users, count(d2.userid) as d2_users, ... (repeated for all joined tables)
from table1 d1
left join (
select userid
from table1
where login_date = date('2020-12-02')
) d2
on d1.userid = d2.userid
... -- (10 more left joins, with each filtering by an incremented date value)
where d1.login_date = date('2020-12-01')
For dates following the second day onwards, I did a bit of preprocessing to exclude users who had logged in on preceding day(s):
with d2_users as (
select userid
from table1 a
left join (
select userid
from table1
where login_date = date('2020-12-01')
) b
on a.userid = b.userid
where b.userid is null -- filtering out users who logged in on preceding day(s)
and a.login_date = date('2020-12-02')
)
select count(d2.userid) as d2_users, ... -- (repeated for all joined tables)
from d2_users d2
left join (
select userid
from table1
where login_date = date('2020-12-03')
) d3
on d2.userid = d3.userid
... -- (similar to the query for the 2020-12-01)
In the process of writing and executing this query it took a lot of manual editing (deleting of unnecessary left joins for later dates and count), and ultimately the entire query for just two days takes up 300+ lines of SQL code. I am not sure whether there is a more efficient process for this.
Any advice would be greatly appreciated! I would be happy to provide further clarification if needed as well since the optimization of the solution to this problem has been bugging me for some time.
I apologize for the poor formatting of the desired result, as I currently only have a representation of it in a spreadsheet and not an idea of how it may look like as a SQL output.
Edit:
I realized I may not have communicated the ideal outcomes properly. For each min_login_date identified, what I wish to obtain is the number of users who continue to log in from a preceding date. An example would be:
10 users log in on 2020-12-01. Hence, the count for 2020-12-01 = 10.
Of the 10 previous users, 8 users log in on 2020-12-02. Hence the count for 2020-12-02 = 8.
Of the 8 users (from the previous day), 6 users log in on 2020-12-03. Hence the count for 2020-12-03 = 6.
As such for each min_login_date, the user count for subsequent dates should be <= that of the user count for previous dates. Hope this helps! I apologize for any miscommunication.
You can use window functions to get the earliest date. And then aggregate:
select min_login_date, count(*) as num_on_day,
sum(case when login_date = '2020-12-01' then 1 else 0 end) as login_20201201,
sum(case when login_date = '2020-12-02' then 1 else 0 end) as login_20201203,
. . .
from (select t.*,
min(login_date) over (partition by user_id) as min_login_date
from t
) t
group by min_login_date
I think you need some tweak using analytical function and aggregate function as follows:
select login_date,
Count(case when min_login_date = '2020-12-01' then 1 end) as login_20201201,
Count(case when min_login_date = '2020-12-02' then 1 end) as login_20201202,
......
from (select t.*,
min(login_date) over (partition by user_id) as min_login_date,
Lag(login_date) over (partition by user_id) as lag_login_date,
from your_taeble t
Where t.login_date between '2020-12-01' and '2020-12-12'
) t
where (lag_login_date = login_date - interval '1 day' or lag_login_date is null)
group by login_date

how to fetch count data of 2 date fields in same month in SQL

I am trying to create a query where I have 3 column.
C_Time: contains task Creation date time
Done_Time: Contains Task completion date time
User ID: Unique id of user
I want to get result where I want to get total count of created tasks in particular month and total number of done task at that same month grouped by user id
Output will be like:
UserID | CreatedCount | DoneCount
------------------------------------------
U12 | 12 | 12
-------------------------------------------
U13 | 7 | 5
here U12 user have created 12 tasks and completed 12 tasks in January 2020 month. But user U13 created 7 tasks in Jan 2020 and done 5 tasks in same month.
You can use apply to unpivot the data and then aggregation:
select t.user_id, sum(is_create), sum(is_complete)
from t cross apply
(values (t.c_time, 1, 0), (t.done_time, 0, 1)
) v(t, is_create, is_complete)
where v.t >= '2020-01-01' and v.t < '2020-02-01'
group by t.user_id;
You can also do this with conditional aggregation:
select user_id,
sum(case when c_time >= '2020-01-01' and c_time < '2020-02-01' then 1 else 0 end),
sum(case when done_time >= '2020-01-01' and done_time < '2020-02-01' then 1 else 0 end)
from t
group by user_id;
This is probably a little faster for your particular example. However, the first version is more generalizable -- for instance, it allows you to summarize easily by both user and month.

Counting and adding distinct values that occur in certain dates using PostgreSQL

Using some SQL in the tables of some database, I get a result like this:
id name date status
1 John 2018-05-03 PRESENT
2 Mary 2018-05-03 NOT PRESENT
3 Jane 2018-05-03 NOT PRESENT
2 Mary 2018-05-04 PRESENT
1 John 2018-05-04 PRESENT
1 John 2018-05-05 PRESENT
2 Mary 2018-05-05 NOT PRESENT
3 Jane 2018-05-04 PRESENT
3 Jane 2018-05-05 NOT PRESENT
1 John 2018-05-06 PRESENT
I wanna use further SQL to get in a result like this one:
id name date present not present
1 John 2018-05 4 0
2 Mary 2018-05 1 2
3 Jane 2018-05 2 1
In other words, I wanna extract how many classes a student attended in a given month, based on the status he/she received everyday. How can I achieve that?
Use conditional aggregation :
select id, name, to_char(date,'YYYY-MM') as "Date",
sum(case when status = 'PRESENT' then 1 else 0 end ) as present,
sum(case when status = 'NOT PRESENT' then 1 else 0 end ) as not_present
from tab
group by id, name, "Date"
order by id
Demo
keeping else 0 is important to get 0 for null returning cases
column alias in the select list might be used in the group by list
for Postgres
due to the desired output, truncating date value to month by
to_char(date,'YYYY-MM') is needed
select id, name, to_char(date,'YYYY-MM') as date,
sum((case when status = 'PRESENT' then 1 end )) present,
sum((case when status = 'NOT PRESENT' then 1 end )) not_present
from your_result_table
group by id, name, to_char(date,'YYYY-MM')
Use conditional aggregation (using filter) and date_trunc():
select id, name, date_trunc('month', date),
count(*) filter (where status = 'PRESENT') as num_present,
count(*) filter (where status = 'NOT PRESENT') as num_notpresent
from t
group by id, name, date_trunc('month', date)
order by id, name, date_trunc('month', date)

Expanding/changing my query to find more entries using (potentially) IFELSE

My question will use this dataset as an example. I have a query setup (I have changed variables to more generic variables for the sake of posting this on the internet so the query may not make perfect sense) that picks the most recent date for a given account. So the query returns values with a reason_type of 1 with the most recent date. This query has effective_date set to is not null.
account date effective_date value reason_type
123456 4/20/2017 5/1/2017 5 1
123456 1/20/2017 2/1/2017 10 1
987654 2/5/2018 3/1/2018 15 1
987654 12/31/2017 2/1/2018 20 1
456789 4/27/2018 5/1/2018 50 1
456789 1/24/2018 2/1/2018 60 1
456123 4/25/2017 null 15 2
789123 5/1/2017 null 16 2
666888 2/1/2018 null 31 2
333222 1/1/2018 null 20 2
What I am looking to do now is to basically use that logic to only apply to reason_type
if there is an entry for it, otherwise have it default to reason_type
I think I should be using an IFELSE, but I'm admittedly not knowledgeable about how I would go about that.
Here is the code that I currently have to return the reason_type 1s most recent entry.
I hope my question is clear.
SELECT account, date, effective_date, value, reason_type
from
(
SELECT account, date, effective_date, value, reason_type
ROW_NUMBER() over (partition by account order by date desc) rn
from mytable
WHERE value is not null
AND effective_date is not null
)
WHERE rn =1
I think you might want something like this (do you really have a column named date by the way? That seems like a bad idea):
SELECT account, date, effective_date, value, reason_type
FROM (
SELECT account, date, effective_date, value, reason_type
, ROW_NUMBER() OVER ( PARTITION BY account ORDER BY date DESC ) AS rn
FROM mytable
WHERE value IS NOT NULL
) WHERE rn = 1
-- effective_date IS NULL or is on or before today's date
AND ( effective_date IS NULL OR effective_date < TRUNC(SYSDATE+1) );
Hope this helps.

Case statement for HIVE platform

I have a table with the following columns:
ID
Scheduled Date
Status
Target Date
I need to extract 'Status' corresponding to minimum 'Appointment Date' for each ID. If not available then I need to extract status corresponding to the minimum 'Target Date' for that ID.
Sample data:
ID | Scheduled_Date | Status | Target_Date
1 12/11/2017 Completed 12/11/2017
1 12/12/2017 Completed 12/12/2017
2 12/13/2017 Completed 12/13/2017
3 12/14/2017 Pending 12/14/2017
3 12/15/2017 Pending 12/15/2017
4 Confirmed 12/18/2017
4 Confirmed 12/19/2017
5 12/14/2017 Completed 12/14/2017
5 12/15/2017 Pending 12/15/2017
Can you please correct the code that I am trying to write?
SELECT ID,
CASE WHEN ID IS NOT NULL THEN
CASE WHEN MIN(SCHEDULED_DATE) IS NOT NULL
THEN STATUS
ELSE
END
CASE WHEN MIN(TARGET_DATE) IS NOT NULL
THEN STATUS
ELSE ''
END
FROM FIRST_STATUS
Try this query.
SELECT id,
status
FROM yourtable t
WHERE COALESCE (Scheduled_Date,
Target_Date) IN
(SELECT MIN(COALESCE (Scheduled_Date,Target_Date))
FROM yourtable i
WHERE i.ID = t.id
GROUP BY i.ID);
DEMO
Use row_number() analytic function:
select id,
status
from
(
select id,
status,
row_number() over(partition by id, order by nvl(Scheduled_Date,Target_Date)) rn
from yourtable t
)s
where rn=1
;