SQL Query Inactive Users with last end date - sql

This is a follow up from a question I asked about a year ago Old thread
The answers I got then have worked fine but now I discovered that I need to tweak the query to be able to get the latest end date for the inactive users.
So again here's a quick example table of users, some are active and some are inactive and some have several period of employment.
when someone is reemployed a new row will be added for that employment period.
Username will always be the same.
So I want to find which users that is disabled and doesn't have an active employment also if there is several period of employment I want the one that has the latest end date. One row per username with all the columns.
The database is SQL Server 2016.
Example table:
| username | name | active | Job title | enddate
+-----------+----------- +--------+-------------+----------
| 1111 | Jane Doe | 1 | CIO | 1/3/2022
| 1111 | Jane Doe | 0 | Janitor | 1/2/2018
| 1112 | Bob Doe | 1 | Coder | NULL
| 1113 | James Doe | 0 | Coder | 1/3/2018
| 1114 | Ray Doe | 1 | Manager | NULL
| 1114 | Ray Doe | 0 | Clerk | 2/2/2019
| 1115 | Emma Doe | 1 | Waiter | NULL
| 1116 | Sarah Doe | 0 | Greeter | 3/4/2016
| 1116 | Sarah Doe | 0 | Trainer | 4/5/2019
So for user 1116 I would ideally get one row with enddate 4/5/2019
The query I use from the answers in the old thread is this one:
;WITH NonActiveDisabledUsers AS
(
SELECT DISTINCT
U.username
FROM
UserEmployment AS U
WHERE
U.active = 0 AND
NOT EXISTS (SELECT 'no current active employment'
FROM UserEmployment AS C
WHERE U.username = C.username AND
C.active = 1 AND
(C.enddate IS NULL OR C.enddate >= CONVERT(DATE, GETDATE())))
)
SELECT
R.*
FROM
NonActiveDisabledUsers AS N
CROSS APPLY (
SELECT TOP 1 -- Just 1 record
U.*
FROM
UserEmployment AS U
WHERE
N.username = U.username AND
U.active = 0
ORDER BY
U.enddate DESC -- Determine which record should we display
) AS R
This gives me the right user and employment status but not the latest end date since it will get the first result for user 1116

We can use conditional aggregation with a window aggregate to get the number of active rows for this user.
We then filter to only inactive, and row-number the result by enddate taking the first row per group:
SELECT
username,
name,
active,
[Job title],
enddate
FROM (
SELECT *, rn = ROW_NUMBER() OVER (PARTITION BY username ORDER BY enddate DESC)
FROM (
SELECT *,
CountOfActive = COUNT(CASE WHEN
Active = 1 AND
(enddate IS NULL OR enddate >= CONVERT(DATE, GETDATE())) THEN 1 END
) OVER (PARTITION BY username)
FROM UserEmployment
) AS t
WHERE CountOfActive = 0
) AS t
WHERE rn = 1;
Note that the row-numbering does not take into account nulls in enddate which would be sorted last. You would need a conditional ordering:
ROW_NUMBER() OVER (PARTITION BY username ORDER BY CASE WHEN enddate IS NULL THEN 0 ELSE 1 END ASC, enddate DESC)

Hmmm . . . I think you can just get the most recent record and check that it is not active:
select ue.*
from (select ue.*,
row_number() over (partition by user_id
order by active desc, enddate desc
) as seqnum
from UserEmployment ue
) ue
where seqnum = 1 and active = 0;

Here is a variant that will produce the desired result:
SELECT distinct username, max(enddate)
FROM UserEmployment as t1
WHERE
t1.active = 0 AND
NOT EXISTS (select username from UserEmployment as t2 WHERE active = 1 AND
(t2.enddate IS NULL OR t2.enddate >= CONVERT(DATE, GETDATE())) AND
t1.username = t2.username)
GROUP BY username

Related

Finding created on dates for duplicates in SQL

I have one table of contact records and I'm trying to get the count of duplicate records that were created on each date. I'm not looking to include the original instance in the count. I'm using SQL Server.
Here's an example table
| email | created_on |
| ------------- | ---------- |
| aaa#email.com | 08-16-22 |
| bbb#email.com | 08-16-22 |
| zzz#email.com | 08-16-22 |
| bbb#email.com | 07-12-22 |
| aaa#email.com | 07-12-22 |
| zzz#email.com | 06-08-22 |
| aaa#email.com | 06-08-22 |
| bbb#email.com | 04-21-22 |
And I'm expecting to return
| created_on | dupe_count |
| ---------- | ---------- |
| 08-16-22 | 3 |
| 07-12-22 | 2 |
| 06-08-22 | 0 |
| 04-21-22 | 0 |
Edited to add error message:
error message
I created a sub table based on email and created date row number. Then, you query that, and ignore the date when the email first was created (row number 1). Works perfectly fine in this case.
Entire code:
Create table #Temp
(
email varchar(50),
dateCreated date
)
insert into #Temp
(email, dateCreated) values
('aaa#email.com', '08-16-22'),
('bbb#email.com', '08-16-22'),
('zzz#email.com', '08-16-22'),
('bbb#email.com', '07-12-22'),
('aaa#email.com', '07-12-22'),
('zzz#email.com', '06-08-22'),
('aaa#email.com', '06-08-22'),
('bbb#email.com', '04-21-22')
select datecreated, sum(case when r = 1 then 0 else 1 end) as duplicates
from
(
Select email, datecreated, ROW_NUMBER() over(partition by email
order by datecreated) as r from #Temp
) b
group by dateCreated
drop table #Temp
Output:
datecreated duplicates
2022-04-21 0
2022-06-08 0
2022-07-12 2
2022-08-16 3
You can calculate the difference between total count of emails for every day and the count of unique emails for the day:
select created_on,
count(email) - count(distinct email) as dupe_count
from cte
group by created_on
It seems I have misunderstood your request, and you wanted to consider previous created_on dates' too:
ct as (
select created_on,
(select case when (select count(*)
from cte t2
where t1.email = t2.email and t1.created_on > t2.created_on
) > 0 then email end) as c
from cte t1)
select created_on,
count(distinct c) as dupe_count
from ct
group by created_on
order by 1
It seems that in oracle it is also possible to aggregate it using one query:
select created_on,
count(distinct case when (select count(*)
from cte t2
where t1.email = t2.email and t1.created_on > t2.created_on
) > 0 then email end) as c
from cte t1
group by created_on
order by 1

Selecting the most recent entry on a timestamp cell

I have these two tables:
User:
=======================
id | Name | Email
=======================
1 | User-A| a#mail
2 | User-B| b#mail
=======================
Entry:
=================================================
id | agree | createdOn | userId
=================================================
1 | true | 2020-11-10 19:22:23 | 1
2 | false | 2020-11-10 22:22:23 | 1
3 | true | 2020-11-11 12:22:23 | 1
4 | true | 2020-11-04 22:22:23 | 2
5 | false | 2020-11-12 02:22:23 | 2
================================================
I need to get the following result:
=============================================================
Name | Email | agree | createdOn
=============================================================
User-A | a#mail | true | 2020-11-11 22:22:23
User-B | b#mail | false | 2020-11-12 02:22:23
=============================================================
The Postgres query I'm running is:
select distinct on (e."createdOn", u.id)
u.id , e.id ,u."Name" , u.email, e.agree, e."createdOn" from "user" u
inner join public.entry e on u."id" = e."userId"
order by "createdOn" desc
But the problem is that it returns all the entries after doing the join! where I only want the most recent entry by the createdOn cell.
You want the latest entry per user. For this, you need the user id in the distinct on clause, and no other column. This guarantees one row in the resultset per user.
Then, you need to put that column first in the order by clause, followed by createdOn desc. This breaks the ties and decides which row will be retained in each group:
select distinct on (u.id) u.id , e.id ,u."Name" , u.email, e.agree, e."createdOn"
from "user" u
inner join public.entry e on u."id" = e."userId"
order by u.id, "createdOn" desc
You can also use row_number to select the latest rows then do the join
SELECT * FROM USER A
LEFT JOIN (
SELECT * FROM (
SELECT *,ROW_NUMBER() OVER(PARTITION BY USERID ORDER BY CREATEDON DESC) AS RN
FROM ENTRY
) K WHERE RN = 1
) B
ON A.ID = B.USERID
Try Having createdon = max(createdon) function. Group By User

Select one row from non unique rows based on row value

I have a quiz table
id | user_id | quiz_id
--------------------------
1 | 34567 | 12334
2 | 34567 | 12334
3 | 34567 | 23455
id 1 and 2 depicts a quiz that can be assigned to the same user twice
and a quiz transaction table
id | date | status
------------------------
1 | 2014 | assigned
2 | 2014 | assigned
3 | 2014 | assigned
------------------------
1 | 2014 | completed
id is foreign key to quiz table id, the last row depicts whenever a user finished the quiz, the row in the transaction table is updated with status 'completed'
Expected Result: I want a table with a structure like
id | user_id| course_id | date | status
------------------------------------------
1 | 34567 | 12334 | 2014 | completed
2 | 32567 | 12334 | 2014 | assigned
3 | 2014 | 23455 | 2014 | assigned
My query is
SELECT q.id, q.user_id, q.course_id, qt.date, qt.status FROM quiz q
LEFT JOIN
quiz_transaction qt ON
q.id = qt.id
but it gives me extra row (as the query will)
1 | 34567 | 12334 | 2014 | assigned
I cannot use
ON qt.type = 'completed'
Because if its completed it should return a completed row and if not it should return an assigned row but not both.
So in the result I cannot have
1 | 34567 | 12334 | 2014 | completed
1 | 34567 | 12334 | 2014 | assigned
How can I do it?
How about simply using the MAX() function with GROUP BY (SQL Fiddle):
SELECT q.id, q.user_id, q.course_id, qt.date, MAX(qt.status) AS Status
FROM quiz q
LEFT JOIN quiz_transaction qt ON q.id = qt.id
GROUP BY q.id, q.user_id, q.course_id, qt.date
EDIT: If you need to order a string a certain way, you could use a CASE statement to convert the string to a number. Get the MAX value and then convert it back (SQL Fiddle):
SELECT m.id, m.user_id, m.quiz_id, MAX(m.date),
CASE WHEN MAX(m.status) = 1 THEN 'assigned'
WHEN MAX(m.status) = 2 THEN 'doing'
WHEN MAX(m.status) = 3 THEN 'completed' END AS Status
FROM
(
SELECT q.id, q.user_id, q.quiz_id, qt.date,
CASE WHEN qt.status = 'assigned' THEN 1
WHEN qt.status = 'doing' THEN 2
WHEN qt.status = 'completed' THEN 3 END AS Status
FROM quiz q
LEFT JOIN quiz_transaction qt ON q.id = qt.id
) AS m
GROUP BY m.id, m.user_id, m.quiz_id;
Depending on your release SQLnServer supports Standard SQL's "Windowed Aggregate Functions". ROW_NUMBER will give you a single row:
SELECT
q.id
,q.user_id
,q.quiz_id
,qt.date
,qt.status
FROM quiz q
JOIN
(
SELECT
id
,date
,status
,ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY Status DESC) as rn
FROM quiz_transaction
) as qt
ON q.id = qt.id
WHERE rn = 1
If you got more complex ordering rules you need to use a CASE:
,ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY CASE Status WHEN 'completed' THEN 1
WHEN 'doing' THEN 2
WHEN 'assigned' THEN 3
END) as rn
try this
SELECT q.id, q.user_id, q.course_id, q1.date, qt.status FROM quiz q
LEFT JOIN
(Select id , convert(varchar,max(convert(varbinary,status ))) 'Status'
from quiz_transaction
group by id
) qt ON
q.id = qt.id
left join quiz_transaction q1 on q1.id = qt.id and q1.status=qt.status

Select distinct where date is max

This feels really stupid to ask, but i can't do this selection in SQL Server Compact (CE)
If i have two tables like this:
Statuses Users
id | status | thedate id | name
------------------------- -----------------------
0 | Single | 2014-01-01 0 | Lisa
0 | Engaged | 2014-01-02 1 | John
1 | Single | 2014-01-03
0 | Divorced | 2014-01-04
How can i now select the latest status for each person in statuses?
the result should be:
Id | Name | Date | Status
--------------------------------
0 | Lisa | 2014-01-04 | Divorced
1 | John | 2014-01-03 | Single
that is, select distinct id:s where the date is the highest, and join the name. As bonus, sort the list so the latest record is on top.
In SQL Server CE, you can do this using a join:
select u.id, u.name, s.thedate, s.status
from users u join
statuses s
on u.id = s.id join
(select id, max(thedate) as mtd
from statuses
group by id
) as maxs
on s.id = maxs.id and s.thedate = maxs.mtd;
The subquery calculates the maximum date and uses that as a filter for the statuses table.
Use the following query:
SELECT U.Id AS Id, U.Name AS Name, S.thedate AS Date, S.status AS Status
FROM Statuses S
INNER JOIN Users U on S.id = U.id
WHERE S.thedate IN (
SELECT MAX(thedate)
FROM statuses
GROUP BY id);

How can I get all records from one table, even if there are no corresponding records in the JOINed table?

I have a table of [Users], simplified for this example:
uID | uName | uSalesRep
----+-------+----------
1 | John | 1
2 | Bob | 1
3 | Fred | 1
4 | Stu | 1
And a table of sales [Activity]:
aID | aDate | aUserID | aText
----+------------+---------+---------------
1 | 2013-10-09 | 1 | John did stuff
2 | 2013-10-14 | 2 | Bob did stuff
3 | 2013-10-17 | 3 | Fred did stuff
I want to get a list of all sales reps, together with their activity for the week beginning 14th October 2013, and here's how I'm trying to do it:
SELECT uID, Name, aID, aDate, aText
FROM [Users]
LEFT JOIN [Activity] ON uID = UserID
WHERE (aDate >= '2013-10-14' OR aDate = NULL)
AND (aDate <= '2013-10-18' OR aDate = NULL)
AND uSalesRep = 1
I used a LEFT JOIN in the hope of retrieving all reps, but I think this is being overridden by the aDate requirements. Including aDate = NULL includes reps with no activity at all, but reps who have activity outside the specified range are being omitted.
How can I get all the reps at least once, regardless of any activity they have recorded?
Thanks for your time.
When the filter applies to the joined table, you need to put the filter on the join, not in the where clause
SELECT uID, Name, aID, aDate, aText
FROM [Users]
LEFT JOIN [Activity] ON uID = UserID
AND (aDate >= '2013-10-14')
AND (aDate <= '2013-10-18')
WHERE uSalesRep = 1