Select row which not followed by specific one - sql

I have table with list of candidates and linked tabled with history of candidate statuses:
CandidateId FirstName LastName
--------------------------------
1 User One
2 User Two
and
CandidateStatusId CandidateId Status Timestamp
--------------------------------------------------------
1 1 Assigned ...
2 1 Interviewed ...
3 1 Offer Accepted ...
1 2 Assigned ...
2 2 Interviewed ...
3 2 Offer Accepted ...
4 2 Hired ...
5 2 Bench ...
6 2 Hired ...
1 3 Assigned ...
2 3 Interviewed ...
3 3 Offer Accepted ...
4 3 Hired ...
5 3 Bench ...
I want select candidates which has last status is 'Offer Accepted' and never before was 'Hired'. In my example only 1st user should be selected because second already hired and third was hired before (and actually on bench).
UPD: I prepared SQL statement which should filter users but not sure about its speed, number of users may be quite big:
SELECT * FROM dbo.CandidatePositionStatus
WHERE CandidateId=34841
AND 'Hired' NOT IN (SELECT Status FROM dbo.CandidatePositionStatus WHERE CandidateId=34841)
But I do not know how to embed it in another select to provide CandidateId
UPD2: I prepared another query, but it is just checking whether candidate has OA status and hasn't 'HR' status, but speed of query is still opened question.
SELECT DISTINCT CandidateId
FROM dbo.CandidatePositionStatus
WHERE
CandidateId IN (
SELECT CandidateId FROM dbo.CandidatePositionStatus WHERE PositionStatusForCandidateCode='Offer Accepted' AND FirstWorkingDay IS NOT NULL
)
AND CandidateId NOT IN (
SELECT CandidateId FROM dbo.CandidatePositionStatus WHERE PositionStatusForCandidateCode='Gired'
)

Please check whether the following query is enough. I have omitted the case of Hire since basic checking is for last value Offer Accepted.
select
CandidateId
From(
select
*,
MAX(CandidateStatusId) over(partition by CandidateId) MaxVal
From
CandidatePositionStatus
)x
where MaxVal=CandidateStatusId and [Status]='Offer Accepted'

Based on your requirement I wrote this.
Not tested.
select CandidateId
from dbo.CandidatePositionStatus
group by CandidateId
having sum(case when PositionStatusForCandidateCode = 'Offer Accepted' then 1 else 0 end) = 1
and sum(case when PositionStatusForCandidateCode = 'Hired' then 1 else 0 end) = 0;

Related

How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

How to check the value of any row in a group after a previous one fulfils a condition?

I have a dataset grouped by test subjects that is filled according to the actions they perform. I need to find which customer does A and then, at some point, does B; but it doesn't necessarily have to be in the next action/row. And it can't be first does B and then A, it has to be specifically in that order. For example, I have this table:
Subject ActionID ActionOrder
1 A 1
1 C 2
1 D 3
1 B 4
1 C 5
2 D 1
2 A 2
2 C 3
2 B 4
3 B 1
3 D 2
3 A 3
4 A 1
Here subjects 1 and 2 are the ones that fulfil the order of actions condition. While 3 does not because it performs the actions in reverse order. And 4 only does action A
How can I get only subjects 1 and 2 as results? Thank you very much
Use conditional aggregation:
SELECT Subject
FROM tablename
WHERE ActionID IN ('A', 'B')
GROUP BY Subject
HAVING MAX(CASE WHEN ActionID = 'A' THEN ActionOrder END) <
MIN(CASE WHEN ActionID = 'B' THEN ActionOrder END)
See the demo.
Consider below option
select Subject
from (
select Subject,
regexp_replace(string_agg(ActionID, '' order by ActionOrder), r'[^AB]', '') check
from `project.dataset.table`
group by Subject
)
where not starts_with(check, 'B')
and check like '%AB%'
Above assumes that Subject can potentially do same actions multiple times that's why few extra checks in where clause. Other wise it would be just check = 'AB'

Summing up only the values of previous rows with the same ID

As I am preparing my data for predicting no-shows at a hospital, I ran into the following problem: In the query below I tried to get the number of shows/no-shows relatively shown to the number of appointments (APPTS). INDICATION_NO_SHOW means whether a patient showed up at a appointment. 0 means show, and 1 means no-show.
with t1 as
(
select
PAT_ID
,APPT_TIME
,APPT_ID
,ROW_NUMBER () over(PARTITION BY PAT_ID order by pat_id,APPT_TIME) as [TOTAL_APPTS]
,INDICATION_NO_SHOW
from appointments
)
,
t2 as
(
t1.PAT_ID
,t1.APPT_TIME
,INDICATION_NO_SHOW
,sum(INDICATION_NO_SHOW) over(order by PAT_ID, APPT_TIME ) as TOTAL_NO_SHOWS
,TOTAL_APPT
from t1
)
SELECT *
,(TOTAL_APPT- TOTAL_NO_SHOWS) AS TOTAL_SHOWS
FROM T2
order by PAT_ID, APPT_TIME
This resulted into the following dataset:
PAT ID APPT_TIME INDICATION_NO_SHOW TOTAL_SHOWS TOTAL_NO_SHOWS TOTAL_APPTS
1 1-1-2001 0 1 0 1
1 1-2-2001 0 2 0 2
1 1-3-2001 1 2 1 3
1 1-4-2001 0 3 1 4
2 1-1-2001 0 0 1 1
2 2-1-2001 0 1 1 2
2 2-2-2001 1 1 2 3
2 2-3-2001 0 2 2 4
As you can see my query only worked for patient 1, and then it also counts the no-shows for patient 1 for patient 2. So individually it worked for 1 patient, but not over the whole dataset.
The TOTAL_APPTs column worked out, because it counted the number of appts the patient had at the moment of that given appt. My question is: How do I succesfully get these shows and no-shows succesfully added up (as I did for patient 1)? I'm completely aware why this query doesn't work, I'm just completely in the blue on how to fix it..
I think that you can just use window functions. You seem to be looking for window sums of shows and no shows per patient, so:
select
pat_id,
appt_time,
indication_no_show,
sum(1 - indication_no_show)
over(partition by pat_id order by appt_time) total_shows,
sum(indication_no_show)
over(partition by pat_id order by appt_time) total_no_shows
from appointments

Getting only one id from duplicate ids (all having diffrent values) and updating the unique id with only one value (applying condition)

I cant seem to solve this. I need to apply 2 conditions here -
1) when the same id has both values as 'Bachlors' and 'Masters', I need to have the id only once and it shows bachelors.
2)when the same id has both values as 'Bachlors' and 'Masters' and 'PHD', I need to have the id only once and it shows bachelors.
id degree
1 bachelor
2 master
3 bachelor
1 master
2 bachelor
2 phd
I want result like this -
1 bachelor
2 master
3 bachelor
Presumably, you want something like this:
select id,
(case when sum(case when degree = 'Bachelors' then 1 else 0 end) > 0
then 'Bachelors'
else max(degree)
end)
from t
group by id;

SQL: A count inside a case inside a case perhaps?

Good day all.
below is an image relating to what I am attempting to achieve.
In one table there is two fields one is an ID and one is a Type.
I figured a picture paints a thousand words, so check the below
I have tried a few things with case and other things but none worked.
There is a couple of things to note: We cannot use temporary tables, inserts or deletes due to certain limitations.
Data Sample:
ID Type
3 bad
2 zeal
4 tro
3 pol
2 tro
2 lata
4 wrong
3 dead
2 wrong
3 dead
4 wrong
3 lata
2 bad
2 zeal
First of all you need a table containing the type groups:
type typegroup
bad 1
tro 1
zeal 1
dead 2
lata 2
wrong 2
pol 3
Then join, group by type group in order to get one result line per type group and count.
select
tg.typegroup,
count(case when id = 2 then 1 end) as id2,
count(case when id = 3 then 1 end) as id3
count(case when id = 4 then 1 end) as id4
from typegroups tg
join mytable m on m.type = tg.type
group by tg.typegroup
order by tg.typegroup;
UPDATE: Of course you can create such table on-the-fly.
...
from
(
select 'bad' as type, 1 as typegroup
union all
select 'tro' as type, 1 as typegroup
union all
...
) tg
join mytable m on m.type = tg.type
...
And you can move this to a WITH clause if you prefer so.