SQL count occurrence of value by distinct id - sql

Given following table A
FlightID| Roles
1 | Pilot
1 | Steward
1 | Steward
2 | Pilot
2 | Co-Pilot
How can I determine the number of stewards on each distinct flight? The output should be like this:
FlightID| Count
1 | 2
2 | 0
First I tried:
select FlightID, count(Role) from A group by FlightID
but this gives me the total number of roles per flight. Then I tried:
select FlightID, count(Role) from A where Role="Steward" group by FlightID
this is partially right since it gives me the number of stewards per flight, but it does not take into account 0 stewards. How can I also include 0 stewards into the result?

You can use conditional aggregation.
select FlightID
,sum(case when Role = 'Steward' then 1 else 0 end)
--or count(case when Role = 'Steward' then 1 end)
from A
group by FlightID

Does your table really not have a primary key on it? If it does, this is a fairly easy LEFT JOIN:
SELECT a1.FlightId, COUNT(a2.FlightId)
FROM A a1 LEFT JOIN A a2 ON a1.id = a2.id
AND a2.Roles = 'Steward'
GROUP BY a1.FlightId;
http://ideone.com/BySBSx

Related

I need a 3 table join

This is my parent table acc_detial -
ACC_DETIAL example -
acc_id
1
2
3
Now i have 3 tables:
ORDER
EMAIL
REPORT
Each table contains 100 rows and acc_id are ForeignKey from ACC_DETIAL.
In ORDER table I have a columns ACC_ID and QUANTITY. I want the count of ACC_ID and sum of QUANTITY.
ORDER table example:
acc_id
quantity
date
1
2
2022/01/22
2
5
2022/01/23
1
10
2022/01/25
3
1
2022/01/25
In EMAIL table I have a column name ACC_ID and I want count of ACC_ID.
EMAIL table example:
acc_id
mail
date
1
5
2022/01/22
2
10
2022/01/22
1
7
2022/01/23
1
7
2022/01/24
2
10
2022/01/25
In REPORT table I have a columns ACC_ID and TYPE and I want the count of ACC_ID and TYPE. Note that TYPE column has only two, possible values:
postive
negative
I want count of each, i.e. count of postive and count of negative in TYPE column.
REPORT table example:
acc_id
type
date
1
positive
2022/01/22
2
negative
2022/01/22
1
negative
2022/01/23
2
postitive
2022/01/26
2
postitive
2022/01/27
I need to take this in a single i need answer as raw query or sqlalchemy. Is it possible or not? Do I need to write separate query to get each table result ?
Result -
result based on above examplec -
acc_id
total_Order_acc_id
total_Order_quantity
total_Email_acc_id
total_Report_acc_id
total_postitive_report
total_negative_report
1
2
12
3
2
1
1
2
1
5
2
3
2
1
3
1
1
Null
Null
Null
Null
You need to aggregate then join as the following:
SELECT ADL.acc_id,
ORD.ord_cnt AS total_Order_acc_id,
ORD.tot_quantity AS total_Order_quantity,
EML.eml_cnt AS total_Email_acc_id,
RPT.rpt_cnt AS total_Report_acc_id,
RPT.pcnt AS total_postitive_report,
RPT.ncnt AS total_negative_report
FROM ACC_DETIAL ADL LEFT JOIN
(
SELECT acc_id,
SUM(quantity) AS tot_quantity,
COUNT(*) AS ord_cnt
FROM ORDERS
GROUP BY acc_id
) ORD
ON ADL.acc_id = ORD.acc_id
LEFT JOIN
(
SELECT acc_id, COUNT(*) AS eml_cnt
FROM EMAIL
GROUP BY acc_id
) EML
ON ADL.acc_id = EML.acc_id
LEFT JOIN
(
SELECT acc_id,
COUNT(*) AS rpt_cnt,
COUNT(*) FILTER (WHERE type='positive') AS pcnt,
COUNT(*) FILTER (WHERE type='negative') AS ncnt
FROM REPORT
GROUP BY acc_id
) RPT
ON ADL.acc_id = RPT.acc_id
See demo
Sample :
Select
`order`.`acc_id`,
report_email_select.`type`,
report_email_select.report_count,
report_email_select.email_count,
SUM(`quantity`) as quantity_sum
FROM
`order`
Left JOIN(
Select
report_select.`acc_id`,
report_select.`type`,
report_select.report_count,
COUNT(*) as email_count
from
(
SELECT
report.`acc_id`,
report.`type`,
COUNT(*) as report_count
FROM
`report`
WHERE
1
GROUP BY
report.`acc_id`,
report.`type`
) AS report_select
INNER JOIN email ON email.acc_id = report_select.acc_id
GROUP BY
report_select.`acc_id`,
report_select.`type`
) AS report_email_select ON `order`.acc_id = report_email_select.acc_id
GROUP BY
`order`.`acc_id`,
report_email_select.`type`;

How to check the count of each values repeating in a row

I have two tables. Data in the first table is:
ID Username
1 Dan
2 Eli
3 Sean
4 John
Second Table Data:
user_id Status_id
1 2
1 3
4 1
3 2
2 3
1 1
3 3
3 3
3 3
. .
goes on goes on
These are my both tables.
I want to find the frequency of individual users doing 'status_id'
My expected result is:
username status_id(1) status_id(2) status_id(3)
Dan 1 1 1
Eli 0 0 1
Sean 0 1 2
John 1 0 0
My current code is:
SELECT b.username , COUNT(a.status_id)
FROM masterdb.auth_user b
left outer join masterdb.xmlform_joblist a
on a.user1_id = b.id
GROUP BY b.username, b.id, a.status_id
This gives me the separate count but in a single row without mentioning which status_id each column represents
This is called pivot and it works in two steps:
extracts the data for the specific field using a CASE statement
aggregates the data on users, to make every field value lie on the same record for each user
SELECT Username,
SUM(CASE WHEN status_id = 1 THEN 1 END) AS status_id_1,
SUM(CASE WHEN status_id = 2 THEN 1 END) AS status_id_2,
SUM(CASE WHEN status_id = 3 THEN 1 END) AS status_id_3
FROM t2
INNER JOIN t1
ON t2.user_id = t1._ID
GROUP BY Username
ORDER BY Username
Check the demo here.
Note: This solution assumes that there are 3 status_id values. If you need to generalize on the amount of status ids, you would require a dynamic query. In any case, it's better to avoid dynamic queries if you can.

How to group values of one column based on condition in t-sql?

My table has doctor ID, visit ID (one doctor can have several visits) and visit cause, it looks like this:
doctor_uid | visit_uid | visit_cause
11-11-11 22-22-11 1
22-22-22 44-44-22 2
11-11-11 23-23-23 1
I need to count all visits for each doctor, based on visit_cause (it should be equal '1'). Before I count all visits for each doctor without condition of visit_cause, I must have this in my query too:
SELECT
v.vra_uid,
COUNT(v.visit_uid) OVER (PARTITION BY v.vra_uid) AS number_of_visits
FROM visits v
It returns:
doctor_uid | number_of_visits
11-11-11 2
22-22-22 1
How can I count those visits, that have visit_cause = '1' for each doctor? so that it would look like:
doctor_uid | number_of_visits | visits_with_cause_equal_1
11-22-33 2 2
22-22-22 1 0
Thanks in advance!
This should work:
select v.vra_uid, COUNT(v.visit_uid) as number_of_visits , SUM(case when v.visit_cause = 1 then 1 else 0 end) as visits_with_cause_equal_1
from visits v
group by v.vra_uid
have you tried CASE WHEN visit_cause = '1' THEN COUNT(*)

logic in HAVING clause to get multiple values of a group by result

Imagine I have a table with data as below:
ROLE_ID | USER_ID | CODE
---------------------------------
14 | USER A | 001
15 | USER A | 002
11 | USER B | 004
13 | USER D | 005
13 | USER A | 001
15 | USER B | 009
15 | USER D | 005
12 | USER C | 004
15 | USER C | 008
13 | USER D | 007
15 | USER D | 007
I want to get the User ids and codes that only have 13 and 15 role_ids. So based on the data above I would like back the following
USER D | 005
USER D | 007
I have the query below, however, it only brings back one, not both.
SELECT a.user_id, a.code
FROM my_table a
WHERE a.ROLE_ID in (13,15,11,14)
group by a.USER_ID, a.code
having sum( case when a.role_id in (13,15) then 1 else 0 end) = 2
and sum( case when a.role_id in (11,14) then 1 else 0 end) = 0
ORDER BY USER_ID
The above query only brings
USER D | 005
rather than
USER D | 005
USER D | 007
Sometimes just listening to your own words in English translates into the easiest to read SQL:
SELECT DISTINCT a.user_id, a.code
FROM my_table a
WHERE a.user_id in
(SELECT b.user_id
FROM my_table b
WHERE b.ROLE_ID = 13)
AND a.user_id in
(SELECT b.user_id
FROM my_table b
WHERE b.ROLE_ID = 15)
AND a.user_id NOT IN
(SELECT b.user_id
FROM my_table b
WHERE b.ROLE_ID NOT IN (13,15))
I will:
SELECT a.user_id, a.code
FROM my_table a
GROUP BY a.user_id, a.code
HAVING sum(case when a.role_id in (13, 15) then 1 else 3 end) = 2
:)
As proven by EthanB, your query is working exactly as you desire. There must be something in your project data that is not represented in your question's fabricated data.
I do endorse a pivot as you have executed in your question, but I would write it as a single SUM expression to reduce the number of iterations over the aggregate data. I certainly do not endorse multiple subqueries on each row of the table (1, 2, 3) ...regardless of whether the optimizer is converting the subqueries to multiple JOINs.
Your pivot conditions:
having sum( case when a.role_id in (13,15) then 1 else 0 end) = 2
and sum( case when a.role_id in (11,14) then 1 else 0 end) = 0
My recommendation:
As the aggregate data is being iterated, you can keep a tally (+1) of qualifying rows and jump to a disqualifying outcome (+3) after each evaluation. This way, there is only one pass over the aggregate instead of two.
SELECT USER_ID, CODE
FROM my_table
WHERE ROLE_ID IN (13,15,11,14)
GROUP BY USER_ID, CODE
HAVING SUM(CASE WHEN ROLE_ID IN (13,15) THEN 1
WHEN ROLE_ID IN (11,14) THEN 3 END) = 2
Another way of expressing what these HAVING clauses are doing is:
Require that the first CASE is satisfied twice and that the second CASE is never satisfied.
Demo Link
Alternatively, the above HAVING clause could be less elegantly written as:
HAVING SUM(CASE ROLE_ID
WHEN 13 THEN 1
WHEN 15 THEN 1
WHEN 11 THEN 3
WHEN 14 THEN 3
END) = 2
Disclaimer #1: I don't swim in the [oracle] tag pool, I've not investigated how to execute this with PIVOT.
Disclaimer #2: My above advice assumes that ROLE_IDs are unique in the grouped USER_ID+CODE aggregate data. Fringe cases: (a demo)
a given group contains ROLE_ID = 13, ROLE_ID = 13, and ROLE_ID = 15 then of course the SUM will be at least 3 and the group will be disqualified.
a given group contains only ROLE_ID = 15 and ROLE_ID = 15 then of course the SUM will be 2 and the group will be unintentionally qualified.
To combat scenarios like these, make three separate MAX conditions.
HAVING MAX(CASE WHEN ROLE_ID = 13 THEN 1 END) = 1
AND MAX(CASE WHEN ROLE_ID = 15 THEN 1 END) = 1
AND MAX(CASE WHEN ROLE_ID IN (11,14) THEN 1 END) IS NULL
Demo
SELECT user_id, code FROM my_table
WHERE role_id = 13
INTERSECT
SELECT user_id, code FROM my_table
WHERE role_id = 15

Semi-complicated result set in sql query

I have a user table like this
FIRSTNAME | LASTNAME | ID |
--------------------------------
James | Hay | 1 |
Other | Person | 2 |
I also have an attendance table like this
EVENTID | USERID | ATTENDANCE | STATUS |
-----------------------------------------------
1 1 True 3
2 1 False 1
3 1 False 3
1 2 False 1
2 2 True 3
3 2 True 3
Basically, when a user is invited to an event, a row is added to the attendance table which has the event ID, their user ID, false attendance, and status 0.
Status is just an indicator of their response
0 = No Response
1 = Said No
2 = Said Yes
3 = Said yes and seats confirmed
My end result I want to get from querying these two tables is quite complicated and I can't figure out what I need to do.
I want to get a result like this
NAME | % of saying YES to an RSVP | % of attending after saying yes
-------------------------------------------------------------------------------
James Hay | 66 | 50
Other Person | 66 | 100
I'm sure you can work out how I got those numbers but to explain, James Hay had the 3 (yes) status to 2/3 events invited to. So % of saying yes is 66. Out of the 2 he said yes to, he only attended 1/2 so % of attending after saying yes is 50%
Any push in the right track would be much appreciated here since I can't get my head around this.
EDIT:
Also something quite important is that I want the results to include every user in the database even if they have 0 rows in the attendance table.
select
u.firstname, u.lastname
-- said yes, as percentage
,floor(100.0
* count(case when a.status in (2,3) then 1 end)
/ count(u.id)) yes
-- attended after saying yes, as percentage
,floor(100.0
* count(case when a.status in (2,3) and attendance='true' then 1 end)
/ nullif(count(case when a.status in (2,3) then 1 end),0)) attendance
--,count(u.id) rsvp -- total invites
from users u
left join attendance a on a.userid = u.id
group by u.firstname, u.lastname
Note: For the special case where the user has never received an invite, the statistics show as 0% and NULL.
Explanation of the terms:
count(case when a.status in (2,3) then 1 end)
represents how many times they said yes, used twice
count(u.id)
how many invites (recorded in attendance) received. Special case is when they have received none, in which case the LEFT JOIN makes it 1 (not important)
count(case when a.status in (2,3) and attendance='true' then 1 end)
count of how many times they attended, AFTER having said yes
SELECT A.NAME,B.PERCENTAGE_YES_RSVP,B.PERCENTAGE_AFTER_YES
FROM
(
SELECT U.ID AS ID,U.FIRSTNAME+''+U.LASTNAME AS NAME
FROM USERS U
) A,
(
SELECT A.USERID,A.PERCENTAGE_YES_RSVP,B.PERCENTAGE_AFTER_YES
FROM
(
SELECT B.USERID,ROUND((CAST(B.COUNT_YES_RSVP AS FLOAT)/B.TOTAL_COUNT)*100,0) AS PERCENTAGE_YES_RSVP
FROM
(SELECT A.USERID,
SUM(CASE WHEN A.STATUS=3 THEN 1 END)AS COUNT_YES_RSVP,
COUNT(*) AS TOTAL_COUNT
FROM ATTENDANCE A
GROUP BY A.USERID
) B
) A,
(
SELECT C.USERID,(CAST(C.COUNT_AFTER_YES_RSVP AS FLOAT)/C.TOTAL_COUNT)*100 AS PERCENTAGE_AFTER_YES
FROM
(SELECT A.USERID,
SUM(CASE WHEN A.STATUS=3 AND A.ATTENDANCE='TRUE' THEN 1 END)AS COUNT_AFTER_YES_RSVP,
SUM(CASE WHEN A.STATUS=3 THEN 1 END) AS TOTAL_COUNT
FROM ATTENDANCE A
GROUP BY A.USERID
) C
)B
WHERE A.USERID=B.USERID
) B
WHERE A.ID = B.USERID;
Edit:
Whoops, forgot to join on users :)
select firstname, lastname,
convert(decimal (5, 2), 1. * count(case when status in (2, 3) then 1 end) / count(*) * 100) SaidYes,
convert(decimal (5, 2), 1. * count(case when status in (2, 3) and attendance = 'True' then 1 end) / count(case when status in (2, 3) then 1 end) * 100) ActuallyAttended
from attendance a
join users u on a.userid = u.id
group by firstname, lastname