MySQL COUNT with GROUP BY and zero entries - sql

I have 2 tables:
Table 1. options_ethnicity with the following entries:
ethnicity_id ethnicity_name
1 White
2 Hispanic
3 African/American
Table 2. inquiries with the following entries:
inquiry_id ethnicity_id
1 1
2 1
3 1
4 2
5 2
I want to generate a table that shows the number of inquires by ethnicity. My query so far looks like this:
SELECT options_ethnicity.ethnicity_name, COUNT('inquiries.ethnicity_id') AS count
FROM (inquiries
LEFT JOIN options_ethnicity ON
options_ethnicity.ethnicity_id = inquiries.ethnicity_id)
GROUP BY options_ethnicity.ethnicity_id
The query gives the correct answer but there is no column for African/American which has 0 results.
White 3
Hispanic 2
If I replace the LEFT JOIN with a RIGHT JOIN, I get all 3 ethnicity names, but the count for African/American is wrong.
White 3
Hispanic 2
African/American 1
Any help would be appreciated.
Here's an update to this post with what seems to be a working query:
SELECT
options_ethnicity.ethnicity_name,
COALESCE(COUNT(inquiries.ethnicity_id), 0) AS count
FROM options_ethnicity LEFT JOIN inquiries ON inquiries.ethnicity_id = options_ethnicity.ethnicity_id
GROUP BY options_ethnicity.ethnicity_id
UNION ALL
SELECT
'NULL Placeholder' AS ethnicity_name,
COUNT(inquiries.inquiry_id) AS count
FROM inquiries
WHERE inquiries.ethnicity_id IS NULL

Because you're using a LEFT JOIN, references to the table defined in the LEFT JOIN can be null. Which means you need to convert this NULL value to zero (in this case):
SELECT oe.ethnicity_name,
COALESCE(COUNT(i.ethnicity_id), 0) AS count
FROM OPTIONS_ETHNICITY oe
LEFT JOIN INQUIRIES i ON i.ethnicity_id = oe.ethnicity_id
GROUP BY oe.ethnicity_id
This example uses COALESCE, an ANSI standard means of handling NULL values. It will return the first non-null value, but if none can be found it will return null. IFNULL is a valid alternative on MySQL, but it is not portable to other databases while COALESCE is.
In the real database table, there are some entries in the inquires table where the ethnicity_id is NULL, i.e. the ethnicity was not recorded. Any idea on how to get these null values to be counted so that they can be shown?
I think I understand the issue you're facing:
SELECT oe.ethnicity_name,
COALESCE(COUNT(i.ethnicity_id), 0) AS count
FROM (SELECT t.ethnicity_name,
t.ethnicity_id
FROM OPTIONS_ETHNICITY t
UNION ALL
SELECT 'NULL placeholder' AS ethnicity_name,
NULL AS ethnicity_id) oe
LEFT JOIN INQUIRIES i ON i.ethnicity_id = oe.ethnicity_id
GROUP BY oe.ethnicity_id
This will pickup all the NULL ethncity_id instances, but it will attribute the counting to the "NULL placeholder" group. IE:
ethnicity_name | COUNT
------------------------
White | 3
Hispanic | 2
NULL placeholder | ?

You counted a string instead of the right column
SELECT options_ethnicity.ethnicity_name, COUNT(inquiries.ethnicity_id) AS count
FROM inquiries
RIGHT JOIN options_ethnicity ON options_ethnicity.ethnicity_id = inquiries.ethnicity_id
GROUP BY options_ethnicity.ethnicity_id

Why don't you "reverse" your query?
SELECT
options_ethnicity.ethnicity_name,
COUNT(inquiries.ethnicity_id) AS count
FROM
options_ethnicity
Left Join inquiries On options_ethnicity.ethnicity_id = inquiries.ethnicity_id
GROUP BY
options_ethnicity.ethnicity_id
You still might need a Coalesce call, but to me, this query makes more sense for what you're trying to accomplish.

Related

SQL query that returns values based on lookups of id in another table - with nullable values

I found this question which is very similar, but mine goes a little further. I'm going to use his example but expand it a little more. Basically I'm adding another column for
I have 2 tables:
dbo.Events
ID StartingLocation EndingLocation
1 1 null
2 2 1
dbo.EventsLocation
ID LocationName
1 Room 1
2 Room 2
What I want to do is write a query that will give me a result that looks like this:
ID StartingLocation EndingLocation
1 Room 1 null
2 Room 2 Room 1
I know I need to do some type of (inner?) join. But I'm getting stuck on the fact that I need to insert the data into two columns, and the fact that a value in EndingLocation can be null.
What I've tried:
SELECT Events.id AS EventID, EventsLocation.LocationName AS StartLocation, EventsLocation.LocationName AS EndLocation
FROM Events
INNER JOIN EventsLocation
on Events.StartingLocation=EventsLocation.id
AND Events.EndingLocation=EventsLocation.id
but this gives me no results. If I chop off the AND condition, I get the following table that just repeats the StartingLocation twice.
EventID StartLocation EndLocation
1 Room 1 Room 1
2 Room 2 Room 2
Can anyone help me get on the right track?
You want two joins. Just in case one of the columns is NULL, I recommend LEFT JOINs:
SELECT e.id AS EventID, else.LocationName AS StartLocation, ele.LocationName AS EndLocation
FROM Events e LEFT JOIN
EventsLocation els
ON e.StartingLocation = els.id LEFT JOIN
EventsLocation ele
ON e.EndingLocation = ele.id;

How to Limit Results Per Match on a Left Join - SQL Server

I have a table with student info [STU] and a table with parent info [PAR]. I want to return an email address for each student, but just one. So I run this query:
SELECT [STU].[ID], [PAR].[EM]
FROM (SELECT [STU].* FROM DB1.STU)
STU LEFT JOIN (SELECT [PAR].* FROM DB1.PAR) PAR ON [STU].[ID] = [PAR].[ID]
This gives me the below table:
Student ID ParentEmail
1 jim#email.com
1 sarah#email.com
2 paul#email.com
2 tim#email.com
3 bill#email.com
3 frank#email.com
3 joyce#email.com
4 greg#email.com
5 tony#email.com
5 sam#email.com
Each student has multiple parent emails, but I only want one. In other words, I want the output to look like this:
Student ID ParentEmail
1 jim#email.com
2 paul#email.com
3 frank#email.com
4 greg#email.com
5 sam#email.com
I've tried so many things. I've tried using GROUP BY and MIN/MAX and I've tried complex CASE statements, and I've tried COALESCE but I just can't seem to figure it out.
I think OUTER APPLY is the simplest method:
SELECT [STU].[ID], [PAR].[EM]
FROM DB1.STU OUTER APPLY
(SELECT TOP (1) [PAR].*
FROM DB1.PAR
WHERE [STU].[ID] = [PAR].[ID]
) PAR;
Normally, there would be an ORDER BY in the subquery, to give you control over which email you want -- the longest, shortest, oldest, or whatever. Without an ORDER BY it returns just one email, which is what you are asking for.
If you just want one column from the parent table, a simple approach is a correlated subquery:
select
s.id student_id,
(select max(p.em) from db1.par p where p.id = s.id) parent_email
from db1.stu s
This gives you the greatest parent email per student.

JOIN query, SQL Server is dropping some rows of my first table

I have two tables customer_details and address_details. I want to display customer details with their corresponding address, so I was using a LEFT JOIN, but when I'm executing this query, SQL Server drops rows where street_no of customer_details table doesn't match with the street_no in address_detials table and displays only rows where `street_no' of customer_detials = street_no of address_details table. I need to display a complete customer_details table and in case if street_no doesn't matches it should display empty string or anything. Am I doing anything wrong in my SQL join?
Table customer_details:
case_id customer_name mob_no street_no
-------------------------------------------------
1 John 242342343 4324234234234
1 Rohan 343233333 43332
1 Ankit 234234233 2342332423433
1 Suresh 234234324 2342342342342
1 Ranjeet 343424323 32233
1 Ramu 234234333 2342342342343
Table address_details:
s_no streen_no address city case_id
------------------------------------------------------
1 4324234234234 Roni road Delhi 1
2 2342332423433 Natan street Lucknow 1
3 2342342342342 Koliko road Herdoi 1
SQL JOIN query:
select
a.*, b.address
from
customer_details a
left join
address_details b on a.street_no = b.street_no
where
b.case_id = 1
Now that it became clear that you used b.case_id=1, I will explain why it filters:
The LEFT JOIN itself returns some rows that contain all NULL values for table b in the result set, which is what you want and expect.
But by using WHERE b.case_id=1, the rows containing NULL values for table b are filtered out because none of them matches the condition (all those rows have b.case_id=NULL so they don't match).
It might work to instead use WHERE a.case_id=1, but we don't know if a.case_id and b.case_id are always the same value for matching rows (they might not be; and if they are always the same, then we just identified a potential redundancy).
There are two ways to fix this for sure.
(1) Move b.case_id = 1 into the left join condition:
left join address_details b on a.street_no = b.street_no and b.case_id = 1
(2) Keep b.case_id = 1 in the WHERE but also allow for NULLED-out b values:
left join address_details b on a.street_no = b.street_no
where b.case_id = 1
or b.street_no IS NULL
Personally I'd go for (1) because that is the most clear way to express that you want to filter b on two conditions, without affecting the rows of a that are being returned.
I do think that Wilhelm Poggenpohl answer is kind of right. You just need to change the last join condition a.case_id=1 to b.case_id=1
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and b.case_id=1
This query will show every row from customer_details and the corresponding adress if there is a match of street_no and the adress meets the condition case_id=1.
This is because of the where clause. Try this:
select a.* , b.address
from customer_details a
left join address_details b on a.street_no=b.street_no
and a.case_id=1

How to have multiple tables with multiple joins

I have three tables that I need to join together and get a combination of results. I have tried using left/right joins but they don't give the desired results.
For example:
Table 1 - STAFF
id name
1 John
2 Fred
Table 2 - STAFFMOBILERIGHTS
id staffid mobilerightsid rights
--this table is empty--
Table 3 - MOBILERIGHTS
id rightname
1 Login
2 View
and what I need is this as the result...
id name id staffid mobilerightsid rights id rightname
1 John null null null null 1 login
1 John null null null null 2 View
2 Fred null null null null 1 login
2 Fred null null null null 2 View
I have tried the following :
SELECT *
FROM STAFFMOBILERIGHTS SMR
RIGHT JOIN STAFF STA
ON STA.STAFFID = SMR.STAFFID
RIGHT JOIN MOBILERIGHTS MRI
ON MRI.ID = SMR.MOBILERIGHTSID
But this only returns two rows as follows:
id name id staffid mobilerightsid rights id rightname
null null null null null null 1 login
null null null null null null 2 View
Can what I am trying to achieve be done and if so how?
Thanks
From your comment its now clear you want a cross join (include all rows from staff and mobilerights). Something like this should do it
SELECT
*
FROM Staff, MobileRights
LEFT OUTER JOIN StaffMobileRights ON StaffMobileRights.StaffId = Staff.Id
The FROM clause specifies that we will be including all rows from the Staff table, and all rows from the MobileRights table. The end result will therefore contain (staff * MobileRights) rows.
To bring in rows from StaffMobileRights then we need a join to that table also. We use a LEFT OUTER join to ensure that we always include the left side (rows in the staff table) but we arent too bothered if no rows exist on the right side (StaffMobileRights table). If no row exists for the join then null values are returned.
What you are probably asking is to see null where is no rights. In the rectangular style that results are always returned, this is the only way to represent it with a simple join:
From PaulG's query i changed it a bit to always get everything form the STAFF table.
SELECT
*
FROM STAFF
RIGHT OUTER JOIN StaffMobileRights ON StaffMobileRights.StaffId = Staff.Id
INNER JOIN MobileRights ON MobileRights.Id = StaffMobileRights.MobileRightsId

Remove Duplicates from LEFT OUTER JOIN

My question is quite similar to Restricting a LEFT JOIN, with a variation.
Assuming I have a table SHOP and another table LOCATION. Location is a sort of child table of table SHOP, that has two columns of interest, one is a Division Key (calling it just KEY) and a "SHOP" number. This matches to the Number "NO" in table SHOP.
I tried this left outer join:
SELECT S.NO, L.KEY
FROM SHOP S
LEFT OUTER JOIN LOCATN L ON S.NO = L.SHOP
but I'm getting a lot of duplicates since there are many locations that belong to a single shop. I want to eliminate them and just get a list of "shop, key" entries without duplicates.
The data is correct but duplicates appear as follows:
SHOP KEY
1 XXX
1 XXX
2 YYY
3 ZZZ
3 ZZZ etc.
I would like the data to appear like this instead:
SHOP KEY
1 XXX
2 YYY
3 ZZZ etc.
SHOP table:
NO
1
2
3
LOCATION table:
LOCATION SHOP KEY
L-1 1 XXX
L-2 1 XXX
L-3 2 YYY
L-4 3 YYY
L-5 3 YYY
(ORACLE 10g Database)
You need to GROUP BY 'S.No' & 'L.KEY'
SELECT S.NO, L.KEY
FROM SHOP S
LEFT OUTER JOIN LOCATN L
ON S.NO = L.SHOP
GROUP BY S.NO, L.KEY
EDIT Following the update in your scenario
I think you should be able to do this with a simple sub query (though I haven't tested this against an Oracle database). Something like the following
UPDATE shop s
SET divnkey = (SELECT DISTINCT L.KEY FROM LOCATN L WHERE S.NO = L.SHOP)
The above will raise an error in the event of a shop being associated with locations that are in multiple divisions.
If you just want to ignore this possibility and select an arbitrary one in that event you could use
UPDATE shop s
SET divnkey = (SELECT MAX(L.KEY) FROM LOCATN L WHERE S.NO = L.SHOP)
I had this problem too but I couldn't use GROUP BY to fix it because I was also returning TEXT type fields. (Same goes for using DISTINCT).
This code gave me duplicates:
select mx.*, case isnull(ty.ty_id,0) when 0 then 'N' else 'Y' end as inuse
from master_x mx
left outer join thing_y ty on mx.rpt_id = ty.rpt_id
I fixed it by rewriting it thusly:
select mx.*,
case when exists (select 1 from thing_y ty where mx.rpt_id = ty.rpt_id) then 'Y' else 'N' end as inuse
from master_x mx
As you can see I didn't care about the data within the 2nd table (thing_y), just whether there was greater than zero matches on the rpt_id within it. (FYI: rpt_id was also not the primary key on the 1st table, master_x).