How to Include Zero in a COUNT() Aggregate? - sql

I have three tables, and I join them and use where Group by - count, I could not get the countries with zero results in the output. I am still lost.
Here is the SQLfiddle
http://sqlfiddle.com/#!4/e330ec/7
CURRENT OUTPUT
(UKD) 3
(EUR) 2
(USA) 2
(CHE) 1
EXPECTED OUTPUT
(UKD) 3
(EUR) 2
(IND) 0
(LAO) 0
(USA) 2
(CHE) 1

You can use a RIGHT JOIN as suggested in another answer or you can reorder your joins and use a LEFT JOIN:
SELECT
C.COUNTRY_CODE,
COUNT(GAME_TYPE)
FROM
COUNTRY_TABLE C
LEFT JOIN PLAYER_TABLE P ON P.COUNTRY_ID = C.COUNTRY_ID
LEFT JOIN PLAYER_GAME_TYPE G ON P.PLAYER_ID = G.PLAYER_ID
WHERE
G.GAME_TYPE = 'GOLF'
OR G.GAME_TYPE IS NULL
GROUP BY
C.COUNTRY_CODE;
Note the inclusion of OR G.GAME_TYPE IS NULL in the WHERE clause -- if you only have G.GAME_TYPE = 'GOLF', then desired results will be filtered out after the joins.

You can prefer applying the following steps as an option :
convert second LEFT JOIN to RIGHT JOIN(since desired missing abbreviations are in COUNTRY_TABLE which stays at right)
make the filtering condition(followed by the WHERE clause) G.GAME_TYPE = 'GOLF' a match condition
by taking next to the ON clause
such as
SELECT C.COUNTRY_CODE, COUNT(GAME_TYPE)
FROM PLAYER_TABLE P
LEFT JOIN PLAYER_GAME_TYPE G
ON P.PLAYER_ID = G.PLAYER_ID
RIGHT JOIN COUNTRY_TABLE C
ON P.COUNTRY_ID = C.COUNTRY_ID
AND G.GAME_TYPE = 'GOLF'
GROUP BY C.COUNTRY_CODE;
Demo

The simple change of tables join order can solve the problem
SELECT C.COUNTRY_CODE, COUNT(GAME_TYPE)
FROM COUNTRY_TABLE C -- get all countries
LEFT JOIN PLAYER_TABLE P ON P.COUNTRY_ID = C.COUNTRY_ID -- join all players
LEFT JOIN PLAYER_GAME_TYPE G ON P.PLAYER_ID = G.PLAYER_ID AND G.GAME_TYPE = 'GOLF' -- join only GOLF games
GROUP BY C.COUNTRY_CODE;
sqlize online

This is your query:
SELECT C.COUNTRY_CODE, COUNT(GAME_TYPE)
FROM PLAYER_TABLE P
LEFT JOIN PLAYER_GAME_TYPE G ON P.PLAYER_ID = G.PLAYER_ID
LEFT JOIN COUNTRY_TABLE C ON P.COUNTRY_ID = C.COUNTRY_ID
WHERE G.GAME_TYPE = 'GOLF'
GROUP BY C.COUNTRY_CODE;
This query seems to try to select all players even when they are no golfers. This doesn't work, however, as WHERE G.GAME_TYPE = 'GOLF' removes all outer joined rows, so you end up with an inner join (all players who play golf.) At last you outer join the countries table, which would give you players that don't belong to a country. Is this indented? I don't think so.
What you want is countries, so select from countries. Then properly outer join players and types in order to count them.
SELECT c.country_code, COUNT(g.game_type) as golfers_in_country
FROM country_table c
LEFT JOIN player_table p ON p.country_id = c.country_id
LEFT JOIN player_game_type g ON g.player_id = p.player_id AND g.game_type = 'GOLF'
GROUP BY c.country_code
ORDER BY c.country_code;
You can use a CTE to get this more readable. It is longer, but makes the intention crystal-clear. Structuring one's queries like this helps avoiding mistakes.
WITH golfers AS
(
SELECT *
FROM player_table
WHERE player_id IN
(
SELECT player_id
FROM player_game_type
WHERE game_type = 'GOLF'
)
)
SELECT c.country_code, COUNT(g.player_id) as golfers_in_country
FROM country_table c
LEFT JOIN golfers g ON g.country_id = c.country_id
GROUP BY c.country_code
ORDER BY c.country_code;

Related

How can I write this SQL in a better way?

This is the query and I'm trying to write it in a better way.
Calculate the average number of languages in every country in a region.
CREATE TABLE region3 AS SELECT regions.name, count(country_languages.country_id) FROM regions
RIGHT OUTER JOIN countries on countries.region_id = regions.region_id
RIGHT OUTER JOIN country_languages on countries.country_id = country_languages.country_id
GROUP BY regions.name;
CREATE TABLE region2 AS SELECT regions.name, count(countries.country_id) FROM regions RIGHT OUTER JOIN countries on countries.region_id = regions.region_id GROUP BY regions.name;
SELECT region2.name, region2.count as total_countries, region3.count as langs from region2
LEFT OUTER JOIN region3 on region2.name = region3.name;
SELECT name, ROUND(langs::decimal/total_countries, 1) as avg_lang_count_per_country from regions_new ORDER BY avg_lang_count_per_country DESC;
This is how it should look.
I guess it is as simple as:
SELECT regions.name, AVG(country_language_count) AS average_country_language_count
FROM regions
JOIN (
SELECT countries.region_id, COUNT(*) AS country_language_count
FROM countries
JOIN country_languages on countries.country_id = country_languages.country_id
GROUP BY countries.country_id, countries.region_id
) AS subquery1 ON regions.region_id = subquery1.region_id
GROUP BY regions.name

Column '' in field list is ambiguous. (when JOIN ON)

Column 'company_id' in field list is ambiguous
It does not seem "ambiguous", I have no idea where I should fix it:
SELECT company_id, companies.name
FROM company_contracts AS contracts
LEFT OUTER JOIN companies ON companies.id = contracts.company_id
LEFT OUTER JOIN order_logs AS logs ON companies.id = logs.company_id;
Because company_id appears on both table company_contracts and order_logs,you need to specify it
SELECT contracts.company_id,c1.name
FROM company_contracts as contracts
LEFT OUTER JOIN companies as c1 on c1.id = contracts.company_id
LEFT OUTER JOIN order_logs as logs on c1.id = logs.company_id;
You should qualify all columns names in such a query. In addition, if you really want outer joins, the second join condition should refer to the first table, not the second:
SELECT cc.company_id, c.name
FROM company_contracts cc LEFT OUTER JOIN
companies c
ON c.id = cc.company_id LEFT OUTER JOIN
order_logs ol
ON cc.company_id = ol.company_id;
Or, more likely, you want to keep all companies and the query should look like:
SELECT c.id, c.name
FROM companies c LEFT OUTER JOIN
company_contracts cc
ON c.id = cc.company_id LEFT OUTER JOIN
order_logs ol
ON c.id = ol.company_id;

SQL join subquery where condition

How can I effectively subquery a LEFT OUTER JOIN so that only rows that meet a specific condition in the join are included?
I'd like to only count PPPD's where converted_at IS NULL. However when I add PPPD.converted_at IS NULL, then the result is more limited than I'd like it to be because it only includes patient_profiles that do have a row with null in converted_at. Instead I'd like a count of all PPPD records that have converted_at = null
SELECT P.id, P.gender, P.dob,
count(distinct recommendations.id) AS recommendation_count,
count(distinct PPPD.id) AS community_submissions,
FROM patient_profiles AS P
LEFT OUTER JOIN recommendations ON recommendations.patient_profile_id = P.id
LEFT OUTER JOIN patient_profile_potential_doctors AS PPPD ON PPPD.patient_profile_id = P.id
WHERE P.is_test = FALSE
GROUP BY P.id
You need to add the condition in the ON clause:
SELECT P.id, P.gender, P.dob,
count(distinct r.id) AS recommendation_count,
count(distinct PPPD.id) AS community_submissions,
FROM patient_profiles P LEFT OUTER JOIN
recommendations r
ON r.patient_profile_id = P.id LEFT OUTER JOIN
patient_profile_potential_doctors PPPD
ON PPPD.patient_profile_id = P.id AND PPPD.converted_at IS NULL
WHERE P.is_test = FALSE;
GROUP BY P.id

Selecting single column multiple times based on different conditions

I have written a SQL query to retrieve required data and it looks like given below:
SELECT distinct p.person_id,p.birth_date,p.gender_code,
wm_concat(distinct r.race_code) as race_code,p.hispanic_latino_code,
c.clinically_diagnosed_code,
wm_concat(distinct c.characteristic_code) as chara_codes,
p.prev_adopted_code,p.age_adopted,
FIRST_VALUE(pe.removed_date) OVER (ORDER BY pe.removed_date),
count(pe.removed_date) as removal_count,
LAST_VALUE(pe.discharge_date) OVER (ORDER BY pe.discharge_date),
LAST_VALUE(pe.removed_date) OVER (ORDER BY pe.removed_date) as latest_removal_date,pe.created_date,
pe.removal_circumstance_code,wm_concat(distinct rr.removal_reason_code) as removal_reasons,
ps.placement_type_code,ps.icpc_placement_flag,pe.caretaker_structure_code
FROM PERSON p left outer join RACE r on p.person_id = r.person_id
left outer join CHARACTERISTIC c on c.person_id = p.person_id
left outer join PLACEMENT_EPISODE pe on p.person_id = pe.child_id
left outer join PLACEMENT_SETTING ps on p.person_id = ps.child_id
left outer join REMOVAL_REASON rr on pe.placement_episode_id = rr.placement_episode_id
GROUP BY p.person_id,p.birth_date,p.gender_code,p.hispanic_latino_code,
c.clinically_diagnosed_code,p.prev_adopted_code,p.age_adopted,pe.removed_date,
pe.discharge_date,pe.removed_date,pe.created_date,pe.removal_circumstance_code,
ps.placement_type_code,ps.icpc_placement_flag,pe.caretaker_structure_code
ORDER BY p.person_id
In the above mentioned query, I have already selected birth date for a person. Now again in select clause I want to select birth_date for persons with following condition:
condition 1: p.person_id = pe.primary_caretaker_id
condition 2: p.person_id = pe.secondary_caretaker_id
Can someone tell me the way to select these fields(birth_date based on two different conditions) in the existing query?
Birth_date has been already selected once for individual person. Now I want to retrieve birth_date for primary_caretaker and secondary_caretaker.
You will need to join to the PERSON table twice more:
SELECT distinct p.person_id,p.birth_date,p.gender_code,
wm_concat(distinct r.race_code) as race_code,p.hispanic_latino_code,
c.clinically_diagnosed_code,
wm_concat(distinct c.characteristic_code) as chara_codes,
p.prev_adopted_code,p.age_adopted,
FIRST_VALUE(pe.removed_date) OVER (ORDER BY pe.removed_date),
count(pe.removed_date) as removal_count,
LAST_VALUE(pe.discharge_date) OVER (ORDER BY pe.discharge_date),
LAST_VALUE(pe.removed_date) OVER (ORDER BY pe.removed_date) as latest_removal_date,
pe.created_date,
pe.removal_circumstance_code,wm_concat(distinct rr.removal_reason_code) as removal_reasons,
ps.placement_type_code,ps.icpc_placement_flag,pe.caretaker_structure_code,
primCare.birth_date as primary_carer_birth_date,
secCare.birth_date as secondary_carer_birth_date,
FROM PERSON p left outer join RACE r on p.person_id = r.person_id
left outer join PERSON primCare on primCare.person_id = pe.primary_caretaker_id
left outer join PERSON secCare on secCare.person_id = pe.secondary_caretaker_id
left outer join CHARACTERISTIC c on c.person_id = p.person_id
left outer join PLACEMENT_EPISODE pe on p.person_id = pe.child_id
left outer join PLACEMENT_SETTING ps on p.person_id = ps.child_id
left outer join REMOVAL_REASON rr on pe.placement_episode_id = rr.placement_episode_id
GROUP BY p.person_id,p.birth_date,p.gender_code,p.hispanic_latino_code,
c.clinically_diagnosed_code,p.prev_adopted_code,p.age_adopted,pe.removed_date,
pe.discharge_date,pe.removed_date,pe.created_date,pe.removal_circumstance_code,
ps.placement_type_code,ps.icpc_placement_flag,pe.caretaker_structure_code, primCare.birth_date, secCare.birth_date
ORDER BY p.person_id

What kind of SQL join would this be?

I need to go to two tables to get the appropriate info
exp_member_groups
-group_id
-group_title
exp_members
-member_id
-group_id
I have the appropriate member_id
So I need to check the members table, get the group_id, then go to the groups table and match up the group_id and get the group_title from that.
INNER JOIN:
SELECT exp_member_groups.group_title
FROM exp_members
INNER JOIN exp_member_groups ON exp_members.group_id = exp_member_groups.group_id
WHERE exp_members.member_id = #memberId
SELECT g.group_title
FROM exp_members m
JOIN exp_member_groups g ON m.group_id = g.group_id
WHERE m.member_id = #YourMemberId
If there is always a matching group, or you only want rows where it is, then it would be an INNER JOIN:
SELECT g.group_title
FROM exp_members m
INNER JOIN
exp_member_groups g
ON m.group_id = g.group_id
WHERE m.member_id = #member_id
If you want rows even where group_id doesn't match, then it is a LEFT JOIN - replace INNER JOIN with LEFT JOIN in the above.