How to include zero results when querying one single table? - sql

I have a table called Apartments that has three columns: apartment_type, person, date. It includes the apartment type selected by a certain person and date. I need to count how many people picked each of the apartment types. Some apartment type have 0 population.
Here is my query:
SELECT apartment_type, COUNT(*) AS TOTAL
FROM Apartments
GROUP BY apartment_type
It works great, but it doesn't include apartment types with a value of 0. Please, help me to correct this query.

In case some appartment_type have 0 population - your table will not contain any record with that type - so you must add some join from another table, where all apartment types exists. Or use union to create all 0 populated entries.
Something like:
SELECT apartment_type, COUNT(*) AS TOTAL
FROM (SELECT * FROM Apartments UNION ALL SELECT apartment_type, 0 as person, 0 as date from SomeTableWithFullListOfTypes group by apartment_type) as tmp
GROUP BY apartment_type

I generally agree with Nosyara's answer, but I don't agree with his sample query with the union all. I'm not sure it works, and it's certainly too complicated.
As stated already, if you don't have a table with all the possible apartment types, create one. Then you can write your query using a simple left join:
select t.apartment_type, count(a.apartment_type) as total
from apartment_types t
left join apartments a
on a.apartment_type = t.apartment_type
group by t.apartment_type
Note how count(*) was replaced by count(a.apartment_type). That change is necessary to have an accurate count in the case where you don't have apartments for a certain apartment type.

SELECT apartment_type, COUNT(apartment.*) AS TOTAL
FROM apartment_type
left join apartment
on apartment_type.aparentment_type = apartements.apartment_type
GROUP BY apartment_type
Using a left join will give you everything from the left side of the join (so all your types) and anything from the right that matches.

Related

SQL Join query brings multiple results

I have 2 tables. One lists all the goals scored in the English Premier League and who scored it and the other, the squad numbers of each player in the league.
I want to do a join so that the table sums the total number of goals by player name, and then looks up the squad number of that player.
Table A [goal_scorer]
[]1
Table B [squads]
[]2
I have the SQL query below:
SELECT goal_scorer.*,sum(goal_scorer.number),squads.squad_number
FROM goal_scorer
Inner join squads on goal_scorer.name=squads.player
group by goal_scorer.name
The issue I have is that in the result, the sum of 'number' is too high and seems to include duplicate rows. For example, Aaron Lennon has scored 33 times, not 264 as shown below.
Maybe you want something like this?
SELECT goal_scorer.*, s.total, squads.squad_number
FROM goal_scorer
LEFT JOIN (
SELECT name, sum(number) as total
FROM goal_scorer
GROUP BY name
) s on s.name = goal_scorer.name
JOIN squads on goal_scorer.name=squads.player
There are other ways to do it, but here I'm using a sub-query to get the total by player. NB: Most modern SQL platforms support windowing functions to do this too.
Also, probably don't need the left on the sub-query (since we know there will always be at least one name), but I put it in case your actual use case is more complicated.
Can you try this if you are using sql-server?
select *
from squads
outer apply(
selecr sum(goal_scorer.number) as score
from goal_scorer where goal_scorer.name=squads.player
)x

Why use many columns in GROUP BY and HAVING clause in these examples

Given the schema here I'm trying to understand and solve the below 3 sql queries as I'm confused:
1- Present a table giving the names of the countries with ≥ 50% urbanization
rates, their urbanization rates, and their per capita GDP. Note that
urbanization rate is the percentage of population living in cities. Do not
count cities with NULL values for population.
SELECT country.name, round(sum(city.population)/country.population, 3) AS urban, round(gdp/country.population, 3) AS gdppc
FROM city
INNER JOIN country ON code = country
INNER JOIN economy ON code = economy.country
WHERE city.population IS NOT NULL
GROUP BY country.name, country.population, economy.gdp
HAVING round(sum(city.population)/country.population, 3) >= 0.5
ORDER BY urban DESC;
In the above query, Why I need to include country.population and economy.gdp in the GROUP BY? If I tried using just country.name in the GROUP BY I get an error saying I should include the others.
2- Show organizations that have as members all the European countries with over 50 million people?
SELECT name
FROM organization
INNER JOIN (SELECT organization
FROM country
INNER JOIN encompasses
ON code = encompasses.country
INNER JOIN ismember
ON code = ismember.country
WHERE population > 50000000 AND continent = 'Europe'
GROUP BY organization
HAVING count(ismember.country) = (SELECT count(*)
FROM country
INNER JOIN encompasses
ON code = country
WHERE population > 50000000 AND continent = 'Europe'))
AS innerQuery
ON abbreviation = innerQuery.organization;
Why I need the HAVING Part above?
3- Insert a new organization called “Tivoli” and a trigger that says if Germany joins “Tivoli” then so too must the UK and France. Insert Germany into the “Tivoli” organization. Confirm proper behavior.
I tried the below script but it's not working, any advice please?
do $$
begin
IF(NOT EXISTS ( SELECT 1 FROM organization WHERE organization."name" = 'Tivoli' AND organization.country = 'D' ))
BEGIN
INSERT INTO organization VALUES ('Tivoli','Tivoli organization',NULL,'F',NULL,NULL);
INSERT INTO organization VALUES ('Tivoli','Tivoli organization',NULL,'GB',NULL,NULL);
END;
end $$
1)
You used country.population and economy.gdp in the select, outside of aggregate functions ( COUNT(), AVG() and SUM() ), and you have a GROUP BY. Everything that you select has to be in GROUP BY or inside of aggregate functions.
2)
Because you were asked to show organizations that have ALL of 50mil + people countries. With HAVING, you check if that organization has the right amount of countries.
3)
organization."name" = 'Tivoli'
It's supposed to be :
organization.name
First of all, you should limit a question to one only, not 3. But here are some pointers for all 3:
In the above query, Why I need to include country.population and economy.gdp in the GROUP BY? If I tried using just country.name in the GROUP BY I get an error saying I should include the others.
This is a requirement. A group by country.name alone would work (in Postgres 9.1+) only if the other two fields are known to be functionally dependent on country.name. But probably country.name is not the primary key of the country table, so in theory it is possible to have two records in that table with the same name, but different population.
The rule is as follows:
When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.
This is implemented since version 9.1.
Why I need the HAVING Part above?
Because a condition on an aggregate (count in this case) can only be performed after grouping, and can thus not be expressed in the where clause. In this case the having clause makes sure that the organisation is not only present in some big EU Member States, but all big EU Member states.
I tried the below script but it's not working, any advice please?
Without a proper database schema, it is not possible to provide you with the correct SQ, but from the ERD diagram it seems that the organization table does not have a country field. Instead the ismember table connects organizations with countries. You would only insert one organization, but several ismember records (one per Member State involved)
It is better also to name the fields in your insert statement, so it is clear which value corresponds to which field.

How to show count value as 0 on rows removed with WHERE (microsoft access)

I have two tables where one table represent the survey with the location and the other table the people interviewed (there are many people for each survey). I'm trying to show the count of people over a certain age in each location, however some provinces don't have anyone over certain ages therefore don't show in the resulting table. I would like the count to show zero if no one is over a certain age.
I have:
SELECT a.location, Count([b.age])
FROM Survey AS a LEFT JOIN person AS b ON a.surveyid = b.surveyid
Where b.age >= 85
GROUP BY a.location;
I realize that the WHERE clause is what is eliminating the zero count results but I can't figure out the subquery I would need.
Use conditional aggregation instead. That means moving the boolean condition to the argument of the aggregation function
SELECT s.location,
SUM(IIF(p.age >= 85, 1, 0))
FROM Survey AS s LEFT JOIN
person AS p
ON s.surveyid = p.surveyid
GROUP BY s.location;
Noticed that I changed the table aliases to be abbreviations of the table names. This makes the query easier to follow.

Querying records that meet muliple criteria

Hi I’m trying to write a query and I’m struggling to figure out how to go about it.
I have a suppliers table and a supplier parts table I want to write a query that lists suppliers that have specified related Parts in the supplier parts table. If a supplier doesn’t have all specified related parts then they should not be listed.
At the moment I have written a very basic query that lists the supplier if they have a related supplier part that meets the criteria.
SELECT id ,name
FROM
efacdb.dbo.suppliers INNER JOIN [efacdb].[dbo].[spmatrix] ON
id = spmsupp
WHERE spmpart
IN ('ALUM_5083', 'ALUM_6082')
I only want to show the supplier if they have both parts related. Does anyone know how I could do this?
Use a subquery with counting distinct occurences:
select * from suppliers s
where 2 = (select count(distinct spmpart) from spmatrix
where id = spmsupp and spmpart in ('ALUM_5083', 'ALUM_6082'))
As a note, you can modify your query to get what you want, just by using an aggregation:
SELECT id, name
FROM efacdb.dbo.suppliers INNER JOIN
[efacdb].[dbo].[spmatrix]
ON id = spmsupp
WHERE spmpart IN ('ALUM_5083', 'ALUM_6082')
GROUP BY id, name
HAVING MIN(spmpart) <> MAX(spmpart);
If you know there are no duplicates, then having count(*) = 2 also solves the problem.

Compare 2 columns in 2 tables with DISTINCT value

I am now creating a reporting service with visual business intelligent.
i try to count how many users have been created under an org_id.
but the report consist of multiple org_id. and i have difficulties on counting how many has been created under that particular org_id.
TBL_USER
USER_ID
0001122
0001234
ABC9999
DEF4545
DEF7676
TBL_ORG
ORG_ID
000
ABC
DEF
EXPECTED OUTPUT
TBL_RESULT
USER_CREATED
000 - 2
ABC - 1
DEF - 2
in my understanding, i need nested SELECT, but so far i have come to nothing.
SELECT COUNT(TBL_USER.USER_ID) AS Expr1
FROM TBL_USER INNER JOIN TBL_ORG
WHERE TBL_USER.USER_ID LIKE 'TBL_ORG.ORG_ID%')
this is totally wrong. but i hope it might give us clue.
It looks like the USER_ID value is the concatenation of your ORG_ID and something to make it unique. I'm assuming this is from a COTS product and nothing a human would have built.
Your desire is to find out how many entries there are by department. In SQL, when you read the word by in a requirement, that implies grouping. The action you want to take is to get a count and the reserved word for that is COUNT. Unless you need something out of the TBL_ORG, I see no need to join to it
SELECT
LEFT(T.USER_ID, 3) AS USER_CREATED
, COUNT(1) AS GroupCount
FROM
TBL_USER AS T
GROUP BY
LEFT(T.USER_ID, 3)
Anything that isn't in an aggregate (COUNT, SUM, AVG, etc) must be in your GROUP BY.
SQLFiddle
I updated the fiddle to also show how you could link to TBL_ORG if you need an element from the row in that table.
-- Need to have the friendly name for an org
-- Now we need to do the join
SELECT
LEFT(T.USER_ID, 3) AS USER_CREATED
, O.SOMETHING_ELSE
, COUNT(1) AS GroupCount
FROM
TBL_USER AS T
-- inner join assumes there will always be a match
INNER JOIN
TBL_ORG AS O
-- Using a function on a column is a performance killer
ON O.ORG_ID = LEFT(T.USER_ID, 3)
GROUP BY
LEFT(T.USER_ID, 3)
, O.SOMETHING_ELSE;