CASE statement leaving NULL - sql

I am trying to make it so there are no NULLs in the JobTitle column. When I do it without at the CASE, I get two JobTitle columns (one for males and one for females) and some have NULL. I want to make it so there is just one column listing all the job titles then listing the total number of males/females next to that column in their own columns. (This is using the AdventureWorks db)
USE AdventureWorks2019
GO
select count(hre.gender) AS NumberOfFemales, JobTitle
into #FemalesPerJobTitle
from HumanResources.employee as hre
group by JobTitle, Gender
having gender = 'F';
SELECT COUNT(HRE.Gender) AS NumberOfMales, JobTitle
INTO #MalesPerJobTitle
FROM HumanResources.Employee AS HRE
GROUP BY JobTitle, Gender
HAVING gender = 'M';
SELECT FPJ.NumberOfFemales AS Females
, MPJ.NumberOfMales AS Males
,
CASE
WHEN MPJ.JobTitle IS NULL THEN FPJ.JobTitle
END AS JobTitle
FROM #FemalesPerJobTitle AS FPJ
FULL OUTER JOIN #MalesPerJobTitle AS MPJ
ON FPJ.JobTitle = MPJ.JobTitle

SELECT
JobTitle,
SUM(CASE WHEN gender = ‘M’ THEN 1 ELSE 0 END) AS Males,
SUM(CASE WHEN gender = ‘F’ THEN 1 ELSE 0 END) AS Females
FROM HumanResources.Employee
GROUP BY JobTitle
You could probably put a COALESCE around the SUMs if it is returning any nulls and you want zeros instead

It sounds like you want to produce a result set that
list all job titles, with
the count of men and women holding each title
If you have a table holding the complete list of job titles (which the database should since job title is/would seem to be a proper entity), I would use that as the source for the job title column.
But, if you don't have that, I'd do something like the following and use a derived table to give the the distinct set of job titles in both source tables:
select coalesce( f.NumberOfFemales , 0 ) Females ,
coalesce( m.NumberOfMales , 0 ) Males ,
jt.JobTitle JobTitle
from ( select JobTitle from #FemalesPerJobTitle
UNION select JobTitle from #MalesPerJobTitle
) jt
left join #FemalesPerJobTitle f on f.JobTitle = jt.JobTitle
left join #MalesPerJobTitle m on m.JobTitle = jt.JobTitle
Another way to go about it (and probably easier for others to understand) would be to do something like this:
select JobTitle = t.JobTitle,
NumberOfFemales = sum( t.NumberOfFemales ) ,
NumberOfMales = sum( t.NumberOfMales )
from ( select JobTitle,
NumberOfFemales = NumberOfFemales,
NumberOfMales = 0
from #FemalesPerJobTitle
UNION ALL
select JobTitle,
NumberOfFemales = 0 ,
NumberOfMales = NumberOfMales
from #MalesPerJobTitle
) t
group by t.JobTitle
Here, the derived table uses UNION ALL to not eliminate duplicates because (1) there shouldn't be any, (2) the query will be more efficient, and (3) the group by clause will take care of the roll-up.

Here you go, I used CTE which makes more sense than temp tables here.
WITH Fcount AS
(
select count(hre.gender) AS NumberOfFemales, JobTitle
from HumanResources.employee as hre
WHERE gender = 'F'
group by JobTitle
), Mcount AS
(
SELECT COUNT(HRE.Gender) AS NumberOfMales, JobTitle
FROM HumanResources.Employee AS HRE
WHERE gender = 'M'
GROUP BY JobTitle
), titles CASE
(
SELECT DISTINCT JobTitle FROM Fcount
UNION ALL
SELECT DISTINCT JobTitle FROM MCount
)
SELECT titles.JobTitle,
FPJ.NumberOfFemales AS Females,
MPJ.NumberOfMales AS Males
FROM titles
LEFT JOIN Fcount AS FPJ ON titles.JobTitle = FPJ.JobTitle
LEFT JOIN Mcount AS MPJ ON titles.JobTitle = MPJ.JobTitle

Related

SQL statement to select all dead people

I have tables:
City:
zip, name,...
People:
id, city_zip(refers to city.zip), born_time, dead_time
I need to select data about cities where ALL people from that city are dead: born_time NOT NULL AND dead_time < NOW()
because we do not assume that someone is dead if we do not have information.
You can use not exists:
select c.*
from city c
where not exists (
select 1
from people p1
where p1.city_zip = c.zip and (dead_time is null or dead_time > now())
)
This would also return cities that have no people at all. If that's something you want to avoid, then another option is aggregation:
select c.*
from city c
inner join people p on p.city_zip = c.zip
group by c.zip
having max(case when dead_time is null or dead_time > now() then 1 else 0 end) = 0
select c.* ... from city c ... group by c.zip is valid standard SQL (assuming that zip is the primary key of table city). However, all databases do not support it, in which case you will need to enumerate the columns you want in both the select and group by clauses.

How do I get counts for different values of the same column with a single totals row, using Postgres SQL?

So I have a list of children and I want to create a list of how many boys and girls there are in each school and a final total count of how many there are.
My query including logic
select sch.id as ID, sch.address as Address, count(p.sex for male) as boycount, count(p.sex for female) as girlcount
from student s
join school sch on sch.studentid = s.id
join person p on p.studentid = s.id
Obviously I know this query wont work but I dont know what to do further from here. I thought about nested query but im having difficulty getting it to work.
I found a similar question for postgres 9.4
Postgres nested SQL query to count field. However I have Postgres 9.3.
Final result would be like :
WARNING
Depending on the data type of the school ID, you may get an error with this union. Consider casting the school ID as a varchar if it is of type INT.
SELECT
sch.id as ID, /*Consider casting this as a varchar if it conflicts with
the 'Total' text being unioned in the next query*/
sch.address as Address,
SUM(CASE
WHEN p.sex = 'male'
THEN 1
ELSE 0
END) AS BoyCount,
SUM(CASE
WHEN p.sex = 'female'
THEN 1
ELSE 0
END) AS GirlCount
FROM
student s
JOIN school sch
ON sch.studentid = s.id
JOIN person p
ON p.studentid = s.id
UNION ALL
SELECT
'Total' as ID,
NULL as Address,
SUM(CASE
WHEN p.sex = 'male'
THEN 1
ELSE 0
END) AS BoyCount,
SUM(CASE
WHEN p.sex = 'female'
THEN 1
ELSE 0
END) AS GirlCount
FROM
person p

SQL Query - Unsure How to Fix Logical Error

Edit: Sorry! I am using Microsoft SQL Server.
For clarification, you can have a department named "x" with a list of jobs, a department named "y" with a different list of jobs, etc.
I also need to use >= ALL instead of TOP 1 or MAX because I need it to return more than one value if necessary (if job1 has 20 employees, job2 has 20 employees and they are both the biggest values, they should both return).
In my query I'm trying to find the most common jobTitle and the number of employees that work under this jobTitle, which is under the department 'Research and Development'. The query I've written consists of joins to be able to return the necessary data.
The problem I am having is with the WHERE statement. The HAVING COUNT(JobTitle) >= ALL is finding the biggest number of employees that work under a job, however the problem is that my WHERE statement is saying the Department must be 'Research and Development', but the job with the most amount of employees comes from a different department, and thus the output produces only the column names and nothing else.
I want to redo the query so that it returns the job with the largest amount of employees that comes from the Research and Development department.
I know this is probably pretty simple, I'm a noob :3 Thanks a lot for the help!
SELECT JobTitle, COUNT(JobTitle) AS JobTitleCount, Department
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle, Department
HAVING COUNT(JobTitle) >= ALL (
SELECT COUNT(JobTitle) FROM HumanResources.Employee
GROUP BY JobTitle
)
If you only want one row, then a typical method is:
SELECT JobTitle, COUNT(*) AS JobTitleCount
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
Although FETCH FIRST 1 ROW ONLY is the ANSI standard, some databases spell it LIMIT or even SELECT TOP (1).
Note that I removed DEPARTMENT both from the SELECT and the GROUP BY. It seems redundant.
And, if I had to guess, your query is going to overstate results because of the history table. If this is the case, ask another question, with sample data and desired results.
EDIT:
In SQL Server, I would recommend using window functions. To get the one top job title:
SELECT JobTitle, JobTitleCount
FROM (SELECT JobTitle, COUNT(*) AS JobTitleCount,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM HumanResources.Employee AS EMP JOIN
HumanResources.EmployeeDepartmentHistory AS HIST
ON EMP.BusinessEntityID = HIST.BusinessEntityID JOIN
HumanResources.Department AS DEPT
ON HIST.DepartmentID = DEPT.DepartmentID
WHERE Department = 'Research and Development'
GROUP BY JobTitle
) j
WHERE seqnum = 1;
To get all such titles, when there are duplicates, use RANK() or DENSE_RANK() instead of ROW_NUMBER().
with employee_counts as (
select
hist.DepartmentID, emp.JobTitle, count(*) as cnt,
case when dept.Department = 'Research and Development' then 1 else 0 end as is_rd,
from HumanResources.Employee as emp
inner join HumanResources.EmployeeDepartmentHistory as hist
on hist.BusinessEntityID = emp.BusinessEntityID
inner join HumanResources.Department as dept
on dept.DepartmentID = hist.DepartmentID
group by
hist.DepartmentID, emp.JobTitle
)
select * from employee_counts
where is_rd = 1 and cnt = (
select max(cnt) from employee_counts
/* where is_rd = 1 */ -- ??
);

SQL count sum and joining table

I ran into a question from my SQL class but did not have a solution to it. My query needs to show ONLY the movies that have same number of female and male actors.
I have three tables:
(table:field 1,field2):
Casting: actor_number, movie_number
Actor_List: id, name, gender
Movie_List: id, movie_name
The sub-query uses a little CASE() trick to increment counts conditionally (i.e. countif). The sub-query factoring syntax means we only execute the query once.
with cte as (
select m.movie_name
, sum(case when a.gender = 'M' then 1 else 0 end) as male_tot
, sum(case when a.gender = 'F' then 1 else 0 end) as female_tot
from casting c
join movie_list m
on c.movie_number = m.id
join actor_list a
on c.actor_number = a.id
group by m.name
)
select cte.*
from cte
where cte.male_tot = cte.female_tot ;

SELECT count(*) then output empty or not empty with yes or no

Find if SALES department has its locations in exactly the same cities as TRANSPORT department"(assume that an empty result means YES and any nonempty result means NO)
In my database got this table.
//DEPTLOC
DNAME CITY
----------------
SALES LONDON
TRANSPORT LONDON
TRANSPORT BOSTON
SCIENCE BOSTON
Not sure how to write the query but at least i try
SELECT COUNT(*) FROM DEPTLOC WHERE
(DNAME='SALES' AND DNAME='TRANSPORT') AND
How should i write after the AND operator in sql in order to get output something like this
//Display
YES . (<--- LONDON , got SALES and TRANSPORT in same cities)
You can use decode
SELECT Decode(Count(*), 0, 'No',
'Yes')
FROM {rest of your query}
DECODE compares expr to each search value one by one. If expr is equal to a search, then Oracle Database returns the corresponding result. If no match is found, then Oracle returns default. If default is omitted, then Oracle returns null.
To pick up whether you have sales and transport you should use a self join.
By aliasing the DEPTLOC table as 'sales' and 'transport' and joining on the city name you will get a list of the city which have both sales & transport.
SELECT sales.CITY
FROM DEPTLOC sales
INNER JOIN DEPTLOC transport ON sales.CITY = transport.CITY
WHERE sales.DNAME = 'SALES'
AND transport.DNAME = 'TRANSPORT'
To get a list of yes and no you can use the above as a sub-query and use a case statement. I have added the distinct clause to eliminate any duplicate entries.
SELECT DISTINCT CITY,
CASE WHEN salesandtransport.CITY IS NULL THEN 'NO' ELSE 'YES' END as DISPLAY
FROM DEPTLOC
LEFT JOIN
(SELECT sales.CITY AS CITY
FROM DEPTLOC sales
INNER JOIN DEPTLOC transport ON sales.CITY = transport.CITY
WHERE sales.DNAME = 'SALES'
AND transport.DNAME = 'TRANSPORT') salesandtransport ON DEPTLOC.CITY = salesandtransport.CITY
There are many ways to approach this problem. The following uses full outer join to compare the two sets of cities. The idea is that any non-matching city will produce a NULL value. This is detecting by comparing count(*) to count(xx.City):
select (case when count(*) = count(dls.City) and count(*) = count(dlt.City)
then 'Yes'
else 'No'
end)
from DeptLoc dls full outer join
DeptLoc dlt
on dls.Dname = 'Sales' and dlt.Dname = 'Transport' and dls.City = dlt.City;
I think you should group by CITY and output Cities where count of DEPTS is not equal to count of DEPTS you need (2 in this case):
SELECT CASE WHEN
EXISTS(SELECT CITY FROM DEPTLOC
WHERE DNAME IN('SALES','TRANSPORT')
GROUP BY CITY
HAVING COUNT(*)<>2
)
THEN 'NO'
ELSE 'YES'
END
FROM DUAL;
SQL Fiddle demo
This simple query should work just fine for you:
SELECT CITY,'YES' FROM DEPTLOC
WHERE CITY IN(SELECT CITY FROM DEPTLOC WHERE DNAME='TRANSPORT')
AND DNAME ='LOCATION'
OUTPUT will have those cities those have both transport and location and look like:
LONDON YES
I guess, this is what you are after:
SELECT CITY,COUNT(*),CASE WHEN COUNT(*)>2 THEN 'YES' ELSE 'NO' END AS STATUS
FROM DEPTLOC WHERE
(DNAME='SALES' OR DNAME='TRANSPORT')
AND CITY='LONDON'
GROUP BY CITY
Result:
CITY COUNT(*) STATUS
LONDON 2 YES
See result in SQL Fiddle.
EXPLANATION:
STATUS field will be YES only if it returns 2 rows for that city. .i.e., 1 for Sales and 1 for Transport.
This question is a little bit ambiguous, but I now think that this is what is asked: Give 'YES' if for all cities where there is a SALES, there is also a TRANSPORT, and NO otherwise.
I believe this query should achieve this.
SELECT CASE WHEN Cnt = 0 THEN 'NO' ELSE 'YES' END AS Answer FROM
(SELECT COUNT(*) AS Cnt FROM DUAL
WHERE NOT EXISTS
(SELECT * FROM DeptLoc dl
WHERE dl.DName = 'SALES' AND NOT EXISTS
(SELECT * FROM DeptLoc
WHERE City = dl.City AND DName = 'TRANSPORT')));
Modified from SQL Fiddle by Raging Bull: http://www.sqlfiddle.com/#!4/ff7a8/1
Here's another alternative. A FULL OUTER JOIN is made between the rows having DNAME as SALES and TRANSPORT. Their differences are found if there are any rows with CITY as NULL in the other set, as follows:
SELECT DECODE(COUNT(*), 0, 'YES', 'NO') Display
FROM
(
SELECT s.DNAME as s_DNAME, s.CITY as s_CITY, t.DNAME as t_DNAME, t.CITY as t_CITY
FROM
(SELECT DNAME, CITY
FROM DEPTLOC
WHERE DNAME = 'SALES') s
FULL OUTER JOIN
(SELECT DNAME, CITY
FROM DEPTLOC
WHERE DNAME = 'TRANSPORT'
) t
ON s.CITY = t.CITY
WHERE s.CITY IS NULL OR t.CITY IS NULL
);
Here's the SQL Fiddle.
The SQL Fiddle also contains a query to list the cities with only SALES or TRANSPORT.