SQL - Group By Distinct Values - sql

My question, is there a faster way to the following query?
I'm using ORACLE 10g
Say i have a table Manufacturer and Car, and i want to count all occurrences of the column 'Car.Name'. here is How i'd do it:
SELECT manuf.Name, COUNT(car1.Name), COUNT(car2.Name), COUNT(car3.Name)
FROM Manufacturer manuf
LEFT JOIN (SELECT * FROM Car c where c.Name = 'Ferrari1') car1 ON manuf.PK = car1.ManufPK
LEFT JOIN (SELECT * FROM Car c where c.Name = 'Ferrari2') car2 ON manuf.PK = car2.ManufPK
LEFT JOIN (SELECT * FROM Car c where c.Name = 'Ferrari3') car3 ON manuf.PK = car3.ManufPK
GROUP BY manuf.Name
Wanted Results:
Manufacturer | Ferrari1 | Ferrari2 | Ferrari3
----------------------------------------------
Fiat | 1 | 0 | 5
Ford | 2 | 3 | 0
I tried this with few LEFT JOINs, and it worked fine. But when i added a lot (like 90+), it was ultra slow (more than 1 minute).
My question, is there a faster way to do this query?

If you are happy to see the cars counted down the page, try:
select m.Name manufacturer_name,
c.Name car_name,
count(*)
from Manufacturer m
left join Car c
on m.PK = c.ManufPK and c.Name in ('Ferrari1','Ferrari2','Ferrari3')
group by m.Name, c.Name
If you need to see individual cars across the page, try:
select m.Name manufacturer_name,
sum(case c.Name when 'Ferrari1' then 1 else 0 end) Ferrari1_Count,
sum(case c.Name when 'Ferrari2' then 1 else 0 end) Ferrari2_Count,
sum(case c.Name when 'Ferrari3' then 1 else 0 end) Ferrari3_Count
from Manufacturer m
left join Car c
on m.PK = c.ManufPK and c.Name in ('Ferrari1','Ferrari2','Ferrari3')
group by m.Name

SELECT manuf.Name, COUNT(DISTINCT c.Name)
FROM Manufacturer manuf
LEFT JOIN Car c ON manuf.PK = c.ManufPK
GROUP BY manuf.Name
OR depending on your needs
SELECT manuf.Name, c.Name, COUNT(*) Cnt
FROM Manufacturer manuf
LEFT JOIN Car c ON manuf.PK = c.ManufPK
GROUP BY manuf.Name, c.Name
PS: Your question is not very clear. Provide some wanted resultset to refine the answer

You can also try this:
SELECT manuf.Name
, car1.cnt AS Ferrari1
, car2.cnt AS Ferrari2
, car3.cnt AS Ferrari3
FROM
Manufacturer AS manuf
LEFT JOIN
( SELECT ManufPK, COUNT(*) AS cnt
FROM Car
WHERE Name = 'Ferrari1'
GROUP BY ManufPK
) AS car1
ON car1.ManufPK = manuf.PK
LEFT JOIN
( SELECT ManufPK, COUNT(*) AS cnt
FROM Car
WHERE Name = 'Ferrari2'
GROUP BY ManufPK
) AS car2
ON car2.ManufPK = manuf.PK
LEFT JOIN
( SELECT ManufPK, COUNT(*) AS cnt
FROM Car
WHERE Name = 'Ferrari3'
GROUP BY ManufPK
) AS car3
ON car3.ManufPK = manuf.PK
ORDER BY manuf.Name

Related

PostreSQL filter results by subset of a joined table and group them

I have table of people, another table of cars and a third table to join them since they have a many to many relationship. I want to select people who own a certain set of cars and group them by a region property on the person. So for example I would want to find all American's who own a Honda and a Nissan.
Example:
people table
id name region
1 Jon America
2 Jane Europe
3 Mike America
cars table
id make
1 Honda
2 Toyota
3 Nissan
people_cars table
person_id car_id
1 1
1 3
2 2
3 1
Desired result:
region own_honda_and_nissan
America 1
Europe 0
An idea for a SQL expression I have is:
SELECT
people.region,
CASE WHEN SUM(CASE WHEN cars.name IN ('Honda', 'Nissan') THEN 1 ELSE 0 END) = 2 THEN 1 ELSE 0 AS own_honda_and_nissan
FROM people
JOIN people_cars ON people_cars.person_id = people.id
JOIN cars ON people_cars.car_id = cars.id
GROUP BY people.region
HAVING
SUM(CASE WHEN cars.name IN ('Honda', 'Nissan') THEN 1 ELSE 0 END) = 2
ORDER BY own_honda_and_nissan DESC
This works if you group by people.id but once they get grouped by region it no longer works.
Use two levels of aggregation:
SELECT p.pregion, COUNT(*) as own_honda_and_nissan
FROM (SELECT pid, p.region,
FROM people p JOIN
people_cars pc
ON pc.person_id = p.id JOIN
cars c
ON pc.car_id = c.id
WHERE c.name IN ('Honda', 'Nissan')
GROUP BY p.id, p.region
HAVING COUNT(DISTINCT c.name) = 2
) p
GROUP BY p.region
ORDER BY own_honda_and_nissan DESC

How to get the count of a particular category for each year?

I am trying out a problem which states me to find " For each year, count the number of movies in that year that had only female actors".
Table schema is as follows:
-------------------- ----------------------- ----------------------
| Movie | | Person | | Cast |
-------------------- ------------------------ ----------------------
| MovieID | year | | PersonID | Gender | | MovieID | PersonID |
-------------------- ------------------------ ----------------------
Running the following query:
SELECT M.YEAR, COUNT(M.MID) NUMBER_OF_FEMALE_ONLY_MOVIES FROM MOVIE M
WHERE M.MID IN (SELECT X.MID FROM (SELECT AX.MID, COUNT(AX.PID) TOTAL_CAST
FROM M_CAST AX GROUP BY AX.MID) X
WHERE
X.TOTAL_CAST = (SELECT COUNT(A.PID) FROM M_CAST A, PERSON B WHERE A.MID =
X.MID AND
TRIM(B.PID) = TRIM(A.PID) AND B.GENDER = 'Female')) GROUP BY M.YEAR
My results are :
---------------------------------------
| year | NUMBER_OF_FEMALE_ONLY_MOVIES |
---------------------------------------
| 1999 | 1 |
| 2005 | 1 |
| 2009 | 1 |
| 2012 | 1 |
| 2018 | 1 |
----------------------------------------
But I need to return 0 as count for the years which do not have any such movies.
Eg.
2013 0
WITH
PERSON_CAST_MERGE AS
(
SELECT P.PID,C.MID,GENDER
FROM PERSON P
INNER JOIN M_CAST C ON C.PID = P.PID
),
MALE_COUNT AS
(
SELECT F.MID FROM PERSON_CAST_MERGE F
WHERE TRIM(F.GENDER) NOT LIKE "%FEMALE%"
),
FEMALE_COUNT AS
(
SELECT F.MID FROM PERSON_CAST_MERGE F
WHERE TRIM(F.GENDER) LIKE "%FEMALE%"
),
ONLY_FEMALE AS
(
SELECT F.MID FROM FEMALE_COUNT F
WHERE F.MID NOT IN (SELECT M.MID FROM MALE_COUNT M)
),
TEST AS
(
SELECT M.YEAR,COUNT(M.MID) AS NO_OF_MOVIES
FROM ONLY_FEMALE F
INNER JOIN MOVIE M ON M.MID = F.MID
GROUP BY M.YEAR
)
SELECT M.YEAR,
CASE
WHEN M.YEAR IN (SELECT F.YEAR FROM TEST F) THEN
(SELECT F.NO_OF_MOVIES FROM TEST F WHERE F.YEAR = M.YEAR)
WHEN M.YEAR <> (SELECT F.YEAR FROM TEST F) THEN
0
END
AS NO_OF_MOVIES
FROM MOVIE M
GROUP BY M.YEAR
I'd suggest exploring the data within the CTE to get a better understanding.
First CTE (all_cast): Return the entire movie cast
Second CTE (male_present): Return movie id's from all_cast where there exists male actors.
Result: Return movies from all_cast where movie id is not present in male_present
WITH all_cast AS (
SELECT SUBSTR(m."year",-4) as 'Year', m.title, trim(m.MID) as MID, p.Name, trim(p.Gender) as Gender
FROM Movie m
JOIN M_Cast mc
ON m.MID = mc.MID
JOIN Person p
ON trim(mc.PID) = p.PID
),
male_present AS (
SELECT year, mid, name
FROM all_cast
WHERE Gender = 'Male'
)
SELECT year, COUNT(DISTINCT mid) as 'All Female Cast'
FROM all_cast a
WHERE NOT EXISTS (SELECT * FROM male_present WHERE a.mid = mid)
GROUP BY year
You need only the group by with subquery as you require reference to the movieids of personids with gender as female in person
SELECT YEAR, COUNT(*) FROM
MOVIE
Where MovieId IN (SELECT MOVIEId
from CAST WHERE PERSONID IN
(Select PersonId from Person Where
Gender ='FEMALE'))
Group by Year
Try this- A DISTINCT MovieID is required as there may have multiple Female casting for a single movie. Distinct will provide the actual count of movies.
SELECT
M.Year,
COUNT(DISTINCT MovieID)
FROM Movie M
INNER JOIN Cast C ON M.MovieID = C.MovieID
INNER JOIN Person P ON C.PersonID = P.PersonID
WHERE P.Gender = 'Female'
GROUP BY M.Year;
I think the problem can be solved by joining all tables and filtering on WHERE clause for female actors. In this case joining tables will also give better performance rather than sub-querying.
Please try the following code:
Select year, count(*)
from movie
join Cast on movie.movieid = cast.movieid
join person on person.personid = cast.personid
where person.gender = 'Female'
group by year
Please let me know if that works fine for you.
By merging your query with the Movie table using the outer left join, you can get the desired results. The time taken will be very low compared to the answer posted by #Lucky
WITH FEMALE_ONLY AS
(SELECT M.YEAR,
COUNT(M.MID) COUNT_ALL_FEMALE
FROM MOVIE M
WHERE M.MID IN
(SELECT Q.MID
FROM
(SELECT MC.MID,
COUNT(MC.PID) total
FROM M_CAST MC
GROUP BY MC.MID) Q
WHERE Q.total =
(SELECT COUNT(A.PID)
FROM M_CAST A,
PERSON B
WHERE A.MID = Q.MID
AND TRIM(B.PID) = TRIM(A.PID)
AND B.Gender = 'Female'))
GROUP BY M.YEAR)
SELECT DISTINCT M.year,
coalesce(FO.COUNT_ALL_FEMALE, 0) FEMALE_ONLY_MOVIES
FROM Movie M
LEFT OUTER JOIN FEMALE_ONLY FO ON M.year = FO.year
ORDER BY M.year;
You can do like this
select z.year, count(*)
from Movie z
where not exists (select *
from Person x, M_Cast xy
where x.PID = xy.PID and xy.MID = z.MID and x.gender!='Female')
group by z.year;

One column as parameter both having and where (excluding)

I have such two queries:
First:
select p.prodid, p.name, max(b.ldate) as lastsale
from prod p, buy b
where p.id = b.idprod and b.id<>0 and b.wskus=0 and b.bufor=0
group by p.prodid, p.name
HAVING sum(b.curNo)=0
order by p.name asc
Second
select p.prodid, p.name, min(b.buydate) as oldest_buy
from prod p, buy b
where p.id = b.idprod and b.id<>0 and b.wskus=0 and b.bufor=0 and b.curNo>0
group by p.prodid, p.name
order by p.name asc
How can I make JOIN for them to have as a result columns:
| p.prodid | p.name | lastsale | oldest_buy |
| 1 | ex1 | 1.1.18 | NULL |
| 2 | ex2 | NULL | 1.1.18 |
as HAVING sum(b.curNo)=0 from first query is exclusive to WHERE parameter from second query b.curNo>0 I have problem how to make this work.
Without your input data it's hard to tell, but it's possible this will work for you...
SELECT
p.prodid,
p.name,
MIN(CASE WHEN b.curNo > 0 THEN b.buydate END) AS oldest_buy, -- MIN(buydate) WHERE curno>0
CASE WHEN SUM(b.curNo) = 0 THEN MAX(b.ldate) END AS lastsale -- MAX(ldate) HAVING SUM(curNo) = 0
FROM
prod p
INNER JOIN -- Don't use "," use "JOIN"s, the standard for about 25 years...
buy b
ON p.id = b.idprod
WHERE
b.id <> 0
AND b.wskus = 0
AND b.bufor = 0
GROUP BY
p.prodid,
p.name
ORDER BY
p.name ASC
It's possible that moving the b.curNo > 0 or the SUM(b.curNo) = 0 in to the CASE statements will give extra rows, depending on the behavior of your data. It's impossible to tell without more details or example data.
The values in the two calculations will be okay, but I can't speak for the number of rows.
To be more explicit about it you could do...
SELECT
p.prodid,
p.name,
CASE WHEN MAX(b.curNo) > 0 THEN MIN(CASE WHEN b.curNo > 0 THEN b.buydate END) END AS oldest_buy,
CASE WHEN SUM(b.curNo) = 0 THEN MAX(b.ldate) END AS lastsale
FROM
prod p
INNER JOIN -- Don't use "," use "JOIN"s, the standard for about 25 years...
buy b
ON p.id = b.idprod
WHERE
b.id <> 0
AND b.wskus = 0
AND b.bufor = 0
GROUP BY
p.prodid,
p.name
HAVING
SUM(b.curNo) = 0
OR MAX(b.curNo) > 0
ORDER BY
p.name ASC
Another possibility (Again because you didn't give example data) is to aggregate then join.
This is based on the notion that you mean p.curNo rather than b.curNo...
SELECT
p.prodid,
p.name,
CASE p.curNo > 0 THEN b.oldest_buy END AS oldest_buy,
CASE p.curNo = 0 THEN b.last_sale END AS lastsale
FROM
prod p
INNER JOIN -- Don't use "," use "JOIN"s, the standard for about 25 years...
(
SELECT
idprod,
MIN(buydate) AS oldest_buy,
MAX(ldate) AS last_sale
FROM
buy
WHERE
b.id <> 0
AND b.wskus = 0
AND b.bufor = 0
)
b
ON p.id = b.idprod
ORDER BY
p.name ASC
Put the first query in a subquery before union all. Try this:
select
t.prodid, t.name, t.lastsale, null as oldest_buy
from (select p.prodid, p.name,
max(b.ldate) as lastsale
from prod p, buy b
where p.id = b.idprod and b.id<>0 and
b.wskus=0 and b.bufor=0
group by p.prodid, p.name
HAVING sum(b.curNo)=0 ) t
union all
( select p.prodid, p.name,
null as lastsale, min(b.buydate) as oldest_buy
from prod p, buy b
where p.id = b.idprod and b.id<>0 and b.wskus=0
and b.bufor=0 and b.curNo>0
group by p.prodid, p.name )
order by 2 asc

Select all categories with COUNT of sub-categories

I need to select all categories with count of its sub-categories.
Assume here are my tables:
categories
id | title
----------
1 | colors
2 | animals
3 | plants
sub_categories
id | category_id | title | confirmed
------------------------------------
1 1 red 1
2 1 blue 1
3 1 pink 1
4 2 cat 1
5 2 tiger 0
6 2 lion 0
What I want is :
id | title | count
------------------
1 colors 3
2 animals 1
3 plants 0
What I have tried so far:
SELECT c.id, c.title, count(s.category_id) as count from categories c
LEFT JOIN sub_categories s on c.id = s.category_id
WHERE c.confirmed = 't' AND s.confirmed='t'
GROUP BY c.id, c.title
ORDER BY count DESC
The only problem with this query is that this query does not show categories with 0 sub categories!
You also can check that on SqlFiddle
Any help would be great appreciated.
The reason you don't get rows with zero counts is that WHERE clause checks s.confirmed to be t, thus eliminating rows with NULLs from the outer join result.
Move s.confirmed check into join expression to fix this problem:
SELECT c.id, c.title, count(s.category_id) as count from categories c
LEFT JOIN sub_categories s on c.id = s.category_id AND s.confirmed='t'
WHERE c.confirmed = 't'
GROUP BY c.id, c.title
ORDER BY count DESC
Adding Sql Fiddle: http://sqlfiddle.com/#!17/83add/13
I think you can try this too (it evidence what column(s) you are really grouping by):
SELECT c.id, c.title, RC
from categories c
LEFT JOIN (SELECT category_id, COUNT(*) AS RC
FROM sub_categories
WHERE confirmed= 't'
GROUP BY category_id) s on c.id = s.category_id
WHERE c.confirmed = 't'
ORDER BY RC DESC

Using 'AND' in a many-to-many relationship

I have a Users table and a Groups table. Users can be in multiple groups via a 'UserInGroup' table and Groups can have a 'GroupTypeId'.
[User]
--------------
Id | Name
1 | Bob
2 | James
[UserInGroup]
-----------------
UserId | GroupId
1 1
1 2
[Group]
Id | Name | TypeId
------------------------
1 | Directors | 1
2 | IT | 1
3 | London | 2
I want to create a query to return for example users that are in both 'Directors' AND 'London' (rather than 'Directors' OR 'London'). However, I only want to AND groups of a different 'Type', I want to OR groups of the same type. I could do with having a separate table per group type but I can't as they are created dynamically.
Ideally I want to be able to query users who are in 'Directors' OR 'IT' AND 'London'.
What is the most efficient way of doing this?
This problem is commonly known as Relational Division.
SELECT a.Name
FROM [user] a
INNER JOIN UserInGroup b
ON a.ID = b.UserID
INNER JOIN [Group] c
ON b.groupID = c.TypeId
WHERE c.Name IN ('Directors','London')
GROUP BY a.Name
HAVING COUNT(*) = 2
SQLFiddle Demo
SQL of Relational Division
But if a UNIQUE constraint was not enforce on GROUP for every USER, DISTINCT keywords is needed to filter out unique groups:
SELECT a.Name
FROM [user] a
INNER JOIN UserInGroup b
ON a.ID = b.UserID
INNER JOIN [Group] c
ON b.groupID = c.TypeId
WHERE c.Name IN ('Directors','London')
GROUP BY a.Name
HAVING COUNT(DISTINCT c.Name) = 2
OUTPUT from both queries
╔══════╗
║ NAME ║
╠══════╣
║ Bob ║
╚══════╝
I arrived at the following solution (with help from J W and this article):
SELECT
u.Name UserName
FROM [User] u
INNER JOIN [UserInGroup] uig
ON uig.UserId = u.Id
INNER JOIN [Group] g
ON g.Id = uig.GroupId
WHERE
g.Id IN (1,2,3) -- these are the passed in groupids
GROUP BY
u.Name
having count(distinct g.TypeId)
= (select count(distinct g1.TypeId)
from [group] g1 where g1.Id IN (1,2,3))
This allows me to group the relational division by a discriminator field. An alternative would be this:
SELECT a.Name
FROM [User] a
INNER JOIN
(
SELECT b.UserID
FROM UserInGroup b
INNER JOIN [Group] c
ON b.groupID = c.Id
WHERE c.Name IN ('Directors','IT')
GROUP BY b.UserID
HAVING COUNT(DISTINCT c.Name) >= 1
) b ON a.ID = b.UserID
INNER JOIN
(
SELECT DISTINCT b.UserID
FROM UserInGroup b
INNER JOIN [Group] c
ON b.groupID = c.Id
WHERE c.Name = 'London'
) c ON a.ID = c.UserID
With an extra join for each GroupTypeId. Execution plans look similar, so I went with the first option.