How to group results by count of relationships - sql

Given tables, Profiles, and Memberships where a profile has many memberships, how do I query profiles based on the number of memberships?
For example I want to get the number of profiles with 2 memberships. I can get the number of profiles for each membership with:
SELECT "memberships"."profile_id", COUNT("profiles"."id") AS "membership_count"
FROM "profiles"
INNER JOIN "memberships" on "profiles"."id" = "memberships"."profile_id"
GROUP BY "memberships"."profile_id"
That returns results like
profile_id | membership_count
_____________________________
1 2
2 5
3 2
...
But how do I group and sum the counts to get the query to return results like:
n | profiles_with_n_memberships
_____________________________
1 36
2 28
3 29
...
Or even just a query for a single value of n that would return
profiles_with_2_memberships
___________________________
28

I don't have your sample data, but I just recreated the scenario here with a single table : Demo
You could LEFT JOIN the counts with generate_series() and get zeroes for missing count of n memberships. If you don't want zeros, just use the second query.
Query1
WITH c
AS (
SELECT profile_id
,count(*) ct
FROM Table1
GROUP BY profile_id
)
,m
AS (
SELECT MAX(ct) AS max_ct
FROM c
)
SELECT n
,COUNT(c.profile_id)
FROM m
CROSS JOIN generate_series(1, m.max_ct) AS i(n)
LEFT JOIN c ON c.ct = i.n
GROUP BY n
ORDER BY n;
Query2
WITH c
AS (
SELECT profile_id
,count(*) ct
FROM Table1
GROUP BY profile_id
)
SELECT ct
,COUNT(*)
FROM c
GROUP BY ct
ORDER BY ct;

Related

How to sum up max values from another table with some filtering

I have 3 tables
User Table
id
Name
1
Mike
2
Sam
Score Table
id
UserId
CourseId
Score
1
1
1
5
2
1
1
10
3
1
2
5
Course Table
id
Name
1
Course 1
2
Course 2
What I'm trying to return is rows for each user to display user id and user name along with the sum of the maximum score per course for that user
In the example tables the output I'd like to see is
Result
User_Id
User_Name
Total_Score
1
Mike
15
2
Sam
0
The SQL I've tried so far is:
select TOP(3) u.Id as User_Id, u.UserName as User_Name, SUM(maxScores) as Total_Score
from Users as u,
(select MAX(s.Score) as maxScores
from Scores as s
inner join Courses as c
on s.CourseId = c.Id
group by s.UserId, c.Id
) x
group by u.Id, u.UserName
I want to use a having clause to link the Users to Scores after the group by in the sub query but I get a exception saying:
The multi-part identifier "u.Id" could not be bound
It works if I hard code a user id in the having clause I want to add but it needs to be dynamic and I'm stuck on how to do this
What would be the correct way to structure the query?
You were close, you just needed to return s.UserId from the sub-query and correctly join the sub-query to your Users table (I've joined in reverse order to you because to me its more logical to start with the base data and then join on more details as required). Taking note of the scope of aliases i.e. aliases inside your sub-query are not available in your outer query.
select u.Id as [User_Id], u.UserName as [User_Name]
, sum(maxScore) as Total_Score
from (
select s.UserId, max(s.Score) as maxScore
from Scores as s
inner join Courses as c on s.CourseId = c.Id
group by s.UserId, c.Id
) as x
inner join Users as u on u.Id = x.UserId
group by u.Id, u.UserName;

SQL | List all all tuples(a, b, c) if there exists another tuple with equal (b,c)

I have three tables where the bold attribute(s) is the primary key
Restaurants(restaurant_ID, name, ...)
resturant_ID, name, ...
1, Macdonalds
2, Hubert
3, Dorsia
... ...
Identifier(restaurant_ID, food_ID)
restaurant_ID, food_ID, ...
1, 1
1, 4
2, 1
2, 7
... ...
Food(food_ID, name, ...)
food_ID food_name
1 Chips
2 Burgers
3 Salmon
... ...
Using postgres I want to list out all restaurants (restaurant_id and name - 1 row per restaurant) that have share the exact same set of foods with at least one other restaurant.
For example, let's say
Restaurant with ID "1" has only associated food_id's 1 and 4 as shown in Identifier
Restaurant with ID "3" has only associated food_id's 4 and 1 as shown in Identifier
Restaurant with ID "7" has only associated food_id's 6 as shown in Identifier
Restaurant with ID "9" has only associated food_id's 6 as shown in Identifier
Then output
Restaurant_id name
1 name1
3 name3
7 ...
9 ...
Any help would be greatly appreciated!
Thank you
Use the aggregate function string_agg() to get the full list of foods for each restaurant:
with cte as (
select restaurant_ID,
string_agg(food_ID::varchar(10),',' order by food_ID) foods
from identifier
group by restaurant_ID
)
select r.*
from Restaurants r inner join cte c
on c.restaurant_ID = r.restaurant_ID
where exists (select 1 from cte where restaurant_ID <> c.restaurant_ID and foods = c.foods)
But I would prefer to group restaurants based on matching foods:
with cte as (
select restaurant_ID,
string_agg(food_ID::varchar(10),',' order by food_ID) foods
from identifier
group by restaurant_ID
)
select string_agg(r.name, ',') restaurants
from Restaurants r inner join cte c
on c.restaurant_ID = r.restaurant_ID
group by foods
having count(*) > 1
See the demo.
Here is a way to get the unique set of resturants having exactly same food items. This uses array_agg() and array_to_string() functions
With cte as
(select T.restaurant_id, array_to_string(array_agg(food_id), ',') as food_list
from
(select *
from Identifier t1
order by restaurant_id, food_id) T
group by T.restaurant_id)
select
concat(r1.name,',',r2.name) as resturant_names,
t1.restaurant_id as restaurant_id1,
r1.name as restaurant_1,
t2.restaurant_id as restaurant_id2,
r2.name as restaurant_2,
t1.food_list as common_food_ids
from cte t1
join cte t2
on t1.restaurant_id < t2.restaurant_id
and t1.food_list = t2.food_list
left join Restaurants r1
on t1.restaurant_id = r1.restaurant_id
left join Restaurants r2
on t2.restaurant_id = r2.restaurant_id;
EDIT : Here is a dB fiddle - https://dbfiddle.uk/?rdbms=postgres_12&fiddle=e2de05edfbe036cc0d81c64d60f0b599 . Also, just for reference, solution to the same problem in Oracle using listagg function - https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=12785c3d5abbca97be5d44dd45a6da4a
Update : Below query addresses the update output format of the question.
With cte as
(select T.restaurant_id, array_to_string(array_agg(food_id), ',') as food_list
from
(select *
from Identifier t1
order by restaurant_id, food_id) T
group by T.restaurant_id)
select
--concat(r1.name,',',r2.name) as resturant_names,
t1.restaurant_id as restaurant_id,
r1.name as restaurant--,
--t2.restaurant_id as restaurant_id2,
--r2.name as restaurant_2,
--t1.food_list as common_food_ids
from cte t1
join cte t2
on t1.restaurant_id = t2.restaurant_id
and t1.food_list = t2.food_list
left join Restaurants r1
on t1.restaurant_id = r1.restaurant_id
left join Restaurants r2
on t2.restaurant_id = r2.restaurant_id;
As I understand your question, you want all restaurants that have the same list of foods as restaurant 1.
If so, that's a relation division problem. Here is an approach using joins and aggregation:
select r.name
from identifier i1
inner join identifier i2 on i2.food_id = i1.food_id
inner join restaurant r on r.restaurant_id = i2.restaurant_id
where i1.restaurant_id = 1
group by r.restaurant_id
having count(*) = (select count(*) from identifier i3 where i3.restaurant_id = 1)

Grouping the data and showing 1 row per group in postgres

I have two tables which look like this :-
Component Table
Revision Table
I want to get the name,model_id,rev_id from this table such that the result set has the data like shown below :-
name model_id rev_id created_at
ABC 1234 2 23456
ABC 5678 2 10001
XYZ 4567
Here the data is grouped by name,model_id and only 1 data for each group is shown which has the highest value of created_at.
I am using the below query but it is giving me incorrect result.
SELECT cm.name,cm.model_id,r.created_at from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by cm.name,cm.model_id,r.created_at
ORDER BY cm.name asc,
r.created_at DESC;
Result :-
Anyone's help will be highly appreciated.
use max and sub-query
select T1.name,T1.model_id,r.rev_id,T1.created_at from
(
select cm.name,
cm.model_id,
MAX(r.created_at) As created_at from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by cm.name,cm.model_id
) T1
left join revision r
on T1.created_at =r.created_at
http://www.sqlfiddle.com/#!17/68cb5/4
name model_id rev_id created_at
ABC 1234 2 23456
ABC 5678 2 10001
xyz 4567
In your SELECT you're missing rev_id
Try this:
SELECT
cm.name,
cm.model_id,
MAX(r.rev_id) AS rev_id,
MAX(r.created_at) As created_at
from dummy.component cm
left join dummy.revision r on cm.model_id=r.model_id
group by 1,2
ORDER BY cm.name asc,
r.created_at DESC;
What you were missing is the statement to say you only want the max record from the join table. So you need to join records, but the join will bring in all records from table r. If you group by the 2 columns in component, then select the max from r, on the id and created date, it'll only pick the top out the available to join
I would use distinct on:
select distinct on (m.id) m.id, m.name, r.rev_id, r.created_at
from model m left join
revision r
on m.model_id = r.model_id
order by m.id, r.rev_id;

How to perform max on an inner join with 2 different counts on columns?

How to find the user with the most referrals that have at least three blue shoes using PostgreSQL?
table 1 - users
name (matches shoes.owner_name)
referred_by (foreign keyed to users.name)
table 2 - shoes
owner_name (matches persons.name)
shoe_name
shoe_color
What I have so far is separate queries returning parts of what I want above:
(SELECT count(*) as shoe_count
FROM shoes
GROUP BY owner_name
WHERE shoe_color = “blue”
AND shoe_count>3) most_shoes
INNER JOIN
(SELECT count(*) as referral_count
FROM users
GROUP BY referred_by
) most_referrals
ORDER BY referral_count DESC
LIMIT 1
Two subqueries seem like the way to go. They would look like:
SELECT s.owner_name, s.show_count, r.referral_count
FROM (SELECT owner_name, count(*) as shoe_count
FROM shoes
WHERE shoe_color = 'blue'
GROUP BY owner_name
HAVING shoe_count >= 3
) s JOIN
(SELECT referred_by, count(*) as referral_count
FROM users
GROUP BY referred_by
) r
ON s.owner_name = r.referred_by
ORDER BY r.referral_count DESC
LIMIT 1 ;

I need a SQL query for comparing column values against rows in the same table

I have a table called BB_BOATBKG which holds passengers travel details with columns Z_ID, BK_KEY and PAXSUM where:
Z_ID = BookingNumber* LegNumber
BK_KEY = BookingNumber
PAXSUM = Total number passengers travelled in each leg for a particular booking
For Example:
Z_ID BK_KEY PAXSUM
001234*01 001234 2
001234*02 001234 3
001287*01 001287 5
001287*02 001287 5
002323*01 002323 7
002323*02 002323 6
I would like to get a list of all Booking Numbers BK_KEY from BB_BOATBKG where the total number of passengers PAXSUM is different in each leg for the same booking
Example, For Booking number A, A*Leg01 might have 2 Passengers, A* Leg02 might have 3 passengers
Dependent of your RDBMs there might be several options availible. A solution that should work for most is:
SELECT A.Z_ID, A.BK_KEY, A.PAXSUM
FROM BB_BOATBKG A
JOIN (
SELECT BK_KEY
FBB_BOATBKGROM BB_BBK_KEY
GROUP BY BK_KEY
HAVING COUNT( DISTINCT PAXSUM ) > 1
) B
ON A.BK_KEY = B.BK_KEY
If your DBMS support OLAP functions, have a look at RANK() OVER (...)
It's a little counterintuitive, but you could join the table to itself on {BK_KEY, PAXSUM} and pull out only the records whose joined result is null.
I think this does it:
SELECT
a.BK_KEY
FROM
BB_BOATBKG a
LEFT OUTER JOIN BB_BOATBKG b ON a.BK_KEY = b.BK_KEY AND a.PAXSUM = b.PAXSUM
WHERE
b.Z_ID IS NULL
GROUP BY
a.BK_KEY
Edit: I think I missed anything beyond the trivial case. I think you can do it with some really nasty subselecting though, a la:
SELECT
b.BK_KEY
FROM
(
SELECT
a.BK_KEY,
Count = COUNT(*)
FROM
(
SELECT
a.BK_KEY,
a.PAXSUM
FROM
BB_BOATBKG a
GROUP BY
a.BK_KEY,
a.PAXSUM
HAVING
COUNT(*) = 1
) a
GROUP BY
a.BK_KEY
) b
INNER JOIN
(
SELECT
c.BK_KEY,
Count = COUNT(*)
FROM
BB_BOATBKG c
GROUP BY
c.BK_KEY
) c ON b.BK_KEY = c.BK_KEY AND b.Count = c.Count