Efficient way to count members of same group - sql

Lets say that I have a dataset of people who are members of groups:
Group ID | Person ID
1 1
2 1
2 2
3 1
3 3
For each person, I want to count the number of distinct people who are in at least one of the same groups (including themselves):
Person ID | Distinct Co-Members
1 3
2 2
3 2
Is there a more efficient way to do this count other than joining the above dataset on itself with a key of the Group ID?

I think you need a self-join and group by:
select t1.personid, count(distinct t2.personid)
from t t1 left join
t t2
on t1.groupid = t2.groupid
group by t1.personid;
Here is a db<>fiddle.

Using nunique
df.merge(df,on='GroupID').groupby('PersonID_x')['PersonID_y'].nunique().reset_index()
Out[170]:
PersonID_x PersonID_y
0 1 3
1 2 2
2 3 2

Related

SQL left join with 2 or more count group

My table
ID catone cattwo
100 2 1
100 3 1
200 1 2
expect result (count not sum)
ID totalcat1 totalcat2
100 2 2
200 1 1
My query
select COUNT(*) as totalcat1, catone
from Table1
group by cat1
left join
select COUNT(*) as totalcat2, cattwo
from Table1
group by cattwo
Try to have both count columns catone and cattwo
Not sure how to correct it. Thank you
A simple group-by should do it
select ID, COUNT(catone) as totalcat1, COUNT(cattwo) as totalcat2
from Table1
group by ID;
Note that this simply counts the number of values that are not NULL. If your original data was this...
ID catone cattwo
100 2 1
100 3 1
100 4 NULL
... then the result would be
ID totalcat1 totalcat2
100 3 2
If you want to count the distinct values - so totalcat2 would be 1 (as only 1 value exists in that column, although it's there twice) you could use
select ID, COUNT(DISTINCT catone) as totalcat1, COUNT(DISTINCT cattwo) as totalcat2
from Table1
group by ID;
which would return totalcat1 = 3 and totalcat2 = 1.
Here's a db<>fiddle with the two options.
Here's a second db<>fiddle on request of OP with ID 200.

How to compare each group of one table with a column in another table?

I have two tables:
TypeTable
TypeId PersonClassificationId
----------------------
1 1
1 2
1 3
2 1
2 2
PersonClassificationTable
PersonClassificationId Capacity
----------------------
1 2
2 2
3 2
I need to select such TypeId that in the entire TypeTable table do not have at least one PersonClassificationID specified in PersonTable.
So, if PersonTable has 1, 2, 3, then TypeId = 2 should be selected, because there is no record in TypeTable:
TypeId PersonClassificationId
----------------------
2 3
How can I do that?
It is undesirable to use cursors : )
I think that you can do what you want by generating all possible combinations of types and classifications, and then filter on those that do not exist in the mapping table:
select t.TypeId, pc.PersonClassificationId
from (select distinct TypeId from TypeTable) t
cross join PersonClassificationTable p
where not exists (
select 1
from TypeTable t1
where t1.TypeId = t.TypeId and t1.PersonClassificationId = p.PersonClassificationId
)

Need to find the count of user who belongs to different depts

I have table with dept,user and so on, I need to find the number of count of user that belongs to different combinations of the dept.
Lets consider I've a table like this:
dept user
1 33
1 33
1 45
2 11
2 12
3 33
3 15
Then I've to find the uniq user and dept combination: something like this:
select distinct dept,user from x;
Which will give me result like :
Dept user
1 33
1 45
2 11
2 12
3 33
3 15
which actually removes the duplicates of the combination:
And here's the thing which i need to do :
My output should look like this:
dep_1_1 dep_1_2 dep_1_3 dep_2_2 dep_2_1 dep_2_3 Dep_3_1 Dep_3_2 Dep_3_3
2 0 1 2 0 0 1 0 2
So, Basically I need to find the count of common users between all the combinations of departments
Thanks for the help
You can get a row for each department combination using a self-join of your Distinct Select:
with cte as
(
select distinct dept,user from x
)
select t1.dept, t2.dept, count(*)
from cte a st1 join cte as t2
on t1.user = t2.user -- same user
and t1.dept < t2.dept -- different department
group by t1.dept, t2.dept
order by t1.dept, t2.dept

SQL: SELECT value for all rows based on a value in one of the rows and a condition

I have a list of total store visits for a customer for a month. The customer has a home store but can visit other stores. Like the table below:
MemberId | HomeStoreId | VisitedStoreId | Month | Visits
1 5 5 1 5
1 5 3 1 2
1 5 2 1 1
1 5 4 1 7
I want my select statement to give the number of visits to the home store against each store for that member for that month. Like the below:
MemberId | HomeStoreId | VisitedStoreId | Month | Visits | HomeStoreVisits
1 5 5 1 5 5
1 5 3 1 2 5
1 5 2 1 1 5
1 5 4 1 7 5
I've looked at a SUM with CASE statements inside and OVER with PARTITION but I can't seem to work it out.
Thanks
I would use window functions:
select t.*,
sum(case when homestoreid = visitedstoreid then visits end) over
(partition by memberid, month) as homestorevisits
from t;
SELECT MemberID,HomestoreID,visitedstoreid,Month,visits, homestorevisits
FROM Table LEFT OUTER JOIN
(SELECT MemberID, Visits homestorevisits
FROM TABLE WHERE homestoreID =VisitedStoreId
)T ON T.MemberID = Table.MemberID
You can achieve this using a simple subquery.
SELECT MemberId, HomeStoreID, VisitedStoreID, Month, Visits,
(SELECT Visits FROM table t2
WHERE t2.MemberId = t1.MemberId
AND t2.HomeStoreId = t1.HomeStoreId
AND t2.Month = t1.Month
AND t2.VisitedStoreId = t2.HomeStoreId) AS HomeStoreVisits
FROM table t1

How to apply a single query that sum column for individual values

I have 2 tables named user and statistics
user table has 3 columns: id, name and category
statistics table has 3 columns: id, idUser (relational), cal
something like this:
user
Id name category
1 name1 1
2 name2 2
3 name3 3
statistics
Id idUser cal
1 1 1
2 1 1
3 1 1
4 2 1
5 2 1
How can I apply a query that sum the cal column by each category of users and give me something like this:
category totalcal
1 3
2 2
3 0
You want to do a left join to keep all the categories. The rest is just aggregation:
select u.category, coalesce(sum(s.cal), 0) as cal
from users u left join
statistics s
on u.id = s.idUser
group by u.category;
Use LEFT JOIN to get 0 sum for the category=3:
SELECT
user.category
,SUM(statistics.cal) AS totalcal
FROM
user
LEFT JOIN statistics ON statistics.idUser = user.Id
GROUP BY
user.category
Here SUM would return NULL for category=3. To get 0 instead of NULL you can use COALESCE(SUM(statistics.cal), 0).