selecting multiple counts when tables not directly co-relate - sql

users table:
user_id (distinct for each user)
source_id (users may have the same source)
rule tables:
white_rules
black_rules
general_rules
these tables all look the same, and have:
victim_id (co-relates to user_id from users table).
rule_id (co-relates to a different table which is not important here)
What i need is to extract the amount of total rules per-type (white,black,general) per-source_id.
example:
source_id: 5 ---> total 70 white rules, total 32 black rules, total 21 general rules
source_id: 7 ---> total 2 white rules, total 0 black rules, total 4 general rules
and so on... for all distinct sources that are listed on users table.
what i tried is:
SELECT source_id,
count(w.victim_id) as total_white,
count(b.victim_id) as total_black,
count(g.victim_id) as total_general
from users
LEFT JOIN white_rules as w ON (user_id=w.victim_id)
LEFT JOIN black_rules as b ON (user_id=b.victim_id)
LEFT JOIN general_rules as g ON (user_id=g.victim_id)
where deleted='f' and source is not null
group by source;
but the result table I get has wrong (higher) numbers than what I expect to get,
so I must be doing something wrong :)
would appreciate any hinge in the right direction.

You need to do your counts in subqueries, or count distinct, as your multiple 1 to many relationships are causing cross joining. I don't know your data but imagine this scenario:
Users:
User_ID | Source_ID
--------+--------------
1 | 1
White_Rules
Victim_ID | Rule_ID
----------+-------------
1 | 1
1 | 2
Black_Rules
Victim_ID | Rule_ID
----------+-------------
1 | 3
1 | 4
If you run
SELECT Users.User_ID,
Users.Source_ID,
White_Rules.Rule_ID AS WhiteRuleID,
Black_Rules.Rule_ID AS BlackRuleID
FROM Users
LEFT JOIN White_Rules
ON White_Rules.Victim_ID = Users.User_ID
LEFT JOIN Black_Rules
ON Black_Rules.Victim_ID = Users.User_ID
You will get all combinations of White_Rules.Rule_ID and Black_Rules.Rule_ID:
User_ID | Source_ID | WhiteRuleID | BlackRuleID
--------+-----------+-------------+-------------
1 | 1 | 1 | 3
1 | 1 | 2 | 4
1 | 1 | 1 | 3
1 | 1 | 2 | 4
So counting the results will return 4 white rules and 4 black rules, even though there are only 2 of each.
You should get the required results if you change your query to this:
SELECT Users.Source_ID,
SUM(COALESCE(w.TotalWhite, 0)) AS TotalWhite,
SUM(COALESCE(b.TotalBlack, 0)) AS TotalBlack,
SUM(COALESCE(g.TotalGeneral, 0)) AS TotalGeneral
FROM Users
LEFT JOIN
( SELECT Victim_ID, COUNT(*) AS TotalWhite
FROM White_Rules
GROUP BY Victim_ID
) w
ON w.Victim_ID = Users.User_ID
LEFT JOIN
( SELECT Victim_ID, COUNT(*) AS TotalBlack
FROM Black_Rules
GROUP BY Victim_ID
) b
ON b.Victim_ID = Users.User_ID
LEFT JOIN
( SELECT Victim_ID, COUNT(*) AS TotalGeneral
FROM General_Rules
GROUP BY Victim_ID
) g
ON g.Victim_ID = Users.User_ID
WHERE Deleted = 'f'
AND Source IS NOT NULL
GROUP BY Users.Source_ID
Example on SQL Fiddle
An alternative would be:
SELECT Users.Source_ID,
COUNT(Rules.TotalWhite) AS TotalWhite,
COUNT(Rules.TotalBlack) AS TotalBlack,
COUNT(Rules.TotalGeneral) AS TotalGeneral
FROM Users
LEFT JOIN
( SELECT Victim_ID, 1 AS TotalWhite, NULL AS TotalBlack, NULL AS TotalGeneral
FROM White_Rules
UNION ALL
SELECT Victim_ID, NULL AS TotalWhite, 1 AS TotalBlack, NULL AS TotalGeneral
FROM Black_Rules
UNION ALL
SELECT Victim_ID, NULL AS TotalWhite, NULL AS TotalBlack, 1 AS TotalGeneral
FROM General_Rules
) Rules
ON Rules.Victim_ID = Users.User_ID
WHERE Deleted = 'f'
AND Source IS NOT NULL
GROUP BY Users.Source_ID
Example on SQL Fiddle

Related

Efficiently getting multiple counts of foreign key rows in PostgreSQL

I have a database that consists of users who can perform various actions, which I keep track of in multiple tables. I'm creating a point system, so I need to count how many of each type of action the user did. For example, if I had:
users posts comments shares
id | username id | user_id id | user_id id | user_id
------------- -------------- -------------- --------------
1 | abc 1 | 1 1 | 1 1 | 2
2 | xyz 2 | 1 2 | 2 2 | 2
I would want to return:
user_details
id | username | post_count | comment_count | share_count
---------------------------------------------------------
1 | abc | 2 | 1 | 0
2 | xyz | 0 | 1 | 2
This is slightly different from this question about foreign key counts since I want to return the individual counts per table.
What I've tried so far (example code):
SELECT
users.id,
users.username,
COUNT( DISTINCT posts.id ) as post_count,
COUNT( DISTINCT comments.id ) as comment_count,
COUNT( DISTINCT shares.id ) as share_count
FROM users
LEFT JOIN posts ON posts.user_id = users.id
LEFT JOIN comments ON comments.user_id = users.id
LEFT JOIN shares ON shares.user_id = users.id
GROUP BY users.id
While this works, I had to use DISTINCT in all of my counts because the LEFT JOINS were causing high numbers of duplicate rows. I feel like there must be a better way to do this since (please correct me if I'm wrong) on each LEFT JOIN, the DISTINCT is having to filter out an exponentially growing number of duplicated rows.
Thank you so much for any help you could give me with this!
You can join derived tables that already do the aggregation.
SELECT u.id,
u.username,
coalesce(pc.c, 0) AS post_count,
coalesce(cc.c, 0) AS comment_count,
coalesce(sc.c, 0) AS share_count
FROM users AS u
LEFT JOIN (SELECT p.user_id,
count(*) AS cc
FROM posts AS p
GROUP BY p.user_id) AS pc
ON pc.user_id = u.id
LEFT JOIN (SELECT c.user_id,
count(*) AS
FROM comments AS c
GROUP BY c.user_id) AS cc
ON cc.user_id = u.id
LEFT JOIN (SELECT s.user_id,
count(*) AS c
FROM shares AS s
GROUP BY s.user_id) AS sc
ON sc.user_id = u.id;

How to find record ids with missing property ids in another table

The PostgreSQL database has two tables: user_properties and properties. The properties table contains a list of all possible properties with ids (a dictionary). The user_properties table contains what properties a user has, referencing property id from the properties table.
The properties table:
----------------------
prop_id | prop_name
----------------------
1 | Email
----------------------
2 | Phone number
----------------------
3 | Something else 1
----------------------
4 | Something else 2
----------------------
The user_properties table:
--------------------------------
user_id | prop_id | prop_value
--------------------------------
100 | 1 | asd#zxc.com
--------------------------------
100 | 2 | 1234567
--------------------------------
100 | 2 | 2345678
--------------------------------
101 | 3 | *******
--------------------------------
101 | 3 | +++++++
--------------------------------
I need to know which properties are missing for every user_id.
The expected result should look like:
-----------------------
user_id | missing_prop_id
-----------------------
100 | 3
-----------------------
100 | 4
-----------------------
101 | 1
-----------------------
101 | 2
-----------------------
101 | 4
-----------------------
You can use except as :
with properties(prop_id,prop_name) as
(
values(1, 'Email'),(2, 'Phone number'),
(3, 'Something else 1'),(4, 'Something else 2')
), user_properties( user_id, prop_id, prop_value) as
(
values(100,1,'asd#zxc.com'),(100,2,'1234567'),(100,2,'2345678'),
(101,3,'*******'),(101,3,'+++++++')
), t2 as
(
select u.user_id, p.prop_id as missing_prop_id
from user_properties u
cross join properties p
group by u.user_id, p.prop_id
except
select u.user_id,
p.prop_id
from user_properties u
right join properties p
on u.prop_id = p.prop_id
group by u.user_id, p.prop_id
)
select * from t2 order by user_id, missing_prop_id;
user_id missing_prop_id
100 3
100 4
101 1
101 2
101 4
Demo
You can simply solve this by join...
SELECT DISTINCT t3.user_id, t3.prop_id FROM
(SELECT DISTINCT user_id, t2.prop_id FROM user_properties t1, properties t2) t3
LEFT JOIN user_properties t4 ON t3.user_id = t4.user_id and t3.prop_id = t4.prop_id WHERE t4.prop_id is null
http://sqlfiddle.com/#!17/0f4e3/2/0
You can use a cross join to generate all the rows and left join to filter out the ones that don't exist:
select u.user_id, p.prop_id
from (select distinct user_id from user_properties
) u cross join
properties p left join
user_properties up
on up.user_id = u.user_id and
up.prop_id = p.prop_id
where up.user_id is null;
Presumably, you have a users table so the subquery for u is not needed:
select u.user_id, p.prop_id
from users u cross join
properties p left join
user_properties up
on up.user_id = u.user_id and
up.prop_id = p.prop_id
where up.user_id is null;
Thanks everyone for the help. I've come up with the following query myself:
select mp.user_id, mp.prop_id missing_prop_id from
(select distinct up.user_id, p.prop_id from user_properties up cross join properties p) mp
except
select distinct user_id, prop_id from user_properties
http://sqlfiddle.com/#!17/0f4e3/3/0

GROUP BY with SUM without removing empty (null) values

TABLES:
Players
player_no | transaction_id
----------------------------
1 | 11
2 | 22
3 | (null)
1 | 33
Transactions
id | value |
-----------------------
11 | 5
22 | 10
33 | 2
My goal is to fetch all data, maintaining all the players, even with null values in following query:
SELECT p.player_no, COUNT(p.player_no), SUM(t.value) FROM Players p
INNER JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no
nevertheless results omit null value, example:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
What I would like to have is mention about the empty value:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
3 | 0 | 0
What do I miss here?
Actually I use QueryDSL for that, but translated example into pure SQL since it behaves in the same manner.
using LEFT JOIN and coalesce function
SELECT p.player_no, COUNT(p.player_no), coalesce(SUM(t.value),0)
FROM Players p
LEFT JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no
Change your JOIN to a LEFT JOIN, then add IFNULL(value, 0) in your SUM()
left join keeps all the rows in the left table
SELECT p.player_no
, COUNT(*) as count
, SUM(isnull(t.value,0))
FROM Players p
LEFT JOIN Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no
You might be looking for count(t.value) rather than count(*)
I'm just offering this so you have a correct answer:
SELECT p.player_no, COUNT(t.id) as [count], COALESCE(SUM(t.value), 0) as [sum]
FROM Players p LEFT JOIN
Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no;
You need to pay attention to the aggregation functions as well as the JOIN.
Please Try This:
SELECT P.player_no,
COUNT(*) as count,
SUM(isnull(T.value,0))
FROM Players P
LEFT JOIN Transactions T
ON P.transaction_id = T.id
GROUP BY P.player_no
Hope this helps.

Count / sum values in subquery and order by it

I have tables like below:
user
id | status
1 | 0
gallery
id | status | create_by_user_id
1 | 0 | 1
2 | 0 | 1
3 | 0 | 1
media
id | status
1 | 0
2 | 0
3 | 0
gallery_media
fk gallery.id fk media.id
id | gallery_id | media_id | sequence
1 | 1 | 1 | 1
2 | 2 | 2 | 1
3 | 2 | 3 | 2
monitor_traffic
1:gallery 2:media
id | anonymous_id | user_id | endpoint_code | endpoint_id
1 | 1 | | 1 | 2 gallery.id 2
2 | 2 | | 1 | 2 gallery.id 2
3 | | 1 | 2 | 3 media.id 3 include in gallery.id 2
these means gallery.id 2 contain 3 rows
gallery_information
fk gallery.id
id | gallery_id
gallery includes media.
monitor_traffic.endpoint_code: 1 .. gallery; 2 .. media
If 1 then monitor_traffic.endpoint_id references gallery.id
monitor_traffic.user_id, monitor_traffic.anonymous_id integer or null
Objective
I want to output gallery rows sort by count each gallery rows in monitor_traffic, then count the gallery related media rows in monitor_traffic. Finally sum them.
The query I provide only counts media in monitor_traffic without summing them and also does not count gallery in monitor_traffic.
How to do this?
This is part of a function, input option then output build query, something like this. I hope to find a solution (maybe with a subquery) that does not require to change other parts of the query.
Query:
SELECT
g.*,
row_to_json(gi.*) as gallery_information
FROM gallery g
LEFT JOIN gallery_information gi ON gi.gallery_id = g.id
LEFT JOIN "user" u ON u.id = g.create_by_user_id
-- start
LEFT JOIN gallery_media gm ON gm.gallery_id = g.id
LEFT JOIN (
SELECT
endpoint_id,
COUNT(*) as mt_count
FROM monitor_traffic
WHERE endpoint_code = 2
GROUP BY endpoint_id
) mt ON mt.endpoint_id = m.id
-- end
ORDER BY mt.mt_count desc NULLS LAST;
sql fiddle
I suggest a CTE to count both types in one aggregation and join to it two times in the FROM clause:
WITH mt AS ( -- count once for both media and gallery
SELECT endpoint_code, endpoint_id, count(*) AS ct
FROM monitor_traffic
GROUP BY 1, 2
)
SELECT g.*, row_to_json(gi.*) AS gallery_information
FROM gallery g
LEFT JOIN mt ON mt.endpoint_id = g.id -- 1st join to mt
AND mt.endpoint_code = 1 -- gallery
LEFT JOIN (
SELECT gm.gallery_id, sum(ct) AS ct
FROM gallery_media gm
JOIN mt ON mt.endpoint_id = gm.media_id -- 2nd join to mt
AND mt.endpoint_code = 2 -- media
GROUP BY 1
) mmt ON mmt.gallery_id = g.id
LEFT JOIN gallery_information gi ON gi.gallery_id = g.id
ORDER BY mt.ct DESC NULLS LAST -- count of galleries
, mmt.ct DESC NULLS LAST; -- count of "gallery related media"
Or, to order by the sum of both counts:
...
ORDER BY COALESCE(mt.ct, 0) + COALESCE(mmt.ct, 0) DESC;
Aggregate first, then join. That prevents complications with "proxy-cross joins" that multiply rows:
Two SQL LEFT JOINS produce incorrect result
The LEFT JOIN to "user" seems to be dead freight. Remove it:
LEFT JOIN "user" u ON u.id = g.create_by_user_id
Don't use reserved words like "user" as identifier, even if that's allowed as long as you double-quote. Very error-prone.

Linking three tables SQL

I have three tables towns , patientsHome,patientsRecords
towns
Id
1
2
3
patientsHome
Id | serial_number
1 | 11
2 | 12
2 | 13
patientsRecords
status | serial_number
stable | 11
expire | 12
expire | 13
I want to count stable and expire patients from patients records against each Id from towns.
output should be like
Result
Id| stableRecords |expiredRecords
1| 1 | 0
2| 0 | 2
3| 0 | 0
Try like this :
select t.id,case when tt.StableRecords is null then 0 else tt.StableRecords end
as StableRecords,case when tt.expiredRecords is null then 0 else tt.expiredRecords
end as expiredRecords from towns t left join
(select ph.id, count(case when pr.status='stable' then 1 end) as StableRecords,
count(Case when pr.status='expire' then 1 end) as expiredRecords
from patientsRecords pr inner join
patientsHome ph on ph.serial_number=pr.serial_number
group by ph.id ) as tt
on t.id=tt.id
Assuming patientsHome.ID is in fact a foreign key to towns.ID, you can join the 3 tables, filter as appropriate, group by Town, and count the rows:
SELECT t.Id, COUNT(*) as patientCount
FROM towns t
INNER JOIN patientsHome ph
on t.Id = ph.Id
INNER JOIN patientsRecords pr
on ph.serialNumber = pr.serialNumber
WHERE pr.status in ('stable', 'expire')
GROUP BY t.Id;
If you also want to classify the status per town:
SELECT t.Id, pr.status, COUNT(*) as patientCount
... FROM, WHERE same as above
GROUP BY t.Id, pr.status;
Try this:
SELECT t.id, pr.status,
COUNT(*) AS countByStatus
FROM patientsRecords pr
INNER JOIN patientsHome ph
ON ph.serial_number = pr.serial_number
INNER JOIN towns t
ON t.id=ph.id
WHERE pr.status IN ('stable', 'expire')
GROUP BY t.id, pr.status;
See the sqlfiddle: http://sqlfiddle.com/#!2/028545/4