SQL join on table twice not bringing expected results

SQL join on table twice not bringing expected results - sql

I have two tables (extraneous columns removed to exemplify the issue):
-People-
PID | CarID1 | CarID2
----------------------
1 | 1 | 3
2 | 5 | NULL
3 | 1 | NULL
4 | NULL | 1
-Cars-
CarID
-----
1
3
5
I'm creating a view based on the CarID so using:
SELECT
c.CarID,
COUNT(p.PID) AS pCount
FROM
Cars c
LEFT JOIN People p ON p.CarID1 = c.CarID OR p.CarID2 = c.CarID
Group By c.CarID
Brings back the expected results:
CarID | pCount
--------------
1 | 3
3 | 1
5 | 1
The issue being that on a table with 1000+ car id's and 25,000 people, this can take a long time (taking out the OR clause means it takes milliseconds)
So I was trying to do it another way like this:
SELECT
c.CarID,
COUNT(p1.PID) AS pCount1,
COUNT(p2.PID) AS pCount2
FROM
Cars c
LEFT JOIN People p1 ON p1.CarID1 = c.CarID
LEFT JOIN People p2 ON p2.CarID2 = c.CarID
Group By c.CarID
It's many times quicker, but because CarID 1 exists in both CarID1 and CarID2 I'm getting this:
CarID | pCount1 | pCount2
-------------------------
1 | 3 | 3
3 | 0 | 1
5 | 1 | 0
When I would expect this:
CarID | pCount1 | pCount2
-------------------------
1 | 2 | 1
3 | 0 | 1
5 | 1 | 0
And I could just sum the pCount1 and pCount2
Is there any way I can achieve the results of the first query using the 2nd method? I'm presuming the GROUP BY clause has something to do with it, but not sure how to omit it.

How about unpivoting the columns and then joining:
SELECT v.CarID, COUNT(p.PID) AS pCount
FROM People p CROSS APPLY
(VALUES (p.CarID1), (p.CarID2)) v(CarID) JOIN
Cars c
ON v.CarID = c.CarId
WHERE v.CarID IS NOT NULL
GROUP BY v.CarID;
If you want to keep cars even with no people, then you can express this as a LEFT JOIN:
SELECT c.CarID, COUNT(p.PID) AS pCount
FROM Cars c LEFT JOIN
(People p CROSS APPLY
(VALUES (p.CarID1), (p.CarID2)) v(CarID)
)
ON v.CarID = c.CarId
GROUP BY c.CarID;
Here is a db<>fiddle.

Is the p.CarID1 a Primary Key?
If so it would explain that a join on the carID1 is fast but on the carID2 it's slow.
Try creating an Index on CarID2 and see if that solves your performance issues.
The index would turn it from a full table scan into an index lookup. Which is a lot faster.
CREATE NONCLUSTERED INDEX CarId2Index
ON p.CarID2;
If that solves it you can keep your query as it is.
Alternatively you can send us the query explain plan so we can see what is slowing it down.

Try using SUM with condition like below.
SELECT
c.CarID,
SUM(IIF(p1.PID IS NULL, 0, 1)) AS pCount1,
SUM(IIF(p2.PID IS NULL, 0, 1)) AS pCount2
FROM
Cars c
LEFT JOIN People p1 ON p1.CarID1 = c.CarID
LEFT JOIN People p2 ON p2.CarID2 = c.CarID
Group By c.CarID

Try with COALESCE function:
SELECT
c.CarID,
COUNT(p.PID) AS pCount
FROM
Cars c
LEFT JOIN People p ON COALESCE(p.CarID1, p.CarID2) = c.CarID
Group By c.CarID

Related

GROUP BY with SUM without removing empty (null) values

TABLES:
Players
player_no | transaction_id
----------------------------
1 | 11
2 | 22
3 | (null)
1 | 33
Transactions
id | value |
-----------------------
11 | 5
22 | 10
33 | 2
My goal is to fetch all data, maintaining all the players, even with null values in following query:
SELECT p.player_no, COUNT(p.player_no), SUM(t.value) FROM Players p
INNER JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no
nevertheless results omit null value, example:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
What I would like to have is mention about the empty value:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
3 | 0 | 0
What do I miss here?
Actually I use QueryDSL for that, but translated example into pure SQL since it behaves in the same manner.

using LEFT JOIN and coalesce function
SELECT p.player_no, COUNT(p.player_no), coalesce(SUM(t.value),0)
FROM Players p
LEFT JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no

Change your JOIN to a LEFT JOIN, then add IFNULL(value, 0) in your SUM()

left join keeps all the rows in the left table
SELECT p.player_no
, COUNT(*) as count
, SUM(isnull(t.value,0))
FROM Players p
LEFT JOIN Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no
You might be looking for count(t.value) rather than count(*)

I'm just offering this so you have a correct answer:
SELECT p.player_no, COUNT(t.id) as [count], COALESCE(SUM(t.value), 0) as [sum]
FROM Players p LEFT JOIN
Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no;
You need to pay attention to the aggregation functions as well as the JOIN.

Please Try This:
SELECT P.player_no,
COUNT(*) as count,
SUM(isnull(T.value,0))
FROM Players P
LEFT JOIN Transactions T
ON P.transaction_id = T.id
GROUP BY P.player_no
Hope this helps.

Postgres SQL: getting group count

I have the following table
>> tbl_category
id | category
-------------
0 | A
1 | B
...|...
>>tbl_product
id | category_id | product
---------------------------
0 | 0 | P1
1 | 1 | P2
...|... | ...
I can use the following query to count the number of products in a category.
select category, count(tbl.product) from tbl_product
join tbl_category on tbl_product.category_id = category.id
group by catregory
However, there are some categories that never have any product belonging to. How do I get these to show up in the query result as well?

Use a left join:
select c.category, count(tbl.product)
from tbl_category c left join
tbl_product p
on p.category_id = c.id
group by c.category;
The table where you want to keep all the rows goes first (tbl_category).
Note the use of table aliases to make the query easier to write and to read.

Count / sum values in subquery and order by it

I have tables like below:
user
id | status
1 | 0
gallery
id | status | create_by_user_id
1 | 0 | 1
2 | 0 | 1
3 | 0 | 1
media
id | status
1 | 0
2 | 0
3 | 0
gallery_media
fk gallery.id fk media.id
id | gallery_id | media_id | sequence
1 | 1 | 1 | 1
2 | 2 | 2 | 1
3 | 2 | 3 | 2
monitor_traffic
1:gallery 2:media
id | anonymous_id | user_id | endpoint_code | endpoint_id
1 | 1 | | 1 | 2 gallery.id 2
2 | 2 | | 1 | 2 gallery.id 2
3 | | 1 | 2 | 3 media.id 3 include in gallery.id 2
these means gallery.id 2 contain 3 rows
gallery_information
fk gallery.id
id | gallery_id
gallery includes media.
monitor_traffic.endpoint_code: 1 .. gallery; 2 .. media
If 1 then monitor_traffic.endpoint_id references gallery.id
monitor_traffic.user_id, monitor_traffic.anonymous_id integer or null
Objective
I want to output gallery rows sort by count each gallery rows in monitor_traffic, then count the gallery related media rows in monitor_traffic. Finally sum them.
The query I provide only counts media in monitor_traffic without summing them and also does not count gallery in monitor_traffic.
How to do this?
This is part of a function, input option then output build query, something like this. I hope to find a solution (maybe with a subquery) that does not require to change other parts of the query.
Query:
SELECT
g.*,
row_to_json(gi.*) as gallery_information
FROM gallery g
LEFT JOIN gallery_information gi ON gi.gallery_id = g.id
LEFT JOIN "user" u ON u.id = g.create_by_user_id
-- start
LEFT JOIN gallery_media gm ON gm.gallery_id = g.id
LEFT JOIN (
SELECT
endpoint_id,
COUNT(*) as mt_count
FROM monitor_traffic
WHERE endpoint_code = 2
GROUP BY endpoint_id
) mt ON mt.endpoint_id = m.id
-- end
ORDER BY mt.mt_count desc NULLS LAST;
sql fiddle

I suggest a CTE to count both types in one aggregation and join to it two times in the FROM clause:
WITH mt AS ( -- count once for both media and gallery
SELECT endpoint_code, endpoint_id, count(*) AS ct
FROM monitor_traffic
GROUP BY 1, 2
)
SELECT g.*, row_to_json(gi.*) AS gallery_information
FROM gallery g
LEFT JOIN mt ON mt.endpoint_id = g.id -- 1st join to mt
AND mt.endpoint_code = 1 -- gallery
LEFT JOIN (
SELECT gm.gallery_id, sum(ct) AS ct
FROM gallery_media gm
JOIN mt ON mt.endpoint_id = gm.media_id -- 2nd join to mt
AND mt.endpoint_code = 2 -- media
GROUP BY 1
) mmt ON mmt.gallery_id = g.id
LEFT JOIN gallery_information gi ON gi.gallery_id = g.id
ORDER BY mt.ct DESC NULLS LAST -- count of galleries
, mmt.ct DESC NULLS LAST; -- count of "gallery related media"
Or, to order by the sum of both counts:
...
ORDER BY COALESCE(mt.ct, 0) + COALESCE(mmt.ct, 0) DESC;
Aggregate first, then join. That prevents complications with "proxy-cross joins" that multiply rows:
Two SQL LEFT JOINS produce incorrect result
The LEFT JOIN to "user" seems to be dead freight. Remove it:
LEFT JOIN "user" u ON u.id = g.create_by_user_id
Don't use reserved words like "user" as identifier, even if that's allowed as long as you double-quote. Very error-prone.

SQL Server - OR clause in join confusion

I have the table structure like below
Package
PACK_ID | DESCR | BRAND_ID
1 | Shoes | 20
2 | Cloths| NULL
ITEMS
ITEM_ID | PACK_ID | BRAND_ID
100 | 1 | 10
101 | 1 | NULL
102 | 1 | 10
BRANDS
NAME | BRAND_ID
A | 10
B | 20
I want to write a query to list how many items are there in a package grouped by same brand. If the brand is not defined in the item it should get it from package.
Note: Brand_id in both package and items are nullable
My query is this
SELECT count (*) as count,p.descr as descr,b.name FROM [items] item
inner join [package] p on item.pack_id= p.pack_id
inner join [brands] b on b.brand_id = item.brand_id or b.brand_id = p.brand_id
where p.pack_id = 1
group by b.name,p.descr
and my result is
COUNT | descr | NAME
2 | Shoes | a
3 | Shoes | B
whereas i expect the result to be something like this
COUNT | descr | NAME
2 | Shoes | a
1 | Shoes | B
could you please suggest what is wrong with my code? Thanks in advance.

Try using ISNULL on your join condition:
SELECT count (*) as count,p.pack_id as pack_id,b.name FROM [items] item
inner join [package] p on item.pack_id= p.pack_id
inner join [brands] b on b.brand_id = ISNULL(item.brand_id, p.brand_id)
where p.pack_id = 1
group by b.name,p.pack_id
Your OR was causing it to join to multiple rows, this should use the item by default and then fall back to the package.

I would tend to approach this by getting the brand for both the item and the package. Then decide which one to use in the select:
SELECT count(*) as count, p.descr as descr, coalesce(bi.name, bp.name) as name
FROM [items] item inner join
[package] p
on item.pack_id= p.pack_id left join
[brands] bi
on bi.brand_id = item.brand_id left join
brands bp
on b.brand_id = p.brand_id
where p.pack_id = 1
group by coalesce(bi.name, bp.name), p.descr;
One key advantage to this approach is performance. Databases tend to do a poor job when joins are on expression or or conditions.

Stored procedure containing inner join with count not working

Let's say i've got a databasetable looking a bit like this, containing information about some assignments.
Id | ProfessionId | Title | Deadline | DateCreated | ClosingDate
1 | 5 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
2 | 6 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
3 | 7 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
4 | 7 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
I want to generate an overview foreach profession (assignments belong to a certain profession) and count the number of assignment in each profession. The overview coming from the database should look like this;
Id | Name | FriendlyUrl | Ordinal | NumberOfAssignments
5 | Profession 1 | profession-1 | 1 | 1
6 | Profession 2 | profession-2 | 1 | 1
7 | Profession 3 | profession-3 | 1 | 2
8 | Profession 4 | profession-4 | 1 | 0
I've currently got a stored procedure returning the overview above, except that the amount of assignments isn't correct. Assignments with a closingdate in the past (we then assume the assignment is closed) shouldn't be taken into the the total number of assignment.
The current stored procedure is like this:
BEGIN
SELECT p.Id,
p.Naam,
p.FriendlyUrl,
p.Ordinal,
COUNT(a.ProfessionId) AS NumberOfAssignments
FROM ME_Profession AS p
LEFT OUTER JOIN ME_Assignment AS a ON a.ProfessionId = p.Id
INNER JOIN ME_Client AS c ON a.ClientId = c.Id
INNER JOIN aspnet_Membership AS m ON m.UserId = c.UserId
WHERE m.IsApproved = 1
GROUP BY p.Id, p.Naam, p.FriendlyUrl, p.Ordinal
END
I've already came up with and modified procedure like the one below, but it doesn't work. It feels like i'm either thinking too difficult or missing something obvious. What could go wrong?
SELECT p.Id, p.Naam, p.FriendlyUrl, p.Ordinal, pc.NumberOfAssignments
FROM ME_Profession AS p
INNER JOIN ME_Assignment AS a ON a.ProfessionId = p.Id
INNER JOIN ME_Client AS c ON a.ClientId = c.Id
INNER JOIN aspnet_Membership AS m ON m.UserId = c.UserId
INNER JOIN (SELECT a2.ProfessionId, COUNT(*) AS NumberOfAssignments FROM ME_Assignment AS a2 GROUP BY a2.ProfessionId WHERE a2.Closingdate > GETDATE()) pc ON p.ProfessionId = pc.ProfessionId
WHERE m.IsApproved = 1 AND a.Closingdate > GETDATE()
GROUP BY p.Id, p.Naam, p.FriendlyUrl, p.Ordinal
UPDATE 1: Added where condition for date

I don't think that you need to join against the table ME_profession again, try this:
SELECT p.Id, p.Naam, p.FriendlyUrl, p.Ordinal, pc.NumberOfAssignments,
COUNT(CASE WHEN ClosingDate > GETDATE() OR ClosingDate IS NULL THEN 1 END) AS NumberOfAssignments
FROM ME_Profession AS p
INNER JOIN ME_Assignment AS a
ON a.ProfessionId = p.Id
INNER JOIN ME_Client AS c
ON a.ClientId = c.Id
INNER JOIN aspnet_Membership AS m
ON m.UserId = c.UserId
WHERE (m.IsApproved = 1)
GROUP BY p.Id, p.Naam, p.FriendlyUrl, p.Ordinal

How about ... LEFT OUTER JOIN (SELECT * FROM ME_Assignment WHERE ClosingDate > GETDATE()) as a ...

I don't see criteria in the current stored procedure that would satisfy this statement:
Assignments with a closingdate in the past (we then assume the
assignment is closed) shouldn't be taken into the the total number of
assignment.
It may be you simply need to add the criteria you have in your work in progress to the existing proc:
WHERE ClosingDate > GETDATE()

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL join on table twice not bringing expected results - sql

Try using SUM with condition like below. SELECT c.CarID, SUM(IIF(p1.PID IS NULL, 0, 1)) AS pCount1, SUM(IIF(p2.PID IS NULL, 0, 1)) AS pCount2 FROM Cars c LEFT JOIN People p1 ON p1.CarID1 = c.CarID LEFT JOIN People p2 ON p2.CarID2 = c.CarID Group By c.CarID

Try with COALESCE function: SELECT c.CarID, COUNT(p.PID) AS pCount FROM Cars c LEFT JOIN People p ON COALESCE(p.CarID1, p.CarID2) = c.CarID Group By c.CarID

Related

GROUP BY with SUM without removing empty (null) values

Postgres SQL: getting group count

Count / sum values in subquery and order by it

SQL Server - OR clause in join confusion

Stored procedure containing inner join with count not working

Categories

Resources