GROUP BY with SUM without removing empty (null) values - sql

TABLES:
Players
player_no | transaction_id
----------------------------
1 | 11
2 | 22
3 | (null)
1 | 33
Transactions
id | value |
-----------------------
11 | 5
22 | 10
33 | 2
My goal is to fetch all data, maintaining all the players, even with null values in following query:
SELECT p.player_no, COUNT(p.player_no), SUM(t.value) FROM Players p
INNER JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no
nevertheless results omit null value, example:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
What I would like to have is mention about the empty value:
player_no | count | sum
------------------------
1 | 2 | 7
2 | 1 | 10
3 | 0 | 0
What do I miss here?
Actually I use QueryDSL for that, but translated example into pure SQL since it behaves in the same manner.

using LEFT JOIN and coalesce function
SELECT p.player_no, COUNT(p.player_no), coalesce(SUM(t.value),0)
FROM Players p
LEFT JOIN Transactions t ON p.transaction_id = t.id
GROUP BY p.player_no

Change your JOIN to a LEFT JOIN, then add IFNULL(value, 0) in your SUM()

left join keeps all the rows in the left table
SELECT p.player_no
, COUNT(*) as count
, SUM(isnull(t.value,0))
FROM Players p
LEFT JOIN Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no
You might be looking for count(t.value) rather than count(*)

I'm just offering this so you have a correct answer:
SELECT p.player_no, COUNT(t.id) as [count], COALESCE(SUM(t.value), 0) as [sum]
FROM Players p LEFT JOIN
Transactions t
ON p.transaction_id = t.id
GROUP BY p.player_no;
You need to pay attention to the aggregation functions as well as the JOIN.

Please Try This:
SELECT P.player_no,
COUNT(*) as count,
SUM(isnull(T.value,0))
FROM Players P
LEFT JOIN Transactions T
ON P.transaction_id = T.id
GROUP BY P.player_no
Hope this helps.

Related

SQL join on table twice not bringing expected results

I have two tables (extraneous columns removed to exemplify the issue):
-People-
PID | CarID1 | CarID2
----------------------
1 | 1 | 3
2 | 5 | NULL
3 | 1 | NULL
4 | NULL | 1
-Cars-
CarID
-----
1
3
5
I'm creating a view based on the CarID so using:
SELECT
c.CarID,
COUNT(p.PID) AS pCount
FROM
Cars c
LEFT JOIN People p ON p.CarID1 = c.CarID OR p.CarID2 = c.CarID
Group By c.CarID
Brings back the expected results:
CarID | pCount
--------------
1 | 3
3 | 1
5 | 1
The issue being that on a table with 1000+ car id's and 25,000 people, this can take a long time (taking out the OR clause means it takes milliseconds)
So I was trying to do it another way like this:
SELECT
c.CarID,
COUNT(p1.PID) AS pCount1,
COUNT(p2.PID) AS pCount2
FROM
Cars c
LEFT JOIN People p1 ON p1.CarID1 = c.CarID
LEFT JOIN People p2 ON p2.CarID2 = c.CarID
Group By c.CarID
It's many times quicker, but because CarID 1 exists in both CarID1 and CarID2 I'm getting this:
CarID | pCount1 | pCount2
-------------------------
1 | 3 | 3
3 | 0 | 1
5 | 1 | 0
When I would expect this:
CarID | pCount1 | pCount2
-------------------------
1 | 2 | 1
3 | 0 | 1
5 | 1 | 0
And I could just sum the pCount1 and pCount2
Is there any way I can achieve the results of the first query using the 2nd method? I'm presuming the GROUP BY clause has something to do with it, but not sure how to omit it.
How about unpivoting the columns and then joining:
SELECT v.CarID, COUNT(p.PID) AS pCount
FROM People p CROSS APPLY
(VALUES (p.CarID1), (p.CarID2)) v(CarID) JOIN
Cars c
ON v.CarID = c.CarId
WHERE v.CarID IS NOT NULL
GROUP BY v.CarID;
If you want to keep cars even with no people, then you can express this as a LEFT JOIN:
SELECT c.CarID, COUNT(p.PID) AS pCount
FROM Cars c LEFT JOIN
(People p CROSS APPLY
(VALUES (p.CarID1), (p.CarID2)) v(CarID)
)
ON v.CarID = c.CarId
GROUP BY c.CarID;
Here is a db<>fiddle.
Is the p.CarID1 a Primary Key?
If so it would explain that a join on the carID1 is fast but on the carID2 it's slow.
Try creating an Index on CarID2 and see if that solves your performance issues.
The index would turn it from a full table scan into an index lookup. Which is a lot faster.
CREATE NONCLUSTERED INDEX CarId2Index
ON p.CarID2;
If that solves it you can keep your query as it is.
Alternatively you can send us the query explain plan so we can see what is slowing it down.
Try using SUM with condition like below.
SELECT
c.CarID,
SUM(IIF(p1.PID IS NULL, 0, 1)) AS pCount1,
SUM(IIF(p2.PID IS NULL, 0, 1)) AS pCount2
FROM
Cars c
LEFT JOIN People p1 ON p1.CarID1 = c.CarID
LEFT JOIN People p2 ON p2.CarID2 = c.CarID
Group By c.CarID
Try with COALESCE function:
SELECT
c.CarID,
COUNT(p.PID) AS pCount
FROM
Cars c
LEFT JOIN People p ON COALESCE(p.CarID1, p.CarID2) = c.CarID
Group By c.CarID

After joining two queries (each having different columns) with UNION I'm getting only one column

I have joined two queries with UNION keyword (Access 2016). It looks like that:
SELECT ITEM.IName, Sum(STOCK_IN.StockIn) AS SumOfIN
FROM ITEM INNER JOIN STOCK_IN ON ITEM.IName = STOCK_IN.IName
GROUP BY ITEM.IName
UNION SELECT ITEM.IName, Sum(STOCK_OUT.StockOut) AS SumOfOut
FROM ITEM INNER JOIN STOCK_OUT ON ITEM.IName = STOCK_OUT.IName
GROUP BY ITEM.IName
I get the following result:
IName | SumOfIN
----------------
Abis Nig | 3
Abrotanum | 1
Acid Acet | 2
Aconite Nap | 2
Aconite Nap | 3
Antim Crud | 3
Antim Tart | 1
But I want the following result:
IName | SumOfIN | SumOfOut
----------------
Abis Nig | 3 | 0
Abrotanum | 1 | 0
Acid Acet | 2 | 0
Aconite Nap | 2 | 3
Antim Crud | 0 | 3
Antim Tart | 0 | 1
Can anyone tell me what changes should I make here?
You need to add dummy values for the third column where they don't exist in the table you are UNIONing. In addition, you need an overall SELECT/GROUP BY since you can have values for both StockIn and StockOut:
SELECT IName, SUM(SumOfIN), Sum(SumOfOut)
FROM (SELECT ITEM.IName, Sum(STOCK_IN.StockIn) AS SumOfIN, 0 AS SumOfOut
FROM ITEM INNER JOIN STOCK_IN ON ITEM.IName = STOCK_IN.IName
GROUP BY ITEM.IName
UNION ALL
SELECT ITEM.IName, 0, Sum(STOCK_OUT.StockOut)
FROM ITEM INNER JOIN STOCK_OUT ON ITEM.IName = STOCK_OUT.IName
GROUP BY ITEM.IName) s
GROUP BY IName
Note that column names in the result table are all taken from the first table in the UNION, so we must name SumOfOut in that query.
You can do this query without UNION at all:
select i.iname, si.sumofin, so.sumofout
from (item as i left join
(select si.iname, sum(si.stockin) as sumofin
from stock_in as si
group by si.iname
) as si
on si.iname = i.iname
) left join
(select so.iname, sum(so.stockout) as sumofout
from stock_out as so
group by so.iname
) as so
on so.iname = i.iname;
This will include items that have no stock in or stock out. That might be a good thing, or a bad thing. If a bad thing, then add:
where si.sumofin > 0 or so.sumofout > 0
If you are going to use union all, then you can dispense with the join to items entirely:
SELECT IName, SUM(SumOfIN), Sum(SumOfOut)
FROM (SELECT si.IName, Sum(si.StockIn) AS SumOfIN, 0 AS SumOfOut
FROM STOCK_IN as si
GROUP BY si.INAME
UNION ALL
SELECT so.IName, 0, Sum(so.StockOut)
STOCK_OUT so
GROUP BY so.IName
) s
GROUP BY IName;
The JOIN would only be necessary if you had stock items that are not in the items table. That would be a sign of bad data modeling.

Count / sum values in subquery and order by it

I have tables like below:
user
id | status
1 | 0
gallery
id | status | create_by_user_id
1 | 0 | 1
2 | 0 | 1
3 | 0 | 1
media
id | status
1 | 0
2 | 0
3 | 0
gallery_media
fk gallery.id fk media.id
id | gallery_id | media_id | sequence
1 | 1 | 1 | 1
2 | 2 | 2 | 1
3 | 2 | 3 | 2
monitor_traffic
1:gallery 2:media
id | anonymous_id | user_id | endpoint_code | endpoint_id
1 | 1 | | 1 | 2 gallery.id 2
2 | 2 | | 1 | 2 gallery.id 2
3 | | 1 | 2 | 3 media.id 3 include in gallery.id 2
these means gallery.id 2 contain 3 rows
gallery_information
fk gallery.id
id | gallery_id
gallery includes media.
monitor_traffic.endpoint_code: 1 .. gallery; 2 .. media
If 1 then monitor_traffic.endpoint_id references gallery.id
monitor_traffic.user_id, monitor_traffic.anonymous_id integer or null
Objective
I want to output gallery rows sort by count each gallery rows in monitor_traffic, then count the gallery related media rows in monitor_traffic. Finally sum them.
The query I provide only counts media in monitor_traffic without summing them and also does not count gallery in monitor_traffic.
How to do this?
This is part of a function, input option then output build query, something like this. I hope to find a solution (maybe with a subquery) that does not require to change other parts of the query.
Query:
SELECT
g.*,
row_to_json(gi.*) as gallery_information
FROM gallery g
LEFT JOIN gallery_information gi ON gi.gallery_id = g.id
LEFT JOIN "user" u ON u.id = g.create_by_user_id
-- start
LEFT JOIN gallery_media gm ON gm.gallery_id = g.id
LEFT JOIN (
SELECT
endpoint_id,
COUNT(*) as mt_count
FROM monitor_traffic
WHERE endpoint_code = 2
GROUP BY endpoint_id
) mt ON mt.endpoint_id = m.id
-- end
ORDER BY mt.mt_count desc NULLS LAST;
sql fiddle
I suggest a CTE to count both types in one aggregation and join to it two times in the FROM clause:
WITH mt AS ( -- count once for both media and gallery
SELECT endpoint_code, endpoint_id, count(*) AS ct
FROM monitor_traffic
GROUP BY 1, 2
)
SELECT g.*, row_to_json(gi.*) AS gallery_information
FROM gallery g
LEFT JOIN mt ON mt.endpoint_id = g.id -- 1st join to mt
AND mt.endpoint_code = 1 -- gallery
LEFT JOIN (
SELECT gm.gallery_id, sum(ct) AS ct
FROM gallery_media gm
JOIN mt ON mt.endpoint_id = gm.media_id -- 2nd join to mt
AND mt.endpoint_code = 2 -- media
GROUP BY 1
) mmt ON mmt.gallery_id = g.id
LEFT JOIN gallery_information gi ON gi.gallery_id = g.id
ORDER BY mt.ct DESC NULLS LAST -- count of galleries
, mmt.ct DESC NULLS LAST; -- count of "gallery related media"
Or, to order by the sum of both counts:
...
ORDER BY COALESCE(mt.ct, 0) + COALESCE(mmt.ct, 0) DESC;
Aggregate first, then join. That prevents complications with "proxy-cross joins" that multiply rows:
Two SQL LEFT JOINS produce incorrect result
The LEFT JOIN to "user" seems to be dead freight. Remove it:
LEFT JOIN "user" u ON u.id = g.create_by_user_id
Don't use reserved words like "user" as identifier, even if that's allowed as long as you double-quote. Very error-prone.

SQL left join two tables independently

If I have these tables:
Thing
id | name
---+---------
1 | thing 1
2 | thing 2
3 | thing 3
Photos
id | thing_id | src
---+----------+---------
1 | 1 | thing-i1.jpg
2 | 1 | thing-i2.jpg
3 | 2 | thing2.jpg
Ratings
id | thing_id | rating
---+----------+---------
1 | 1 | 6
2 | 2 | 3
3 | 2 | 4
How can I join them to produce
id | name | rating | photo
---+---------+--------+--------
1 | thing 1 | 6 | NULL
1 | thing 1 | NULL | thing-i1.jpg
1 | thing 1 | NULL | thing-i2.jpg
2 | thing 2 | 3 | NULL
2 | thing 2 | 4 | NULL
2 | thing 2 | NULL | thing2.jpg
3 | thing 3 | NULL | NULL
Ie, left join on each table simultaneously, rather than left joining on one than the next?
This is the closest I can get:
SELECT Thing.*, Rating.rating, Photo.src
From Thing
Left Join Photo on Thing.id = Photo.thing_id
Left Join Rating on Thing.id = Rating.thing_id
You can get the results you want with a union, which seems the most obvious, since you return a field from either ranking or photo.
Your additional case (have none of either), is solved by making the joins left join instead of inner joins. You will get a duplicate record with NULL, NULL in ranking, photo. You can filter this out by moving the lot to a subquery and do select distinct on the main query, but the more obvious solution is to replace union all by union, which also filters out duplicates. Easier and more readable.
select
t.id,
t.name,
r.rating,
null as photo
from
Thing t
left join Rating r on r.thing_id = t.id
union
select
t.id,
t.name,
null,
p.src
from
Thing t
left join Photo p on p.thing_id = t.id
order by
id,
photo,
rating
Here's what I came up with:
SELECT
Thing.*,
rp.src,
rp.rating
FROM
Thing
LEFT JOIN (
(
SELECT
Photo.src,
Photo.thing_id AS ptid,
Rating.rating,
Rating.thing_id AS rtid
FROM
Photo
LEFT JOIN Rating
ON 1 = 0
)
UNION
(
SELECT
Photo.src,
Photo.thing_id AS ptid,
Rating.rating,
Rating.thing_id AS rtid
FROM
Rating
LEFT JOIN Photo
ON 1 = 0
)
) AS rp
ON Thing.id IN (rp.rtid, rp.ptid)
MySQL has no support for full outer joins so you have to hack around it using a UNION:
Here's the fiddle: http://sqlfiddle.com/#!2/d3d2f/13
SELECT *
FROM (
SELECT Thing.*,
Rating.rating,
NULL AS photo
FROM Thing
LEFT JOIN Rating ON Thing.id = Rating.thing_id
UNION ALL
SELECT Thing.*,
NULL,
Photo.src
FROM Thing
LEFT JOIN Photo ON Thing.id = Photo.thing_id
) s
ORDER BY id, photo, rating

Stored procedure containing inner join with count not working

Let's say i've got a databasetable looking a bit like this, containing information about some assignments.
Id | ProfessionId | Title | Deadline | DateCreated | ClosingDate
1 | 5 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
2 | 6 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
3 | 7 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
4 | 7 | Something | 01-12-2012 | 05-11-2012 | 12-11-2012
I want to generate an overview foreach profession (assignments belong to a certain profession) and count the number of assignment in each profession. The overview coming from the database should look like this;
Id | Name | FriendlyUrl | Ordinal | NumberOfAssignments
5 | Profession 1 | profession-1 | 1 | 1
6 | Profession 2 | profession-2 | 1 | 1
7 | Profession 3 | profession-3 | 1 | 2
8 | Profession 4 | profession-4 | 1 | 0
I've currently got a stored procedure returning the overview above, except that the amount of assignments isn't correct. Assignments with a closingdate in the past (we then assume the assignment is closed) shouldn't be taken into the the total number of assignment.
The current stored procedure is like this:
BEGIN
SELECT p.Id,
p.Naam,
p.FriendlyUrl,
p.Ordinal,
COUNT(a.ProfessionId) AS NumberOfAssignments
FROM ME_Profession AS p
LEFT OUTER JOIN ME_Assignment AS a ON a.ProfessionId = p.Id
INNER JOIN ME_Client AS c ON a.ClientId = c.Id
INNER JOIN aspnet_Membership AS m ON m.UserId = c.UserId
WHERE m.IsApproved = 1
GROUP BY p.Id, p.Naam, p.FriendlyUrl, p.Ordinal
END
I've already came up with and modified procedure like the one below, but it doesn't work. It feels like i'm either thinking too difficult or missing something obvious. What could go wrong?
SELECT p.Id, p.Naam, p.FriendlyUrl, p.Ordinal, pc.NumberOfAssignments
FROM ME_Profession AS p
INNER JOIN ME_Assignment AS a ON a.ProfessionId = p.Id
INNER JOIN ME_Client AS c ON a.ClientId = c.Id
INNER JOIN aspnet_Membership AS m ON m.UserId = c.UserId
INNER JOIN (SELECT a2.ProfessionId, COUNT(*) AS NumberOfAssignments FROM ME_Assignment AS a2 GROUP BY a2.ProfessionId WHERE a2.Closingdate > GETDATE()) pc ON p.ProfessionId = pc.ProfessionId
WHERE m.IsApproved = 1 AND a.Closingdate > GETDATE()
GROUP BY p.Id, p.Naam, p.FriendlyUrl, p.Ordinal
UPDATE 1: Added where condition for date
I don't think that you need to join against the table ME_profession again, try this:
SELECT p.Id, p.Naam, p.FriendlyUrl, p.Ordinal, pc.NumberOfAssignments,
COUNT(CASE WHEN ClosingDate > GETDATE() OR ClosingDate IS NULL THEN 1 END) AS NumberOfAssignments
FROM ME_Profession AS p
INNER JOIN ME_Assignment AS a
ON a.ProfessionId = p.Id
INNER JOIN ME_Client AS c
ON a.ClientId = c.Id
INNER JOIN aspnet_Membership AS m
ON m.UserId = c.UserId
WHERE (m.IsApproved = 1)
GROUP BY p.Id, p.Naam, p.FriendlyUrl, p.Ordinal
How about ... LEFT OUTER JOIN (SELECT * FROM ME_Assignment WHERE ClosingDate > GETDATE()) as a ...
I don't see criteria in the current stored procedure that would satisfy this statement:
Assignments with a closingdate in the past (we then assume the
assignment is closed) shouldn't be taken into the the total number of
assignment.
It may be you simply need to add the criteria you have in your work in progress to the existing proc:
WHERE ClosingDate > GETDATE()