Sql Count Where Groupby SubQuery

Sql Count Where Groupby SubQuery - sql

Hey I have this query,
SELECT item_type.id, item_type.item_type,
(SELECT COUNT(*) FROM item WHERE item.sale_transaction_id IS NULL) as stock_qty,
(SELECT COUNT(*) FROM item WHERE item.sale_transaction_id IS NOT NULL) as sold_qty
FROM item
JOIN item_type ON item.item_type_id = item_type.id
GROUP BY item.item_type_id
This gives me a result:
| id | item_type | stock_qty | sold_qty|
----------------------------------------
| 1 | Book | 12 | 12 |
| 2 | Pencil | 12 | 12 |
| ........... # etc
But this does not work as intended, I need to do it like this to make it work:
SELECT item_type.id, item_type.item_type,
COUNT(item.purchase_transaction_id) - COUNT(item.sale_transaction_id) as stock_qty,
COUNT(item.sale_transaction_id) as sold_qty
FROM item
JOIN item_type ON item.item_type_id = item_type.id
GROUP BY item.item_type_id
and the result is what I want and this is the correct/expected output:
| id | item_type | stock_qty | sold_qty|
----------------------------------------
| 1 | Book | 1 | 0 |
| 2 | Pencil | 0 | 5 |
| ........... # etc
In my Table structure, each item that has sale_transaction_id is marked as sold.
My question is why the first one is not working as intended? and how do I make it to work as 2nd one? Is it actually possible using subquery for this type of query?

SELECT item_type.id, item_type.item_type,
SUM(case when item.sale_transaction_id IS NULL then 1 else 0 end) as stock_qty,
SUM(case when item.sale_transaction_id IS NOT NULL then 1 else 0 end) as sold_qty
FROM item
JOIN item_type ON item.item_type_id = item_type.id
GROUP BY item_type.id, item_type.item_type
Is this what you need?

You need to add correlation to the subqueries:
SELECT item_type.id, item_type.item_type,
(SELECT COUNT(item.purchase_transaction_id) - COUNT(item.sale_transaction_id)
FROM item
WHERE item.item_type_id = i.item_type_id) as stock_qty,
(SELECT COUNT(item.sale_transaction_id)
FROM item
WHERE item.item_type_id = i.item_type_id ) as sold_qty
FROM item AS i
JOIN item_type ON i.item_type_id = item_type.id
GROUP BY i.item_type_id
The subqueries are now correlated: they are executed for each item_type_id of the outer query and return results for this exact value each time.
But this seems like an overkill, since you can get the same result applying aggregation in the outer query, just like you do in the second query of your question.

Start from "item_type" table, instead of "item" table and use left join, otherwise you will never get a row in the query result if you not have items from a type.
SELECT
item_type.id,
item_type.item_type,
SUM(CASE WHEN item.id IS NOT NULL AND item.sale_transaction_id IS NULL THEN 1 ELSE 0 END) AS stock_qty,
SUM(CASE WHEN item.id IS NOT NULL AND item.sale_transaction_id IS NOT NULL THEN 1 ELSE 0 END) AS sold_qty
FROM
item_type
LEFT JOIN
item
ON
item.item_type_id = item_type.id
GROUP BY
item_type.id, item_type.item_type
Avoid using subselects. Each subselect you use will be executed for each row and that will slow down performance a lot. You can run explain on both queries (subselect and join version) and you will see what I mean
It will be helpful if you post an example data of initial tables.

Related

How to right join these series so that this query will return results where count = 0 in Postgresql SQL?

I have the following query that runs on a Postgresql database:
SELECT NULL AS fromdate,
l.eventlevel,
SUM(CASE
WHEN e.id IS NULL THEN 0
ELSE 1
END) AS COUNT
FROM event e
RIGHT JOIN
(SELECT generate_series(0, 3) AS eventlevel) l ON e.event_level = l.eventlevel
WHERE e.project_id = :projectId
GROUP BY l.eventlevel
ORDER BY l.eventlevel DESC
With the (trimmed) event table:
TABLE public.event
id uuid NOT NULL,
event_level integer NOT NULL
This is a variant for a bucketed query but with all data, hence the NULL fromdate.
I'm trying to get counts from the table event and counted per event_level. But I also want the number 0 to return when there aren't any events for that particular event_level. But the current right join is not doing that job. What am I doing wrong?
I also tried adding OR e.project_id IS null thinking it might be filtering out the 0 counts. Or would this work with a CROSS JOIN and if so how?
Current result:
+----------+------------+-------+
| fromdate | eventlevel | count |
+----------+------------+-------+
| null | 3 | 1 |
+----------+------------+-------+
Desired result:
+----------+------------+-------+
| fromdate | eventlevel | count |
+----------+------------+-------+
| null | 3 | 1 |
| null | 2 | 0 |
| null | 1 | 0 |
| null | 0 | 0 |
+----------+------------+-------+

I recommend avoiding RIGHT JOINs and using LEFT JOINs. They are just simpler for following the logic -- keep everything in the first table and matching rows in the subsequent ones.
Your issue is the placement of the filter -- it filters out the outer joined rows. So that needs to go into the ON clause. I would recommend:
SELECT NULL AS fromdate, gs.eventlevel,
COUNT(e.id) as count
FROM generate_series(0, 3) gs(eventlevel) LEFT JOIN
event e
ON e.event_level = gs.eventlevel AND e.project_id = :projectId
GROUP BY gs.eventlevel
ORDER BY gs.eventlevel DESC;
Note the other simplifications:
No subquery is needed for generate_series.
You can use COUNT() instead of your case logic.

You have to move the e.project_id condition from the WHERE clause to the ON clause to get true RIGHT JOIN result:
...
END) AS COUNT
FROM event e
RIGHT JOIN
(SELECT generate_series(0, 3) AS eventlevel) l ON e.event_level = l.eventlevel
AND e.project_id = :projectId
...

How to COUNT different values without adding to GROUP BY

I have a data set that contains a name for every "job" record, and whether the job passed or failed. I want to show the Name, number of jobs, how many passed, and how many failed in one row.
I am grouping the name and using COUNT on the name to count the total number of jobs, which works fine, but I can't show how many passed and how many failed without adding them to the GROUP BY clause causing the data to separate again.
SELECT I.Name, Count(I.Name) As NumberOfJobs,
CASE WHEN WI.resultTypeID = 1 THEN COUNT(WI.resultTypeID) END AS [Passed],
CASE WHEN WI.resultTypeID = 2 THEN COUNT(WI.resultTypeID) END AS [Failed],
FROM DB.DBO.People AS I
INNER JOIN DB2.dbo.Jobs AS WI ON I.JOBID = WI.JOBID
GROUP BY I.Name, wi.resultTypeID
+-----------+-----------+--------+--------+
| Name | NumofJobs | Passed | Failed |
+-----------+-----------+--------+--------+
| Dale Test | 2 | 2 | NULL |
| Dale Test | 2 | NULL | 2 |
+-----------+-----------+--------+--------+
This is what happens when I add ResultTypeID to the GROUP BY, but I want this:
+-----------+-----------+--------+--------+
| Name | NumofJobs | Passed | Failed |
+-----------+-----------+--------+--------+
| Dale Test | 4 | 2 | 2 |
+-----------+-----------+--------+--------+
Is there anyway to do this?

You want conditional aggregation. The case expression is an argument to the aggregation function:
SELECT I.Name, Count(*) As NumberOfJobs,
SUM(CASE WHEN WI.resultTypeID = 1 THEN 1 ELSE 0 END) AS [Passed],
SUM(CASE WHEN WI.resultTypeID = 2 THEN 1 ELSE 0 END) AS [Failed],
FROM DB.DBO.People I INNER JOIN
DB2.dbo.Jobs WI
ON I.JOBID = WI.JOBID
GROUP BY I.Name;
I am guessing that wi.resultTypeID is not NULL, so I replaced the COUNT() with SUM() because I prefer SUM() in this case.

You don't need to group your query by wi.resultTypeID .
simply remove wi.resultTypeID from group by statement and put it inside aggregate function:
SELECT I.Name, Count(I.Name) As NumberOfJobs,
SUM(CASE WHEN WI.resultTypeID = 1 THEN 1 ELSE 0 END) AS [Passed],
SUM(CASE WHEN WI.resultTypeID = 2 THEN 1 ELSE 0 END) AS [Failed],
FROM DB.DBO.People AS I
INNER JOIN DB2.dbo.Jobs AS WI ON I.JOBID = WI.JOBID
GROUP BY I.Name

After joining two queries (each having different columns) with UNION I'm getting only one column

I have joined two queries with UNION keyword (Access 2016). It looks like that:
SELECT ITEM.IName, Sum(STOCK_IN.StockIn) AS SumOfIN
FROM ITEM INNER JOIN STOCK_IN ON ITEM.IName = STOCK_IN.IName
GROUP BY ITEM.IName
UNION SELECT ITEM.IName, Sum(STOCK_OUT.StockOut) AS SumOfOut
FROM ITEM INNER JOIN STOCK_OUT ON ITEM.IName = STOCK_OUT.IName
GROUP BY ITEM.IName
I get the following result:
IName | SumOfIN
----------------
Abis Nig | 3
Abrotanum | 1
Acid Acet | 2
Aconite Nap | 2
Aconite Nap | 3
Antim Crud | 3
Antim Tart | 1
But I want the following result:
IName | SumOfIN | SumOfOut
----------------
Abis Nig | 3 | 0
Abrotanum | 1 | 0
Acid Acet | 2 | 0
Aconite Nap | 2 | 3
Antim Crud | 0 | 3
Antim Tart | 0 | 1
Can anyone tell me what changes should I make here?

You need to add dummy values for the third column where they don't exist in the table you are UNIONing. In addition, you need an overall SELECT/GROUP BY since you can have values for both StockIn and StockOut:
SELECT IName, SUM(SumOfIN), Sum(SumOfOut)
FROM (SELECT ITEM.IName, Sum(STOCK_IN.StockIn) AS SumOfIN, 0 AS SumOfOut
FROM ITEM INNER JOIN STOCK_IN ON ITEM.IName = STOCK_IN.IName
GROUP BY ITEM.IName
UNION ALL
SELECT ITEM.IName, 0, Sum(STOCK_OUT.StockOut)
FROM ITEM INNER JOIN STOCK_OUT ON ITEM.IName = STOCK_OUT.IName
GROUP BY ITEM.IName) s
GROUP BY IName
Note that column names in the result table are all taken from the first table in the UNION, so we must name SumOfOut in that query.

You can do this query without UNION at all:
select i.iname, si.sumofin, so.sumofout
from (item as i left join
(select si.iname, sum(si.stockin) as sumofin
from stock_in as si
group by si.iname
) as si
on si.iname = i.iname
) left join
(select so.iname, sum(so.stockout) as sumofout
from stock_out as so
group by so.iname
) as so
on so.iname = i.iname;
This will include items that have no stock in or stock out. That might be a good thing, or a bad thing. If a bad thing, then add:
where si.sumofin > 0 or so.sumofout > 0
If you are going to use union all, then you can dispense with the join to items entirely:
SELECT IName, SUM(SumOfIN), Sum(SumOfOut)
FROM (SELECT si.IName, Sum(si.StockIn) AS SumOfIN, 0 AS SumOfOut
FROM STOCK_IN as si
GROUP BY si.INAME
UNION ALL
SELECT so.IName, 0, Sum(so.StockOut)
STOCK_OUT so
GROUP BY so.IName
) s
GROUP BY IName;
The JOIN would only be necessary if you had stock items that are not in the items table. That would be a sign of bad data modeling.

Sum Group By Column

I have a column (PL.UNITS) that I need to Total at the bottom of the results of a query, is it possible to sum PL.UNITS that is already summed?
Please see query below.
SELECT ID.DUEDATE AS [DUE DATE], CD.RENEWALDATE, CD.RENEWALSTATUS, CD.CONTRACTNUMBER, L.LOCNAME, L.LOCADDRESS1, L.LOCADDRESS2, L.LOCADDRESS3, L.LOCADDRESS4, L.POSTCODE, SUM(PL.UNITS) AS UNITS from CLIENTDETAILS CD
INNER JOIN LOCATIONS L ON CD.CLIENTNUMBER = L.CLIENTNUMBER
INNER JOIN ITEMDETAILS ID ON L.LOCNUMBER = ID.LOCNUMBER
INNER JOIN PLANT PL ON ID.CODE = PL.CODE
WHERE L.OWNER = 210 and L.STATUSLIVE = 1 and ID.DUEDATE > '01/01/2017'
GROUP BY ID.DUEDATE, CD.RENEWALDATE, CD.RENEWALSTATUS, CD.CONTRACTNUMBER, L.LOCNAME, L.LOCADDRESS1, L.LOCADDRESS2, L.LOCADDRESS3, L.LOCADDRESS4, L.POSTCODE

It's probably best to do this sort of thing in front end development. Nevertheless, here is an example (quick and dirty, but shows the idea) for sql-server:
SELECT COALESCE(a.id, 'total') AS id
, SUM(a.thing) AS thing_summed
FROM (
SELECT '1' id
, 1 thing
UNION
SELECT '2'
, 2 thing
UNION
SELECT '1'
, 3 thing
) AS a
GROUP BY ROLLUP(a.id)
Result:
+-------+--------------+
| id | thing_summed |
+-------+--------------+
| 1 | 4 |
| 2 | 2 |
| total | 6 |
+-------+--------------+

Faster SQL query with CASE in JOIN instead of CASE in SELECT statement of query?

I have a view of CommunityMembers where each has a primary key for ID. Some also have old ID's from another system and some have a spouse ID. All ID's are unique.
e.g.:
ID | Name | OldID | SpouseID | SpouseName
1 | John.Smith | o71 | s99 | Jenna.Smith
2 | Jane.Doe | o72 | |
3 | Jessie.Jones | |
I also have a view of ActivityDates where each Community member can have multiple activity dates. There are activity dates for old ID's and for Spouse ID's. (Unfortunately I can't clean the data up by converting old to new ID's)
e.g.:
ID | ActivityDate | ActiviyType | ActivityGroup
1 | 2017-12-31 | 1 | 1
1 | 2017-12-31 | 3 | 2
1 | 2017-12-31 | 7 | 1
2 | 2017-12-31 | 1 | 1
3 | 2017-12-31 | 1 | 1
o72 | 2010-12-31 | 1 | 2
o72 | 2010-12-31 | 3 | 1
s99 | 2017-12-31 | 1 | 1
s99 | 2017-12-31 | 2 | 1
I can select the data in the way I need it using the following method having multiple case selects running 3 times to check the 3 possible ID's though it is very slow because it is running a select query multiple times per record:
SELECT
C.ID,
C.Name,
C.OldID,
C.SpouseID,
C.SpouseName,
CASE
WHEN C.ID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
OR C.OldID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
OR C.SpouseID (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType = 1 AND ActiviyGroup = 1)
AND NOT EXISTS (SELECT ID FROM ActivityDates WHERE ActivityDate > 2016-12-31 AND ActiviyType > 1 AND ActiviyGroup > 1)
THEN 'Yes'
ELSE ''
END AS Result i.e. HasTheCommunityMemberOrTheirSpouseOnlyEverAttendedActivityTypeAndGroup1After2016?
So I would expect the following results, which I get, it is just slow:
ID | Name | OldID | SpouseID | SpouseName | Result
1 | John.Smith | o71 | s99 | Jenna.Smith |
2 | Jane.Doe | o72 | | | Yes
3 | Jessie.Jones | | | | Yes
I appreciate that there are better ways to do this which I'm happy to hear suggestions on though I have limited flexibility in changing this system so that aside all I am asking is how can I make this faster? Ideally I want to use a join to the table and use conditions off that though I can't work it out. e.g.
SELECT
C.ID, C.Name,
C.OldID, C.SpouseID, C.SpouseName,
R.Result
FROM
CommunityMembers C
JOIN
CASE WHEN Date ... Type ... Group ... ELSE ... IN ... Not Exist ... THEN ... ActivityDates R
or
SELECT
C.ID, C.Name,
C.OldID, C.SpouseID, C.SpouseName,
CASE
WHEN R.Date ... R.Type ... R.Group ... ELSE ... THEN 'Yes' END AS Result
FROM
CommunityMembers C
JOIN
ActivityDates R
I suspect I need to make multiple joins though I don't know how to write it.
Thank you

Index is just like this:
CREATE INDEX index_name
ON table_name (column1, column2, ...);
see this link for more details

You want information from table ActivityDates per ID. So group by ID and filter the desired IDs in HAVING:
SELECT ID
FROM ActivityDates
WHERE ActivityDate > '2016-12-31'
GROUP BY ID
HAVING COUNT(CASE WHEN ActiviyType = 1 AND ActiviyGroup = 1 THEN 1 END) > 1
AND COUNT(CASE WHEN ActiviyType > 1 AND ActiviyGroup > 1 THEN 1 END) = 0
You can use this with an EXISTS clause:
select
c.*,
case when exists
(
SELECT a.ID
FROM ActivityDates a
WHERE a.ActivityDate > '2016-12-31'
AND a.ID in (c.id, c.oldid, c.spouseid)
GROUP BY a.ID
HAVING COUNT(CASE WHEN ActiviyType = 1 AND ActiviyGroup = 1 THEN 1 END) > 1
AND COUNT(CASE WHEN ActiviyType > 1 AND ActiviyGroup > 1 THEN 1 END) = 0
) then 'Yes' else '' end as result
from c;
Appropriate indexes to speed this up may be
create index idx1 on ActivityDates (ID, ActivityDate, ActivityType, ActivityGroup);
create index idx2 on ActivityDates (ActivityDate, ID, ActivityType, ActivityGroup);
Find out whether one of them gets used and drop the other (or both in case None gets used).
It is possible that using the subquery non-correlated (which means we must access it multiple times) performs better. It depends on the optimizer if it even comes to a different execution plan:
with good_ids as
(
select id
from activitydates
where activitydate > '2016-12-31'
group by id
having count(case when activiytype = 1 and activiygroup = 1 then 1 end) > 1
and count(case when activiytype > 1 and activiygroup > 1 then 1 end) = 0
)
select
c.*,
case when id in (select id from good_ids)
or oldid in (select id from good_ids)
or spouseid in (select id from good_ids)
then 'Yes' else ''
end as result
from c;

You should try to explain the output .It is difficult to find the correct biz. rule from wrong query.
This way you get best query from here.Just try explaning again that why id 2,3 is yes.Then i will rewrite my query.
Second biggest mistake you are about to commit is that without understanding your biz. rule ,without writing correct query,you are going to create index
Try this,
declare #t table(ID varchar(20),Name varchar(40),OldID varchar(20), SpouseID varchar(20)
, SpouseName varchar(40))
insert into #t VALUES
('1','John.Smith','o71' ,'s99','Jenna.Smith')
,('2','Jane.Doe' ,'o72',null,null)
,('3','Jessie.Jones',null,null,null)
--select * from #t
declare #ActivityDates table(ID varchar(20), ActivityDate date
, ActiviyType int, ActivityGroup int)
insert into #ActivityDates VALUES
('1','2017-12-31',1, 1)
,('1','2017-12-31',3, 2)
,('1','2017-12-31',7, 1)
,('2','2017-12-31',1, 1)
,('3','2017-12-31',1, 1)
,('o72','2010-12-31',1, 2)
,('o72','2010-12-31',3, 1)
,('s99','2017-12-31',1, 1)
,('s99','2017-12-31',2, 1)
SELECT t.*
,case when tbl.id is not null then 'Yes' else null end Remarks
from #t t
left JOIN
(select * from #ActivityDates AD
WHERE(( ActivityDate > '2016-12-31' AND ActiviyType = 1 AND ActivityGroup = 1
AND NOT EXISTS (SELECT ID FROM #ActivityDates ad1 WHERE (ad.id=ad1.id) AND
ActivityDate > '2016-12-31' AND (ActiviyType > 1 or ActivityGroup > 1))
)
))tbl
on t.ID=tbl.ID

Here is another pattern for utilising 'optional joins' that may or may not perform better. It's not quite the same as your output - I'm not sure what you're after there.
SELECT A.*,
COALESCE(C1.Name, C2.Name, C3.Name) As Name
FROM ActivityDates A
LEFT OUTER JOIN CommunityMember As C1
ON C1.ID = A.ID
LEFT OUTER JOIN CommunityMember As C2
ON C2.OldID = CAST(A.ID AS VARCHAR(12))
LEFT OUTER JOIN CommunityMember As C3
ON C2.SpouseID = CAST(A.ID AS VARCHAR(12))
There are cases where this will 'double count' but if you are certain that the entire collection of id's is unique you should be fine. If you only want to know if an activity record exists you can definitely speed this up by using exists but again I don't follow your logic.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sql Count Where Groupby SubQuery - sql

Related

How to right join these series so that this query will return results where count = 0 in Postgresql SQL?

How to COUNT different values without adding to GROUP BY

After joining two queries (each having different columns) with UNION I'm getting only one column

Sum Group By Column

Faster SQL query with CASE in JOIN instead of CASE in SELECT statement of query?

Categories

Resources