Joined SQL query MAX aggregate with condition

Joined SQL query MAX aggregate with condition - sql

I am having trouble with a SQL query. Here is a representation of my schema on SQL Fiddle:
http://sqlfiddle.com/#!15/14c8e/1
The issue is that I want to return rows of data from the Invitations table and join them with a sum of both the 'sent' event_type and 'viewed' event_type from the associated events, as well as the latest created_at date.
I can get all the data and counts working, but am having issue with the last_sent_on. Is there a way I can use a condition in a MAX aggregate function?
e.g.
MAX(
SELECT events.created_at
WHERE event_type='sent'
)
If not, how would I write the proper subselect?
I am currently using Postgresql.
Thank you.

You can use a case statement inside of max just as you've done with sum. The query below will select the maximum created_at for event_type='sent'
SELECT
i.id,
i.name,
i.email,
max(case when e.event_type='sent' then e.created_at end) AS last_sent_on,
sum(case when e.event_type='sent' then 1 else 0 end) AS sent_count,
sum(case when e.event_type='viewed' then 1 else 0 end) AS view_count
FROM
invitations i
LEFT OUTER JOIN
events e
ON e.eventable_id = i.id
WHERE e.eventable_type='Invitation'
GROUP BY i.id, i.name, i.email
SQLFiddle

Try using a subquery to build the max value for sent.
SELECT
i.id,
i.name,
i.email,
sent.last_sent,
sum(case when e.event_type='sent' then 1 else 0 end) AS sent_count,
sum(case when e.event_type='viewed' then 1 else 0 end) AS view_count
FROM
invitations i
LEFT OUTER JOIN
events e
ON e.eventable_id = i.id
LEFT JOIN ( SELECT eventable_id uid, MAX(created_at) AS last_sent
FROM events
WHERE event_type = 'sent'
GROUP BY eventable_id ) AS sent
ON sent.uid = i.id
WHERE e.eventable_type='Invitation'
GROUP BY i.id, i.name, i.email, sent.last_sent

Related

SQL Query not finding all rows in SUM and COUNT

select coalesce(ratings.positive,0) as positive,coalesce(ratings.negative,0) as negative,articles.id,x.username,commentnumb,
articles.category,
articles."createdAt",
articles.id,
articles.title,
articles."updatedAt"
FROM articles
LEFT JOIN (SELECT id AS userId,username,about FROM users) x ON articles.user_id = x.userId
LEFT JOIN (SELECT id,
article_id,
sum(case when rating = '1' then 1 else 0 end) as positive,
sum(case when rating = '0' then 1 else 0 end) as negative
from article_ratings
GROUP by id
) as ratings ON ratings.article_id = articles.id
LEFT JOIN (SELECT article_id,id,
count(article_id) as commentNumb
from article_comments
GROUP by id
) as comments ON comments.article_id = articles.id
WHERE articles."createdAt" <= :date
group by ratings.positive,ratings.negative,articles.id,x.username,commentnumb
order by articles."createdAt" desc
LIMIT 10
The code is working, however I have many more comments and many more ratings than what is counted in both SUM and COUNT functions.
How do I fix this query?
This is using postgres.
I've done some experimentation and it seems that the third join for comments is the one causing issues.

In the derived tables, you should ideally be grouping using article_id. But, you are grouping based on id. Due to this, you are getting more than the necessary rows in the derived tables. I have modified the query to suit your needs.
SELECT COALESCE(ratings.positive,0) AS positive,COALESCE(ratings.negative,0) AS negative,articles.id,x.username,commentnumb,
articles.category,
articles."createdAt",
articles.id,
articles.title,
articles."updatedAt"
FROM articles
LEFT OUTER JOIN (SELECT id AS userId,username,about FROM users) x ON articles.user_id = x.userId
LEFT OUTER JOIN (SELECT article_id,
SUM(case when rating = '1' then 1 else 0 end) as positive,
SUM(case when rating = '0' then 1 else 0 end) as negative
FROM article_ratings
GROUP by article_id
) AS ratings ON ratings.article_id = articles.id
LEFT OUTER JOIN (SELECT article_id,
count(article_id) as commentNumb
FROM article_comments
GROUP by article_id
) AS comments ON comments.article_id = articles.id
WHERE articles."createdAt" <= :date
ORDER BY articles."createdAt" desc
LIMIT 10;

sql join not taking all records from another table

I have a query like this
WITH CTE AS
(
SELECT
U.Name, U.Adluserid AS 'Empid',
MIN(CASE WHEN IOType = 0 THEN Edatetime END) AS 'IN',
MAX(CASE WHEN IOType = 1 THEN Edatetime END) AS 'out',
(CASE
WHEN MAX(E.Status) = 1 THEN 'AL'
WHEN MAX(E.Status) = 2 THEN 'SL'
ELSE 'L'
END) AS leave_status
FROM
Mx_ACSEventTrn
RIGHT JOIN
Mx_UserMst U ON Mx_ACSEventTrn.UsrRefcode = U.UserID
LEFT JOIN
Tbl_Zeo_Empstatus E ON Mx_ACSEventTrn.UsrRefcode = E.Emp_Id
WHERE
CAST(Edatetime AS DATE) BETWEEN '2019-11-03' AND '2019-11-03'
GROUP BY
U.Name, U.Adluserid
)
SELECT
[Name], [Empid], [IN], [OUT],
(CASE
WHEN CAST([IN] AS TIME) IS NULL THEN CAST(leave_status AS NVARCHAR(50))
WHEN CAST([IN] AS TIME) < CAST('08:15' AS TIME) THEN 'P'
ELSE 'L'
END) AS status
FROM
CTE
In my employee master table Mx_UserMst I have 67 employees. But here it is showing only a few employees those who are punched. I want to show all employees from employee master

I believe that the problem is his WHERE clause:
where cast(Edatetime as date) between '2019-11-03' and '2019-11-03'
Why not cast(Edatetime as date) = '2019-11-03'?
I'm not sure in which table the column Edatetime belongs (you should qualify all the columns with the correct table name/alias).
You must move the condition to an ON clause:
WITH CTE AS
(
select U.Name,U.Adluserid as 'Empid',
min(case when IOType=0 then Edatetime end) as 'IN',
max(case when IOType=1 then Edatetime end) as 'out',
case max(E.Status) when 1 then 'AL' when 2 then 'SL' else 'L' end as leave_status
from Mx_UserMst U
left join Mx_ACSEventTrn on Mx_ACSEventTrn.UsrRefcode=U.UserID and (cast(Edatetime as date) between '2019-11-03' and '2019-11-03')
left join Tbl_Zeo_Empstatus E on Mx_ACSEventTrn.UsrRefcode=E.Emp_Id
group by U.Name,U.Adluserid
)
SELECT [Name], [Empid],[IN],[OUT],
case
when cast([IN] as time) is null then cast(leave_status as nvarchar(50))
when cast([IN] as time) < cast('08:15' as time) then 'P'
else 'L'
end as status
FROM CTE
If Edatetime belongs to Tbl_Zeo_Empstatus move the condition to the next join's ON clause.
I also changed the RIGHT to a LEFT join so to make the statement more readable.

If you want to keep everything in a particular table, then that should be the first table in the FROM clause. Subsequent joins should be LEFT JOINs and conditions on subsequent tables should be in the ON clause rather than the WHERE clause.
I would also advise you to use table aliases and to only use single quotes for string and date constants -- NOT column aliases.
The following assumes that IOType and Edatetime are in the table Mx_ACSEventTrn. I should not have to guess. You should qualify all column names in the query.
WITH CTE AS (
SELECT U.Name, U.Adluserid AS Empid,
MIN(CASE WHEN AE.IOType = 0 THEN AE.Edatetime END) AS in_dt,
MAX(CASE WHEN AE.IOType = 1 THEN AE.Edatetime END) AS out_dt,
(CASE WHEN MAX(ES.Status) = 1 THEN 'AL'
WHEN MAX(ES.Status) = 2 THEN 'SL'
ELSE 'L'
END) AS leave_status
FROM Mx_UserMst U LEFT JOIN
Mx_ACSEventTrn AE
ON AE.UsrRefcode = U.UserID AND
CAST(AE.Edatetime AS DATE) BETWEEN '2019-11-03' AND '2019-11-03' LEFT JOIN
Tbl_Zeo_Empstatus ES
ON AE.UsrRefcode = ES.Emp_Id AND
GROUP BY U.Name, U.Adluserid
)
SELECT Name, Empid, IN_DT, OUT_DT,
(CASE WHEN IN_DT IS NULL THEN leave_status
WHEN CAST(IN_DT AS TIME) < CAST('08:15' AS TIME) THEN 'P'
ELSE 'L'
END) AS status
FROM CTE;
Some more points:
Don't name aliases things like IN that are already key words. That is why I gave it the name IN_DT.
There is no reason to cast to a TIME to compare to NULL.
I don't see a reason to cast to NVARCHAR(50) in the outer CASE expression.

Aggregate case when inside non aggregate query

I have a pretty massive query that in its simplest form looks like this:
select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum
from table1 r
left join table2 u on r.user_id=u.user_id
left join table3 pi on u.user_id=pi.user_id
I need to add one more condition that gives me count of users with non null application date per rep (like: rep 1 has 3 users with filled application dates), and assign it into categories (since 3 users, rep is a certain status category). This looks something like this:
case when sum(case when application_date is not null then 1 else 0 end) >=10 then 'status1'
when sum(case when application_date is not null then 1 else 0 end) >=5 then 'status2'
when sum(case when application_date is not null then 1 else 0 end) >=1 then 'status3'
else 'no_status' end as category
However, if I was to simply add it to the select statement, all reps will becomes of status1 because the sum() is done over all advisors with application dates filled:
select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum,
(
select case when sum(case when application_date is not null then 1 else 0 end) >=10 then 'status1'
when sum(case when application_date is not null then 1 else 0 end) >=5 then 'status2'
when sum(case when application_date is not null then 1 else 0 end) >=1 then 'status3'
else 'no_status' end as category
from table3
) as category
from table1 r
left join table2 u on r.user_id=u.user_id
left join table3 pi on u.user_id=pi.user_id
Can you assist with having the addition to my query to be across reps and not overall? Much appreciated!

Based on your description, I think you need a window function:
select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum,
count(pi.application_date) over (partition by r.rep_id) as newcol
from table1 r left join
table2 u
on r.user_id = u.user_id left join
table3 pi
on u.user_id = pi.user_id;
You can use the count() in a case to get ranges, if that is what you prefer.

optimize Table Spool in SQL Server Execution plan

I have the following sql query and trying to optimize it using execution plan. In execution plan it says Estimated subtree cost is 36.89. There are several table spools(Eager Spool). can anyone help me to optimize this query. Thanks in advance.
SELECT
COUNT(DISTINCT bp.P_ID) AS total,
COUNT(DISTINCT CASE WHEN bc.Description != 'S' THEN bp.P_ID END) AS m_count,
COUNT(DISTINCT CASE WHEN bc.Description = 'S' THEN bp.P_ID END) AS s_count,
COUNT(DISTINCT CASE WHEN bc.Description IS NULL THEN bp.P_ID END) AS n_count
FROM
progress_tbl AS progress
INNER JOIN Person_tbl AS bp ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm ON bm.MS_ID = bp.MembershipStatusID
LEFT OUTER JOIN Membership_tbl AS m ON m.M_ID = bp.CurrentMembershipID
LEFT OUTER JOIN Category_tbl AS bc ON bc.MC_ID = m.MembershipCategoryID
WHERE
logged_when BETWEEN '2017-01-01' AND '2017-01-31'

Here's a technique you can use.
WITH T AS
(
SELECT DISTINCT CASE
WHEN bc.Description != 'S' THEN 'M'
WHEN bc.Description = 'S' THEN 'S'
WHEN bc.Description IS NULL THEN 'N'
END AS type,
bp.P_ID
FROM progress_tbl AS progress
INNER JOIN Person_tbl AS bp
ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm
ON bm.MS_ID = bp.MembershipStatusID
LEFT OUTER JOIN Membership_tbl AS m
ON m.M_ID = bp.CurrentMembershipID
LEFT OUTER JOIN Category_tbl AS bc
ON bc.MC_ID = m.MembershipCategoryID
WHERE logged_when BETWEEN '2017-01-01' AND '2017-01-31'
)
SELECT COUNT(DISTINCT P_ID) AS total,
COUNT(CASE WHEN type= 'M' THEN P_ID END) AS m_count,
COUNT(CASE WHEN type= 'S' THEN P_ID END) AS s_count,
COUNT(CASE WHEN type= 'N' THEN P_ID END) AS n_count
FROM T
I will demonstrate it on a simpler example.
Suppose your existing query is
SELECT
COUNT(DISTINCT number) AS total,
COUNT(DISTINCT CASE WHEN name != 'S' THEN number END) AS m_count,
COUNT(DISTINCT CASE WHEN name = 'S' THEN number END) AS s_count,
COUNT(DISTINCT CASE WHEN name IS NULL THEN number END) AS n_count
FROM master..spt_values;
You can rewrite it as follows
WITH T AS
(
SELECT DISTINCT CASE
WHEN name != 'S'
THEN 'M'
WHEN name = 'S'
THEN 'S'
ELSE 'N'
END AS type,
number
FROM master..spt_values
)
SELECT COUNT(DISTINCT number) AS total,
COUNT(CASE WHEN type= 'M' THEN number END) AS m_count,
COUNT(CASE WHEN type= 'S' THEN number END) AS s_count,
COUNT(CASE WHEN type= 'N' THEN number END) AS n_count
FROM T
Note the rewrite is costed as considerably cheaper and the plan is much simpler.

As already pointed out, there seems to be some typo/copy paste issues with your query. This makes it rather difficult for us to figure out what's going on.
The table-spools probably are what's going on in the CASE WHEN b.description etc... constructions. MSSQL first creates a (memory) table with all the resulting values and then that one gets sorted and streamed through the COUNT(DISTINCT ...) operator. I don't think there is much you can do about that as the work needs to be done somewhere.
Anyway, some remarks and wild guesses:
I'm guessing that logged_when is in the progress_tbl table?
If so, do you really need to LEFT OUTER JOIN all the other tables? From what I can tell they aren't being used?
You're trying to count the number of P_IDs that match the criteria and you want to split up that number between those that have b.Description either 'S', something else, or NULL.
for this you could calculate the total as the sum of the m_count, s_count and n_count. This would save you 1 COUNT() operation, not sure it helps a lot in the bigger picture but all bits help I guess.
Something like this:
;WITH counts AS (
SELECT
COUNT(DISTINCT CASE WHEN b.Description != 'S' THEN b_p.P_ID END) AS m_count,
COUNT(DISTINCT CASE WHEN b.Description = 'S' THEN b_p.P_ID END) AS s_count,
COUNT(DISTINCT CASE WHEN b.Description IS NULL THEN b_p.P_ID END) AS n_count
FROM
progress_tbl AS progress
INNER JOIN Person_tbl AS bp ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm ON bm.MS_ID = bp.MembershipStatusID -- really needed?
LEFT OUTER JOIN Membership_tbl AS m ON m.M_ID = bp.CurrentMembershipID -- really needed?
LEFT OUTER JOIN Category_tbl AS bc ON bc.MC_ID = m.MembershipCategoryID -- really needed?
WHERE
logged_when BETWEEN '2017-01-01' AND '2017-01-31' -- what table does logged_when column come from????
)
SELECT total = m_count + s_count + n_count,
*
FROM counts
UPDATE
BEWARE: Using the answer/example code of Martin Smith I came to realize that total isn't necessarily the sum of the other fields. It could be a given P_ID shows up with different description which then might fall into different categories. Depending on your data it might thus be that my answer is plain wrong.

How to convert two rows value in a single row?

I have below sql server query data
Looking for solution.
SQL Query:
SELECT
p.ProjectName,
i.ItemName,
inv.TransactionDirection,
SUM(inv.TransactionQty) AS TransactionQuantity
FROM INVTransaction inv
JOIN BDProject p ON p.ProjectID=inv.ProjectID
JOIN MDItem i ON i.ItemID=inv.ItemID
GROUP BY p.ProjectName,
i.ItemName,
inv.TransactionDirection

I think you just want conditional aggregation:
SELECT p.ProjectName, i.ItemName,
SUM(CASE WHEN inv.TransactionDirection = 'IN' THEN inv.TransactionQty ELSE 0 END) as IN_Quantity,
SUM(CASE WHEN inv.TransactionDirection = 'OUT' THEN inv.TransactionQty ELSE 0 END) as OUT_Quantity,
SUM(CASE WHEN inv.TransactionDirection = 'IN' THEN inv.TransactionQty
WHEN inv.TransactionDirection = 'OUT' THEN -inv.TransactionQty
ELSE 0
END) as Balance
FROM INVTransaction inv JOIN
BDProject p
ON p.ProjectID = inv.ProjectID JOIN
MDItem i ON i.ItemID = inv.ItemID
GROUP BY p.ProjectName, i.ItemName

You can use Pivot for this. Below is working query.
SELECT PROJECTNAME,ITEMNAME,[IN],[OUT] ,ISNULL([IN],0)-ISNULL([OUT],0) AS BALANCE FROM
(SELECT PROJECTNAME,ITEMNAME,TRANSACTIONDIRECTION,TRANSACTIONQUANTITY FROM TRANSACTIONS
)A
PIVOT (SUM(TRANSACTIONQUANTITY) FOR TRANSACTIONDIRECTION IN ([IN],[OUT])) AS PVT
Replace TRANSACTIONS with your table name

As #Gordon Linoff almost provided solution but for balance column you can replace
SUM(inv.TransactionQty) as Balance
With
SUM(CASE WHEN inv.TransactionDirection = 'IN'
THEN inv.TransactionQty
ELSE -1*inv.TransactionQty END) as Balance

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas