Aggregate case when inside non aggregate query - sql

I have a pretty massive query that in its simplest form looks like this:
select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum
from table1 r
left join table2 u on r.user_id=u.user_id
left join table3 pi on u.user_id=pi.user_id
I need to add one more condition that gives me count of users with non null application date per rep (like: rep 1 has 3 users with filled application dates), and assign it into categories (since 3 users, rep is a certain status category). This looks something like this:
case when sum(case when application_date is not null then 1 else 0 end) >=10 then 'status1'
when sum(case when application_date is not null then 1 else 0 end) >=5 then 'status2'
when sum(case when application_date is not null then 1 else 0 end) >=1 then 'status3'
else 'no_status' end as category
However, if I was to simply add it to the select statement, all reps will becomes of status1 because the sum() is done over all advisors with application dates filled:
select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum,
(
select case when sum(case when application_date is not null then 1 else 0 end) >=10 then 'status1'
when sum(case when application_date is not null then 1 else 0 end) >=5 then 'status2'
when sum(case when application_date is not null then 1 else 0 end) >=1 then 'status3'
else 'no_status' end as category
from table3
) as category
from table1 r
left join table2 u on r.user_id=u.user_id
left join table3 pi on u.user_id=pi.user_id
Can you assist with having the addition to my query to be across reps and not overall? Much appreciated!

Based on your description, I think you need a window function:
select r.rep_id, u.user_id, u.signup_date, pi.application_date, pi.management_date, aum,
count(pi.application_date) over (partition by r.rep_id) as newcol
from table1 r left join
table2 u
on r.user_id = u.user_id left join
table3 pi
on u.user_id = pi.user_id;
You can use the count() in a case to get ranges, if that is what you prefer.

Related

using Query one results to run Query two

SELECT distinct
Pd.Cpd as ' accountnumber'
FROM [RQL_ALK_PMT].[Cts_opps] pd
INNER JOIN [RQL_ALK_PMT].[Cts_opps].dpo.cnms_id metg on metg.cnms_id=me.cnms_id
This code would result with this
Result
accountnumber
1332132
3213123
5641202
6412221
1233242
What I would like to do is when the code above gets the results my bottom code reads them and runs them trough its code. The common denominator here would be the account number because its running through a different table
SELECT
pm.AcctNumb as 'accountnumber'
, SUM(CASE WHEN pm.cusidIN ('cr') THEN 1 ELSE 0 END) AS CA
, SUM(CASE WHEN pm.cusidIN ('gb') THEN 1 ELSE 0 END) AS GB
, SUM(CASE WHEN pm.cusidIN ('tev','offev','Lastev') THEN 1 ELSE 0 END) AS chr
, SUM(CASE WHEN pm.cusidIN
('pm','pr','che' )
THEN 1 ELSE 0 END) AS Act
, SUM(CASE WHEN pm.cusidIN ('supev','tev') THEN 1 ELSE 0 END) AS Fulfillment
FROM ops.medadata pm WITH (NOLOCK)
INNER JOIN mw.pim_acct Ma with (nolock) ON ma.AcctNumb= pm.AcctNumb
Where pm.AcctNumb in ()
GROUP BY
pm.AcctNumb
I have tried doing this the code below but it doesnt seem to work
With counta as (
SELECT distinct
Pd.Cpd as ' accountnumber'
FROM [RQL_ALK_PMT].[Cts_opps] pd
INNER JOIN [RQL_ALK_PMT].[Cts_opps].dpo.cnms_id metg on metg.cnms_id=me.cnms_id
)
SELECT
pm.AcctNumb as 'accountnumber'
, SUM(CASE WHEN pm.cusidIN ('cr') THEN 1 ELSE 0 END) AS CA
, SUM(CASE WHEN pm.cusidIN ('gb') THEN 1 ELSE 0 END) AS GB
, SUM(CASE WHEN pm.cusidIN ('tev','offev','Lastev') THEN 1 ELSE 0 END) AS chr
, SUM(CASE WHEN pm.cusidIN
('pm','pr','che' )
THEN 1 ELSE 0 END) AS Act
, SUM(CASE WHEN pm.cusidIN ('supev','tev') THEN 1 ELSE 0 END) AS Fulfillment
FROM ops.medadata pm WITH (NOLOCK)
INNER JOIN mw.pim_acct Ma with (nolock) ON ma.AcctNumb= pm.AcctNumb
left join counta on Pm.accountnumber = counta.accountnumber
Where pm.AcctNumb in (counta.accountnumber)
GROUP BY
pm.AcctNumb
**im having issue with joining the two tables together**
IN is not an equivalent to a join - but you are treating that way.
Instead think of "IN this LIST" and the list could be supplied by you, or by a subquery e.g.
a list given by the query itself
select * from atable
where acol IN ('a','b','c') -- i.e. the list is hardcoded
or, using a subquery
SELECT ...etc.
WHERE pm.AcctNumb IN (
SELECT Pd.Cpd
FROM [RQL_ALK_PMT].[Cts_opps] pd
INNER JOIN [RQL_ALK_PMT].[Cts_opps].dpo.cnms_id metg on metg.cnms_id=me.cnms_id
)
or, if using a CTE
With counta as (
SELECT Pd.Cpd
FROM [RQL_ALK_PMT].[Cts_opps] pd
INNER JOIN [RQL_ALK_PMT].[Cts_opps].dpo.cnms_id metg on metg.cnms_id=me.cnms_id
)
SELECT ... etc.
WHERE pm.AcctNumb IN (
SELECT Cpd
FROM counta
)
note, it usually is not more efficient to use select distinct when forming a subquery to be used with an IN list.

Join 2 tables even with null values and sum each row

See pics below of what i want to do with my sql statement. I have used a left join and ISNULL to get all the results fine except for I don't get the total, which I want to sum the numbers for each customer. All table b values are integers.
Select a.CustomerId, a.FName, a.LName, b.mtg1, b.mtg2, b.mtg3, b.mtg4 From Customer a Left Join Hours b On a.CustomerID = b.CustomerID group by a.Lname
The GROUP BY will make it a bit difficult to captured all the necessary data. Also, the sum of different columns will require the use of COALESCE(column, 0) so as to use zero as the value if the column is null because if not done, your total will come back asNULL`.
One possible solution will is:
SELECT a.CustId, a.FName, a.LName, SUM(b.mtg1), SUM(b.mtg2), SUM(b.mtg3), SUM(b.mtg4), (COALESCE(SUM(b.mtg1), 0) + COALESCE(SUM(b.mtg2), 0) + COALESCE(SUM(b.mtg3), 0) + COALESCE(SUM(b.mtg4), 0)) AS total
FROM table_a a
LEFT JOIN table_b b ON(b.CustID = a.CustId)
GROUP BY a.CustID, a.FName, a.LName
with cte as
(
select flname+', '+fname as Name,
mtg1,mtg2,mtg3,mtg4,
isnull(((case when mtg1 is null then 0 else mtg1 end)+
(case when mtg2 is null then 0 else mtg2 end)+
(case when mtg3 is null then 0 else mtg3 end)+
(case when mtg4 is null then 0 else mtg4 end)),0) as total
from a left join b
on a.custid = b.custid
)
select name,
isnull(mtg1,0) as mtg1,
isnull(mtg2,0) as mtg2,
isnull(mtg3,0) as mtg3,
isnull(mtg4,0) as mtg4,
total
from cte
order by 1

Group by with case when

I have Projects - Issues, one to many relationship.
I want Pending issues and Completed issues for each project.
So, what I have done
SELECT
a.id ,
a.Name,
SUM(CASE WHEN b.StatusId = 3 THEN 1 ELSE NULL END) AS CompletedIssues,
SUM(CASE WHEN b.StatusId != 3 THEN 1 ELSE NULL END) AS PendingIssues
FROM
Projects a
JOIN Issues b
ON a.ID = b.ProjectId
GROUP BY
a.name,
b.StatusId,
a.ID
But it's not giving proper output. see below snap.
There are two separate rows for Completed and pending issues and sometimes more then 2 rows based upon Issues Status ID (See BT5).
Is case when is wrong for this scenario?
what is the proper way to achieve this?
Fix your group by:
select p.id, p.Name,
sum(case when i.StatusId = 3 then 1 else null end) as CompletedIssues,
sum(case when i.StatusId <> 3 then 1 else null end) as PendingIssues
from Projects p join
Issues i
on p.ID = i.ProjectId
group by p.name, p.id;
Note: You may not want else NULL. Normally, you want counts to be zero rather than NULL:
select p.id, p.Name,
sum(case when i.StatusId = 3 then 1 else 0 end) as CompletedIssues,
sum(case when i.StatusId <> 3 then 1 else 0 end) as PendingIssues
from Projects p join
Issues i
on p.ID = i.ProjectId
group by p.name, p.id;
Also, I changed the table aliases to something more meaningful. Don't use meaningless letters such as a and b. Use table abbreviations.

optimize Table Spool in SQL Server Execution plan

I have the following sql query and trying to optimize it using execution plan. In execution plan it says Estimated subtree cost is 36.89. There are several table spools(Eager Spool). can anyone help me to optimize this query. Thanks in advance.
SELECT
COUNT(DISTINCT bp.P_ID) AS total,
COUNT(DISTINCT CASE WHEN bc.Description != 'S' THEN bp.P_ID END) AS m_count,
COUNT(DISTINCT CASE WHEN bc.Description = 'S' THEN bp.P_ID END) AS s_count,
COUNT(DISTINCT CASE WHEN bc.Description IS NULL THEN bp.P_ID END) AS n_count
FROM
progress_tbl AS progress
INNER JOIN Person_tbl AS bp ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm ON bm.MS_ID = bp.MembershipStatusID
LEFT OUTER JOIN Membership_tbl AS m ON m.M_ID = bp.CurrentMembershipID
LEFT OUTER JOIN Category_tbl AS bc ON bc.MC_ID = m.MembershipCategoryID
WHERE
logged_when BETWEEN '2017-01-01' AND '2017-01-31'
Here's a technique you can use.
WITH T AS
(
SELECT DISTINCT CASE
WHEN bc.Description != 'S' THEN 'M'
WHEN bc.Description = 'S' THEN 'S'
WHEN bc.Description IS NULL THEN 'N'
END AS type,
bp.P_ID
FROM progress_tbl AS progress
INNER JOIN Person_tbl AS bp
ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm
ON bm.MS_ID = bp.MembershipStatusID
LEFT OUTER JOIN Membership_tbl AS m
ON m.M_ID = bp.CurrentMembershipID
LEFT OUTER JOIN Category_tbl AS bc
ON bc.MC_ID = m.MembershipCategoryID
WHERE logged_when BETWEEN '2017-01-01' AND '2017-01-31'
)
SELECT COUNT(DISTINCT P_ID) AS total,
COUNT(CASE WHEN type= 'M' THEN P_ID END) AS m_count,
COUNT(CASE WHEN type= 'S' THEN P_ID END) AS s_count,
COUNT(CASE WHEN type= 'N' THEN P_ID END) AS n_count
FROM T
I will demonstrate it on a simpler example.
Suppose your existing query is
SELECT
COUNT(DISTINCT number) AS total,
COUNT(DISTINCT CASE WHEN name != 'S' THEN number END) AS m_count,
COUNT(DISTINCT CASE WHEN name = 'S' THEN number END) AS s_count,
COUNT(DISTINCT CASE WHEN name IS NULL THEN number END) AS n_count
FROM master..spt_values;
You can rewrite it as follows
WITH T AS
(
SELECT DISTINCT CASE
WHEN name != 'S'
THEN 'M'
WHEN name = 'S'
THEN 'S'
ELSE 'N'
END AS type,
number
FROM master..spt_values
)
SELECT COUNT(DISTINCT number) AS total,
COUNT(CASE WHEN type= 'M' THEN number END) AS m_count,
COUNT(CASE WHEN type= 'S' THEN number END) AS s_count,
COUNT(CASE WHEN type= 'N' THEN number END) AS n_count
FROM T
Note the rewrite is costed as considerably cheaper and the plan is much simpler.
As already pointed out, there seems to be some typo/copy paste issues with your query. This makes it rather difficult for us to figure out what's going on.
The table-spools probably are what's going on in the CASE WHEN b.description etc... constructions. MSSQL first creates a (memory) table with all the resulting values and then that one gets sorted and streamed through the COUNT(DISTINCT ...) operator. I don't think there is much you can do about that as the work needs to be done somewhere.
Anyway, some remarks and wild guesses:
I'm guessing that logged_when is in the progress_tbl table?
If so, do you really need to LEFT OUTER JOIN all the other tables? From what I can tell they aren't being used?
You're trying to count the number of P_IDs that match the criteria and you want to split up that number between those that have b.Description either 'S', something else, or NULL.
for this you could calculate the total as the sum of the m_count, s_count and n_count. This would save you 1 COUNT() operation, not sure it helps a lot in the bigger picture but all bits help I guess.
Something like this:
;WITH counts AS (
SELECT
COUNT(DISTINCT CASE WHEN b.Description != 'S' THEN b_p.P_ID END) AS m_count,
COUNT(DISTINCT CASE WHEN b.Description = 'S' THEN b_p.P_ID END) AS s_count,
COUNT(DISTINCT CASE WHEN b.Description IS NULL THEN b_p.P_ID END) AS n_count
FROM
progress_tbl AS progress
INNER JOIN Person_tbl AS bp ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm ON bm.MS_ID = bp.MembershipStatusID -- really needed?
LEFT OUTER JOIN Membership_tbl AS m ON m.M_ID = bp.CurrentMembershipID -- really needed?
LEFT OUTER JOIN Category_tbl AS bc ON bc.MC_ID = m.MembershipCategoryID -- really needed?
WHERE
logged_when BETWEEN '2017-01-01' AND '2017-01-31' -- what table does logged_when column come from????
)
SELECT total = m_count + s_count + n_count,
*
FROM counts
UPDATE
BEWARE: Using the answer/example code of Martin Smith I came to realize that total isn't necessarily the sum of the other fields. It could be a given P_ID shows up with different description which then might fall into different categories. Depending on your data it might thus be that my answer is plain wrong.

Joined SQL query MAX aggregate with condition

I am having trouble with a SQL query. Here is a representation of my schema on SQL Fiddle:
http://sqlfiddle.com/#!15/14c8e/1
The issue is that I want to return rows of data from the Invitations table and join them with a sum of both the 'sent' event_type and 'viewed' event_type from the associated events, as well as the latest created_at date.
I can get all the data and counts working, but am having issue with the last_sent_on. Is there a way I can use a condition in a MAX aggregate function?
e.g.
MAX(
SELECT events.created_at
WHERE event_type='sent'
)
If not, how would I write the proper subselect?
I am currently using Postgresql.
Thank you.
You can use a case statement inside of max just as you've done with sum. The query below will select the maximum created_at for event_type='sent'
SELECT
i.id,
i.name,
i.email,
max(case when e.event_type='sent' then e.created_at end) AS last_sent_on,
sum(case when e.event_type='sent' then 1 else 0 end) AS sent_count,
sum(case when e.event_type='viewed' then 1 else 0 end) AS view_count
FROM
invitations i
LEFT OUTER JOIN
events e
ON e.eventable_id = i.id
WHERE e.eventable_type='Invitation'
GROUP BY i.id, i.name, i.email
SQLFiddle
Try using a subquery to build the max value for sent.
SELECT
i.id,
i.name,
i.email,
sent.last_sent,
sum(case when e.event_type='sent' then 1 else 0 end) AS sent_count,
sum(case when e.event_type='viewed' then 1 else 0 end) AS view_count
FROM
invitations i
LEFT OUTER JOIN
events e
ON e.eventable_id = i.id
LEFT JOIN ( SELECT eventable_id uid, MAX(created_at) AS last_sent
FROM events
WHERE event_type = 'sent'
GROUP BY eventable_id ) AS sent
ON sent.uid = i.id
WHERE e.eventable_type='Invitation'
GROUP BY i.id, i.name, i.email, sent.last_sent