How to replace SELECTS in an Aggregate Function SUM() - sql

How do I create a workaround for a SELECT statement in a SUM-Function
I am currently migrating my Sybase Database towards MsSQL.
One of my Views has some SUMs in its main select statements which then use subSelects for a case in my SUM function
SELECT
SUM(CASE WHEN e.s = 'E'
AND EXISTS
( SELECT
1
FROM
system.E
JOIN system.EF
ON EF.EID = E.ID
WHERE
E.CID = C.ID
AND EF.T='smth')
AND A.AC= 'smthelse'
AND ET.EC not in( 'lol','lul','lel')
THEN
B.A
ELSE
0.0
END) AS smth
FROM ...
I expect it to SUM the b.A when the Select statement has at least 1 result
but instead I get this error message:
Cannot perform an aggregate function on an expression containing an aggregate or a subquery.
I think it doesnt allow me to use a subSelect in the SUM-function, but im not sure how to fix it.

You can replace the subquery with a lateral JOIN by using the OUTER APPLY operator:
SELECT . . .
SUM(CASE WHEN e.s = 'E' AND
eef.ID IS NOT NULL AND
A.AC = 'smthelse' AND
ET.EC NOT IN ( 'lol', 'lul', 'lel')
THEN B.A ELSE 0.0
END) AS smth
FROM ... OUTER APPLY
(SELECT TOP (1) E.*
FROM system.E JOIN
system.EF
ON EF.EID = E.ID
WHERE E.CID = C.ID AND EF.T = 'smth'
) EEF

Related

How do you properly query the result of a complex join statement in SQL?

New to advanced SQL!
I'm trying to write a query that returns the COUNT(*) and SUM of the resulting columns from this query:
DECLARE #Id INT = 1000;
SELECT
*,
CASE
WHEN Id1 >= 6 THEN 1
ELSE 0
END AS Tier1,
CASE
WHEN Id1 >= 4 THEN 1
ELSE 0
END AS Tier2,
CASE
WHEN Id1 >= 2 THEN 1
ELSE 0
END AS Tier3
FROM (
SELECT
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName,
MAX(AppSubmitU_Level.Id1) AS Id1
FROM Org
INNER JOIN AppEmployment
ON AppEmployment.OrgID = Org.OrgID
INNER JOIN App
ON App.AppID = AppEmployment.AppID
INNER JOIN AppSubmit
ON App.AppID = AppSubmit.AppID
INNER JOIN AppSubmitU_Level
ON AppSubmit.LevelID = AppSubmitU_Level.Id1
INNER JOIN AppEmpU_VerifyStatus
ON AppEmpU_VerifyStatus.VerifyStatusID = AppEmployment.VerifyStatusID
WHERE AppSubmitU_Level.SubmitTypeID = 1 -- Career
AND AppEmpU_VerifyStatus.StatusIsVerified = 1
AND AppSubmit.[ExpireDate] IS NOT NULL
AND AppSubmit.[ExpireDate] > GETDATE()
AND Org.OrgID = #Id
GROUP BY
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName
) employees
I've tried to do so by moving the #Id outside the original query, and adding a SELECT(*), SUM, and SUM to the top, like so:
DECLARE #OrgID INT = 1000;
SELECT COUNT(*), SUM(employees.Tier1), SUM(employees.Tier2), SUM(employees.Tier3)
FROM
(SELECT *,
...
) AS employees
);
When I run the query, however, I'm getting the errors:
The multi-part identifier employees.Tier1 could not be bound
The same errors appear for the other identifiers in my SUM statements.
I'm assuming this has to do with the fact that the Tier1, Tier2, and Tier3 columns are being returned by the inner join query in my FROM(), and aren't values set by the existing tables that I'm querying. But I can't figure out how to rewrite it to initialize properly.
Thanks in advance for the help!
This is a scope problem: employees is defined in the subquery only, it is not available in the outer scope. You basically want to alias the outer query:
DECLARE #OrgID INT = 1000;
SELECT COUNT(*), SUM(employees.Tier1) TotalTier1, SUM(employees.Tier2) TotalTier2, SUM(employees.Tier3) TotalTier3
FROM (
SELECT *,
...
) AS employees
) AS employees;
--^ here
Note that I added column aliases to the outer query, which is a good practice in SQL.
It might be easier to understand what is going on if you use another alias for the outer query:
SELECT COUNT(*), SUM(e.Tier1), SUM(e.Tier2), SUM(e.Tier3)
FROM (
SELECT *,
...
) AS employees
) AS e;
Note that you don't actually need to qualify the column names in the outer query, since column names are unambigous anyway.
And finally: you don't actually need a subquery. You could write the query as:
SELECT
SUM(CASE WHEN Id1 >= 6 THEN 1 ELSE 0 END) AS TotalTier1,
SUM(CASE WHEN Id1 >= 4 THEN 1 ELSE 0 END) AS TotalTier2,
SUM(CASE WHEN Id1 >= 2 THEN 1 ELSE 0 END) AS TotalTier3
FROM (
SELECT
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName,
MAX(AppSubmitU_Level.Id1) AS Id1
FROM Org
INNER JOIN AppEmployment
ON AppEmployment.OrgID = Org.OrgID
INNER JOIN App
ON App.AppID = AppEmployment.AppID
INNER JOIN AppSubmit
ON App.AppID = AppSubmit.AppID
INNER JOIN AppSubmitU_Level
ON AppSubmit.LevelID = AppSubmitU_Level.Id1
INNER JOIN AppEmpU_VerifyStatus
ON AppEmpU_VerifyStatus.VerifyStatusID = AppEmployment.VerifyStatusID
WHERE AppSubmitU_Level.SubmitTypeID = 1 -- Career
AND AppEmpU_VerifyStatus.StatusIsVerified = 1
AND AppSubmit.[ExpireDate] IS NOT NULL
AND AppSubmit.[ExpireDate] > GETDATE()
AND Org.OrgID = #Id
GROUP BY
Org.OrgID,
App.AppID,
App.FirstName,
App.LastName
) employees

SQL query having CASE WHEN EXISTS statement

I trying to create a SQL query with a CASE WHEN EXISTS clause in SQL Server. I assume I am doing something wrong as when I run the SELECT * FROM [Christmas_Sale] it takes forever for SQL to load the code.
CREATE VIEW [Christmas_Sale]
AS
SELECT
C.*,
CASE
WHEN EXISTS (SELECT S.Sale_Date
FROM [Christmas_Sale] s
WHERE C.ID = S.ID)
THEN 0
ELSE 1
END AS ChristmasSale
FROM
[Customer_Detail] C ;
I'm trying to write a sub select which I need to return a 1 if Sale_Date= 1 and 0 for anything else.
The syntax of your query looks ok. But since your stated:
I'm trying to write a sub select which I need to return a 1 if Sale_Date= 1 and 0 for anything else.
... Then you could rephrase your query by adding one more condition in the WHERE clause of the subquery:
CREATE VIEW [Christmas_Sale]AS
SELECT
C.*,
CASE WHEN EXISTS (
SELECT 1
FROM [Christmas_Sale] s
WHERE C.ID = S.ID and S.Sale_Date = 1
) THEN 0 ELSE 1 END AS ChristmasSale
FROM [Customer_Detail] C ;
If a record exists in [Christmas_Sale] with the corresponding ID and Sale_Date = 1, then ChristmasSale will have value 1, else it will display 0.
This query looks correct:
CREATE VIEW [Christmas_Sale] AS
SELECT C.*,
(CASE WHEN EXISTS (SELECT 1
FROM [Christmas_Sale] s
WHERE C.ID = S.ID
)
THEN 0 ELSE 1
END) AS ChristmasSale
FROM [Customer_Detail] C ;
If performance is an issue, you want an index on Christmas_Sale(ID).
Note that the SELECT S.Sale_Date in the subquery is meaningless, because EXISTS checks for rows not columns. Hence, I replaced it with the simpler 1.

Removing Self Join

I have the below query
SELECT h.*
FROM table1 h
LEFT JOIN table1 e
ON e.fundno = h.fundno
AND e.trantype = 'D'
AND e.modifiedon > h.modifiedon
WHERE e.fundno IS NULL
AND h.trantype != 'D'
Is there way to avoid the self join. I know it can be rewritten using Not Exists but I am trying to avoid hitting the table twice..
If the trantype is same then we can use Row_Number to do this.. since trantype is different I couldn't find a way to do it..
You seem to want non-D rows where there is no "D" row modifed at a laster time. You could use window functions:
select h.*
from (select h.*,
max(case when h.transtype = 'D' then modifiedon end) over (partition by fundno) as last_d_modifiedon
from table1 h
) h
where (last_d_modifiedon is null or last_d_modifiedon < modifiedon) and
h.transtype <> 'D';

SQL Subquery Having COUNT(var) turns 0 to NULLs

I have written a SQL query with a subquery to include counts. When the count is 0, and I try to filter out the 0, it turns the 0's to NULLs and keeps the rows, and vice versa. The result is that I can't filter out the 0's, which was the purpose of including the counts.
SELECT distinct
a
,b
,
(SELECT
count(id)
FROM seq_stud
WHERE scs.SequenceID = seq_stud.SequenceID
and seq_stud.EndDate is null
HAVING count(id) <> 0
) As t1
FROM sp
INNER JOIN p on sp.ProgramID = p.ProgramID
...etc.
Does anyone know why this is happening and how I can filter out the 0 counts?
You don't filter in the SELECT clause. If you don't want rows that have no match in seq_stud, then use WHERE:
WHERE EXISTS (SELECT 1
FROM seq_stud ss
WHERE scs.SequenceID = ss.SequenceID and ss.EndDate is null
)
I would remove the HAVING statement altogether. You need to put that in the WHERE clause. Otherwise, it will return null, as you found.
SELECT distinct a, b,
(SELECT count(id)
FROM seq_stud
WHERE scs.SequenceID = seq_stud.SequenceID
and seq_stud.EndDate is null
) As t1
FROM sp
INNER JOIN p on sp.ProgramID = p.ProgramID
WHERE t1 > 0
I just figured this out. The Select subquery should be included as a WHERE statement
Using having count() in exists clause

SQL Server: Logical equivalent of ALL query

I have a following query (simplified):
SELECT
Id
FROM
dbo.Entity
WHERE
1 = ALL (
SELECT
CASE
WHEN {Condition} THEN 1
ELSE 0
END
FROM
dbo.Related
INNER JOIN dbo.Entity AS TargetEntity ON
TargetEntity.Id = Related.TargetId
WHERE
Related.SourceId = Entity.Id
)
where {Condition} is a complex dynamic condition on TargetEntity.
In simple terms, this query should return entities for which all related entities match the required condition.
Unfortunately, that does not work quite well, since by SQL standard 1 = ALL evaluates to TRUE when ALL is applied to an empty set. I know I can add AND EXISTS, but that will require me to repeat the whole subquery, which, I am certain, will cause problems for performance.
How should I rewrite the query to achieve the result I need (SQL Server 2008)?
Thanks in advance.
Note: practically speaking, the whole query is highly dynamic, so the perfect solution would be to rewrite only 1 = ALL ( ... ), since changing top-level select can cause problems when additional conditions are added to top-level where.
Couldn't you use a min to achieve this?
EG:
SELECT
Id
FROM
dbo.Entity
WHERE
1 = (
SELECT
MIN(CASE
WHEN {Condition} THEN 1
ELSE 0
END)
FROM
dbo.Related
INNER JOIN dbo.Entity AS TargetEntity ON
TargetEntity.Id = Related.TargetId
WHERE
Related.SourceId = Entity.Id
)
The min should return null if there's no clauses, 1 if they're all 1 and 0 if there's any 0's, and comparing to 1 should only be true for 1.
It can be translated to pick Entities where no related entities with unmatched condition exist.
This can be accomplished by:
SELECT
Id
FROM
dbo.Entity
WHERE
NOT EXISTS (
//as far as I have an element which do not match the condition, skip this entity
SELECT TOP 1 1
FROM
dbo.Related
INNER JOIN dbo.Entity AS TargetEntity ON
TargetEntity.Id = Related.TargetId
WHERE
Related.SourceId = Entity.Id AND
CASE
WHEN {Condition} THEN 1
ELSE 0
END = 0
)
EDIT: depending on condition, you can write something like:
WHERE Related.SourceId = Entity.Id AND NOT {Condition} if it doesn't change too much the complexity of the query.
Instead of using all, change your query to compare the result of the subquery directly:
select Id
from dbo.Entity
where 1 = (
select
case
when ... then 1
else 0
end
from ...
where ...
)
Probably this will work: WHERE NOT 0 = ANY(...)
If I read the query correctly, it can be simplified to something like:
SELECT e.Id
FROM dbo.Entity e
INNER JOIN dbo.Related r ON r.SourceId = e.Id
INNER JOIN dbo.Entity te ON te.Id = r.TargetId
WHERE <extra where stuff>
GROUP BY e.Id
HAVING SUM(CASE WHEN {Condition} THEN 1 ELSE 0 END) = COUNT(*)
This says the Condition must be true for all rows. It filters the "empty" set case away with the INNER JOINs.