How to output a new column in a SELECT query based on a condition?

How to output a new column in a SELECT query based on a condition? - sql

Currently, I have this SQL query:
SELECT AVG(ttbe.MarkGiven) FROM tblTestsTakenByEmployee ttbe
INNER JOIN tblCoursesTakenByEmployee ctbe ON ttbe.EmployeeId = ctbe.EmployeeId
LEFT JOIN tblCourse c ON ctbe.CourseId = c.CourseId
WHERE ctbe.HasCompletedCourse = 'Y'
GROUP BY ctbe.CourseId, c.CourseName, EXTRACT(YEAR FROM ctbe.DateOfCourseCompletion), ctbe.EmployeeId
At the moment, this returns the average mark of a single employee on a course which was completed on a certain year, calculated across each of the tests it contains (a course can have multiple tests).
I want to output an additional column to the SELECT query which specifies whether that employee has passed the course, based on a threshold. For example, if AVG(ttbe.MarkGiven) >= 40 then it would return 'Y' in the new column, otherwise it would return 'N'. What's the simplest and most efficient way of achieving this?

you could use the CASE expression like that:
SELECT AVG(ttbe.MarkGiven),
CASE WHEN AVG(ttbe.MarkGiven) >= 40 THEN 'Y' ELSE 'N' END as exam_passed
FROM tblTestsTakenByEmployee ttbe
INNER JOIN tblCoursesTakenByEmployee ctbe ON ttbe.EmployeeId = ctbe.EmployeeId
LEFT JOIN tblCourse c ON ctbe.CourseId = c.CourseId
WHERE ctbe.HasCompletedCourse = 'Y'
GROUP BY ctbe.CourseId, c.CourseName, EXTRACT(YEAR FROM ctbe.DateOfCourseCompletion),
ctbe.EmployeeId
https://www.oracletutorial.com/oracle-basics/oracle-case/#:~:text=Oracle%20CASE%20expression%20allows%20you,that%20accepts%20a%20valid%20expression.

You could use your current query as a CTE and then from the CTE use the reulst and compare it to your threshold
The advantag of the CTE is that you will not have to recalculate your AVG(ttbe.MarkGiven)
WITH _CTE as
(
SELECT AVG(ttbe.MarkGiven) as Col1
FROM tblTestsTakenByEmployee ttbe
INNER JOIN tblCoursesTakenByEmployee ctbe
ON ttbe.EmployeeId = ctbe.EmployeeId
LEFT JOIN tblCourse c
ON ctbe.CourseId = c.CourseId
WHERE ctbe.HasCompletedCourse = 'Y'
GROUP BY ctbe.CourseId, c.CourseName, EXTRACT(YEAR FROM ctbe.DateOfCourseCompletion), ctbe.EmployeeId
)
Select Col1
,CASE
WHEN Col1 >= 50 THEN 'Y'
ELSE 'N'
END AS Col1_Threshold

Related

Combine two queries to get the data in two columns

SELECT
tblEmployeeMaster.TeamName, SUM(tblData.Quantity) AS 'TotalQuantity'
FROM
tblData
INNER JOIN
tblEmployeeMaster ON tblData.EntryByHQCode = tblEmployeeMaster.E_HQCode
INNER JOIN
tblPhotos ON tblEmployeeMaster.TeamNo = tblPhotos.TeamNo
WHERE
IsPSR = 'Y'
GROUP BY
tblPhotos.TeamSort, tblPhotos.TeamNo, tblPhotos.Data,
tblEmployeeMaster.TeamName
ORDER BY
tblPhotos.TeamSort DESC, TotalQuantity DESC
This returns
Using this statement
select TeamName, count(TeamName) AS 'Head Count'
from dbo.tblEmployeeMaster
where IsPSR = 'Y'
group by teamname
Which returns
I would like to combine these 2 queries in 1 to get the below result.
Tried union / union all but no success :(
Any help will be very much helpful.

You can simply use the sub-query as follows:
SELECT tblEmployeeMaster.TeamName, SUM(tblData.Quantity) AS 'TotalQuantity',
MAX(HEAD_COUNT) AS HEAD_COUNT, -- USE THIS VALUE FROM SUB-QUERY
CASE WHEN MAX(HEAD_COUNT) <> 0
THEN SUM(tblData.Quantity)/MAX(HEAD_COUNT)
END AS PER_MAN_CONTRIBUTION -- column asked in comment
FROM tblData INNER JOIN
tblEmployeeMaster ON tblData.EntryByHQCode = tblEmployeeMaster.E_HQCode INNER JOIN
tblPhotos ON tblEmployeeMaster.TeamNo = tblPhotos.TeamNo
-- FOLLOWING SUB-QUERY CAN BE USED
LEFT JOIN (select TeamName, count(TeamName) AS HEAD_COUNT
from dbo.tblEmployeeMaster
where IsPSR = 'Y' group by teamname) AS HC
ON HC.TeamName = tblEmployeeMaster.TeamName
where IsPSR = 'Y'
GROUP BY tblPhotos.TeamSort, tblPhotos.TeamNo, tblPhotos.Data,tblEmployeeMaster.TeamName
order by tblPhotos.TeamSort desc, TotalQuantity desc

optimize Table Spool in SQL Server Execution plan

I have the following sql query and trying to optimize it using execution plan. In execution plan it says Estimated subtree cost is 36.89. There are several table spools(Eager Spool). can anyone help me to optimize this query. Thanks in advance.
SELECT
COUNT(DISTINCT bp.P_ID) AS total,
COUNT(DISTINCT CASE WHEN bc.Description != 'S' THEN bp.P_ID END) AS m_count,
COUNT(DISTINCT CASE WHEN bc.Description = 'S' THEN bp.P_ID END) AS s_count,
COUNT(DISTINCT CASE WHEN bc.Description IS NULL THEN bp.P_ID END) AS n_count
FROM
progress_tbl AS progress
INNER JOIN Person_tbl AS bp ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm ON bm.MS_ID = bp.MembershipStatusID
LEFT OUTER JOIN Membership_tbl AS m ON m.M_ID = bp.CurrentMembershipID
LEFT OUTER JOIN Category_tbl AS bc ON bc.MC_ID = m.MembershipCategoryID
WHERE
logged_when BETWEEN '2017-01-01' AND '2017-01-31'

Here's a technique you can use.
WITH T AS
(
SELECT DISTINCT CASE
WHEN bc.Description != 'S' THEN 'M'
WHEN bc.Description = 'S' THEN 'S'
WHEN bc.Description IS NULL THEN 'N'
END AS type,
bp.P_ID
FROM progress_tbl AS progress
INNER JOIN Person_tbl AS bp
ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm
ON bm.MS_ID = bp.MembershipStatusID
LEFT OUTER JOIN Membership_tbl AS m
ON m.M_ID = bp.CurrentMembershipID
LEFT OUTER JOIN Category_tbl AS bc
ON bc.MC_ID = m.MembershipCategoryID
WHERE logged_when BETWEEN '2017-01-01' AND '2017-01-31'
)
SELECT COUNT(DISTINCT P_ID) AS total,
COUNT(CASE WHEN type= 'M' THEN P_ID END) AS m_count,
COUNT(CASE WHEN type= 'S' THEN P_ID END) AS s_count,
COUNT(CASE WHEN type= 'N' THEN P_ID END) AS n_count
FROM T
I will demonstrate it on a simpler example.
Suppose your existing query is
SELECT
COUNT(DISTINCT number) AS total,
COUNT(DISTINCT CASE WHEN name != 'S' THEN number END) AS m_count,
COUNT(DISTINCT CASE WHEN name = 'S' THEN number END) AS s_count,
COUNT(DISTINCT CASE WHEN name IS NULL THEN number END) AS n_count
FROM master..spt_values;
You can rewrite it as follows
WITH T AS
(
SELECT DISTINCT CASE
WHEN name != 'S'
THEN 'M'
WHEN name = 'S'
THEN 'S'
ELSE 'N'
END AS type,
number
FROM master..spt_values
)
SELECT COUNT(DISTINCT number) AS total,
COUNT(CASE WHEN type= 'M' THEN number END) AS m_count,
COUNT(CASE WHEN type= 'S' THEN number END) AS s_count,
COUNT(CASE WHEN type= 'N' THEN number END) AS n_count
FROM T
Note the rewrite is costed as considerably cheaper and the plan is much simpler.

As already pointed out, there seems to be some typo/copy paste issues with your query. This makes it rather difficult for us to figure out what's going on.
The table-spools probably are what's going on in the CASE WHEN b.description etc... constructions. MSSQL first creates a (memory) table with all the resulting values and then that one gets sorted and streamed through the COUNT(DISTINCT ...) operator. I don't think there is much you can do about that as the work needs to be done somewhere.
Anyway, some remarks and wild guesses:
I'm guessing that logged_when is in the progress_tbl table?
If so, do you really need to LEFT OUTER JOIN all the other tables? From what I can tell they aren't being used?
You're trying to count the number of P_IDs that match the criteria and you want to split up that number between those that have b.Description either 'S', something else, or NULL.
for this you could calculate the total as the sum of the m_count, s_count and n_count. This would save you 1 COUNT() operation, not sure it helps a lot in the bigger picture but all bits help I guess.
Something like this:
;WITH counts AS (
SELECT
COUNT(DISTINCT CASE WHEN b.Description != 'S' THEN b_p.P_ID END) AS m_count,
COUNT(DISTINCT CASE WHEN b.Description = 'S' THEN b_p.P_ID END) AS s_count,
COUNT(DISTINCT CASE WHEN b.Description IS NULL THEN b_p.P_ID END) AS n_count
FROM
progress_tbl AS progress
INNER JOIN Person_tbl AS bp ON bp.P_ID = progress.person_id
LEFT OUTER JOIN Status_tbl AS bm ON bm.MS_ID = bp.MembershipStatusID -- really needed?
LEFT OUTER JOIN Membership_tbl AS m ON m.M_ID = bp.CurrentMembershipID -- really needed?
LEFT OUTER JOIN Category_tbl AS bc ON bc.MC_ID = m.MembershipCategoryID -- really needed?
WHERE
logged_when BETWEEN '2017-01-01' AND '2017-01-31' -- what table does logged_when column come from????
)
SELECT total = m_count + s_count + n_count,
*
FROM counts
UPDATE
BEWARE: Using the answer/example code of Martin Smith I came to realize that total isn't necessarily the sum of the other fields. It could be a given P_ID shows up with different description which then might fall into different categories. Depending on your data it might thus be that my answer is plain wrong.

Group by count zero rows not displaying

I want to count the amount of rows of every componistId. When I run the following SQL statement it works fine:
SELECT C.componistId, COUNT(*)
FROM Componist C LEFT JOIN Stuk S ON S.componistId = C.componistId
GROUP BY C.componistId
Now I want only the rows where stukNrOrigineel is null
SELECT C.componistId, COUNT(*)
FROM Componist C LEFT JOIN Stuk S ON S.componistId = C.componistId
WHERE S.stuknrOrigineel IS NULL
GROUP BY C.componistId
But when I do this, all the rows with a result of 0 disappear. Only the rows with at least 1 row are displayed. How can I make this work?

You need to include the condition in the on clause:
SELECT C.componistId, COUNT(C.componistId)
FROM Componist C LEFT JOIN
Stuk S
ON S.componistId = C.componistId AND
S.stuknrOrigineel IS NULL
GROUP BY C.componistId;
Note: I changed the COUNT() to count from the second table. This is normally what you want when combining LEFT JOIN with COUNT().
On some databases, I think the above might not quite work as expected (the question is whether the condition on S is evaluated before or after the LEFT JOIN). This should always work:
SELECT C.componistId, COUNT(s.componistId)
FROM Componist C LEFT JOIN
(SELECT S.* FROM Stuk S WHERE S.stuknrOrigineel IS NULL
) s
ON S.componistId = C.componistId AND
GROUP BY C.componistId;
Another generic solution is move the condition into the aggregation function:
SELECT C.componistId,
SUM(CASE WHEN S.stuknrOrigineel IS NULL THEN 1 ELSE 0 END)
FROM Componist C LEFT JOIN
Stuk S
ON S.componistId = C.componistId
GROUP BY C.componistId;

can we have CASE expression/case result as a join table name in oracle

I have 3 tables say Employee, Permanent_Emp and Contract_Emp
SELECT E.EMP_NO,
E.NAME,
JET.EMP_TYPE,
JET.DATE_JOINED
FROM Employee E
LEFT OUTER JOIN
/* Here Join Table Name(JET) it can be Permanent_Emp or Contract_Emp
which i want as a result of my case expression. */
ON (some condition here) ORDER BY E.EMP_NO DESC
case expression:
CASE
WHEN (E.EMP_TYPE_CODE >10 )
THEN
Permanent_Emp JET
ELSE
Contract_Emp JET
END
Note: table and column names are just for an example to understand requirement.
how can i have join table name from a case expression?

Something like this (although without a description of your tables, the exact join conditions or any sample data its hard to give a more precise answer):
SELECT E.EMP_NO,
E.NAME,
COALESCE( P.EMP_TYPE, C.EMP_TYPE ) AS EMP_TYPE
COALESCE( P.DATE_JOINED, C.DATE_JOINED ) AS DATE_JOINED
FROM Employee E
LEFT OUTER JOIN
Permanent_Emp P
ON ( E.EMP_TYPE_CODE > 10 AND E.EMP_NO = P.EMP_NO )
LEFT OUTER JOIN
Contract_Emp C
ON ( E.EMP_TYPE_CODE <= 10 AND E.EMP_NO = C.EMP_NO )
ORDER BY
E.EMP_NO DESC

use your case in select and join both tables
as
SELECT case when 1 then a.column
when 2 then b.column
end
from table c
join table a
on 1=1
join table2 b
on 1=1
but you cant use case while joining. its better to join both tables and in select use case statement with conditions as per your requirement

There is no way to conditionally add tables to a query in static SQL. If the relevant columns in Permanent_Emp and Contract_Emp are roughly equivalent, you could use a union in a sub-query.
SELECT *
FROM employee e
JOIN
(SELECT employee_id, relevant_column, 'P' AS source_indicator
FROM permanent_emp
UNION ALL
SELECT employee_id, relevant_column, 'C' AS source_indicator
FROM contract_emp) se
ON e.employee_id = se.employee_id
AND ( (e.emp_type_code > 10 AND source_indicator = 'P')
OR (e.emp_type_code <= 10 AND source_indicator = 'C'))

Using Alan's query as a starting point you can still use a case statement, just move it to the join condition:
SELECT *
FROM employee e
JOIN (
SELECT employee_id
, relevant_column
, 'P' AS source_indicator
FROM permanent_emp
UNION ALL
SELECT employee_id
, relevant_column
, 'C' AS source_indicator
FROM contract_emp
) se
ON se.employee_id = e.employee_id
and se.source_indicator = case when e.emp_type_code > 10
then 'P'
else 'C'
end
The only difference between this query and Allan's is the use of a case statement instead of an or statement.

Inner join that ignore singlets

I have to do an self join on a table. I am trying to return a list of several columns to see how many of each type of drug test was performed on same day (MM/DD/YYYY) in which there were at least two tests done and at least one of which resulted in a result code of 'UN'.
I am joining other tables to get the information as below. The problem is I do not quite understand how to exclude someone who has a single result row in which they did have a 'UN' result on a day but did not have any other tests that day.
Query Results (Columns)
County, DrugTestID, ID, Name, CollectionDate, DrugTestType, Results, Count(DrugTestType)
I have several rows for ID 12345 which are correct. But ID 12346 is a single row of which is showing they had a row result of count (1). They had a result of 'UN' on this day but they did not have any other tests that day. I want to exclude this.
I tried the following query
select
c.desc as 'County',
dt.pid as 'PID',
dt.id as 'DrugTestID',
p.id as 'ID',
bio.FullName as 'Participant',
CONVERT(varchar, dt.CollectionDate, 101) as 'CollectionDate',
dtt.desc as 'Drug Test Type',
dt.result as Result,
COUNT(dt.dru_drug_test_type) as 'Count Of Test Type'
from
dbo.Test as dt with (nolock)
join dbo.History as h on dt.pid = h.id
join dbo.Participant as p on h.pid = p.id
join BioData as bio on bio.id = p.id
join County as c with (nolock) on p.CountyCode = c.code
join DrugTestType as dtt with (nolock) on dt.DrugTestType = dtt.code
inner join
(
select distinct
dt2.pid,
CONVERT(varchar, dt2.CollectionDate, 101) as 'CollectionDate'
from
dbo.DrugTest as dt2 with (nolock)
join dbo.History as h2 on dt2.pid = h2.id
join dbo.Participant as p2 on h2.pid = p2.id
where
dt2.result = 'UN'
and dt2.CollectionDate between '11-01-2011' and '10-31-2012'
and p2.DrugCourtType = 'AD'
) as derived
on dt.pid = derived.pid
and convert(varchar, dt.CollectionDate, 101) = convert(varchar, derived.CollectionDate, 101)
group by
c.desc, dt.pid, p.id, dt.id, bio.fullname, dt.CollectionDate, dtt.desc, dt.result
order by
c.desc ASC, Participant ASC, dt.CollectionDate ASC

This is a little complicated because the your query has a separate row for each test. You need to use window/analytic functions to get the information you want. These allow you to do calculate aggregation functions, but to put the values on each line.
The following query starts with your query. It then calculates the number of UN results on each date for each participant and the total number of tests. It applies the appropriate filter to get what you want:
with base as (<your query here>)
select b.*
from (select b.*,
sum(isUN) over (partition by Participant, CollectionDate) as NumUNs,
count(*) over (partition by Partitipant, CollectionDate) as NumTests
from (select b.*,
(case when result = 'UN' then 1 else 0 end) as IsUN
from base
) b
) b
where NumUNs <> 1 or NumTests <> 1
Without the with clause or window functions, you can create a particularly ugly query to do the same thing:
select b.*
from (<your query>) b join
(select Participant, CollectionDate, count(*) as NumTests,
sum(case when result = 'UN' then 1 else 0 end) as NumUNs
from (<your query>) b
group by Participant, CollectionDate
) bsum
on b.Participant = bsum.Participant and
b.CollectionDate = bsum.CollectionDate
where NumUNs <> 1 or NumTests <> 1

If I understand the problem, the basic pattern for this sort of query is simply to include negating or exclusionary conditions in your join. I.E., self-join where columnA matches, but columns B and C do not:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
and t1.PkId != t2.PkId
and t1.category != t2.category
)
Put the conditions in the WHERE clause if it benchmarks better:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
And it's often easiest to start with the self-join, treating it as a "base table" on which to join all related information:
select
[columns]
from
(select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
) bt
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
This can allow you to focus on getting that self-join right, without interference from other tables.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas