Inner join that ignore singlets - sql

I have to do an self join on a table. I am trying to return a list of several columns to see how many of each type of drug test was performed on same day (MM/DD/YYYY) in which there were at least two tests done and at least one of which resulted in a result code of 'UN'.
I am joining other tables to get the information as below. The problem is I do not quite understand how to exclude someone who has a single result row in which they did have a 'UN' result on a day but did not have any other tests that day.
Query Results (Columns)
County, DrugTestID, ID, Name, CollectionDate, DrugTestType, Results, Count(DrugTestType)
I have several rows for ID 12345 which are correct. But ID 12346 is a single row of which is showing they had a row result of count (1). They had a result of 'UN' on this day but they did not have any other tests that day. I want to exclude this.
I tried the following query
select
c.desc as 'County',
dt.pid as 'PID',
dt.id as 'DrugTestID',
p.id as 'ID',
bio.FullName as 'Participant',
CONVERT(varchar, dt.CollectionDate, 101) as 'CollectionDate',
dtt.desc as 'Drug Test Type',
dt.result as Result,
COUNT(dt.dru_drug_test_type) as 'Count Of Test Type'
from
dbo.Test as dt with (nolock)
join dbo.History as h on dt.pid = h.id
join dbo.Participant as p on h.pid = p.id
join BioData as bio on bio.id = p.id
join County as c with (nolock) on p.CountyCode = c.code
join DrugTestType as dtt with (nolock) on dt.DrugTestType = dtt.code
inner join
(
select distinct
dt2.pid,
CONVERT(varchar, dt2.CollectionDate, 101) as 'CollectionDate'
from
dbo.DrugTest as dt2 with (nolock)
join dbo.History as h2 on dt2.pid = h2.id
join dbo.Participant as p2 on h2.pid = p2.id
where
dt2.result = 'UN'
and dt2.CollectionDate between '11-01-2011' and '10-31-2012'
and p2.DrugCourtType = 'AD'
) as derived
on dt.pid = derived.pid
and convert(varchar, dt.CollectionDate, 101) = convert(varchar, derived.CollectionDate, 101)
group by
c.desc, dt.pid, p.id, dt.id, bio.fullname, dt.CollectionDate, dtt.desc, dt.result
order by
c.desc ASC, Participant ASC, dt.CollectionDate ASC

This is a little complicated because the your query has a separate row for each test. You need to use window/analytic functions to get the information you want. These allow you to do calculate aggregation functions, but to put the values on each line.
The following query starts with your query. It then calculates the number of UN results on each date for each participant and the total number of tests. It applies the appropriate filter to get what you want:
with base as (<your query here>)
select b.*
from (select b.*,
sum(isUN) over (partition by Participant, CollectionDate) as NumUNs,
count(*) over (partition by Partitipant, CollectionDate) as NumTests
from (select b.*,
(case when result = 'UN' then 1 else 0 end) as IsUN
from base
) b
) b
where NumUNs <> 1 or NumTests <> 1
Without the with clause or window functions, you can create a particularly ugly query to do the same thing:
select b.*
from (<your query>) b join
(select Participant, CollectionDate, count(*) as NumTests,
sum(case when result = 'UN' then 1 else 0 end) as NumUNs
from (<your query>) b
group by Participant, CollectionDate
) bsum
on b.Participant = bsum.Participant and
b.CollectionDate = bsum.CollectionDate
where NumUNs <> 1 or NumTests <> 1

If I understand the problem, the basic pattern for this sort of query is simply to include negating or exclusionary conditions in your join. I.E., self-join where columnA matches, but columns B and C do not:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
and t1.PkId != t2.PkId
and t1.category != t2.category
)
Put the conditions in the WHERE clause if it benchmarks better:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
And it's often easiest to start with the self-join, treating it as a "base table" on which to join all related information:
select
[columns]
from
(select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
) bt
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
This can allow you to focus on getting that self-join right, without interference from other tables.

Related

How to output a new column in a SELECT query based on a condition?

Currently, I have this SQL query:
SELECT AVG(ttbe.MarkGiven) FROM tblTestsTakenByEmployee ttbe
INNER JOIN tblCoursesTakenByEmployee ctbe ON ttbe.EmployeeId = ctbe.EmployeeId
LEFT JOIN tblCourse c ON ctbe.CourseId = c.CourseId
WHERE ctbe.HasCompletedCourse = 'Y'
GROUP BY ctbe.CourseId, c.CourseName, EXTRACT(YEAR FROM ctbe.DateOfCourseCompletion), ctbe.EmployeeId
At the moment, this returns the average mark of a single employee on a course which was completed on a certain year, calculated across each of the tests it contains (a course can have multiple tests).
I want to output an additional column to the SELECT query which specifies whether that employee has passed the course, based on a threshold. For example, if AVG(ttbe.MarkGiven) >= 40 then it would return 'Y' in the new column, otherwise it would return 'N'. What's the simplest and most efficient way of achieving this?
you could use the CASE expression like that:
SELECT AVG(ttbe.MarkGiven),
CASE WHEN AVG(ttbe.MarkGiven) >= 40 THEN 'Y' ELSE 'N' END as exam_passed
FROM tblTestsTakenByEmployee ttbe
INNER JOIN tblCoursesTakenByEmployee ctbe ON ttbe.EmployeeId = ctbe.EmployeeId
LEFT JOIN tblCourse c ON ctbe.CourseId = c.CourseId
WHERE ctbe.HasCompletedCourse = 'Y'
GROUP BY ctbe.CourseId, c.CourseName, EXTRACT(YEAR FROM ctbe.DateOfCourseCompletion),
ctbe.EmployeeId
https://www.oracletutorial.com/oracle-basics/oracle-case/#:~:text=Oracle%20CASE%20expression%20allows%20you,that%20accepts%20a%20valid%20expression.
You could use your current query as a CTE and then from the CTE use the reulst and compare it to your threshold
The advantag of the CTE is that you will not have to recalculate your AVG(ttbe.MarkGiven)
WITH _CTE as
(
SELECT AVG(ttbe.MarkGiven) as Col1
FROM tblTestsTakenByEmployee ttbe
INNER JOIN tblCoursesTakenByEmployee ctbe
ON ttbe.EmployeeId = ctbe.EmployeeId
LEFT JOIN tblCourse c
ON ctbe.CourseId = c.CourseId
WHERE ctbe.HasCompletedCourse = 'Y'
GROUP BY ctbe.CourseId, c.CourseName, EXTRACT(YEAR FROM ctbe.DateOfCourseCompletion), ctbe.EmployeeId
)
Select Col1
,CASE
WHEN Col1 >= 50 THEN 'Y'
ELSE 'N'
END AS Col1_Threshold

Print result from 2 queries in different columns of one table

I am trying to output result from 2 queries in one table but no luck. Tried with UNION and with this template
SELECT x.a, y.b FROM (SELECT * from a) as x, (SELECT * FROM b) as y
with no luck. My queries are:
Select
Sum('.MAIN_DB_PREFIX.'facturedet.qty) As qty_sum,
'.MAIN_DB_PREFIX.'user.firstname,
'.MAIN_DB_PREFIX.'user.lastname,
'.MAIN_DB_PREFIX.'user.rowid,
'.MAIN_DB_PREFIX.'facture.datef,
'.MAIN_DB_PREFIX.'user.rowid
From
'.MAIN_DB_PREFIX.'facturedet
Inner Join
'.MAIN_DB_PREFIX.'facture On '.MAIN_DB_PREFIX.'facturedet.fk_facture = '.MAIN_DB_PREFIX.'facture.rowid Inner Join
'.MAIN_DB_PREFIX.'societe_commerciaux On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_soc = '.MAIN_DB_PREFIX.'facture.fk_soc Inner Join
'.MAIN_DB_PREFIX.'user On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_user = '.MAIN_DB_PREFIX.'user.rowid
Where
".MAIN_DB_PREFIX."facture.datef Between Date_Format(Now(), '%Y-%m-01') And CurDate()
GROUP BY
'.MAIN_DB_PREFIX.'user.rowid
ORDER BY
qty_sum DESC
and
Select
Sum('.MAIN_DB_PREFIX.'facture.total) As total_sum,
'.MAIN_DB_PREFIX.'user.firstname,
'.MAIN_DB_PREFIX.'user.lastname,
'.MAIN_DB_PREFIX.'user.rowid,
'.MAIN_DB_PREFIX.'facture.datef,
'.MAIN_DB_PREFIX.'user.rowid
From
'.MAIN_DB_PREFIX.'facture Inner Join
'.MAIN_DB_PREFIX.'societe_commerciaux On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_soc = '.MAIN_DB_PREFIX.'facture.fk_soc Inner Join
'.MAIN_DB_PREFIX.'user On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_user = '.MAIN_DB_PREFIX.'user.rowid
Where
".MAIN_DB_PREFIX."facture.datef Between Date_Format(Now(), '%Y-%m-01') And CurDate()
GROUP BY
'.MAIN_DB_PREFIX.'user.rowid
ORDER BY
total_sum DESC
With both UNION or combined query as mentioned above, I get blank result and no errors.
Also, I cannot wrap the select in brackets and use an alias like (query) a UNION (query2) b as this also brings blank output.
I can output first name + last name in column one of a table, sum_qty on second, some calculation is made based on the qty and it goes in the third column. In the fourth column, I need to output total_sum
Digged here like 20 threads and tried different solutions but no result.
Full code: https://pastebin.com/bie0LXH9
This should achieve the desired result (ie. the resultset that you need to output your table) using the UNION that you were attempting (although a more efficient query could possibly be achieved using a Common Table Expression - which may or may not be available to you, depending on your RDBMS, and the version thereof).
SELECT
firstname,
lastname,
rowid,
SUM(IFNULL(qty_sum, 0)) AS qty_sum,
SUM(IFNULL(total_sum, 0)) AS total_sum
FROM
(
Select
Sum('.MAIN_DB_PREFIX.'facturedet.qty) As qty_sum,
0 AS total_sum,
'.MAIN_DB_PREFIX.'user.firstname,
'.MAIN_DB_PREFIX.'user.lastname,
'.MAIN_DB_PREFIX.'user.rowid
From
'.MAIN_DB_PREFIX.'facturedet
Inner Join
'.MAIN_DB_PREFIX.'facture On '.MAIN_DB_PREFIX.'facturedet.fk_facture = '.MAIN_DB_PREFIX.'facture.rowid Inner Join
'.MAIN_DB_PREFIX.'societe_commerciaux On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_soc = '.MAIN_DB_PREFIX.'facture.fk_soc Inner Join
'.MAIN_DB_PREFIX.'user On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_user = '.MAIN_DB_PREFIX.'user.rowid
Where
".MAIN_DB_PREFIX."facture.datef Between Date_Format(Now(), '%Y-%m-01') And CurDate()
GROUP BY
'.MAIN_DB_PREFIX.'user.rowid
UNION ALL
Select
0 AS qty_sum,
Sum('.MAIN_DB_PREFIX.'facture.total) As total_sum,
'.MAIN_DB_PREFIX.'user.firstname,
'.MAIN_DB_PREFIX.'user.lastname,
'.MAIN_DB_PREFIX.'user.rowid
From
'.MAIN_DB_PREFIX.'facture Inner Join
'.MAIN_DB_PREFIX.'societe_commerciaux On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_soc = '.MAIN_DB_PREFIX.'facture.fk_soc Inner Join
'.MAIN_DB_PREFIX.'user On '.MAIN_DB_PREFIX.'societe_commerciaux.fk_user = '.MAIN_DB_PREFIX.'user.rowid
Where
".MAIN_DB_PREFIX."facture.datef Between Date_Format(Now(), '%Y-%m-01') And CurDate()
GROUP BY
'.MAIN_DB_PREFIX.'user.rowid
) AS tbl_all
GROUP BY rowid
ORDER BY
total_sum DESC

Combine two queries to get the data in two columns

SELECT
tblEmployeeMaster.TeamName, SUM(tblData.Quantity) AS 'TotalQuantity'
FROM
tblData
INNER JOIN
tblEmployeeMaster ON tblData.EntryByHQCode = tblEmployeeMaster.E_HQCode
INNER JOIN
tblPhotos ON tblEmployeeMaster.TeamNo = tblPhotos.TeamNo
WHERE
IsPSR = 'Y'
GROUP BY
tblPhotos.TeamSort, tblPhotos.TeamNo, tblPhotos.Data,
tblEmployeeMaster.TeamName
ORDER BY
tblPhotos.TeamSort DESC, TotalQuantity DESC
This returns
Using this statement
select TeamName, count(TeamName) AS 'Head Count'
from dbo.tblEmployeeMaster
where IsPSR = 'Y'
group by teamname
Which returns
I would like to combine these 2 queries in 1 to get the below result.
Tried union / union all but no success :(
Any help will be very much helpful.
You can simply use the sub-query as follows:
SELECT tblEmployeeMaster.TeamName, SUM(tblData.Quantity) AS 'TotalQuantity',
MAX(HEAD_COUNT) AS HEAD_COUNT, -- USE THIS VALUE FROM SUB-QUERY
CASE WHEN MAX(HEAD_COUNT) <> 0
THEN SUM(tblData.Quantity)/MAX(HEAD_COUNT)
END AS PER_MAN_CONTRIBUTION -- column asked in comment
FROM tblData INNER JOIN
tblEmployeeMaster ON tblData.EntryByHQCode = tblEmployeeMaster.E_HQCode INNER JOIN
tblPhotos ON tblEmployeeMaster.TeamNo = tblPhotos.TeamNo
-- FOLLOWING SUB-QUERY CAN BE USED
LEFT JOIN (select TeamName, count(TeamName) AS HEAD_COUNT
from dbo.tblEmployeeMaster
where IsPSR = 'Y' group by teamname) AS HC
ON HC.TeamName = tblEmployeeMaster.TeamName
where IsPSR = 'Y'
GROUP BY tblPhotos.TeamSort, tblPhotos.TeamNo, tblPhotos.Data,tblEmployeeMaster.TeamName
order by tblPhotos.TeamSort desc, TotalQuantity desc

counting items that match within select statement

We have a stored procedure that executes against some fairly large tables and while joining to a larger table it is also keeping a tally of how many records match the corresponding batch_id. What I am trying to figure out is can I improve this with a function for the count or some other means? Trying to get rid of the nested SELECT COUNT(*) statement. The CCTransactions table is 1.4 million rows and the BatchItems is 6.6 million rows.
SELECT a.ItemAuthID, a.FeeAuthID, a.Batch_ID, a.ItemAuthCode,
a.FeeAuthCode, b.Amount, b.Fee,
(SELECT COUNT(*) FROM BatchItems WHERE Batch_ID = a.Batch_ID) AS BatchCount,
ItemBillDate, FeeBillDate, b.AccountNumber,
b.Itemcode, ItemAuthToken, FeeAuthToken,
cc.ItemMerchant, cc.FeeMerchant
FROM CCTransactions a WITH(NOLOCK)
INNER JOIN BatchItems b WITH(NOLOCK)
ON a.Batch_ID = b.Batch_ID
INNER JOIN CCConfig cc WITH(NOLOCK)
ON a.ClientCode = cc.ClientCode
WHERE ((ItemAuthCode > '' AND ItemBillDate IS NULL)
OR (FeeAuthCode > '' AND FeeBillDate IS NULL))
AND TransactionDate BETWEEN DATEADD(d,-7,GETDATE())
AND convert(char(20),getdate(),101) + ' ' + #Cutoff
ORDER BY TransactionDate
When your DBMS supports WIndowed Aggregate Functions you can rewrite it to
COUNT(*) OVER (PARTITION BY Batch_ID)
Of course this only returns the number of rows per Batch_ID returned by the SELECT. if the inner join results in less rows, it's not the correct number.
Then it might be more efficient (depending on your DBMS) to rewrite the Scalar Subquery to a join:
SELECT a.ItemAuthID, a.FeeAuthID, a.Batch_ID, a.ItemAuthCode,
a.FeeAuthCode, b.Amount, b.Fee,
dt.BatchCount,
ItemBillDate, FeeBillDate, b.AccountNumber,
b.Itemcode, ItemAuthToken, FeeAuthToken,
cc.ItemMerchant, cc.FeeMerchant
FROM CCTransactions a WITH(NOLOCK)
INNER JOIN BatchItems b WITH(NOLOCK)
ON a.Batch_ID = b.Batch_ID
INNER JOIN CCConfig cc WITH(NOLOCK)
ON a.ClientCode = cc.ClientCode
INNER JOIN
(
SELECT BatchCount, COUNT(*) AS BatchCount
FROM BatchItems
GROUP BY Batch_ID
) AS dt ON a.Batch_ID = dt.Batch_ID
WHERE ((ItemAuthCode > '' AND ItemBillDate IS NULL)
OR (FeeAuthCode > '' AND FeeBillDate IS NULL))
AND TransactionDate BETWEEN DATEADD(d,-7,GETDATE())
AND convert(CHAR(20),getdate(),101) + ' ' + #Cutoff
ORDER BY TransactionDate

how to get the count in SQL Server?

I have tried a lot to figure how to get the count from two tables with respect to master table
I have three tables
Using these table values I need to get this output..
Tried but could get the desired result
http://en.wikipedia.org/wiki/Join_(SQL)
SQL - LEFT OUTER JOIN and WHERE clause
http://forums.devshed.com/oracle-development-96/combination-of-left-outer-join-and-where-clause-383248.html
You have to first GROUP BY in subqueries, then JOIN to the main table:
SELECT
a.AttributeId
, COALECSE(cntE, 0) AS cntE
, COALECSE(cntM, 0) AS cntM
FROM
AttributeMaster AS a
LEFT JOIN
( SELECT
AttributeId
, COUNT(*) AS cntE
FROM
EmployeeMaster
GROUP BY
AttributeId
) em
ON em.AttributeId = a.AttributeId
LEFT JOIN
( SELECT
AttributeId
, COUNT(*) AS cntM
FROM
MonthlyDerivedMaster
GROUP BY
AttributeId
) mdm
ON mdm.AttributeId = a.AttributeId
SELECT AttributeId,
(SELECT COUNT(Eid) FROM EmployeeMaster WHERE AttributeMaster.AttributeId = EmployeeMaster.AttributeId) as master_eid,
(SELECT COUNT(Eid) FROM MonthnlyDerivedMaster WHERE AttributeMaster.AttributeId = MonthnlyDerivedMaster.AttributeId) as monthly_eid
FROM AttributeMaster