SQL Server - Conditional Join to whatever table is not null - sql

Badly phrased title my apologies.
I am trying to join a table to one of two other tables
MasterTable
SubTable
SubTableArchive
So, MasterTable contains an ID field.
SubTable and SubTableArchive contain a MasterTableId field for the join.
But, data will only ever exist in one of these SubTables. So I just want to join to whichever table has the data in it.
But the only way I know of doing it is by joining to both and using isnull's on all the fields im selecting, and its quickly becoming complicated to read (and write).
Especially because some of the fields are already wrapped in ISNULL's
SELECT M.Id, ISNULL(S.Field1, SA.field1), ISNULL(S.field2, SA.Field2),
SUM(CASE WHEN (ISNULL(S.Finished,SA.Finished)=1 AND ISNULL( ISNULL(S.ItemCode,SA.ItemCode),'')='') THEN 1 WHEN (ISNULL(S.Finished,SA.Finished)=0 AND ISNULL( ISNULL(S.AltItemCode,SA.AltItemCode),'')='') THEN 1 ELSE 0 END) AS SummaryField
FROM MAsterTable M
LEFT OUTER JOIN SubTable S ON S.MasterTableId = M.Id
LEFT OUTER JOIN SubTableArchive SA ON S.MasterTableId = M.Id
GROUP BY M.Id, ISNULL(S.Field1, SA.field1), ISNULL(S.field2, SA.Field2)
So that is working, but its not pretty.
Thats a sample, but the real queries are longer and more convoluted.
I was hoping SQL might have had some sort of conditional joining functionality built in.
Something to do what I am trying to do and leave the query itslef a little friendlier.

Another option is to use a UNION and then use INNER JOINs
SELECT M.x, S.x
FROM MAsterTable M INNER JOIN SubTable S ON S.MasterTableId = M.Id
UNION
SELECT M.x, STA.x
FROM MAsterTable M INNER JOIN SubTableArchive STA ON STA.MasterTableId = M.Id
From a maintenance point of view, if you make the above union a view, you can then apply where filters and sorts to the view, which should simplify matters.

Use sub-select or better WITH statement (code not tested):
WITH SubTable_WithArchive
AS (SELECT Field1, Field2, Finished, ItemCode, AltItemCode, MasterTableID
FROM SubTable
UNION ALL
SELECT Field1, Field2, Finished, ItemCode, AltItemCode, MasterTableID
FROM SubTableArchive
)
SELECT M.Id,
S.Field1,
S.Field2,
SUM(CASE
WHEN s.Finished = 1 AND ISNULL(s.ItemCode, '') == '' THEN 1
WHEN s.Finished = 0 AND ISNULL(s.AltItemCode, '') == '' THEN 1
ELSE 0
END)
AS SummaryField
FROM MasterTable M
LEFT OUTER JOIN SubTable_WithArchive S
ON S.MasterTableId = M.Id
GROUP BY M.Id, S.Field1, s.field2

No, unfortunately, there isn't.

Try this
SELECT M.Id, S.Field1,S.field2,
SUM(CASE WHEN S.Finished=1 AND ISNULL( S.ItemCode,'')='')
THEN 1
WHEN S.Finished=0 AND ISNULL( S.AltItemCode,'')='')
THEN 1 ELSE 0 END) AS SummaryField
FROM MAsterTable M
JOIN (
SELECT id,field1,field2,ItemCode,AltItemCode,finished
FROM subTable
UNION
SELECT id,field1,field2,ItemCode,AltItemCode,finished
FROM subTableArchive
) S ON S.id = M.Id
GROUP BY M.Id, S.Field1,S.field2

Because a ID value from MasterTable will exist in only one table (SubTable or SubTableArchive) you can use this query:
SELECT MasterTableId Id, Field1, Field2,
SUM(CASE
WHEN Finished=1 AND ItemCode IS NULL THEN 1 --OR ISNULL(ItemCode,'') = ''
WHEN Finished=0 AND AltItemCode IS NULL THEN 1 --OR ISNULL(AltItemCode,'') = ''
ELSE 0
END) AS SummaryField
FROM SubTable
GROUP BY 1, 2, 3
UNION ALL
SELECT MasterTableId Id, Field1, Field2,
SUM(CASE
WHEN Finished=1 AND ItemCode IS NULL THEN 1 --OR ISNULL(ItemCode,'') = ''
WHEN Finished=0 AND AltItemCode IS NULL THEN 1 --OR ISNULL(AltItemCode,'') = ''
ELSE 0
END) AS SummaryField
FROM SubTableArchive
GROUP BY 1, 2, 3

Related

Join 2 tables even with null values and sum each row

See pics below of what i want to do with my sql statement. I have used a left join and ISNULL to get all the results fine except for I don't get the total, which I want to sum the numbers for each customer. All table b values are integers.
Select a.CustomerId, a.FName, a.LName, b.mtg1, b.mtg2, b.mtg3, b.mtg4 From Customer a Left Join Hours b On a.CustomerID = b.CustomerID group by a.Lname
The GROUP BY will make it a bit difficult to captured all the necessary data. Also, the sum of different columns will require the use of COALESCE(column, 0) so as to use zero as the value if the column is null because if not done, your total will come back asNULL`.
One possible solution will is:
SELECT a.CustId, a.FName, a.LName, SUM(b.mtg1), SUM(b.mtg2), SUM(b.mtg3), SUM(b.mtg4), (COALESCE(SUM(b.mtg1), 0) + COALESCE(SUM(b.mtg2), 0) + COALESCE(SUM(b.mtg3), 0) + COALESCE(SUM(b.mtg4), 0)) AS total
FROM table_a a
LEFT JOIN table_b b ON(b.CustID = a.CustId)
GROUP BY a.CustID, a.FName, a.LName
with cte as
(
select flname+', '+fname as Name,
mtg1,mtg2,mtg3,mtg4,
isnull(((case when mtg1 is null then 0 else mtg1 end)+
(case when mtg2 is null then 0 else mtg2 end)+
(case when mtg3 is null then 0 else mtg3 end)+
(case when mtg4 is null then 0 else mtg4 end)),0) as total
from a left join b
on a.custid = b.custid
)
select name,
isnull(mtg1,0) as mtg1,
isnull(mtg2,0) as mtg2,
isnull(mtg3,0) as mtg3,
isnull(mtg4,0) as mtg4,
total
from cte
order by 1

CTE TABLE WITH ORDER BY ERROR MESSAGE SQL SERVER

;WITH myTree AS
(
SELECT
y.User_id, y.user_usercode, y.user_username,
y.user_uplineID,
trans_WinLose, y.User_ID AS sourceID, trans_id,
trans_Rolling, y.User_Level,
lvl1.User_Level AS Level_lvl1, lvl2.User_Level AS Level_lvl2,
y.User_GivenPT, lvl1.User_GivenPT AS GivenPT_lvl1,
lvl2.User_GivenPT AS GivenPT_lvl2, y.User_GivenComm,
lvl1.User_GivenComm AS downline_Comm
FROM
tbl_user y
INNER JOIN
tbl_trans x ON x.trans_Robot_ID = y.User_RobotID
INNER JOIN
tbl_user lvl1 ON y.user_uplineID = lvl1.User_ID
INNER JOIN
tbl_user lvl2 ON lvl1.user_uplineID = lvl2.User_ID
UNION ALL
SELECT
u.User_ID, u.user_usercode, u.user_username, u.user_uplineID,
t.trans_WinLose, t.sourceID AS sourceID, t.trans_Id,
t.trans_Rolling, u.User_Level, t.User_Level, t.Level_lvl1,
u.User_GivenPT, t.User_GivenPT, t.GivenPT_lvl1,
u.User_GivenComm, t.User_GivenComm
FROM
myTree t
INNER JOIN
tbl_user u ON t.user_uplineID = u.User_ID
)
SELECT *
FROM
(SELECT
mytree.*,
(SELECT
CASE
WHEN Level_lvl1 = 7 THEN GivenPT_lvl1
WHEN level_lvl2 = 7 THEN User_GivenPT-GivenPT_Lvl2
ELSE (CASE
WHEN (User_GivenPT-GivenPT_lvl1) > 0
THEN User_GivenPT - GivenPT_lvl1
ELSE 0
END)
END) AS Net_PT
FROM
Mytree
ORDER BY
mytree.trans_ID) AS c
I would like to order by mytree.trans_ID, but I get an error:
Msg 1033, Level 15, State 1, Line 17
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
How to solve this problem?
Another possibility is to remove the outer query when querying the CTE - it doesn't seem to serve any purpose.
;with myTree as (
select y.User_id,y.user_usercode,y.user_username,y.user_uplineID,trans_WinLose, y.User_ID as sourceID, trans_id,trans_Rolling,y.User_Level,lvl1.User_Level as Level_lvl1,lvl2.User_Level as Level_lvl2,y.User_GivenPT,lvl1.User_GivenPT as GivenPT_lvl1 ,lvl2.User_GivenPT as GivenPT_lvl2,y.User_GivenComm,lvl1.User_GivenComm as downline_Comm from tbl_user y
Inner join tbl_trans x on x.trans_Robot_ID = y.User_RobotID
Inner join tbl_user lvl1 on y.user_uplineID = lvl1.User_ID
Inner join tbl_user lvl2 on lvl1.user_uplineID = lvl2.User_ID
union all
select u.User_ID,u.user_usercode,u.user_username,u.user_uplineID,t.trans_WinLose, t.sourceID as sourceID, t.trans_Id,t.trans_Rolling,u.User_Level,t.User_Level,t.Level_lvl1,u.User_GivenPT,t.User_GivenPT,t.GivenPT_lvl1,u.User_GivenComm,t.User_GivenComm
from myTree t
inner join tbl_user u on t.user_uplineID = u.User_ID
)
SELECT mytree.*,
(SELECT CASE
WHEN Level_lvl1=7 THEN GivenPT_lvl1
WHEN level_lvl2=7 THEN User_GivenPT-GivenPT_Lvl2
ELSE (CASE WHEN (User_GivenPT-GivenPT_lvl1) > 0 THEN User_GivenPT-GivenPT_lvl1 else 0 END)
END) as Net_PT
From Mytree order by mytree.trans_ID
SQL behaves with data in Mathematics Sets manner. In Sets the order of data and elements are meaningless. So when you are dealing with CTE, sub-queries, views, inline table value functions etc, that are used to return a set of data, you are not allowed to order and sort them.
You need to sort data when you need to show them in output or you are using TOP or OFFSET-Fetch or etc commands, where the Order By helps these commands to work as they should be.
You should rewrite your query as:
;with myTree as (
select y.User_id,y.user_usercode,y.user_username,y.user_uplineID,trans_WinLose, y.User_ID as sourceID, trans_id,trans_Rolling,y.User_Level,lvl1.User_Level as Level_lvl1,lvl2.User_Level as Level_lvl2,y.User_GivenPT,lvl1.User_GivenPT as GivenPT_lvl1 ,lvl2.User_GivenPT as GivenPT_lvl2,y.User_GivenComm,lvl1.User_GivenComm as downline_Comm from tbl_user y
Inner join tbl_trans x on x.trans_Robot_ID = y.User_RobotID
Inner join tbl_user lvl1 on y.user_uplineID = lvl1.User_ID
Inner join tbl_user lvl2 on lvl1.user_uplineID = lvl2.User_ID
union all
select u.User_ID,u.user_usercode,u.user_username,u.user_uplineID,t.trans_WinLose, t.sourceID as sourceID, t.trans_Id,t.trans_Rolling,u.User_Level,t.User_Level,t.Level_lvl1,u.User_GivenPT,t.User_GivenPT,t.GivenPT_lvl1,u.User_GivenComm,t.User_GivenComm
from myTree t
inner join tbl_user u on t.user_uplineID = u.User_ID
)select * from (SELECT mytree.*,
(SELECT CASE
WHEN Level_lvl1=7 THEN GivenPT_lvl1
WHEN level_lvl2=7 THEN User_GivenPT-GivenPT_Lvl2
ELSE (CASE WHEN (User_GivenPT-GivenPT_lvl1) > 0 THEN User_GivenPT-GivenPT_lvl1 else 0 END)
END) as Net_PT
From Mytree) as c
order by c.trans_ID
From Mytree order by mytree.trans_ID OFFSET 0 ROWS
change it to this, solved

How to optimize the sql query

This query takes dynamic input in the place of cg.ownerid IN (294777,228649 ,188464).when the input increases in the IN condition the query is taking too much time to execute. Please suggest me a way to optimize it.
For example, the below query is taking 4 seconds, if I reduce the list to just IN(188464) its just taking 1 second.
SELECT *
FROM
(SELECT *,
Row_number() OVER(
ORDER BY datecreated DESC) AS rownum
FROM
(SELECT DISTINCT c.itemid,
(CASE WHEN (Isnull(c.password, '') <> '') THEN 1 ELSE 0 END) AS password,
c.title,
c.encoderid,
c.type,
(CASE WHEN c.author = 'education' THEN 'Discovery' ELSE c.type END) AS TYPE,
c.publisher,
c.description,
c.author,
c.duration,
c.copyright,
c.rating,
c.userid,
Stuff(
(SELECT DISTINCT ' ' + NAME AS [text()]
FROM firsttable SUB
LEFT JOIN secondtable AS rgc ON thirdtable = rgc.id
WHERE SUB.itemid = c.itemid
FOR xml path('')), 1, 1, '')AS [Sub_Categories]
FROM fourthtable AS cg
LEFT JOIN item AS c ON c.itemid = cg.itemid
WHERE Isnull(title, '') <> ''
AND c.active = '1'
AND c.systemid = '20'
AND cg.ownerid IN (294777,
228649,
188464)) AS a) AS b
WHERE rownum BETWEEN 1 AND 32
ORDER BY datecreated DESC
As I haven't further information, I just would suggest a first change of your where clause. They should be moved to a subquery as you left join those columns.
SELECT *
FROM(
SELECT *,
Row_number() OVER(
ORDER BY datecreated DESC) AS rownum
FROM
(SELECT DISTINCT c.itemid,
(CASE WHEN (Isnull(c.password, '') <> '') THEN 1 ELSE 0 END) AS password,
c.title,
c.encoderid,
c.type,
(CASE WHEN c.author = 'education' THEN 'Discovery' ELSE c.type END) AS TYPE,
c.publisher,
c.description,
c.author,
c.duration,
c.copyright,
c.rating,
c.userid,
Stuff(
(
SELECT DISTINCT ' ' + NAME AS [text()]
FROM firsttable SUB
LEFT JOIN secondtable AS rgc ON thirdtable = rgc.id
WHERE SUB.itemid = c.itemid
FOR xml path('')
), 1, 1, ''
) AS [Sub_Categories]
FROM (
SELECT cg.itemid
FROM fourthtable as cg
WHERE cg.ownerid IN (294777,228649, 188464)
) AS cg
LEFT JOIN (
SELECT DISTINCT c.itemid, c.type, c.author, c.title, c.encoderid, c.type, c.publisher, c.description, c.author, c.duration, c.copyright, c.rating,c.userid
FROM item as c
WHERE Isnull(c.title, '') <> ''
AND c.active = '1'
AND c.systemid = '20'
) AS c
ON c.itemid = cg.itemid
) AS a
) AS b
WHERE rownum BETWEEN 1 AND 32
ORDER BY datecreated DESC
But not quite sure if everything is connected right away, your missing some aliases which makes it hard for me to get through your query. But I thing you'll get my idea. :-)
With this little information it's impossible to give any specific ideas, but the normal general things apply:
Turn on statistics io and check what's causing most of the logical I/O and try to solve that
Look at the actual plan and check if there's something that doesn't look ok, for example:
Clustered index / table scans (new index could solve this)
Key lookups with a huge amount of rows (adding more columns to index could solve this, either as normal or included fields)
Spools (new index could solve this)
Big difference between estimated and actual number of rows (10x, 100x and so on)
To give any better hints you should really include the actual plan, table / index structure at least on the essential parts and tell what is too much time (seconds, minutes, hours?)

How can I optimize the SQL query?

I have a query an SQL query as follows, can anybody suggest any optimization for this; I think most of the effort is being done for the Union operation - is there anything else can be done to get the same result ?
Basically I wanna query first portion of the UNION and if for each record there is no result then the second portion need to be run. Please help.
:
SET dateformat dmy;
WITH incidentcategory
AS (
SELECT 1 ord, i.IncidentId, rl.Description Category FROM incident i
JOIN IncidentLikelihood l ON i.IncidentId = l.IncidentId
JOIN IncidentSeverity s ON i.IncidentId = s.IncidentId
JOIN LikelihoodSeverity ls ON l.LikelihoodId = ls.LikelihoodId AND s.SeverityId = ls.SeverityId
JOIN RiskLevel rl ON ls.RiskLevelId = rl.riskLevelId
UNION
SELECT 2 ord, i.incidentid,
rl.description Category
FROM incident i
JOIN incidentreportlikelihood l
ON i.incidentid = l.incidentid
JOIN incidentreportseverity s
ON i.incidentid = s.incidentid
JOIN likelihoodseverity ls
ON l.likelihoodid = ls.likelihoodid
AND s.severityid = ls.severityid
JOIN risklevel rl
ON ls.risklevelid = rl.risklevelid
) ,
ic AS (
SELECT ROW_NUMBER() OVER (PARTITION BY i.IncidentId ORDER BY (CASE WHEN incidentTime IS NULL THEN GETDATE() ELSE incidentTime END) DESC,ord ASC) rn,
i.incidentid,
dbo.Incidentdescription(i.incidentid, '',
'',
'', '')
IncidentDescription,
dbo.Dateconverttimezonecompanyid(closedtime,
i.companyid)
ClosedTime,
incidenttime,
incidentno,
Isnull(c.category, '')
Category,
opencorrectiveactions,
reportcompleted,
Isnull(classificationcompleted, 0)
ClassificationCompleted,
Cast (( CASE
WHEN closedtime IS NULL THEN 0
ELSE 1
END ) AS BIT)
IncidentClosed,
Cast (( CASE
WHEN investigatorfinishedtime IS NULL THEN 0
ELSE 1
END ) AS BIT)
InvestigationFinished,
Cast (( CASE
WHEN investigationcompletetime IS NULL THEN 0
ELSE 1
END ) AS BIT)
InvestigationComplete,
Cast (( CASE
WHEN investigatorassignedtime IS NULL THEN 0
ELSE 1
END ) AS BIT)
InvestigatorAssigned,
Cast (( CASE
WHEN (SELECT Count(*)
FROM incidentinvestigator
WHERE incidentid = i.incidentid
AND personid = 1588
AND tablename = 'AdminLevels') = 0
THEN 0
ELSE 1
END ) AS BIT)
IncidentInvestigator,
(SELECT dbo.Strconcat(osname)
FROM (SELECT TOP 10 osname
FROM incidentlocation l
JOIN organisationstructure o
ON l.locationid = o.osid
WHERE incidentid = i.incidentid
ORDER BY l.locorder) loc)
Location,
Isnull((SELECT TOP 1 teamleader
FROM incidentinvestigator
WHERE personid = 1588
AND tablename = 'AdminLevels'
AND incidentid = i.incidentid), 0)
TeamLeader,
incidentstatus,
incidentstatussearch
FROM incident i
LEFT OUTER JOIN incidentcategory c
ON i.incidentid = c.incidentid
WHERE i.isdeleted = 0
AND i.companyid = 158
AND incidentno <> 0
--AND reportcompleted = 1
--AND investigatorassignedtime IS NOT NULL
--AND investigatorfinishedtime IS NULL
--AND closedtime IS NULL
),
ic2 AS (
SELECT * FROM ic WHERE rn=1
)
SELECT * FROM ic2
--WHERE rownumber >= 0
-- AND rownumber < 0 + 10
--WHERE ic2.incidentid in(53327,53538)
--WHERE ic2.incidentid = 53338
ORDER BY incidentid DESC
Following is the execution plan I got:
https://www.dropbox.com/s/50dcpelr1ag4blp/Execution_Plan.sqlplan?dl=0
There are several issues:
1) use UNION ALL instead of UNION ALL to avoid the additional operation to aggregate the data.
2) try to modify the numerous function calls (e.g. dbo.Incidentdescription() ) to be an in-lie table valued function so you can reference it using CROSS APPLY or OUTER APPLY. Especially, if those functions referencing a table again.
3) move the subqueries from the SELECT part of the query to the FROM part using CROSS APPLY or OUTER APPLY again.
4) after the above is done, check the execution plan again for any missing indexes. Also, run the query with STATISTICS TIME, IO on to verify that the number of times a table
is referenced is correct (sometimes the execution plan put you in the wrong direction, especially if function calls are involved)...
Since the first inner query produces rows with ord=1 and the second produces rows with ord=2, you should use UNION ALL instead of UNION. UNION will filter out equal rows and since you will never get equal rows it is more efficient to use UNION ALL.
Also, rewrite your query to not use the WITH construct. I've had very bad experiences with this. Just use regular derived tables instead. In the case the query is still abnormally slow, try to serialize some derived tables to a temporary table and query the temporary table instead.
Try alternate approach by removing
(SELECT dbo.Strconcat(osname)
FROM (SELECT TOP 10 osname
FROM incidentlocation l
JOIN organisationstructure o
ON l.locationid = o.osid
WHERE incidentid = i.incidentid
ORDER BY l.locorder) loc)
Location,
Isnull((SELECT TOP 1 teamleader
FROM incidentinvestigator
WHERE personid = 1588
AND tablename = 'AdminLevels'
AND incidentid = i.incidentid), 0)
TeamLeader
from the SELECT. Avoid using complex functions/sub-queries in select.

Inner join that ignore singlets

I have to do an self join on a table. I am trying to return a list of several columns to see how many of each type of drug test was performed on same day (MM/DD/YYYY) in which there were at least two tests done and at least one of which resulted in a result code of 'UN'.
I am joining other tables to get the information as below. The problem is I do not quite understand how to exclude someone who has a single result row in which they did have a 'UN' result on a day but did not have any other tests that day.
Query Results (Columns)
County, DrugTestID, ID, Name, CollectionDate, DrugTestType, Results, Count(DrugTestType)
I have several rows for ID 12345 which are correct. But ID 12346 is a single row of which is showing they had a row result of count (1). They had a result of 'UN' on this day but they did not have any other tests that day. I want to exclude this.
I tried the following query
select
c.desc as 'County',
dt.pid as 'PID',
dt.id as 'DrugTestID',
p.id as 'ID',
bio.FullName as 'Participant',
CONVERT(varchar, dt.CollectionDate, 101) as 'CollectionDate',
dtt.desc as 'Drug Test Type',
dt.result as Result,
COUNT(dt.dru_drug_test_type) as 'Count Of Test Type'
from
dbo.Test as dt with (nolock)
join dbo.History as h on dt.pid = h.id
join dbo.Participant as p on h.pid = p.id
join BioData as bio on bio.id = p.id
join County as c with (nolock) on p.CountyCode = c.code
join DrugTestType as dtt with (nolock) on dt.DrugTestType = dtt.code
inner join
(
select distinct
dt2.pid,
CONVERT(varchar, dt2.CollectionDate, 101) as 'CollectionDate'
from
dbo.DrugTest as dt2 with (nolock)
join dbo.History as h2 on dt2.pid = h2.id
join dbo.Participant as p2 on h2.pid = p2.id
where
dt2.result = 'UN'
and dt2.CollectionDate between '11-01-2011' and '10-31-2012'
and p2.DrugCourtType = 'AD'
) as derived
on dt.pid = derived.pid
and convert(varchar, dt.CollectionDate, 101) = convert(varchar, derived.CollectionDate, 101)
group by
c.desc, dt.pid, p.id, dt.id, bio.fullname, dt.CollectionDate, dtt.desc, dt.result
order by
c.desc ASC, Participant ASC, dt.CollectionDate ASC
This is a little complicated because the your query has a separate row for each test. You need to use window/analytic functions to get the information you want. These allow you to do calculate aggregation functions, but to put the values on each line.
The following query starts with your query. It then calculates the number of UN results on each date for each participant and the total number of tests. It applies the appropriate filter to get what you want:
with base as (<your query here>)
select b.*
from (select b.*,
sum(isUN) over (partition by Participant, CollectionDate) as NumUNs,
count(*) over (partition by Partitipant, CollectionDate) as NumTests
from (select b.*,
(case when result = 'UN' then 1 else 0 end) as IsUN
from base
) b
) b
where NumUNs <> 1 or NumTests <> 1
Without the with clause or window functions, you can create a particularly ugly query to do the same thing:
select b.*
from (<your query>) b join
(select Participant, CollectionDate, count(*) as NumTests,
sum(case when result = 'UN' then 1 else 0 end) as NumUNs
from (<your query>) b
group by Participant, CollectionDate
) bsum
on b.Participant = bsum.Participant and
b.CollectionDate = bsum.CollectionDate
where NumUNs <> 1 or NumTests <> 1
If I understand the problem, the basic pattern for this sort of query is simply to include negating or exclusionary conditions in your join. I.E., self-join where columnA matches, but columns B and C do not:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
and t1.PkId != t2.PkId
and t1.category != t2.category
)
Put the conditions in the WHERE clause if it benchmarks better:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
And it's often easiest to start with the self-join, treating it as a "base table" on which to join all related information:
select
[columns]
from
(select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
) bt
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
This can allow you to focus on getting that self-join right, without interference from other tables.