Related
I am trying to create a query which will reference multiple CTEs. To check the code was working, I created the query and on conclusion of the result, I migrated it to a cte and completed the inner joins.
When I checked the query in temp_table3, the data returned was correct, but when I added this as a CTE and ran the next query, I noted that the results started to multiply, which is incorrect, as i should only ever have 3 records. I commented this out in the end of the main query :(
I then added a new cte called temp_table4 which correctly returned the summed data, however when i add this to the final query, I get an error:
Recursive common table expression 'temp_table4' does not contain a top-level UNION ALL operator
I need to add another 7 ctes to finish the full query, but to be honest, I am now officially stuck.
Can somebody help me understand what/where I have gone wrong with the multiplying records and help resolve the union all error?
I thought I would add the table, the raw data and the query I have been working on.
Table is:
CREATE TABLE [dbo].[SUC_AS_PBATCH_ISK]
(
[pbatch_code] [nvarchar](80) NULL,
[outflow_rate] [int] NULL,
[qty] [int] NULL,
[start_inflow] [datetime] NULL,
[end_outflow] [datetime] NULL,
[duration] [float] NULL,
[custom_string_1] [nvarchar](80) NULL,
[custom_string_2] [nvarchar](80) NULL,
[model_name] [nvarchar](80) NULL
)
The raw data contents is:
INSERT INTO dbo.SUC_AS_PBATCH_ISK ([pbatch_code],
[outflow_rate], [qty], [start_inflow],
[end_outflow], [duration], [custom_string_1],
[custom_string_2], [model_name])
VALUES
('P31200','1200','44342','2021-05-25 03:10:00','2021-05-26
14:23:00','2113','used by macro','15000351','ISK_AS_VI'),
('P31202','1200','42279','2021-05-25 02:23:00','2021-05-26
13:36:00','2113','used by macro',
'15000351','ISK_AS_VI'),
('P31204','1200','42280','2021-05-25 07:46:00','2021-05-26
19:01:00','2114','used by macro','15000351','ISK_AS_VI'); `
My full query is:
WITH temp_table1 AS
(
SELECT
MAX(start_inflow) AS Latest_start_time,
custom_string_2 AS item_code,
COUNT(custom_string_2) AS number_batches,
SUM(duration) / 60 AS Delta_duration_sum
FROM
dbo.SUC_AS_PBATCH_ISK AS pb1
WHERE
(custom_string_1 = 'used by macro')
GROUP BY
custom_string_2
)
---- Latest_start_time, DeltahoursLatestStart_time,
Delta_duration_Sum
, temp_table2
AS
(
SELECT pb2.custom_string_2 AS item_code,
DATEDIFF
(minute,pb2.start_inflow,tt1.Latest_start_time)/60.0 AS
[DeltaHoursLatestStart_time]
FROM temp_table1 AS tt1
INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2
ON pb2.custom_string_2 = tt1.item_code
WHERE (custom_string_1 = 'used by macro')
)
----tt3.[Duration to delta]
,temp_table3
AS
(
select
pb2.custom_string_2 AS item_code,
sum(tt2.DeltaHoursLatestStart_time) AS Duration_to_delta
,pb2.pbatch_code
FROM temp_table1 AS tt1
INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2
ON pb2.custom_string_2 = tt1.item_code
INNER JOIN temp_table2 AS tt2
ON tt1.item_code = tt2.item_code
WHERE (custom_string_1 = 'used by macro')
GROUP BY Pb2.custom_string_2,pb2.pbatch_code
)
,temp_table4
AS
(
select
pb2.custom_string_2 AS item_code,
sum(tt2.DeltaHoursLatestStart_time+tt3.Duration_to_delta)
as Test
,pb2.pbatch_code
FROM temp_table1 AS tt1
INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2
ON pb2.custom_string_2 = tt1.item_code
INNER JOIN temp_table2 AS tt2
ON tt1.item_code = tt2.item_code
INNER JOIN temp_table3 AS tt3
ON tt2.item_code = tt3.item_code
INNER JOIN temp_table4 AS tt4
ON tt2.item_code = tt4.item_code
WHERE (custom_string_1 = 'used by macro')
GROUP BY Pb2.custom_string_2,pb2.pbatch_code
)
select distinct
pb2.start_inflow
,tt1.latest_start_time
,tt1.item_code
,tt1.number_batches
,tt1.Delta_duration_sum
,pb2.end_outflow
,pb2.duration
,pb2.pbatch_code
,pb2.outflow_rate
,pb2.qty
--,tt2.DeltaHoursLatestStart_time
,tt3.Duration_to_delta
,tt4.Test
FROM temp_table1 AS tt1
INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2
ON pb2.custom_string_2 = tt1.item_code
INNER JOIN temp_table2 AS tt2
ON pb2.custom_string_2 = tt2.item_code
INNER JOIN temp_table3 AS tt3
ON pb2.custom_string_2 = tt3.item_code AND PB2.pbatch_code= tt3.pbatch_code
INNER JOIN temp_table4 AS tt4
ON pb2.custom_string_2 = tt4.item_code
WHERE(pb2.custom_string_2 IS NOT NULL) AND
(pb2.custom_string_1 = 'used by macro')
order by item_code`
'''
Your temp_table4 CTE refers to itself in this line:
INNER JOIN temp_table4 AS tt4
This is legal, but it means that this CTE is recursive; recursive CTEs in SQL have always two parts: one that 'seeds' the table and that does not refer to the table itself, and the other that does. The result is the UNION ALL of that first select and the results obtained by recursively calling the second part until no more rows are added.
A typical example is something like this:
WITH t(id, value) as (
SELECT id, value FROM some_table WHERE value = 10
UNION ALL
SELECT id, value FROM some_table st JOIN t on st.value = t.value + 1
)
This will select first all ids where value is 10 and then it will keep selecting ids where value is one bigger than any record already selected. So it will first select all 11's, then all 12's and so on until it runs out of new records.
thank you for the help guys, simplifying the code, did make a difference, allowing me to pinpoint the multiplying records and where I had added an incorrect inner join on temp_table4.
INNER JOIN temp_table4 AS tt4 ON tt2.item_code = tt4.item_code
which was referencing itself, and caused the error: Recursive common table expression 'temp_table4' does not contain a top-level UNION ALL operator
On removing the above lines from the temp_table4, the query ran successfully.
In regards to the multiple records, I noted that I had not added the inner join on the pb1.pbatch_code, once I added this to the temp table 2 and the main query as below it returned the correct number of records. :)
` INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2 ON pb2.custom_string_2 = tt1.item_code
INNER JOIN temp_table2 AS tt2 ON pb2.custom_string_2 = tt2.item_code
AND PB2.pbatch_code = tt2.pbatch_code
INNER JOIN temp_table3 AS tt3 ON pb2.custom_string_2 = tt3.item_code
AND PB2.pbatch_code = tt3.pbatch_code `
this is a copy of my final code, maybe this can help someone else in the future, hope so.
WITH temp_table1 AS ( SELECT MAX(start_inflow) AS Latest_start_time -- used in Tt2-- , custom_string_2 AS item_code, COUNT(custom_string_2) AS number_batches, SUM(duration) / 60 AS Delta_duration_sum FROM dbo.SUC_AS_PBATCH_ISK AS pb1 WHERE (custom_string_1 = 'used by macro') GROUP BY custom_string_2 ) ---- Latest_start_time, DeltahoursLatestStart_time,Delta_duration_Sum , temp_table2 AS ( SELECT pb2.custom_string_2 AS item_code, DATEDIFF (minute, pb2.start_inflow, tt1.Latest_start_time) / 60.0 AS [DeltaHoursLatestStart_time], tt1.Delta_duration_sum, pb2.pbatch_code FROM temp_table1 AS tt1 INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2 ON pb2.custom_string_2 = tt1.item_code WHERE (custom_string_1 = 'used by macro') ) ----tt3.[Duration to delta] , temp_table3 AS ( select pb2.custom_string_2 AS item_code, sum(tt2.DeltaHoursLatestStart_time) AS Duration_to_delta, pb2.pbatch_code FROM temp_table1 AS tt1 INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2 ON pb2.custom_string_2 = tt1.item_code INNER JOIN temp_table2 AS tt2 ON tt1.item_code = tt2.item_code WHERE (custom_string_1 = 'used by macro') GROUP BY Pb2.custom_string_2, pb2.pbatch_code ) select pb2.start_inflow, tt1.latest_start_time, tt1.item_code, tt1.number_batches, tt1.Delta_duration_sum, pb2.end_outflow, pb2.duration, pb2.pbatch_code, pb2.outflow_rate, pb2.qty, tt2.DeltaHoursLatestStart_time, (tt2.DeltaHoursLatestStart_time + tt3.Duration_to_delta) as [sumdeltahours], tt3.Duration_to_delta FROM temp_table1 AS tt1 INNER JOIN dbo.SUC_AS_PBATCH_ISK AS pb2 ON pb2.custom_string_2 = tt1.item_code INNER JOIN temp_table2 AS tt2 ON pb2.custom_string_2 = tt2.item_code AND PB2.pbatch_code = tt2.pbatch_code INNER JOIN temp_table3 AS tt3 ON pb2.custom_string_2 = tt3.item_code AND PB2.pbatch_code = tt3.pbatch_code WHERE (pb2.custom_string_2 IS NOT NULL) AND (pb2.custom_string_1 = 'used by macro') order by item_code
This big query below returns me 2860 records and its correct number. I'm getting. The thing is that I need to add to this query invoice lines and make the same thing as I did with sale_order_lines "sum(l.price_subtotal / COALESCE(cr.rate, 1.0)) AS price_subtotal". I need to get the sum of price_subtotal of invoice lines.
so my first thought was to join tables like this.
JOIN sale_order_invoice_rel so_inv_rel on (so_inv_rel.order_id = s.id )
JOIN account_invoice inv on (inv.id = so_inv_rel.invoice_id and inv.state in ('open','paid'))
JOIN account_invoice_line ail on (inv.id = ail.invoice_id)
and then
sum(ail.price_subtotal / COALESCE(cr.rate, 1.0)) as price_subtotal
but after the first JOIN number of lines, I'm selecting is changing and even if I joins are done the numbers are way off basically I get 5x2860. So probably I need to make some subquery but at this point, I don't know how and asking for help.
WITH currency_rate AS (
SELECT r.currency_id,
COALESCE(r.company_id, c.id) AS company_id,
r.rate,
r.name AS date_start,
( SELECT r2.name
FROM res_currency_rate r2
WHERE r2.name > r.name AND r2.currency_id = r.currency_id AND (r2.company_id IS NULL OR r2.company_id = c.id)
ORDER BY r2.name
LIMIT 1) AS date_end
FROM res_currency_rate r
JOIN res_company c ON r.company_id IS NULL OR r.company_id = c.id
)
SELECT min(l.id) AS id,
l.product_id,
l.color_id,
l.product_size_id,
t.uom_id AS product_uom,
sum(l.product_uom_qty / u.factor * u2.factor) AS product_uom_qty,
sum(l.qty_delivered / u.factor * u2.factor) AS qty_delivered,
sum(l.qty_invoiced / u.factor * u2.factor) AS qty_invoiced,
sum(l.qty_to_invoice / u.factor * u2.factor) AS qty_to_invoice,
sum(l.price_total / COALESCE(cr.rate, 1.0)) AS price_total,
l.price_unit / COALESCE(cr3.rate, 1.0) AS price_total_by_cmp_curr,
sum(l.price_subtotal / COALESCE(cr.rate, 1.0)) AS price_subtotal,
count(*) AS nbr,
s.date_order AS date,
s.state,
s.partner_id,
s.user_id,
s.company_id,
date_part('epoch'::text, avg(date_trunc('day'::text, s.date_order) - date_trunc('day'::text, s.create_date))) / (24 * 60 * 60)::numeric(16,2)::double precision AS delay,
t.categ_id,
s.pricelist_id,
s.project_id AS analytic_account_id,
s.team_id,
p.product_tmpl_id,
partner.country_id,
partner.commercial_partner_id
FROM sale_order_line l
JOIN sale_order s ON l.order_id = s.id
JOIN res_partner partner ON s.partner_id = partner.id
LEFT JOIN product_product p ON l.product_id = p.id
LEFT JOIN product_template t ON p.product_tmpl_id = t.id
LEFT JOIN product_uom u ON u.id = l.product_uom
LEFT JOIN product_uom u2 ON u2.id = t.uom_id
LEFT JOIN res_company rc ON rc.id = s.company_id
LEFT JOIN product_pricelist pp ON s.pricelist_id = pp.id
LEFT JOIN currency_rate cr ON cr.currency_id = pp.currency_id AND cr.company_id = s.company_id AND cr.date_start <= COALESCE(s.date_order::timestamp with time zone, now()) AND (cr.date_end IS NULL OR cr.date_end > COALESCE(s.date_order::timestamp with time zone, now()))
LEFT JOIN currency_rate cr3 ON cr.currency_id = rc.currency_id AND cr.company_id = s.company_id AND cr.date_start <= COALESCE(s.date_order::timestamp with time zone, now()) AND (cr.date_end IS NULL OR cr.date_end > COALESCE(s.date_order::timestamp with time zone, now()))
GROUP BY l.product_id, t.uom_id, t.categ_id, s.date_order, s.partner_id, s.user_id, s.state, s.company_id, s.pricelist_id, s.project_id, s.team_id, l.color_id, cr3.rate, l.price_unit, l.product_size_id, p.product_tmpl_id, partner.country_id, partner.commercial_partner_id;
You can add the part that could be in a subquery to the with statement you already have to avoid the increase in number of lines, like so:
WITH currency_rate AS (
SELECT r.currency_id,
COALESCE(r.company_id, c.id) AS company_id,
r.rate,
r.name AS date_start,
( SELECT r2.name
FROM res_currency_rate r2
WHERE r2.name > r.name AND r2.currency_id = r.currency_id AND (r2.company_id IS NULL OR r2.company_id = c.id)
ORDER BY r2.name
LIMIT 1) AS date_end
FROM res_currency_rate r
JOIN res_company c ON r.company_id IS NULL OR r.company_id = c.id
)
, order_line_subtotal AS(
SELECT so_inv_rel.order_id, sum(ail.price_subtotal) as price_subtotal
FROM sale_order_invoice_rel so_inv_rel
JOIN account_invoice inv on (inv.id = so_inv_rel.invoice_id and inv.state in ('open','paid'))
JOIN account_invoice_line ail on (inv.id = ail.invoice_id)
GROUP BY so_inv_rel.order_id )
SELECT min(l.id) AS id,
....
From there it should be straightforward to add to the query without increasing the number of rows (since from the joins you already have a row for each order before the aggregation / group by.
In my query I need to join with a sub query on a derived column :
select w1.wk_id,
(floor(td1.military_hour/4)*4) as military_hour_group ,
w1.end_date as end_date
from work_instances w1
inner join time_table td1 on w1.end_time = td1.time_id
inner join
(
select (floor(td2.military_hour/4)*4) as td2_military_hour_group,
(floor(td3.military_hour/4)*4) as td3_military_hour_group, wk_id
from task_instances t1
inner join time_table td2 on t1.end_time = td1.time_id
inner join time_table td3 on t1.start_time = td3.time_id
) tq1
on tq1.td2_military_hour_group = military_hour_group
and tq1.td3_military_hour_group = military_hour_group
and tq1.wk_id = w1.wf_id
It says Invalid operation: column "military_hour_group" does not exist in w1, td1, unnamed_join, tq1;
What am I doing wrong?
Please help.
Try below: military_hour_group is your calculated column and that's why it is showing that error
select w1.wk_id,
(floor(td1.military_hour/4)*4) as military_hour_group ,
w1.end_date as end_date
from work_instances w1
inner join time_table td1 on w1.end_time = td1.time_id
inner join
(
select (floor(td2.military_hour/4)*4) as td2_military_hour_group,
(floor(td3.military_hour/4)*4) as td3_military_hour_group, wk_id
from task_instances t1
inner join time_table td2 on t1.end_time = td1.time_id
inner join time_table td3 on t1.start_time = td3.time_id
) tq1
on tq1.td2_military_hour_group = (floor(td1.military_hour/4)*4)
and tq1.td3_military_hour_group = (floor(td1.military_hour/4)*4)
and tq1.wk_id = w1.wf_id
Okay Use like below mention
select *
from (select w1.wk_id, (floor(td2.military_hour/4)*4) as military_hour_group , w1.end_date as end_date
from work_instances w1 inner join time_table td1 on w1.end_time = td1.time_id) table1 inner join (select (floor(td2.military_hour/4)*4) as td2_military_hour_group, (floor(td3.military_hour/4)*4) as td3_military_hour_group, wk_id
from task_instances t1 inner join time_table td2 on t1.end_time = td1.time_id inner join time_table td3 on t1.start_time = td3.time_id ) tabl2 on tabl2.td2_military_hour_group = table1.military_hour_group and tabl2.td3_military_hour_group = table1.military_hour_group and tabl2.wk_id = table1.wf_id
I have 2 with clauses like this:
WITH T
AS (SELECT tfsp.SubmissionID,
tfsp.Amount,
tfsp.campaignID,
cc.Name
FROM tbl_FormSubmissions_PaymentsMade tfspm
INNER JOIN tbl_FormSubmissions_Payment tfsp
ON tfspm.SubmissionID = tfsp.SubmissionID
INNER JOIN tbl_CurrentCampaigns cc
ON tfsp.CampaignID = cc.ID
WHERE tfspm.isApproved = 'True'
AND tfspm.PaymentOn >= '2013-05-01 12:00:00.000' AND tfspm.PaymentOn <= '2013-05-07 12:00:00.000')
SELECT SUM(Amount) AS TotalAmount,
campaignID,
Name
FROM T
GROUP BY campaignID,
Name;
and also:
WITH T1
AS (SELECT tfsp.SubmissionID,
tfsp.Amount,
tfsp.campaignID,
cc.Name
FROM tbl_FormSubmissions_PaymentsMade tfspm
INNER JOIN tbl_FormSubmissions_Payment tfsp
ON tfspm.SubmissionID = tfsp.SubmissionID
INNER JOIN tbl_CurrentCampaigns cc
ON tfsp.CampaignID = cc.ID
WHERE tfspm.isApproved = 'True'
AND tfspm.PaymentOn >= '2013-05-08 12:00:00.000' AND tfspm.PaymentOn <= '2013-05-14 12:00:00.000')
SELECT SUM(Amount) AS TotalAmount,
campaignID,
Name
FROM T1
GROUP BY campaignID,
Name;
Now I want to join the results of the both of the outputs. How can I do it?
Edited: Added the <= cluase also.
Reults from my first T:
Amount-----ID----Name
1000----- 2-----Annual Fund
83--------1-----Athletics Fund
300-------3-------Library Fund
Results from my T2
850-----2-------Annual Fund
370-----4-------Other
The output i require:
1800-----2------Annual Fund
83-------1------Athletics Fund
300------3-------Library Fund
370------4-----Other
You don't need a join. You can use
SELECT SUM(tfspm.PaymentOn) AS Amount,
tfsp.campaignID,
cc.Name
FROM tbl_FormSubmissions_PaymentsMade tfspm
INNER JOIN tbl_FormSubmissions_Payment tfsp
ON tfspm.SubmissionID = tfsp.SubmissionID
INNER JOIN tbl_CurrentCampaigns cc
ON tfsp.CampaignID = cc.ID
WHERE tfspm.isApproved = 'True'
AND ( tfspm.PaymentOn BETWEEN '2013-05-01 12:00:00.000'
AND '2013-05-07 12:00:00.000'
OR tfspm.PaymentOn BETWEEN '2013-05-08 12:00:00.000'
AND '2013-05-14 12:00:00.000' )
GROUP BY tfsp.campaignID,
cc.Name
If I am right, after a WITH-clause you have to immediatly select the results of that afterwards. So IMHO your best try to achieve joining the both would be to save each of them into a temporary table and then join the contents of those two together.
UPDATE: after re-reading your question I realized that you probably don't want a (SQL-) join but just your 2 results packed together in one, so you could easily achieve that with what I descibed above, just select the contents of both temporary tables and put a UNION inbetween them.
I was thinking it wrongly. Thanks for the help. This is how i achieved what exactly i want:
WITH
T AS (
SELECT tfsp.SubmissionID , Amount1 =
CASE
WHEN tfspm.PaymentOn >= '2013-01-10 11:34:54.000' AND tfspm.PaymentOn <= '2013-04-10 11:34:54.000' THEN tfsp.Amount
END
, Amount2 =
CASE
WHEN tfspm.PaymentOn >= '2013-05-01 11:34:54.000' AND tfspm.PaymentOn <= '2013-05-23 11:34:54.000' THEN tfsp.Amount
END
, tfsp.campaignID , cc.Name FROM tbl_FormSubmissions_PaymentsMade tfspm
INNER JOIN tbl_FormSubmissions_Payment tfsp ON tfspm.SubmissionID = tfsp.SubmissionID
INNER JOIN tbl_CurrentCampaigns cc ON tfsp.CampaignID = cc.ID
WHERE tfspm.isApproved = 'True'
)
SELECT ISNULL(SUM(Amount1),0) AS TotalAmount1, ISNULL(SUM(Amount2),0) AS TotalAmount2, campaignID , Name FROM T GROUP BY campaignID, Name;
I have to do an self join on a table. I am trying to return a list of several columns to see how many of each type of drug test was performed on same day (MM/DD/YYYY) in which there were at least two tests done and at least one of which resulted in a result code of 'UN'.
I am joining other tables to get the information as below. The problem is I do not quite understand how to exclude someone who has a single result row in which they did have a 'UN' result on a day but did not have any other tests that day.
Query Results (Columns)
County, DrugTestID, ID, Name, CollectionDate, DrugTestType, Results, Count(DrugTestType)
I have several rows for ID 12345 which are correct. But ID 12346 is a single row of which is showing they had a row result of count (1). They had a result of 'UN' on this day but they did not have any other tests that day. I want to exclude this.
I tried the following query
select
c.desc as 'County',
dt.pid as 'PID',
dt.id as 'DrugTestID',
p.id as 'ID',
bio.FullName as 'Participant',
CONVERT(varchar, dt.CollectionDate, 101) as 'CollectionDate',
dtt.desc as 'Drug Test Type',
dt.result as Result,
COUNT(dt.dru_drug_test_type) as 'Count Of Test Type'
from
dbo.Test as dt with (nolock)
join dbo.History as h on dt.pid = h.id
join dbo.Participant as p on h.pid = p.id
join BioData as bio on bio.id = p.id
join County as c with (nolock) on p.CountyCode = c.code
join DrugTestType as dtt with (nolock) on dt.DrugTestType = dtt.code
inner join
(
select distinct
dt2.pid,
CONVERT(varchar, dt2.CollectionDate, 101) as 'CollectionDate'
from
dbo.DrugTest as dt2 with (nolock)
join dbo.History as h2 on dt2.pid = h2.id
join dbo.Participant as p2 on h2.pid = p2.id
where
dt2.result = 'UN'
and dt2.CollectionDate between '11-01-2011' and '10-31-2012'
and p2.DrugCourtType = 'AD'
) as derived
on dt.pid = derived.pid
and convert(varchar, dt.CollectionDate, 101) = convert(varchar, derived.CollectionDate, 101)
group by
c.desc, dt.pid, p.id, dt.id, bio.fullname, dt.CollectionDate, dtt.desc, dt.result
order by
c.desc ASC, Participant ASC, dt.CollectionDate ASC
This is a little complicated because the your query has a separate row for each test. You need to use window/analytic functions to get the information you want. These allow you to do calculate aggregation functions, but to put the values on each line.
The following query starts with your query. It then calculates the number of UN results on each date for each participant and the total number of tests. It applies the appropriate filter to get what you want:
with base as (<your query here>)
select b.*
from (select b.*,
sum(isUN) over (partition by Participant, CollectionDate) as NumUNs,
count(*) over (partition by Partitipant, CollectionDate) as NumTests
from (select b.*,
(case when result = 'UN' then 1 else 0 end) as IsUN
from base
) b
) b
where NumUNs <> 1 or NumTests <> 1
Without the with clause or window functions, you can create a particularly ugly query to do the same thing:
select b.*
from (<your query>) b join
(select Participant, CollectionDate, count(*) as NumTests,
sum(case when result = 'UN' then 1 else 0 end) as NumUNs
from (<your query>) b
group by Participant, CollectionDate
) bsum
on b.Participant = bsum.Participant and
b.CollectionDate = bsum.CollectionDate
where NumUNs <> 1 or NumTests <> 1
If I understand the problem, the basic pattern for this sort of query is simply to include negating or exclusionary conditions in your join. I.E., self-join where columnA matches, but columns B and C do not:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
and t1.PkId != t2.PkId
and t1.category != t2.category
)
Put the conditions in the WHERE clause if it benchmarks better:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
And it's often easiest to start with the self-join, treating it as a "base table" on which to join all related information:
select
[columns]
from
(select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
) bt
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
This can allow you to focus on getting that self-join right, without interference from other tables.