SQL Server: how to write a DELETE statement with a GROUP BY - sql

I am using SQL Server 2008.
I have a SELECT query as follows:
SELECT
Apples.ID, COUNT(Pips.Apples_ID)
FROM
Apples
LEFT JOIN
Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN
Table_C tc ON tb.xID = tc.xID
LEFT JOIN
Pips p ON tb.Apples_ID = p.Apples_ID
WHERE
tc.X IS NULL
GROUP BY
Apples.ID
The tables are:
Apples which has a unique entry (ID) for each Apple.
Pips which can have dozens of pips belonging to 1 Apple
Table_B and Table_C are mapping tables to refine the search
I need to group the results because I do not want an Apples result for each and every Pip that apples can have. The SELECT statement works and returns a list of unique Apple IDs
I now want to DELETE these Apples. I changed my statement to:
DELETE Apples
FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.Apples_ID
WHERE tc.X IS NULL
GROUP BY Apples.ID
but got a syntax error on the GROUP BY.
I tried:
DELETE x
FROM
(SELECT Apples.ID
FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.Apples_ID
WHERE tc.X IS NULL
GROUP BY Apples.ID) x;
But I got an error:
View or function not updatable because the modification affects multiple base tables
How can I delete these rows I have identified in the SELECT, without using a temporary table or script?

As others have pointed out, the sub-query approach can be adapted to work by using an IN ( ... ) clause on a normal single-table delete. This is the simplest way of adapting any select statement to a delete:
DELETE FROM Apples
WHERE ID IN (
-- Sub-query selecting a single column of ID values
)
The sub-query can then be as complex as you like, using GROUP BY, HAVING, etc, as long as it only has one column in the SELECT list.
In your specific case, however, there is no need:
You have no HAVING clause, so the COUNT() doesn't change the rows to delete
The LEFT JOIN to the Pips table has no effect on the result other than the COUNT()
Mentioning the same row twice in a DELETE has no effect, so eliminating duplicates is unnecessary
You can therefore simplify this particular case without using the sub-query:
DELETE Apples
FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
WHERE tc.X IS NULL

DELETE FROM Apples WHERE ID in
(
SELECT a.ID FROM Apples a
LEFT JOIN Table_B tb ON a.ID = tb.a
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.a
WHERE tc.X IS NULL
GROUP BY a.ID
) as q

Are you trying to achieve this:
DELETE FROM APPLES WHERE ID IN
(
SELECT Apples.ID FROM Apples
LEFT JOIN Table_B tb ON Apples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
LEFT JOIN Pips p ON tb.Apples_ID = p.Apples_ID
WHERE tc.X IS NULL
GROUP BY Apples.ID
) x;

The only thing that has a role in the query is tc.X is null. It can be null if there is no match or there is a match but the field X is null:
delete from Apples
where AppleId in
(
SELECT Apples.ID FROM Apples
LEFT JOIN Table_b tb ON tApples.ID = tb.Apples_ID
LEFT JOIN Table_C tc ON tb.xID = tc.xID
WHERE tc.X IS NULL
);

Related

Most efficient way to join two tables on multiple fields?

I'm working with an Oracle SQL DB and attempting to join 2 tables together. My issue is that there are 3 different dimensions (4 total fields) upon which the two tables may be joined and I'm looking to identify all records where any one of those methods delivers a match and then pull in a certain field from that 2nd table in those instances.
My current plan is as follows:
SELECT a.*,
CASE
WHEN b.field_1 IS NOT NULL THEN b.field_5
WHEN c.field_2 IS NOT NULL THEN c.field_5
WHEN d.field_3 IS NOT NULL THEN c.field_5
END AS match
FROM table_1 a
LEFT JOIN table_2 b ON a.field_1 = b.field_1
LEFT JOIN table_3 c ON a.field_2 = c.field_2
LEFT JOIN table_4 d ON a.field_3 = d.field3 AND a.field_4 = d.field4
I believe this will give me the results I'm looking for, but I imagine this isn't the most efficient way to accomplish that. Any thoughts on a better approach?
[TL;DR] Your query is fine.
You need to use JOINs to correlate the relationships between the four tables.
If you want to be able to include rows from the driving table when there are no rows in the related tables then the join wants to be an OUTER JOIN.
If you put the driving table first then it will be a LEFT OUTER JOIN (or just LEFT JOIN)
You do not have much option on this.
If you want to get the field_5 values then you either want:
SELECT a.*,
b.field_5 AS b_match,
c.field_5 AS c_match,
d.field_5 AS d_match
FROM table_1 a
LEFT JOIN table_2 b ON a.field_1 = b.field_1
LEFT JOIN table_3 c ON a.field_2 = c.field_2
LEFT JOIN table_4 d ON a.field_3 = d.field3 AND a.field_4 = d.field4
If you want all the matches.
Or, you want to use your query:
SELECT a.*,
CASE
WHEN b.field_1 IS NOT NULL THEN b.field_5
WHEN c.field_2 IS NOT NULL THEN c.field_5
WHEN d.field_3 IS NOT NULL THEN c.field_5 -- Should this be d.field_5?
END AS match
FROM table_1 a
LEFT JOIN table_2 b ON a.field_1 = b.field_1
LEFT JOIN table_3 c ON a.field_2 = c.field_2
LEFT JOIN table_4 d ON a.field_3 = d.field3 AND a.field_4 = d.field4
If you want to get a single match in preference order of tables b, c and then d.
If you are using Oracle 12 or later, a third alternative could be to use UNION ALL in a LATERAL join:
SELECT a.*, l.field_5
FROM table_1 a
LEFT OUTER JOIN LATERAL (
SELECT 1 AS priority, b.field_5
FROM table_2 b
WHERE a.field_1 = b.field_1
UNION ALL
SELECT 2 AS priority, c.field_5
FROM table_3 c
WHERE a.field_2 = c.field_2
UNION ALL
SELECT 3 AS priority, d.field_5
FROM table_3 d
WHERE a.field_3 = d.field_3
AND a.field_4 = d.field_4
ORDER BY priority ASC
FETCH FIRST ROW WITH TIES
) l
ON (1 = 1)
Which may reduce the number of duplicate rows from not having multiple JOINs (that you are potentially ignoring with your CASE expression) but you should test whether it does return your desired results and if it would be more or less performant.

MS Acces Jet SQL Error: Join Expression not supported with multiple Join conditions

I'm trying to run this SQL Expression in Access:
Select *
From ((TableA
Left Join TableB
On TableB.FK = TableA.PK)
Left Join TableC
On TableC.FK = TableB.PK)
Left Join (SELECT a,b,c FROM TableD WHERE b > 1) AS TableD
On (TableD.FK = TableC.PK AND TableA.a = TableD.a)
but it keeps getting error: Join-Expression not supported.
Whats the problem?
Sorry, im just starting with Jet-SQL and in T-SQL its all fine.
Thanks
The issue is that the final outer join condition TableA.a = TableD.a will cause the query to contain ambiguous outer joins, since the records to which TableA is joined to TableD will depend upon the results of the joins between TableA->TableB, TableB->TableC and TableC->TableD.
To avoid this, you'll likely need to structure your query with the joins between tables TableA, TableB & TableC existing within a subquery, the result of which is then outer joined to TableD. This unambiguously defines the order in which the joins are evaluated.
For example:
select * from
(
select TableA.a, TableC.PK from
(
TableA left join TableB on TableA.PK = TableB.FK
)
left join TableC on TableB.PK = TableC.FK
) q1
left join
(
select TableD.a, TableD.b, TableD.c, TableD.FK from TableD
where TableD.b > 1
) q2
on q1.a = q2.a and q1.PK = q2.FK
Consider relating every join to the FROM table to avoid having to nest relations.
SELECT *
FROM ((TableA
LEFT JOIN TableB
ON TableB.FK = TableA.PK)
LEFT JOIN TableC
ON TableC.FK = TableA.PK)
LEFT JOIN
(SELECT FK,a,b,c
FROM TableD WHERE b > 1
) AS TableD
ON (TableD.FK = TableA.PK)
AND (TableD.a = TableA.a)

Using a result of an inner join as part of a query for another join

I have to use the result from one inner join table and subsequently get the records that are not present in another linking table:
To check whether a value is not in a set of values, use NOT IN:
SELECT *
FROM A
WHERE some_ID NOT IN (SELECT C.some_ID
FROM B
JOIN C ON ...)
Alternatively, use a correlated subquery, which does a separate lookup for each record in the outer query:
SELECT *
FROM A
WHERE NOT EXISTS (SELECT 1
FROM B
JOIN C ON ...
WHERE C.some_ID = A.some_ID)
Alternatively, use an outer join and check which records did not match:
SELECT A.*
FROM A
LEFT JOIN (B JOIN C ON ...)
ON A.some_ID = C.some_ID
WHERE C.some_ID IS NULL

Oracle joining multiple tables?

I am trying to do the following
select
TA.C1 ,TB.C1 ,TC.C1
from TableA TA ,TableB TB , TableC TC
where TA.C1 = "ABC"
AND TA.C2 = TB.C1
and TA.C3 = TC.C1
Result is
My aim is to add a couple more tables to this query
select
TA.C1,TB.C1,TC.C1,TD.C1,TE.C1
from TableA TA ,TableB TB , TableC TC , TableD TD, TableE TE
where TA.C1 = "ABC"
and TA.C2 = TB.C1
and TA.C3 = TC.C1
and TA.C4 = TD.C1
and TD.C2 = TE.C1
But since the Column TD.C1 contains null values , whereas TA.C4 always has some values , I get the below results.
The expected result is
I have tried joining using Joins for joining 4 tables :
select
TA.C1,TB.C1,TC.C1,TD.C1
from TableA TA
JOIN TableB TB ON (TA.C2 = TB.C1)
JOIN TableC TC ON (TA.C3 = TC.C1)
LEFT JOIN TableD TD ON (TA.C4 = TD.C1)
AND TA.C1 = "ABC"
The results is pretty near what I expect:
The issue is I'm not sure how to Join the 5th table (Table E) as this doesn't have any joing with Table A.
You can just include table E with another left join to table D. Basically, the relation exists between tables D and E, and data entered into it has to be in accordance with it. If there is no data, the relation still exists, so the join will return nulls as you want.
select
TA.C1,TB.C1,TC.C1,TD.C1, TE.C1
from TableA TA
INNER JOIN TableB TB ON (TA.C2 = TB.C1)
INNER JOIN TableC TC ON (TA.C3 = TC.C1)
LEFT JOIN TableD TD ON (TA.C4 = TD.C1)
LEFT JOIN TableE TE ON (TD.C2 = TE.C1)
AND TA.C3 = "ABC"
Best practice: Use explicit joins as you are in you later example.
When joining multiple tables the join need not all start with the same table, each one must simply be related. That is you can
select *
from a
inner join b on a.id = b.id_a
inner join c on b.id = c.id_b

Sql NOT IN optimization

I'm having trouble optimizing a query. Here are two example tables I am working with:
Table 1:
UID
A
B
Table 2:
UID Parent
A 2
B 2
C 3
D 2
E 3
F 2
Here is what I am doing now:
Select Table1.UID
FROM Table1 R
INNER JOIN Table2 T ON
R.UID = T.UID
INNER JOIN Table2 E ON
T.PARENT = E.PARENT
AND E.UID NOT IN (SELECT UID FROM Table1)
I'm trying to avoid using the NOT IN clause because of obvious hindrances in performance for large numbers of records.
I know the typical ways to avoid NOT IN clauses like the LEFT JOIN where the other table is null, but can't seem to get what I want with all of the other Joins going on.
I will continue working and post if I find a solution.
EDIT: Here is what I am trying to end up with
After the first Inner Join I would have
A
B
AFter the second Inner join I would have:
A D
A F
B D
B F
The second column above is just to represent that it is matching to the other UIDs with the same parent, but I still need the As and Bs as the UID.
EDIT: RDBMS is SQL server 2005, 2008r2, 2012
Table1 is declared in the query with no index
DECLARE #Table1 TABLE ( [UNIQUE_ID] INT PRIMARY KEY )
Table2 has a clustered index on Unique ID
The general approach to this is to use a LEFT JOIN with a where clause that only selects the non-matching rows:
Select Table1.UID
FROM Table1 R
JOIN Table2 T ON R.UID = T.UID
JOIN Table2 E ON T.PARENT = E.PARENT
LEFT JOIN Table3 E2 ON E.UID = R.UID
WHERE E2.UID IS NULL
SELECT Table2.*
FROM Table2
INNER JOIN (
SELECT id FROM Table2
EXCEPT
SELECT id FROM Table1
) AS Filter ON (Table2.id = Filter.id)