Remove Duplicate Code from UPDATE - sql

I have the following query.
UPDATE A SET b = (SELECT b FROM B WHERE A.a_id = B.a_id AND B.value = ?)
This might fill A with NULL values if no a_id exists in B where value = ?. but this is okay, because before executing this query, it is certain that A.b contains only NULL values to begin with.
However, I need the number of updated columns to reflect the number of updates performed. So I changed it into this:
UPDATE A SET b = (SELECT b FROM B WHERE A.a_id = B.a_id AND B.value = ?)
WHERE EXISTS (SELECT b FROM B WHERE A.a_id = B.a_id AND B.value = ?)
I don't like this solution, because now I have duplicate code and have to fill the parameter multiple times. This gets even uglier when the where clause gets more complicated.
Is there a way to get rid of this duplicate code?
(BTW I'm on Oracle 10, but I prefer DB independent solutions)

Updat using an inner join
UPDATE A
INNER JOIN B ON A.a_id = B.a_id
SET A.b = B.b
WHERE B.value = ?
If that isn't allowed with your particular RDBMS, perhaps you could SELECT the old and new values into an aliased table expression and update using that. See Update statement with inner join on Oracle

Related

Trying to return a select statement which show records from two tables that match records from a third table

Here is the Exercise we were given to practice our SQL refresher
Get all rows from TableA. If a match is available in both TableB and TableC, include it. This means that if data is available in TableB or TableC, but not both, data from both willbe excluded and only TableA data will be show
This is currently my full syntax I am using at the current moment.
SELECT *
FROM dbo.TableA a
LEFT JOIN dbo.TableB b ON a.ID = b.ID
LEFT JOIN dbo.TableC c ON a.ID = b.ID AND b.ID = c.ID
WHERE a.ID <100;
go
And this is the corresponding output I am getting.
I am trying to change Column B record 2 into NULL as it does not match Column C. Is there anyway I can get something like this to work, if I try this in the syntax it throws an identifier can not be found.
LEFT JOIN dbo.TableB b
on a.ID = b.ID and TableC.ID = b.ID
Expecting
All From Table A
Rows from TableB that match TableA and TableC
Rows from TableC that match TableA and TableB
Figured out my logic was a bit wrong.
I decided to try and layout the tables differently to see if that works and it ended up getting to what I needed
select *
from dbo.TableB b
inner join dbo.TableC c
on b.ID = c.ID
right outer join dbo.TableA a
on b.ID = a.ID
where a.ID < 100;
go

Why this Join-Update query update all value?

I made some mistake and all values of a column are updated.
I did this in SQL Server 2008 R2.
I should have run some query like this:
UPDATE TABLE_A
SET FEEL = 'HAPPY'
FROM TABLE_A A
INNER JOIN TABLE_B B ON A.SN = B.SN
WHERE B.WEATHER = 'SUNNY';
However, I made a mistake and ran this:
UPDATE TABLE_A
SET FEEL = 'HAPPY'
FROM TABLE_C A
INNER JOIN TABLE_B B ON A.SN = B.SN
WHERE B.WEATHER = 'SUNNY';
and even TABLE_C has column of [SN].
I expected that this query update FEEL of TABLE_A as 'HAPPY' where WEATHER of TABLE_B is 'SUNNY' with inner join between two tables, but every value of column FEEL is updated to 'HAPPY'.
What means Update A set ~ from c in SQL Server and when it should be used? And why "inner-join" updates all values?
This query:
UPDATE TABLE_A
SET FEEL = 'HAPPY'
FROM TABLE_C A INNER JOIN
TABLE_B B
ON A.SN = B.SN
WHERE B.WEATHER = 'SUNNY';
is saying to update all rows in TABLE_A that match the conditions in the ON and WHERE clauses. But, none of those conditions involve TABLE_A. So, nothing is being filtered. Actually, what you are doing is equivalent to:
UPDATE AA
SET FEEL = 'HAPPY'
FROM TABLE AA CROSS JOIN
TABLE_C A INNER JOIN
TABLE_B B
ON A.SN = B.SN
WHERE B.WEATHER = 'SUNNY';
This is a bit of weirdness in the UPDATE.
When you do:
UPDATE TABLE_A
SET FEEL = 'HAPPY'
FROM TABLE_A A INNER JOIN
TABLE_B B
ON A.SN = B.SN
WHERE B.WEATHER = 'SUNNY';
SQL Server makes an exception to the rule that an alias always replaces the table reference. It still allows TABLE_A in the UPDATE to refer to A. So, there is no CROSS JOIN.
Personally, I consider this broken-ness, because a table alias should always replace the table reference. The developers at Microsoft think otherwise. And there is no standard that guides this syntax.
If you have a FROM clause, I recommend that always use table aliases.
UPDATE A
SET FEEL = 'HAPPY'
FROM TABLE_A A INNER JOIN
TABLE_B B
ON A.SN = B.SN
WHERE B.WEATHER = 'SUNNY';

What kind of join is used in a Vertica UPDATE statement?

Vertica has an interesting update syntax when updating a table based on a join value. Instead of using a join to find the update rows, it mandates a syntax like this:
UPDATE a
SET col = b.val
where a.id = b.id
(Note that this syntax is indeed mandated in this case, because Vertica prohibits us from using a where clause that includes a "self-join", that is a join referencing the table being updated, in this case a.)
This syntax is nice, but it's less explicit about the join being used than other SQL dialects. For example, what happens in this case?
UPDATE a
SET col = CASE 0 if b.id IS NULL ELSE b.val END
where a.id = b.id
What happens when a.id has no match in b.id? Does a.col not get updated, as though the condition a.id = b.id represented an inner join of a and b? Or does it get updated to zero, as if the condition were a left outer join?
I think Vertica uses the Postgres standard for this syntax:
UPDATE a
SET col = b.val
FROM b
whERE a.id = b.id;
This is an INNER JOIN. I agree that it would be nice if Postgres and the derived databases supported explicit JOINs to the update table (as some other databases do). But the answer to your question is that this is an INNER JOIN.
I should note that if you want a LEFT JOIN, you have two options. One is a correlated subquery:
UPDATE a
SET col = (SELECT b.val FROM b whERE a.id = b.id);
The other is an additional level of JOIN (assuming that id is unique in a):
UPDATE a
SET col = b.val
FROM a a2 LEFT JOIN
b
ON a2.id = b.id
WHERE a.id = a2.id;

INNER JOIN where **every** row must match the WHERE clause?

Here's a simplified example of what I'm trying to do. I have two tables, A and B.
A B
----- -----
id id
name a_id
value
I want to select only the rows from A where ALL the values of the rows from B match a where clause. Something like:
SELECT * from A INNER JOIN B on B.a_id = A.id WHERE B.value > 2
The problem with the above query is that if ANY row from B has a value > 2 I'll get the corresponding row from A, and I only want the row from A if
1.) ALL the rows in B for B.a_id = A.id match the WHERE, OR
2.) There are no rows in B that reference A
B is basically a table of filters.
SELECT *
FROM a
WHERE NOT EXISTS
(
SELECT NULL
FROM b
WHERE b.a_id = a.a_id
AND (b.value <= 2 OR b.value IS NULL)
)
This should solve your problem:
SELECT *
FROM a
WHERE NOT EXISTS (SELECT *
FROM b
WHERE b.a_id = a.id
AND b.value <= 2)
Here is the way in which this is obtained.
Suppose that we have available a universal quantifier (parallel to EXISTS, the existential quantifier), with a syntax like:
FORALL table WHERE condition1 : condition2
(to be read: FORALL the elements of table that satisfy the condition1, then condition2 is true)
So you could write your query in this way:
SELECT *
FROM a
WHERE FORALL b WHERE b.a_id = a.id : b.value > 2
(Note that forall is true even when no element in b exists with a value of a.id)
Then we can transform the universal quantifier in the existential one, with a double negation, as usual:
SELECT *
FROM a
WHERE NOT EXISTS b WHERE b.a_id = a.id : NOT (b.value > 2)
In plain SQL this can be written as:
SELECT *
FROM a
WHERE NOT EXISTS (SELECT *
FROM b
WHERE b.a_id = a.id
AND (b.value > 2) IS NOT TRUE)
This technique is very handy in case of universal quantification.
Answering this question (which it seems you actually meant to ask):
Return all rows from A, where all rows in B with B.a_id = A.id also pass the test B.value > 2.
Which is equivalent to:
Return all rows from A, where no row in B with B.a_id = A.id fails the test B.value > 2.
SELECT a.* -- "rows from A" (so don't include other columns)
FROM a
LEFT JOIN b ON b.a_id = a.id
AND (b.value > 2) IS NOT TRUE -- safe inversion of logic
WHERE b.a_id IS NULL;
When inverting a WHERE condition carefully consider NULL. IS NOT TRUE is the simple and safe way to perfectly invert a WHERE condition. The alternative would be (b.value <= 2 OR b.value IS NULL) which is longer but may be faster (easier to support with index).
Select rows which are not present in other table
Try this
SELECT * FROM A
LEFT JOIN B ON B.a_id = A.id
WHERE B.value > 2 OR B.a_id IS NULL
SELECT * FROM A LEFT JOIN B ON b.a_id = a.id
WHERE B.a_id IS NULL OR NOT EXIST (
SELECT 1
FROM b
WHERE b.value <= 2)
SELECT a.is, a.name, c.id as B_id, c.value from A
INNER JOIN (Select b.id, b.a_id, b.value from B WHERE B.value > 2) C
on C.a_id = A.id
Note it is a poor practice to use select *. You shoudl only specify fields you need. IN this case, I might possibly remove the b.Id refernces becasue they are probably not needed. If you have a join there is a 100% chance you are wasting resouces sending data you don't need becasue the join fields will be repeated. That is why I did nto include a_id in the final result set.
If you prefer not to use EXISTS, you can use an outer join.
SELECT A.*
FROM
A
LEFT JOIN B ON
B.a_id = A.id
AND B.value <= 2 -- note: condition reversed!!
WHERE B.id IS NULL
This works by searching for the existence of a failing record in B. If it finds one, then the join will match, and the final WHERE clause will exclude that record.

filter duplicates in SQL join

When using a SQL join, is it possible to keep only rows that have a single row for the left table?
For example:
select * from A, B where A.id = B.a_id;
a1 b1
a2 b1
a2 b2
In this case, I want to remove all except the first row, where a single row from A matched exactly 1 row from B.
I'm using MySQL.
This should work in MySQL:
select * from A, B where A.id = B.a_id GROUP BY A.id HAVING COUNT(*) = 1;
For those of you not using MySQL, you will need to use aggregate functions (like min() or max()) on all the columns (except A.id) so your database engine doesn't complain.
It helps if you specify the keys of your tables when asking a question such as this. It isn't obvious from your example what the key of B might be (assuming it has one).
Here's a possible solution assuming that ID is a candidate key of table B.
SELECT *
FROM A, B
WHERE B.id =
(SELECT MIN(B.id)
FROM B
WHERE A.id = B.a_id);
First, I would recommend using the JOIN syntax instead of the outdated syntax of separating tables by commas. Second, if A.id is the primary key of the table A, then you need only inspect table B for duplicates:
Select ...
From A
Join B
On B.a_id = A.id
Where Exists (
Select 1
From B B2
Where B2.a_id = A.id
Having Count(*) = 1
)
This avoids the cost of counting matching rows, which can be expensive for large tables.
As usual, when comparing various possible solutions, benchmarking / comparing the execution plans is suggested.
select
*
from
A
join B on A.id = B.a_id
where
not exists (
select
1
from
B B2
where
A.id = b2.a_id
and b2.id != b.id
)