SQL advice about inner select and join - sql

I'm having some sort of debate with a colleague about a piece of sql. In a project, I wrote something like this :
update MyTable
set field1 = (select count(distinct blabla)
from anotherTable t
inner join againAnotherTable t2 on t2.fk = t1.pk
where t2.fk = MyTable.fk)
after some unitTests, the field "field1" of MyTable is properly populated with valid values. My colleague is telling me that I got lucky because the link I do inside the inner query (t2.fk = MyTable.fk) is inconsistent and I might have some error sometimes and update the wrong line or update the whole table. Instead I should put a join after the end of the parenthesis
Did I miss something ? Is there indeed a huge mistake on my side ?

Your query looks fine to me.
Do note that it does update the entire table because there is no where clause (or other filtering) for the update. Unmatched rows will have a value of 0. I don't know if that is what you intend.
Of course, this all depends on whether t2.fk = MyTable.fk is the logic you really want. I don't know what "inconsistent" would mean in this case.
I don't see how changing the data could result in an error. You might get unexpected values. For instance, if you intend for NULL values of fk to match, they won't. If there are no matches, then you'll get 0. The may not be the correct result (based on the logic you intend), but the query would be doing something sensible.

It is ok for this query if you wanted to actually update it that way, but if you want to update fields with join, and include the table being updated, do not write it like that.
Your colleague is trying to make sure that you write safer update queries in future. He does not want you to miss where t2.fk = MyTable.fk someday, in the sub-query. That would update the table incorrectly.
In that case then, write it like shown below,
update a set a.field1 = b.value
FROM MyTable a INNER JOIN anotherTable b ON a.condition1 = b.condition2
INNER JOIN yetAnotherTable c ON a.condition1 = c.condition2
So, you should change your update query to something like below
update a
set field1 = count(distinct blabla)
FROM anotherTable b INNER JOIN againAnotherTable t2 on t2.fk = t1.pk
INNER JOIN MyTable a ON t2.fk = MyTable.fk

declare #field1 datatype
set #field1 = (select count(distinct blabla)
from anotherTable t
inner join againAnotherTable t2 on t2.fk = t1.pk
where t2.fk = MyTable.fk)
update table set column=#field1 where id=''

Related

SQL subquery to joins -

Is it possible to remove the subquery from this SQL?
Table has 2 attributes "id" and "field"
Many field could have the same Id.
These table has many registers with the same Id and different Value
In need get all same Id values using one of them like filter.
select *
from Table
where id = (select id from Table where value = 'someValue')
I think it could be really easy but I don't know how to do.
Self Join can be done
select T.Id,T.Field
from Table T
INNER JOIN Table TT
ON T.ID = TT.ID
AND TT.Value = 'someValue'
Not sure if you over simplified your example too much but you could make this a little simpler.
select *
from Table
where value = 'someValue'
This should work
select T1.* from Table T1 JOIN Table T2 ON T1.id = T2.id AND T2.value = 'someValue'
Edited (Correct Answer):
What I assume your problem is:
You have a value. Let´s pretend it´s "testValue". Now you want to get the id of this value and find all other datasets with the same id.
What has to be cleared is that, "ID" is not the Primary Key and is not Unique.
You should be able to solve this by a simple self join:
select t.* from Table t right join Table tt on tt.id = t.id where tt.value = 'someValue';
So because of the join you will get a result that returns simply the table. With the where clause you shrink the result to your value. You should get the set of ids.
Old Answer:
This should do the trick:
select * from Table a inner join Table2 b on a.id = b.id where b.value = 'someValue';
You mentioned only one table in your question. I think this must be a mistake. If not, you have to change only the Table2 in my query. But that would have no sense as you could do a simple query, too:
select * from Table where value = 'someValue';
this would be the result of the first query with a self join.

Two different update statements - Only one working

I have two different variations of update statements for a stored procedure. The top one does not work and the bottom one does.
Could any of you please provide insight as to why it doesn't?
UPDATE table1
SET outcome = (
SELECT outcome
FROM table2
WHERE table1.StatusID = table2.StatusID
AND table1.IDUser = table2.UserID
)
The one below works, even though I have exactly the same constraints.
UPDATE a
SET a.outcome = b.outcome
FROM table1 A
INNER JOIN table2 B ON A.IDUser = B.UserID AND A.StatusID = B.StatusID
The first update will fail, when there are more rows in table2 matching the join. The second update will pick an arbitrary value for outcome from the join and use that value in the update.
This change to the first update should work, or rather give the same result:
UPDATE table1
SET outcome = (
SELECT TOP 1 outcome
FROM table2
WHERE table1.StatusID = table2.StatusID
AND table1.IDUser = table2.UserID
)
Maybe this would be better than your existing update. This way you will have some control of which value will end up in outcome in table1:
UPDATE table1
SET outcome = (
SELECT MAX(outcome)
FROM table2
WHERE table1.StatusID = table2.StatusID
AND table1.IDUser = table2.UserID
)
It is normal that first query does not work the way you want because it is a wrong query.
Your first code has a main query and a sub query.
In your subquery, you join the tables and get a result set.
But in your main query, you set your every row with the returned result from sub query, since you have no where block. There should be a null value in that result set. This is the reason of you having null after update.
You must do the joining out of your subquery, exactly like you do in the second code.
UPDATE table1
SET outcome = (
SELECT TOP(1)outcome
FROM table2
WHERE table2.StatusID = table1.StatusID
AND table2.IDUser = table1.IDUser
)

Update with where ANSI join syntax in PostgreSQL updates all rows

I'm trying to do an update with joins in the where clause. I understand with PostgreSQL there is a from clause that I can use with implicit joins like this:
update tbl1 t1 set name = 'foo'
from tbl2 t2
where t2.id = t1.table2_id
and t2.region = 'bar'
However, I have existing code that generates ANSI joins instead of implicit joins. Looking around Stack Overflow, I read that I could do something like this:
update tbl1 set name = 'foo'
from tbl1 t1
inner join tbl2 t2 on t2.id = t1.table2_id
where t2.region = 'bar'
Unfortunately, this doesn't seem to work, and instead of updating just 2 rows, it updates all the rows, regardless of what's in the from/where clause.
What am I missing?
Yes that is a side effect that is caused due to the reason that it treats t1 as a different table than the one that is being updated. There are 2 ways of getting around this.
Use the first query that you posted to UPDATE.
Add a condition to the second query like tbl1.id = t1.id so that it forces 1 to 1 mapping of the table being updated.

refer to outside field value in subselect?

I want to do a query to update values that I forgot to copy over in a mass insert. However I'm not sure how to phrase it.
UPDATE table
SET text_field_1 = (SELECT text_field_2
FROM other_table
WHERE id = **current row in update statement, outside parens**.id )
How do I do this? It seems like a job for recursion.
Use:
UPDATE YOUR_TABLE
SET text_field_1 = (SELECT t.text_field_2
FROM other_table t
WHERE t.id = YOUR_TABLE.id)
Warning
If there's no supporting record in other_table, text_field_1 will be set to NULL.
Explanation
In standard SQL, you can't have table aliases on the table defined for the UPDATE (or DELETE) statement, so you need to use full table name to indicate the source of the column.
It's called a correlated subquery -- the correlation is be cause of the evaluation against the table from the outer query.
Clarification
MySQL (and SQL Server) support table aliases in UPDATE and DELETE statement, in addition to JOIN syntax:
UPDATE YOUR_TABLE a
JOIN OTHER_TABLE b ON b.id = a.id
SET a.text_field_1 = b.text_field_2
...is not identical to the provided query, because only the rows that match will be updated -- those that don't match, their text_field_1 values will remain untouched. This is equivalent to the provided query:
UPDATE YOUR_TABLE a
LEFT JOIN OTHER_TABLE b ON b.id = a.id
SET a.text_field_1 = b.text_field_2
If there is one ID field:
UPDATE updtable t1
SET t1.text_field_1 = (
SELECT t2.text_field_2
FROM seltable t2
WHERE t1.ID = t2.ID
)
;
UPDATE Table1, Tabl2
SET Table1.myField = Table2.SomeField
WHERE Table1.ID = Table2.ID
Note: I have not tried it.
This will only update records where IDs match.
Try this:
UPDATE table
SET text_field_1 = (SELECT text_field_2
FROM other_table
WHERE id = table.id )

How can I compare two tables and delete the duplicate rows in SQL?

I have two tables and I need to remove rows from the first table if an exact copy of a row exists in the second table.
Does anyone have an example of how I would go about doing this in MSSQL server?
Well, at some point you're going to have to check all the columns - might as well get joining...
DELETE a
FROM a -- first table
INNER JOIN b -- second table
ON b.ID = a.ID
AND b.Name = a.Name
AND b.Foo = a.Foo
AND b.Bar = a.Bar
That should do it... there is also CHECKSUM(*), but this only helps - you'd still need to check the actual values to preclude hash-conflicts.
If you're using SQL Server 2005, you can use intersect:
delete * from table1 intersect select * from table2
I think the psuedocode below would do it..
DELETE FirstTable, SecondTable
FROM FirstTable
FULL OUTER JOIN SecondTable
ON FirstTable.Field1 = SecondTable.Field1
... continue for all fields
WHERE FirstTable.Field1 IS NOT NULL
AND SecondTable.Field1 IS NOT NULL
Chris's INTERSECT post is far more elegant though and I'll use that in future instead of writing out all of the outer join criteria :)
I would try a DISTINCT query and do a union of the two tables.
You can use a scripting language like asp/php to format the output into a series of insert statements to rebuild the table the resulting unique data.
try this:
DELETE t1 FROM t1 INNER JOIN t2 ON t1.name = t2.name WHERE t1.id = t2.id