DB2 performance of coalesce and inner join - sql

I have a pretty poorly performing query (which I inherited) and I'm not too sure how to optimize it... As far as I understand it's setting the value of a 2nd column as the value of the 1st column PLUS a value from another table, where a relationship is found.
update table1 set
col2 = col1 || coalesce ((
select table2.the_column_wanted from table2 where table2.fk = table1.pk and
table2.flag = 'Y'))
where flag = 'Y' and pk in ( select distinct fk from table2 );

The exact speed issue depends on the characteristics of your table, but in general I would look into MERGE for this sort of problem. Something like this:
MERGE INTO table1 USING table2
ON table1.pk = table2.fk and
table1.flag = 'Y' and
table2.flag = 'Y'
WHEN MATCHED THEN UPDATE SET
table1.col2 = table1.col1 || table2.the_column_wanted;
The query as written is very questionable. You should probably take a good look at all of the code you have "inherited."

Related

Aster UPDATE Column Does Not Exist

I am attempting to UPDATE the values of a column in one table based on the count of values in a source table. I am using Teradata Aster.
When I submit the following correlated subquery, I get an error stating the column does not exist despite verifying that it does exist.
UPDATE table2
SET column =
(
SELECT count(*)
FROM table1
WHERE table2.column = table1.column
)
I feel there is something idiosyncratic about Aster, but I'm not certain.
You could use below query for simple column update from another table.
UPDATE table1
SET col2 = table2.col2
FROM table2
WHERE table1.col1 = table2.col1;
and for aggregate function in update query you could use below query.
UPDATE table1
SET col2 = table2.col2
FROM (select col1, count(col2) col2 from table2 group by col1 ) table2
WHERE table1.col1 = table2.col1;
Both the queries works fine for me.

SQL advice about inner select and join

I'm having some sort of debate with a colleague about a piece of sql. In a project, I wrote something like this :
update MyTable
set field1 = (select count(distinct blabla)
from anotherTable t
inner join againAnotherTable t2 on t2.fk = t1.pk
where t2.fk = MyTable.fk)
after some unitTests, the field "field1" of MyTable is properly populated with valid values. My colleague is telling me that I got lucky because the link I do inside the inner query (t2.fk = MyTable.fk) is inconsistent and I might have some error sometimes and update the wrong line or update the whole table. Instead I should put a join after the end of the parenthesis
Did I miss something ? Is there indeed a huge mistake on my side ?
Your query looks fine to me.
Do note that it does update the entire table because there is no where clause (or other filtering) for the update. Unmatched rows will have a value of 0. I don't know if that is what you intend.
Of course, this all depends on whether t2.fk = MyTable.fk is the logic you really want. I don't know what "inconsistent" would mean in this case.
I don't see how changing the data could result in an error. You might get unexpected values. For instance, if you intend for NULL values of fk to match, they won't. If there are no matches, then you'll get 0. The may not be the correct result (based on the logic you intend), but the query would be doing something sensible.
It is ok for this query if you wanted to actually update it that way, but if you want to update fields with join, and include the table being updated, do not write it like that.
Your colleague is trying to make sure that you write safer update queries in future. He does not want you to miss where t2.fk = MyTable.fk someday, in the sub-query. That would update the table incorrectly.
In that case then, write it like shown below,
update a set a.field1 = b.value
FROM MyTable a INNER JOIN anotherTable b ON a.condition1 = b.condition2
INNER JOIN yetAnotherTable c ON a.condition1 = c.condition2
So, you should change your update query to something like below
update a
set field1 = count(distinct blabla)
FROM anotherTable b INNER JOIN againAnotherTable t2 on t2.fk = t1.pk
INNER JOIN MyTable a ON t2.fk = MyTable.fk
declare #field1 datatype
set #field1 = (select count(distinct blabla)
from anotherTable t
inner join againAnotherTable t2 on t2.fk = t1.pk
where t2.fk = MyTable.fk)
update table set column=#field1 where id=''

How to select a value that can come from two different tables?

First, SQL is not my strength. So I need help with the following problem. I'll simplify the table contents to describe the problem.
Let's start with three tables : table1 with columns id_1 and value, table2 with columns id_2 and value, and table3 with columns id_3 and value. As you'll notice, a field value appears in all three tables, while ids have different column names. Modifying column names is not an option because they are used by Java legacy code.
I need to set table3.value using table1.value or table2.value according to the fields table1.id_1, table2.id_2 and table3.id_3.
My last attempt, which describes what I try to do, is the following:
UPDATE table3
SET value=(IF ((SELECT COUNT(\*) FROM table1 t1 WHERE t1.id_1=id_3) > 0)
SELECT value FROM table1 t1 WHERE t1.id_1=id_3
ELSE IF ((SELECT COUNT(\*) FROM table2 t2 WHERE t2.id_2=id_3)) > 0)
SELECT value FROM table2 t2 WHERE t2.id_2=id_3)
Here are some informations about the tables and the update.
This update will be included in an XML file used by Liquibase.
It must work with Oracle or SQL Server.
An id from table3.id_3 can be found at most once in table1.id_1 or in table2.id_2, but not in both tables simultaneously.
If table3.id_3 is not found in table1.id_1 nor in table2.id_2, table3.value remains null.
As you can imagine, my last attempt failed. In that case, the IF command was not recognized during the Liquibase update. If anyone has any ideas how to deal with this, I'd appreciate. Thanks in advance.
I don't know Oracle very well, but a SQL Server approach would be the following using COALESCE() and OUTER JOINs.
Update T3
Set Value = Coalesce(T1.Value, T2.Value)
From Table3 T3
Left Join Table2 T2 On T3.Id_3 = T2.Id_2
Left Join Table1 T1 On T3.Id_3 = T1.Id_1
The COALESCE() will return the first non-NULL value from the LEFT JOIN to tables 1 and 2, and if a record was not found in either, it would be set to NULL.
It is Siyual's UPDATE written with MERGE operator.
MERGE into table_1
USING (
SELECT COALESCE(t2.value, t3.value) as value, t1.id_1 as id
FROM table_1 t1, table_2 t2, table_3 t3
WHERE t2.id_2 = t3.id_3 and t1.id_1 = t2.id_2
) t on (table_1.id_1 = t.id)
WHEN MATCHED THEN
UPDATE SET table_1.value = t.value
This should work in Oracle.
In Oracle
UPDATE table3 t
SET value=COALESCE((SELECT value FROM table1 t1 WHERE t1.id_1=t.id_3),
(SELECT value FROM table2 t2 WHERE t2.id_2=t.id_3))
Given your assumption #3, you can use union all to put together tables 1 and 2 without running the risk of duplicating information (at least for the id's of interest). So a simple merge solution like the one below should work (in all DB products that implement the merge operation).
merge into table3
using (
select id_2 as id, value from table2
union all
select id_3, value from table 3
) t
on table3.id_3 = t.id
when matched
then update set table3.value = t.value;
You may want to test the various solutions and see which is most effective for your specific tables.
(Note: merge should be more efficient than the update solution using coalesce, at least when relatively few of the id's in table3 have a match in the other tables. This is because the update solution will re-insert NULL where NULL was already stored when there is no match. The merge solution avoids this unnecessary activity.)

Inserting values generated from query into a blank column

I am trying to count the number of points (stored in table2) that are found in each polygon of table 1. The query works but I have tried to alter it to add the valus generated to a blank column in table 1.
So far it only works by appending the results to the bottom of the table. Any help? To summarise I am trying to add values generated from this query into and add them into table1. At the moment the query inserts them into the blank column in table 1, but no matched against the ID, but appended at the bottom.
INSERT INTO table1(field3)
SELECT COUNT(table2.id) AS count1
FROM table1 LEFT JOIN table2
ON ST_Contains(table1.geom,table2.geom)
GROUP BY table1.id;
The only change I made here was to switch your left join to an inner join. In the case where a geometry in table1 contains no geometries in table2, the value of field3 will stay null, so you might want to start by doing an "update table1 set field3 = 0" first (it can turn out to be a bit faster doing that in two steps depending on how many features you have and how many points each geometry has).
update table1 a
set field3 = b.count1
from
(
SELECT table1.id,
COUNT(table2.id) AS count1
FROM table1
JOIN table2
ON ST_Contains(table1.geom,table2.geom)
GROUP BY table1.id
) b
where a.id = b.id
Alternative:
update table1 a
set field3 = b.count1
from
(
SELECT table1.id,
COUNT(table2.id) AS count1
FROM table1
left JOIN table2
ON ST_Contains(table1.geom,table2.geom)
GROUP BY table1.id
) b
where a.id = b.id
Also, this site just showed up on reddit this morning. I haven't spent much time digging through it but it looks promising as (yet another) resource for learning sql (in a postgres-specific environment).
Edit: I'm starting to doubt myself with regards to the two step approach that I first posted - I think it's almost entirely wrong about the performance, so I included an alternative query.

refer to outside field value in subselect?

I want to do a query to update values that I forgot to copy over in a mass insert. However I'm not sure how to phrase it.
UPDATE table
SET text_field_1 = (SELECT text_field_2
FROM other_table
WHERE id = **current row in update statement, outside parens**.id )
How do I do this? It seems like a job for recursion.
Use:
UPDATE YOUR_TABLE
SET text_field_1 = (SELECT t.text_field_2
FROM other_table t
WHERE t.id = YOUR_TABLE.id)
Warning
If there's no supporting record in other_table, text_field_1 will be set to NULL.
Explanation
In standard SQL, you can't have table aliases on the table defined for the UPDATE (or DELETE) statement, so you need to use full table name to indicate the source of the column.
It's called a correlated subquery -- the correlation is be cause of the evaluation against the table from the outer query.
Clarification
MySQL (and SQL Server) support table aliases in UPDATE and DELETE statement, in addition to JOIN syntax:
UPDATE YOUR_TABLE a
JOIN OTHER_TABLE b ON b.id = a.id
SET a.text_field_1 = b.text_field_2
...is not identical to the provided query, because only the rows that match will be updated -- those that don't match, their text_field_1 values will remain untouched. This is equivalent to the provided query:
UPDATE YOUR_TABLE a
LEFT JOIN OTHER_TABLE b ON b.id = a.id
SET a.text_field_1 = b.text_field_2
If there is one ID field:
UPDATE updtable t1
SET t1.text_field_1 = (
SELECT t2.text_field_2
FROM seltable t2
WHERE t1.ID = t2.ID
)
;
UPDATE Table1, Tabl2
SET Table1.myField = Table2.SomeField
WHERE Table1.ID = Table2.ID
Note: I have not tried it.
This will only update records where IDs match.
Try this:
UPDATE table
SET text_field_1 = (SELECT text_field_2
FROM other_table
WHERE id = table.id )