I need to correct some data in a table (a correclation table was messed up and therefore I need to gather the correlation data and then update another table with the information). To not mess up the production data I want to collect all the data I need into a seperate table (temptablecorrection) and then update the respective production DB tables from there.
The seperate table already has some filled rows from a previous SQL (t2.field4).
MERGE INTO temptablecorrection t1
USING (
SELECT DISTINCT
cfield1,
cfield2,
cfield3
FROM
maintable
WHERE
ccreationtime BETWEEN 1656396000000 AND 1656550800000
)
t2 ON ( t1.field3 = t2.field4 )
WHEN MATCHED THEN UPDATE
SET t1.field1 = t2.field1,
t1.field2 = t2.field2;
t1.field3 is unique in the timeframe (-> dinstinct count field3 and rowcount are the same).
t2.field4 is unique in the timeframe (-> dinstinct count field4 and rowcount are the same).
I get the following SQL error:
MERGE into dataUMCContrl t1 USING (select DISTINCT cfield1, cfield2,cfield3 from mainTable where ccrea...
ORA-30926: unable to get a stable set of rows in the source tables [SQL State=99999, DB Errorcode=30926]
1 statement failed.
I have no Idea what the issue is. On my testsytem the SQL works without issues.
When I checked for the issue the pointers I found were in regard to an issue with not unique rows.
But from what I gather t1.field3 and t2.field4 are unique.
In total the timeframe covers almost 74000 rows.
Any ideas to point me in the right direction?
Since field4 is unique use it in the select statement in the using clause.Also how are you using field4 in the ON clause without having field4 in the select statement
MERGE INTO temptablecorrection t1
USING (
SELECT DISTINCT
cfield1,
cfield2,
cfield3,
field4,
rowid row_id
FROM
maintable
WHERE
ccreationtime BETWEEN 1656396000000 AND 1656550800000
)
t2 ON ( t1.field3 = t2.field4 )
WHEN MATCHED THEN UPDATE
SET t1.field1 = t2.cfield1,
t1.field2 = t2.cfield2;
Related
I found a very useful delete query that will delete duplicates based on specific columns:
DELETE FROM table USING table alias
WHERE table.field1 = alias.field1 AND table.field2 = alias.field2 AND
table.max_field < alias.max_field
How to delete duplicate entries?
However, is there an equivalent SELECT query that will allow to filter the same way? Was trying USING but no success.
Thank you.
You can join your table with itself using the specific columns, field1 and field2, and then filter based on a comparison between max_field on both tables.
select t1.*
from mytable t1
join mytable t2 on (t1.field1 = t2.field1 and t1.field2 = t2.field2)
where t1.max_field < t2.max_field;
You will get all the duplicates whose max_field is not the greatest.
sqlfiddle here.
For example I have table1 with field1 and field2 and want to do something like:
UPDATE table1
SET field1, field2 = (SELECT field1, field 2 FROM tableXYZ)
WHERE field 3 = 'foobar'
or do I have to do multiple SETs, running the same SELECT query several times?
Assuming whatever database you are using supports it, you can join the tables. SO:
Update table1
set field1 = tbx.field1,
field2 = tbx.field2
from table1 join tablexyz on --some key value join
You can do a tuple assignment by putting the columns on the left hand side between parentheses.
UPDATE table1
SET (column1, column2) = (SELECT col1, col2
FROM tableXYZ
WHERE ...)
WHERE column3 = 'foobar';
The above is standard SQL, but not all DBMS support that.
Note that you have to use a WHERE clause in the sub-select to make sure that select only returns a single row (you would typically make that a co-related sub-query).
First, SQL is not my strength. So I need help with the following problem. I'll simplify the table contents to describe the problem.
Let's start with three tables : table1 with columns id_1 and value, table2 with columns id_2 and value, and table3 with columns id_3 and value. As you'll notice, a field value appears in all three tables, while ids have different column names. Modifying column names is not an option because they are used by Java legacy code.
I need to set table3.value using table1.value or table2.value according to the fields table1.id_1, table2.id_2 and table3.id_3.
My last attempt, which describes what I try to do, is the following:
UPDATE table3
SET value=(IF ((SELECT COUNT(\*) FROM table1 t1 WHERE t1.id_1=id_3) > 0)
SELECT value FROM table1 t1 WHERE t1.id_1=id_3
ELSE IF ((SELECT COUNT(\*) FROM table2 t2 WHERE t2.id_2=id_3)) > 0)
SELECT value FROM table2 t2 WHERE t2.id_2=id_3)
Here are some informations about the tables and the update.
This update will be included in an XML file used by Liquibase.
It must work with Oracle or SQL Server.
An id from table3.id_3 can be found at most once in table1.id_1 or in table2.id_2, but not in both tables simultaneously.
If table3.id_3 is not found in table1.id_1 nor in table2.id_2, table3.value remains null.
As you can imagine, my last attempt failed. In that case, the IF command was not recognized during the Liquibase update. If anyone has any ideas how to deal with this, I'd appreciate. Thanks in advance.
I don't know Oracle very well, but a SQL Server approach would be the following using COALESCE() and OUTER JOINs.
Update T3
Set Value = Coalesce(T1.Value, T2.Value)
From Table3 T3
Left Join Table2 T2 On T3.Id_3 = T2.Id_2
Left Join Table1 T1 On T3.Id_3 = T1.Id_1
The COALESCE() will return the first non-NULL value from the LEFT JOIN to tables 1 and 2, and if a record was not found in either, it would be set to NULL.
It is Siyual's UPDATE written with MERGE operator.
MERGE into table_1
USING (
SELECT COALESCE(t2.value, t3.value) as value, t1.id_1 as id
FROM table_1 t1, table_2 t2, table_3 t3
WHERE t2.id_2 = t3.id_3 and t1.id_1 = t2.id_2
) t on (table_1.id_1 = t.id)
WHEN MATCHED THEN
UPDATE SET table_1.value = t.value
This should work in Oracle.
In Oracle
UPDATE table3 t
SET value=COALESCE((SELECT value FROM table1 t1 WHERE t1.id_1=t.id_3),
(SELECT value FROM table2 t2 WHERE t2.id_2=t.id_3))
Given your assumption #3, you can use union all to put together tables 1 and 2 without running the risk of duplicating information (at least for the id's of interest). So a simple merge solution like the one below should work (in all DB products that implement the merge operation).
merge into table3
using (
select id_2 as id, value from table2
union all
select id_3, value from table 3
) t
on table3.id_3 = t.id
when matched
then update set table3.value = t.value;
You may want to test the various solutions and see which is most effective for your specific tables.
(Note: merge should be more efficient than the update solution using coalesce, at least when relatively few of the id's in table3 have a match in the other tables. This is because the update solution will re-insert NULL where NULL was already stored when there is no match. The merge solution avoids this unnecessary activity.)
I am trying to write the following MySQL query in PostgreSQL 8.0 (specifically, using Redshift):
DELETE t1 FROM table t1
LEFT JOIN table t2 ON (
t1.field = t2.field AND
t1.field2 = t2.field2
)
WHERE t1.field > 0
PostgreSQL 8.0 does not support DELETE FROM table USING. The examples in the docs say that you can reference columns in other tables in the where clause, but that doesn't work here as I'm joining on the same table I'm deleting from. The other example is a subselect query, but the primary key of the table I'm working with has four columns so I can't see a way to make that work either.
Amazon Redshift was forked from Postgres 8.0, but is a very much different beast. The manual informs, that the USING clause is supported in DELETE statements:
Just use the modern form:
DELETE FROM tbl
USING tbl t2
WHERE t2.field = tbl.field
AND t2.field2 = tbl.field2
AND t2.pkey <> tbl.pkey -- exclude self-join
AND tbl.field > 0;
This is assuming JOIN instead of LEFT JOIN in your MySQL statement, which would not make any sense. I also added the condition AND t2.pkey <> t1.pkey, to make it a useful query. This excludes rows joining itself. pkey being the primary key column.
What this query does:
Delete all rows where at least one other row exists in the same table with the same not-null values in field and field2. All such duplicates are deleted without leaving a single row per set.
To keep (for example) the row with the smallest pkey per set of duplicates, use t2.pkey < t2.pkey.
An EXISTS semi-join (as #wilplasser already hinted) might be a better choice, especially if multiple rows could be joined (a row can only be deleted once anyway):
DELETE FROM tbl
WHERE field > 0
AND EXISTS (
SELECT 1
FROM tbl t2
WHERE t2.field = tbl.field
AND t2.field2 = tbl.field2
AND t2.pkey <> tbl.pkey
);
I don't understand the mysql syntax, but you probably want this:
DELETE FROM mytablet1
WHERE t1.field > 0
-- don't need this self-join if {field,field2}
-- are a candidate key for mytable
-- (in that case, the exists-subquery would detect _exactly_ the
-- same tuples as the ones to be deleted, which always succeeds)
-- AND EXISTS (
-- SELECT *
-- FROM mytable t2
-- WHERE t1.field = t2.field
-- AND t1.field2 = t2.field2
-- )
;
Note: For testing purposes, you can replace the DELETE keyword by SELECT * or SELECT COUNT(*), and see which rows would be affected by the query.
I am trying to count the number of points (stored in table2) that are found in each polygon of table 1. The query works but I have tried to alter it to add the valus generated to a blank column in table 1.
So far it only works by appending the results to the bottom of the table. Any help? To summarise I am trying to add values generated from this query into and add them into table1. At the moment the query inserts them into the blank column in table 1, but no matched against the ID, but appended at the bottom.
INSERT INTO table1(field3)
SELECT COUNT(table2.id) AS count1
FROM table1 LEFT JOIN table2
ON ST_Contains(table1.geom,table2.geom)
GROUP BY table1.id;
The only change I made here was to switch your left join to an inner join. In the case where a geometry in table1 contains no geometries in table2, the value of field3 will stay null, so you might want to start by doing an "update table1 set field3 = 0" first (it can turn out to be a bit faster doing that in two steps depending on how many features you have and how many points each geometry has).
update table1 a
set field3 = b.count1
from
(
SELECT table1.id,
COUNT(table2.id) AS count1
FROM table1
JOIN table2
ON ST_Contains(table1.geom,table2.geom)
GROUP BY table1.id
) b
where a.id = b.id
Alternative:
update table1 a
set field3 = b.count1
from
(
SELECT table1.id,
COUNT(table2.id) AS count1
FROM table1
left JOIN table2
ON ST_Contains(table1.geom,table2.geom)
GROUP BY table1.id
) b
where a.id = b.id
Also, this site just showed up on reddit this morning. I haven't spent much time digging through it but it looks promising as (yet another) resource for learning sql (in a postgres-specific environment).
Edit: I'm starting to doubt myself with regards to the two step approach that I first posted - I think it's almost entirely wrong about the performance, so I included an alternative query.