Best way to update/insert into a table based on a remote table - sql

I have two very large enterprise tables in an Oracle 10g database. One table keeps the historical information of the other table. The problem is, I'm getting to the point where the records are just too many that my insert update is taking too long and my session is getting killed by the governor.
Here's a pseudocode of my update process:
sqlsel := 'SELECT col1, col2, col3, col4 sysdate
FROM table2#remote_location dpi
WHERE (col1, col2, col3) IN
(
SELECT col1, col2, col3
FROM table2#remote_location
MINUS
SELECT DISTINCT col1, col2, col3
FROM table1 mpc
WHERE facility = '''||load_facility||'''
)';
EXECUTE IMMEDIATE sqlsel BULK COLLECT
INTO table1;
I've tried the MERGE statement:
MERGE INTO table1 t1
USING (
SELECT col1, col2, col3 FROM table2#remote_location
) t2
ON (
t1.col1 = t2.col1 AND
t1.col2 = t2.col2 AND
t1.col3 = t2.col3
)
WHEN NOT MATCHED THEN
INSERT (t1.col1, t1.col2, t1.col3, t1.update_dttm )
VALUES (t2.col1, t2.col2, t2.col3, sysdate )
But there seems to be a confirmed bug on versions prior to Oracle 10.2.0.4 on the merge statement when doing a merge using a remote database. The chance of getting an enterprise upgrade is slim so is there a way to further optimize my first query or write it in another way to have it run best performance wise?
Thanks.

Have you looked at Materialized Views to perform your sync? A pretty good into can be found at Ask Anantha. This Oracle white paper is good, too.

If there are duplicate col1/col2/col3 entries in table2#remote, then your query will return them. if they are not needed, then you could do a
SELECT col1, col2, col3, sysdate
FROM (
SELECT col1, col2, col3
FROM table2#remote_location
MINUS
SELECT col1, col2, col3
FROM table1 mpc
WHERE facility = '''||load_facility||'''
)
You can get rid of the DISTINCT too. MINUS is a set operation and so it is unnecessary.

Related

How Can I Use the Max Function to Filter a List and Insert Into

Is there a way to do something like:
Insert Into (col1, col2, col3)
Select col1, col2, col3, max(col4)
From mytable
Group By col1, col2, col3
That gives me: The select list for the INSERT statement contains more items than the insert list.
I want to use the max function to filter out dupes but when I select this extra field, the order of fields and number of fields doesn’t match up. How can I filter a list from a table, use the max function, and insert all records except the ones in the max field?
I want to use the max function to filter out dupes
Well, I suspect that you actually want distinct:
insert into my_target_table(col1, col2, col3)
select distinct col1, col2, col3 from my_source_table
This will insert one record in the target table for each distinct (col1, col2, col3) tuple in the source table.
You are describing something like this:
Insert Into (col1, col2, col3)
select col1, col2, col3
from mytable
where t.col4 = (select max(t2.col4)
from mytable t2
where t2.col1 = t.col1 and t2.col2 = t.col2 and t2.col3 = t.col3
);
However, this is pretty much equivalent to select distinct (NULL values might be treated differently). You probably want dupes defined on only one column, so I'm thinking:
insert into (col1, col2, col3)
select col1, col2, col3
from mytable
where t.col4 = (select max(t2.col4)
from mytable t2
where t2.col1 = t.col1
);

using partition by clause in delete statement postgresql

I am trying to debug the below code. It throws me an error saying ERROR: syntax error at or near "(" .
My aim to to delete duplicate records in the table
delete FROM (SELECT *,
ROW_NUMBER() OVER (partition BY snapshot,col1,col2,col3,col4,col5) AS rnum
FROM table where snapshot='2019-08-31') as t
WHERE t.rnum > 1;
try like below
DELETE FROM table a
WHERE a.ctid <> (SELECT min(b.ctid)
FROM table b
WHERE a.snapshot = b.snapshot
and a.col1=b.col1 and a.col2=b.col2);
Postgres does not allow deleting from subqueries. You can join in other tables. But in this case, I think a correlated subquery is sufficient, assuming you have a unique id of some sort:
delete from t
where snapshot = '2019-08-31' and
id > (select min(id)
from t t2
where t2.snapshot = t.snapshot and
t2.col1 = t.col1 and
t2.col2 = t.col2 and
t2.col3 = t.col3 and
t2.col4 = t.col4 and
t2.col5 = t.col5
);
Note: This also assumes that the columns are not NULL. You can replace = with is not distinct from if NULLs are a possibility.
If you have lots of duplicates and no identity column, you might find it simpler to remove and re-insert the data:
create table temp_snapshot as
select distinct on (col1, col2, col3, col4, col5) t.*
from t
where snapshot = '2019-08-31'
order by col1, col2, col3, col4, col5;
delete from t
where col1, col2, col3, col4, col5;
insert into t
select *
from temp_snapshot;
If your table is partitioned by snapshot (possibly a very good idea), then you can drop the partition instead and then add the data back in. That process is typically faster than deleting records.

Loop variable SQL query to append tables

I have a database that has 100+ tables, all with the same header. I want to merge these tables into one. Also within the database is a table that lists all the other tables (an inventory of the database per se).
I'm looking for a way to loop the following SQL append query so VaryingTableName changes to follow through my inventory table:
INSERT INTO MainTable IN 'C:\newDBFile.accdb'
SELECT VaryingTableName.*
FROM VaryingTableName;
If there were a way to do this without the inventory table, that's fine too.
It's not the most pretty solution and involves no automation but you could do:
INSERT INTO MainTable (Col1, Col2, Col3, Col4) IN 'C:\newDBFile.accdb'
SELECT Col1, Col2, Col3, Col4
FROM (
SELECT Col1, Col2, Col3, Col4
FROM OldTable1
UNION ALL
SELECT Col1, Col2, Col3, Col4
FROM OldTable2
...)

Oracle: Insert into select... in the

What is the advantage of inserting into a select of a table over simply inserting into the table?
eg
insert into
( select COL1
, COL2
from Table1
where 1=2 <= this and above is the focus of the question.
) select COL3, COL4 from Table2 ;
It seems to do the same thing as:
insert into Table1
( COL1, COL2 )
select COL3, COL4 from Table2 ;
This is the first time I've seen this; our Sr Dev says there is some advantage but he can't remember what it is.
It may make sense in a way if one was inserting a "select *..." from a table with lots of columns, and we want to be lazy, but... we're not. We're enumerating each column in the table.
Database is Oracle 11gR2, but this query was written probably in 10g or before.
we want to be lazy
No, we use insert into table(col1, col2) select col2, col2 from ... when there is a lot of records (for example 1M) and we don't want to create a the values section for each. Let's imagine how much time it takes if you write
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 1);
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 2);
...
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 1000000);
insert select is faster way for copying data from one table(several tables) to an another table.
In a nutshell. It's a lot easier. Especially when you have a massive query that you dont wanna rebuild,or if you have a crapton of objects, or values you are inserting.
Without WITH CHECK OPTION specified, I don't know of any purpose for this syntax. If you specify WITH CHECK OPTION, you can effectively implement an ad-hoc check constraint within your insert statement.
insert into
( select COL1
, COL2
from Table1
where 1=2 WITH CHECK OPTION
) select COL3, COL4 from Table2 ;
The above will never insert a record, because 1 will never equal 2.
The statement below will insert a record as long as COL3 is less than 100, otherwise an exception is raised.
insert into
( select COL1
, COL2
from Table1
where COL1 < 100 WITH CHECK OPTION
) select COL3, COL4 from Table2 ;

Updating records in a specific order

I am trying to update all records in my table. As I read through the records I need to update a column in the current record with a value from the NEXT record in the set. The catch is the updates need to be done in a specified order.
I was thinking of something like this ...
Update t1
Set col1 = (select LEAD(col2,1) OVER (ORDER BY col3, col4, col5)
from t1);
This doesn't compile but you see what I'm driving at ... any ideas ?
... update
This peice does run successfully but writes only NULLS
Update t1 A
Set t1.col1 = (select LEAD(col2,1) OVER (ORDER BY col3, col4, col5)
from t1 B
where A.col3 = B.col3 AND
A.col4 = B.col4 AND
A.col5 = B.col5);
This should do it:
merge into t1
using
(
select rowid as rid,
LEAD(col2,1) OVER (ORDER BY col3, col4, col5) as ld
from t1
) lv on ( lv.rid = t1.rowid )
when matched then
update set col1 = lv.ld;
Not 100% sure if I got the syntax completely right, but as you didn't supply any testdata, I'll leave potential syntax errors for you to fix.
You can also replace the usage of rowid with the real primary key columns of your table.
Why don't you use cursor? You can use update within a cursor with specified order.
You can do this using the with statement:
with toupdate as (
select t1.*,
lead(col2, 1) over (order by col3, col4, col5) as nextval
from t1
)
Update toupdate
Set col1 = nextval
By the way, this does not guarantee the ordering of the updates. However, col2 is not mentioned in the partitioning clause so it should do the right thing.
The above syntax works in SQL Server, but not Oracle. The original question did not specify the database (and lead is a valid function in SQL Server 2012). It seems the merge statement is the way to get the values in the subquery.