Updating records in a specific order - sql

I am trying to update all records in my table. As I read through the records I need to update a column in the current record with a value from the NEXT record in the set. The catch is the updates need to be done in a specified order.
I was thinking of something like this ...
Update t1
Set col1 = (select LEAD(col2,1) OVER (ORDER BY col3, col4, col5)
from t1);
This doesn't compile but you see what I'm driving at ... any ideas ?
... update
This peice does run successfully but writes only NULLS
Update t1 A
Set t1.col1 = (select LEAD(col2,1) OVER (ORDER BY col3, col4, col5)
from t1 B
where A.col3 = B.col3 AND
A.col4 = B.col4 AND
A.col5 = B.col5);

This should do it:
merge into t1
using
(
select rowid as rid,
LEAD(col2,1) OVER (ORDER BY col3, col4, col5) as ld
from t1
) lv on ( lv.rid = t1.rowid )
when matched then
update set col1 = lv.ld;
Not 100% sure if I got the syntax completely right, but as you didn't supply any testdata, I'll leave potential syntax errors for you to fix.
You can also replace the usage of rowid with the real primary key columns of your table.

Why don't you use cursor? You can use update within a cursor with specified order.

You can do this using the with statement:
with toupdate as (
select t1.*,
lead(col2, 1) over (order by col3, col4, col5) as nextval
from t1
)
Update toupdate
Set col1 = nextval
By the way, this does not guarantee the ordering of the updates. However, col2 is not mentioned in the partitioning clause so it should do the right thing.
The above syntax works in SQL Server, but not Oracle. The original question did not specify the database (and lead is a valid function in SQL Server 2012). It seems the merge statement is the way to get the values in the subquery.

Related

using partition by clause in delete statement postgresql

I am trying to debug the below code. It throws me an error saying ERROR: syntax error at or near "(" .
My aim to to delete duplicate records in the table
delete FROM (SELECT *,
ROW_NUMBER() OVER (partition BY snapshot,col1,col2,col3,col4,col5) AS rnum
FROM table where snapshot='2019-08-31') as t
WHERE t.rnum > 1;
try like below
DELETE FROM table a
WHERE a.ctid <> (SELECT min(b.ctid)
FROM table b
WHERE a.snapshot = b.snapshot
and a.col1=b.col1 and a.col2=b.col2);
Postgres does not allow deleting from subqueries. You can join in other tables. But in this case, I think a correlated subquery is sufficient, assuming you have a unique id of some sort:
delete from t
where snapshot = '2019-08-31' and
id > (select min(id)
from t t2
where t2.snapshot = t.snapshot and
t2.col1 = t.col1 and
t2.col2 = t.col2 and
t2.col3 = t.col3 and
t2.col4 = t.col4 and
t2.col5 = t.col5
);
Note: This also assumes that the columns are not NULL. You can replace = with is not distinct from if NULLs are a possibility.
If you have lots of duplicates and no identity column, you might find it simpler to remove and re-insert the data:
create table temp_snapshot as
select distinct on (col1, col2, col3, col4, col5) t.*
from t
where snapshot = '2019-08-31'
order by col1, col2, col3, col4, col5;
delete from t
where col1, col2, col3, col4, col5;
insert into t
select *
from temp_snapshot;
If your table is partitioned by snapshot (possibly a very good idea), then you can drop the partition instead and then add the data back in. That process is typically faster than deleting records.

Oracle: Insert into select... in the

What is the advantage of inserting into a select of a table over simply inserting into the table?
eg
insert into
( select COL1
, COL2
from Table1
where 1=2 <= this and above is the focus of the question.
) select COL3, COL4 from Table2 ;
It seems to do the same thing as:
insert into Table1
( COL1, COL2 )
select COL3, COL4 from Table2 ;
This is the first time I've seen this; our Sr Dev says there is some advantage but he can't remember what it is.
It may make sense in a way if one was inserting a "select *..." from a table with lots of columns, and we want to be lazy, but... we're not. We're enumerating each column in the table.
Database is Oracle 11gR2, but this query was written probably in 10g or before.
we want to be lazy
No, we use insert into table(col1, col2) select col2, col2 from ... when there is a lot of records (for example 1M) and we don't want to create a the values section for each. Let's imagine how much time it takes if you write
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 1);
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 2);
...
insert into table (col1, col2)
values (select col1, col2 from (select col1, col2, rownum rn from ...) where rn = 1000000);
insert select is faster way for copying data from one table(several tables) to an another table.
In a nutshell. It's a lot easier. Especially when you have a massive query that you dont wanna rebuild,or if you have a crapton of objects, or values you are inserting.
Without WITH CHECK OPTION specified, I don't know of any purpose for this syntax. If you specify WITH CHECK OPTION, you can effectively implement an ad-hoc check constraint within your insert statement.
insert into
( select COL1
, COL2
from Table1
where 1=2 WITH CHECK OPTION
) select COL3, COL4 from Table2 ;
The above will never insert a record, because 1 will never equal 2.
The statement below will insert a record as long as COL3 is less than 100, otherwise an exception is raised.
insert into
( select COL1
, COL2
from Table1
where COL1 < 100 WITH CHECK OPTION
) select COL3, COL4 from Table2 ;

Efficiently duplicate some rows in PostgreSQL table

I have PostgreSQL 9 database that uses auto-incrementing integers as primary keys. I want to duplicate some of the rows in a table (based on some filter criteria), while changing one or two values, i.e. copy all column values, except for the ID (which is auto-generated) and possibly another column.
However, I also want to get the mapping from old to new IDs. Is there a better way to do it then just querying for the rows to copy first and then inserting new rows one at a time?
Essentially I want to do something like this:
INSERT INTO my_table (col1, col2, col3)
SELECT col1, 'new col2 value', col3
FROM my_table old
WHERE old.some_criteria = 'something'
RETURNING old.id, id;
However, this fails with ERROR: missing FROM-clause entry for table "old" and I can see why: Postgres must be doing the SELECT first and then inserting it and the RETURNING clauses only has access to the newly inserted row.
RETURNING can only refer to the columns in the final, inserted row. You cannot refer to the "OLD" id this way unless there is a column in the table to hold both it and the new id.
Try running this which should work and will show all the possible values that you can get via RETURNING:
INSERT INTO my_table (col1, col2, col3)
SELECT col1, 'new col2 value', col3
FROM my_table AS old
WHERE old.some_criteria = 'something'
RETURNING *;
It won't get you the behavior you want, but should illustrate better how RETURNING is designed to work.
This can be done with the help of data-modifiying CTEs (Postgres 9.1+):
WITH sel AS (
SELECT id, col1, col3
, row_number() OVER (ORDER BY id) AS rn -- order any way you like
FROM my_table
WHERE some_criteria = 'something'
ORDER BY id -- match order or row_number()
)
, ins AS (
INSERT INTO my_table (col1, col2, col3)
SELECT col1, 'new col2 value', col3
FROM sel
ORDER BY id -- redundant to be sure
RETURNING id
)
SELECT s.id AS old_id, i.id AS new_id
FROM (SELECT id, row_number() OVER (ORDER BY id) AS rn FROM ins) i
JOIN sel s USING (rn);
SQL Fiddle demonstration.
This relies on the undocumented implementation detail that rows from a SELECT are inserted in the order provided (and returned in the order provided). It works in all current versions of Postgres and is not going to break. Related:
Does Postgres preserve insertion order of records?
Window functions are not allowed in the RETURNING clause, so I apply row_number() in another subquery.
More explanation in this related later answer:
INSERT INTO ... FROM SELECT ... RETURNING id mappings
Good! I test this code, but I change
this (FROM my_table AS old) in (FROM my_table) and
this (WHERE old.some_criteria = 'something') in (WHERE some_criteria = 'something')
This is the final code that I use
INSERT INTO my_table (col1, col2, col3)
SELECT col1, 'new col2 value', col3
FROM my_table AS old
WHERE some_criteria = 'something'
RETURNING *;
Thanks!
DROP TABLE IF EXISTS tmptable;
CREATE TEMPORARY TABLE tmptable as SELECT * FROM products WHERE id = 100;
UPDATE tmptable SET id = sbq.id from (select max(id)+1 as id from products) as sbq;
INSERT INTO products (SELECT * FROM tmptable);
DROP TABLE IF EXISTS tmptable;
add another update before the insert to modify another field
UPDATE tmptable SET another = 'data';
'old' is a reserved word, used by the rule rewrite system.
[ I presume this query fragment is not part of a rule; in that case you would have phrased the question differently ]

sql insert into table from select without duplicates (need more then a DISTINCT)

I am selecting multiple rows and inserting them into another table. I want to make sure that it doesn't already exists in the table I am inserting multiple rows into.
DISTINCT works when there are duplicate rows in the select, but not when comparing it to the data already in the table your inserting into.
If I Selected one row at a time I could do a IF EXIST but since its multiple rows (sometimes 10+) it doesn't seem like I can do that.
INSERT INTO target_table (col1, col2, col3)
SELECT DISTINCT st.col1, st.col2, st.col3
FROM source_table st
WHERE NOT EXISTS (SELECT 1
FROM target_table t2
WHERE t2.col1 = st.col1
AND t2.col2 = st.col2
AND t2.col3 = st.col3)
If the distinct should only be on certain columns (e.g. col1, col2) but you need to insert all column, you will probably need some derived table (ANSI SQL):
INSERT INTO target_table (col1, col2, col3)
SELECT st.col1, st.col2, st.col3
FROM (
SELECT col1,
col2,
col3,
row_number() over (partition by col1, col2 order by col1, col2) as rn
FROM source_table
) st
WHERE st.rn = 1
AND NOT EXISTS (SELECT 1
FROM target_table t2
WHERE t2.col1 = st.col1
AND t2.col2 = st.col2)
If you already have a unique index on whatever fields need to be unique in the destination table, you can just use INSERT IGNORE (here's the official documentation - the relevant bit is toward the end), and have MySQL throw away the duplicates for you.
Hope this helps!
So you're looking to retrieve all unique rows from source table which do not already exist in target table?
SELECT DISTINCT(*) FROM source
WHERE primaryKey NOT IN (SELECT primaryKey FROM target)
That's assuming you have a primary key which you can base the uniqueness on... otherwise, you'll have to check each column for uniqueness.
pseudo code for what might work
insert into <target_table> select col1 etc
from <source_table>
where <target_table>.keycol not in
(select source_table.keycol from source_table)
There are a few MSDN articles out there about this, but by far this one is the best:
http://msdn.microsoft.com/en-us/library/ms162773.aspx
They made it real easy to implement and my problem is now fixed. Also the GUI is ugly, but you actually can set minute intervals without using the command line in windows 2003.

Best way to update/insert into a table based on a remote table

I have two very large enterprise tables in an Oracle 10g database. One table keeps the historical information of the other table. The problem is, I'm getting to the point where the records are just too many that my insert update is taking too long and my session is getting killed by the governor.
Here's a pseudocode of my update process:
sqlsel := 'SELECT col1, col2, col3, col4 sysdate
FROM table2#remote_location dpi
WHERE (col1, col2, col3) IN
(
SELECT col1, col2, col3
FROM table2#remote_location
MINUS
SELECT DISTINCT col1, col2, col3
FROM table1 mpc
WHERE facility = '''||load_facility||'''
)';
EXECUTE IMMEDIATE sqlsel BULK COLLECT
INTO table1;
I've tried the MERGE statement:
MERGE INTO table1 t1
USING (
SELECT col1, col2, col3 FROM table2#remote_location
) t2
ON (
t1.col1 = t2.col1 AND
t1.col2 = t2.col2 AND
t1.col3 = t2.col3
)
WHEN NOT MATCHED THEN
INSERT (t1.col1, t1.col2, t1.col3, t1.update_dttm )
VALUES (t2.col1, t2.col2, t2.col3, sysdate )
But there seems to be a confirmed bug on versions prior to Oracle 10.2.0.4 on the merge statement when doing a merge using a remote database. The chance of getting an enterprise upgrade is slim so is there a way to further optimize my first query or write it in another way to have it run best performance wise?
Thanks.
Have you looked at Materialized Views to perform your sync? A pretty good into can be found at Ask Anantha. This Oracle white paper is good, too.
If there are duplicate col1/col2/col3 entries in table2#remote, then your query will return them. if they are not needed, then you could do a
SELECT col1, col2, col3, sysdate
FROM (
SELECT col1, col2, col3
FROM table2#remote_location
MINUS
SELECT col1, col2, col3
FROM table1 mpc
WHERE facility = '''||load_facility||'''
)
You can get rid of the DISTINCT too. MINUS is a set operation and so it is unnecessary.