In an Oracle database (11gR2), I have a table my_table with columns (sequence, col1, col2, col3). I want to insert values into the table that are queried from other tables, i.e. insert into my_table select <query from other tables>. The problem is that the primary key is the four columns, hence I need to add a sequence starting from 0 up till the count of the rows to be inserted (order is not a problem).
I tried using a loop like this:
DECLARE
j NUMBER;
r_count number;
BEGIN
select count(1) into r_count from <my query to be inserted>;
FOR j IN 0 .. r_count
LOOP
INSERT INTO my_table
select <my query, incorporating r_count as sequence column> ;
END LOOP;
END;
But it didn't work, actually looped r_count times trying to insert the entire rows every time, as logically it shall do. How can I achieve the expected goal and insert rows with adding a sequence column?
Don't do this in a loop. Just use row_number():
INSERT INTO my_table(seq, . . .)
select row_number() over (order by NULL) - 1, . . .
from . . .;
Let's create table with sample data (to simulate your source of data)
-- This is your source query table (can be anything)
CREATE TABLE source_table
(
source_a VARCHAR(255),
source_b VARCHAR(255),
source_c VARCHAR(255)
);
insert into source_table (source_a, source_b, source_c) values ('A', 'B', 'C');
insert into source_table (source_a, source_b, source_c) values ('D', 'E', 'F');
insert into source_table (source_a, source_b, source_c) values ('G', 'H', 'I');
Then create target table, with id and 3 data columns.
-- This is your target_table
CREATE TABLE target_table
(
id NUMBER(9,0),
target_a VARCHAR2(255),
target_b VARCHAR2(255),
target_c VARCHAR2(255)
);
-- This is sequence used to ensure unique number in 1st column
CREATE sequence target_table_id_seq start with 0 minvalue 0 increment BY 1;
Finally, perform insert, loading id from sequence, rest of the data from source table.
INSERT INTO target_table
SELECT target_table_id_seq.nextval,
source_a,
source_b,
source_c
FROM source_table;
Results might look like
1 A B C
2 D E F
3 G H I
If you added some values later, they will continue with numbering 4,5,6 etc.. Or do you want to get order only inside the group ? Thus if you added 2 more rows JKL and MNO, target table would look like this
1 A B C
2 D E F
3 G H I
1 J K L
2 M N O
For that you need different solution (don't even need sequencer)
SELECT
RANK() OVER (ORDER BY source_a, source_b, source_c),
source_a,
source_b,
source_c
FROM source_table;
Technically you could use ROWNUM directly, BUT I opt for RANK() OVER analytical function due to consistent result. Please note, that this will breach your complex primary key if you try to insert the same rows twice (My first solution doesn't)
Clearly, you should use Oracle sequence.
First, create a sequence:
create sequence seq_my_table start with 0 minvalue 0 increment by 1;
Then use it:
INSERT INTO my_table (sequence, ...)
select seq_my_table.nextval, <the rest of my query>;
Sequence numbers will be inserted in succession.
So, you already have the table, it has the required number of rows, and now you want to add numbers from 0 to total number of rows minus one in the column named sequence? (perhaps not "sequence" but something less likely to clash with Oracle reserved words?)
Then this should work:
update my_table set seq = rownum - 1;
Related
I have a table that looks like:
ID|CREATED |VALUE
1 |1649122158|200
1 |1649122158|200
1 |1649122158|200
That I'd like to look like:
ID|CREATED |VALUE
1 |1649122158|200
And I run the following query:
DELETE FROM MY_TABLE T USING (SELECT ID,CREATED,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CREATED DESC) AS RANK_IN_KEY FROM MY_TABLE T) X WHERE X.RANK_IN_KEY <> 1 AND T.ID = X.ID AND T.CREATED = X.CREATED
But it removes everything from MY_TABLE and not just other rows with the same value. This is more than just selecting distinct records, I'd like to enforce a unique constraint to get the latest value of ID and keep just one record for it, even if there were duplicates.
So
ID|CREATED |VALUE
1 |1649122158|200
1 |1649122159|300
2 |1649122158|200
2 |1649122158|200
3 |1649122170|500
3 |1649122160|200
Would become (using the same final unique constraint statement):
ID|CREATED |VALUE
1 |1649122159|300
2 |1649122158|200
3 |1649122170|500
How can I improve my logic to properly handle these unique constraint modifications?
Check out this post: https://community.snowflake.com/s/question/0D50Z00008EJgemSAD/how-to-delete-duplicate-records-
If all columns make up a unique records, the recommended solution is the insert all the records into a new table with SELECT DISTINCT * and do a swap. You could also do a INSERT OVERWRITE INTO the same table.
Something like INSERT OVERWRITE INTO tableA SELECT DISTINCT * FROM tableA;
The following setup should leave rows with id of 1 and 3. And not delete all rows as you say.
Schema
create table t (
id int,
created int ,
value int
);
insert into t values(1, 1649122158, 200);
insert into t values(1 ,1649122159, 300);
insert into t values(2 ,1649122158, 200);
insert into t values(2 ,1649122158, 200);
insert into t values(3 ,1649122170, 500);
insert into t values(3 ,1649122160, 200);
Delete statement
with x as (
SELECT
id, created,
row_number() over(partition by id) as r
FROM t
)
delete from t
using x
where x.id = t.id and x.r <> 1 and x.created = t.created
;
Output
select * from t;
1 1649122158 200
3 1649122170 500
The logic is such, that the table in the using clause is joined with the operated on table. Following the join logic, it just matches by some key. In your case, you have key as {id,created}. This key is duplicated for rows with id of 2. So the whole group is deleted.
I'm no savvy in database schemas. But as a thought, you may add a row with a rank to existing table. And after that you can proceed with deletion. This way you do not need to create other table and insert values to that. Be warned that data may become fragmented(physically, on disks). So you will need to run some kind of tune up later.
Update
You may find this almost one-liner interesting:
SO answer
I will duplicate code here, as it is so small and well written.
WITH
u AS (SELECT DISTINCT * FROM your_table),
x AS (DELETE FROM your_table)
INSERT INTO your_table SELECT * FROM u;
I have 2 tables (TABLE_A & TABLE_B) where I'm using the MINUS command to see if there are differences in the tables.
In my example below you can see that TABLE_A has an additional row.
Is there a way to capture the numeric difference between the two tables, in this case 1 row.
If there is a difference >0 then display the value. Although my example is small it could contain many rows. Therefore I would only like to do the MINUS command once if possible. I'm also also amenable to alternative solutions and not tied to the MINUS command or if this can be done with SQL only that will work too.
Thanks in advance for your expertise and all who answer.
CREATE TABLE TABLE_A(
seq_num NUMBER GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
nm VARCHAR(30)
);
/
CREATE TABLE TABLE_B(
seq_num NUMBER GENERATED BY DEFAULT AS IDENTITY (START WITH 1) NOT NULL,
nm VARCHAR(30)
);
/
BEGIN
FOR I IN 1..4 LOOP
INSERT INTO TABLE_A (nm) VALUES('Name '||I);
end loop;
FOR I IN 1..3 LOOP
INSERT INTO TABLE_B (nm) VALUES('Name '||I);
end loop;
END;
-- MINUS operation
SELECT nm FROM TABLE_A
MINUS
SELECT nm FROM TABLE_B;
Output:
NM
Name 4
Pseudo code
Do minus command
If difference >0 then display rows
There are many ways for this, you can try 1 as below -
SELECT COUNT(*)
FROM (SELECT nm FROM TABLE_A
MINUS
SELECT nm FROM TABLE_B);
Another method maybe -
SELECT COUNT(*)
FROM TABLE_A A
WHERE NOT EXISTS (SELECT NULL
FROM TABLE_B B
WHERE A.nm = B.nm)
If I understood the question correctly you can do it using analytic count:
select *
from (
select v.*,count(*)over() cnt
from (
SELECT nm FROM TABLE_A
MINUS
SELECT nm FROM TABLE_B
) v
)
where cnt>=4;
DBFiddle: https://dbfiddle.uk/?rdbms=oracle_21&fiddle=0ac62f3d1ea835f60427a1da8efb965e
I have a main table (say tableA which has columns tab_a_id, field_code , field_id). There is another table, say tableB which has columns area_id , area_code. tab_a_id is a primary key of TableA. I want to update field_id of tableA based on field_code. field_code of tableA and area_code of tableB are matching but not identical, mean field_code has other values which are not matching with area_code column. I want to set field_id = area_id if field_code = area_code but, if not matched it should set to default value -1 which is 'unknown' field. I tried with subquery and bulk update (e.g Update tableA SET field_id = (SELECT area_code from tableB where area_code = field_code)). This worked for limited set of data. But I have 3 Million matching records , which means 3 million subqueries. Another problem is there are 7 million records, resulting 4 million unmatched records & useless subqueries.
Is there any optimal way to update such records with minimum time and better efficiency. I tried merge command but it has poor performance compare to forall loop query
Updating 3 out of 7 million rows seems to be the problem here.
I've created a test set in a database on a small machine, and the fasted way to get your results is to create a new table (CTAS) with the desired data and later swap names. I have not used the primary key column tab_a_id to simplify the answer.
CREATE TABLE a (field_id NUMBER, field_code VARCHAR2(30)) NOLOGGING;
CREATE TABLE b (area_id NUMBER, area_code VARCHAR2(30)) NOLOGGING;
Using MERGE and UPDATE is quite slow (15 minutes), presumably because of the amount of changes:
UPDATE a SET field_id=-1 WHERE field_code NOT IN (SELECT area_code FROM b);
5,599,989 rows updated. (560 seconds)
MERGE INTO a USING b ON (a.field_code=b.area_code)
WHEN MATCHED THEN UPDATE SET a.field_id = b.area_id;
2,400,011 rows merged. (232 seconds)
However, creating a new table with the changed data is 20 times faster and takes only 38 seconds:
CREATE TABLE x NOLOGGING AS
SELECT a.field_id, NVL(b.area_code, -1) AS field_code
FROM a JOIN b ON a.field_code=b.area_code;
Here is the test data generation:
INSERT /*+ APPEND */ INTO a (field_id, field_code) SELECT id, to_char(id) from (select level as id from dual connect by rownum <= 1000000); COMMIT;
INSERT /*+ APPEND */ INTO a (field_id, field_code) SELECT field_id+1000000, to_char(field_id+1000000) from a; COMMIT;
INSERT /*+ APPEND */ INTO a (field_id, field_code) SELECT field_id+2000000, to_char(field_id+2000000) from a; COMMIT;
INSERT /*+ APPEND */ INTO a (field_id, field_code) SELECT field_id+4000000, to_char(field_id+4000000) from a; COMMIT;
EXEC dbms_stats.gather_table_stats(null, 'a');
INSERT /*+ APPEND */ INTO b (area_id, area_code) SELECT -field_id, field_code FROM a SAMPLE (30);
exec dbms_stats.gather_table_stats(null, 'b');
I have a table and i would like to replicate/clone records within the same table. However i would like to do that with a condition. And the condition is i have a column called recordcount with numeric values. For example Row 1 can take on a value of recordcount say 7, then i would like my row 1 to be replicated 7 times. Row 2 could take on a value say 9 then i would like row 2 to be replicated 9 times.
Any help is appreciated. Thank you
What you can do (and I'm pretty sure it's not a best practice),
Is to hold a table with just numbers, which has rowcount that correspond to the numeric value.
Join that with your table, and project your table only.
Example:
create table nums(x int);
insert into nums select 1;
insert into nums select 2;
insert into nums select 2;
insert into nums select 3;
insert into nums select 3;
insert into nums select 3;
create table t (txt varchar(10) , recordcount int);
insert into t select 'A',1;
insert into t select 'B',2;
insert into t select 'C',3;
select t.*
from t
inner join nums
on t.recordcount = nums.x
order by 1
;
Will project:
"A",1
"B",2
"B",2
"C",3
"C",3
"C",3
I'm writing a function in node.js to query a PostgreSQL table.
If the row exists, I want to return the id column from the row.
If it doesn't exist, I want to insert it and return the id (insert into ... returning id).
I've been trying variations of case and if else statements and can't seem to get it to work.
A solution in a single SQL statement. Requires PostgreSQL 8.4 or later though.
Consider the following demo:
Test setup:
CREATE TEMP TABLE tbl (
id serial PRIMARY KEY
,txt text UNIQUE -- obviously there is unique column (or set of columns)
);
INSERT INTO tbl(txt) VALUES ('one'), ('two');
INSERT / SELECT command:
WITH v AS (SELECT 'three'::text AS txt)
,s AS (SELECT id FROM tbl JOIN v USING (txt))
,i AS (
INSERT INTO tbl (txt)
SELECT txt
FROM v
WHERE NOT EXISTS (SELECT * FROM s)
RETURNING id
)
SELECT id, 'i'::text AS src FROM i
UNION ALL
SELECT id, 's' FROM s;
The first CTE v is not strictly necessary, but achieves that you have to enter your values only once.
The second CTE s selects the id from tbl if the "row" exists.
The third CTE i inserts the "row" into tbl if (and only if) it does not exist, returning id.
The final SELECT returns the id. I added a column src indicating the "source" - whether the "row" pre-existed and id comes from a SELECT, or the "row" was new and so is the id.
This version should be as fast as possible as it does not need an additional SELECT from tbl and uses the CTEs instead.
To make this safe against possible race conditions in a multi-user environment:
Also for updated techniques using the new UPSERT in Postgres 9.5 or later:
Is SELECT or INSERT in a function prone to race conditions?
I would suggest doing the checking on the database side and just returning the id to nodejs.
Example:
CREATE OR REPLACE FUNCTION foo(p_param1 tableFoo.attr1%TYPE, p_param2 tableFoo.attr1%TYPE) RETURNS tableFoo.id%TYPE AS $$
DECLARE
v_id tableFoo.pk%TYPE;
BEGIN
SELECT id
INTO v_id
FROM tableFoo
WHERE attr1 = p_param1
AND attr2 = p_param2;
IF v_id IS NULL THEN
INSERT INTO tableFoo(id, attr1, attr2) VALUES (DEFAULT, p_param1, p_param2)
RETURNING id INTO v_id;
END IF;
RETURN v_id:
END;
$$ LANGUAGE plpgsql;
And than on the Node.js-side (i'm using node-postgres in this example):
var pg = require('pg');
pg.connect('someConnectionString', function(connErr, client){
//do some errorchecking here
client.query('SELECT id FROM foo($1, $2);', ['foo', 'bar'], function(queryErr, result){
//errorchecking
var id = result.rows[0].id;
};
});
Something like this, if you are on PostgreSQL 9.1
with test_insert as (
insert into foo (id, col1, col2)
select 42, 'Foo', 'Bar'
where not exists (select * from foo where id = 42)
returning foo.id, foo.col1, foo.col2
)
select id, col1, col2
from test_insert
union
select id, col1, col2
from foo
where id = 42;
It's a bit longish and you need to repeat the id to test for several times, but I can't think of a different solution that involves a single SQL statement.
If a row with id=42 exists, the writeable CTE will not insert anything and thus the existing row will be returned by the second union part.
When testing this I actually thought the new row would be returned twice (therefor a union not a union all) but it turns out that the result of the second select statement is actually evaluated before the whole statement is run and it does not see the newly inserted row. So in case a new row is inserted, it will be taken from the "returning" part.
create table t (
id serial primary key,
a integer
)
;
insert into t (a)
select 2
from (
select count(*) as s
from t
where a = 2
) s
where s.s = 0
;
select id
from t
where a = 2
;