MERGE INTO Performance - sql

I have a table contains tat contains {service_id, service_name,region_name}
As input my procedure gets service_id , i_svc_region list of key,value pairs, which has {service_name, region}.
Have to insert into the table if the record does not exists already. I know it is a very simple query.. But does the below queries make any difference in performance?
which one is better and why?
MERGE INTO SERVICE_REGION_MAP table1
USING
(SELECT i_svc_region(i).key as service_name,i_enabled_regions(i).value as region
FROM dual) table2
ON (table1.service_id =i_service_id and table1.region=table2.region)
WHEN NOT MATCHED THEN
INSERT (service_id,service_name ,region) VALUES (i_service_id ,table2.service_name,table2.region);
i_service_id - is passed as it is.
MERGE INTO SERVICE_REGION_MAP table1
USING
(SELECT i_service_id as service_id, i_svc_region(i).key as service_name,i_enabled_regions(i).value as region
FROM dual) table2
ON (table1.service_id =table2.service_id and table1.region=table2.region)
WHEN NOT MATCHED THEN
INSERT (service_id,service_name ,region) VALUES (table2.service_id,table2.service_name,table2.region);
i_service_id is considered as column in table.
Does this really make any difference?

You should be using the FORALL statement. It will result in much faster performance than any looping we could write. Check out the documenation, starting with https://docs.oracle.com/database/121/LNPLS/forall_statement.htm#LNPLS01321

As #Brian Leach suggests the FORALL will give you a single round trip to SQL engine for all of the elements (i's) in your table. This can give between 10 and 100 times improvement depending on table size and many other things beyond me.
Also you are only using the INSERT capability of MERGE so a time honoured INSERT statement should make life easier/faster for the database. MERGE has more bells and whistles which can slow it down.
So try something like:
FORALL i IN 1..i_svc_region(i).COUNT
INSERT INTO SERVICE_REGION_MAP table1
(service_id, service_name, region)
SELECT
i_service_id AS service_id,
i_svc_region(i).KEY AS service_name,
i_enabled_regions(i).VALUE AS region
FROM DUAL table2
WHERE NOT EXISTS
( SELECT *
FROM SERVICE_REGION_MAP table1
WHERE table1.service_id=table2.service_id AND table1.region=table2.region
);

Related

CTE: SELECT statement to read rows made by an INSERT Statement

I have a CTE that first inserts rows, and then reads the table with the inserted rows. Right now, the read on the table does not take into account the inserted rows.
The simplest example could be like this:
The Table:
CREATE TABLE mytable (column1 text, column2 text);
The query:
WITH insert_first AS (
INSERT INTO mytable (column1, column2)
VALUES ('value1', 'value2')
RETURNING *
), select_after AS (
SELECT * FROM mytable
LEFT JOIN insert_first ON insert_first.column1 = mytable.column1
) SELECT * FROM select_after
Here, select_after will be empty.
I thought by doing a LEFT JOIN on insert_first, I would hint to SQL to wait for the insert. But, it does not seem to do this.
Is there a way I could make a query that runs over mytable, which sees the inserts made from insert_first?
Here's a playground too: https://www.db-fiddle.com/f/4jyoMCicNSZpjMt4jFYoz5/6796
As Adrian mentioned in the comments, this is not possible:
From docs WITH: The primary query and the WITH queries are all (notionally) executed at the same time. This implies that the effects of a data-modifying statement in WITH cannot be seen from other parts of the query, other than by reading its RETURNING output. If two such data-modifying statements attempt to modify the same row, the results are unspecified.

Store result of minus query ( list of varchars) in a variable in Oracle PL/SQL

I'm using below minus query to get the extra project_ids present in TABLE_ONE compared to TABLE_TWO
select project_id from TABLE_ONE minus select project_id from TABLE_TWO;
I want to store result of above query which is list of varchars in a variable since i need to perform below 2 steps :
If above query returns any project_ids, send an email which contains these project_ids in mail body
insert those extra project_ids in TABLE_TWO to make sure all project_ids present in TABLE_ONE are present in TABLE_TWO
For step 2 I tried below query and it worked.
insert into TABLE_TWO columns (project_id) values (select project_id from TABLE_ONE minus select project_id from TABLE_TWO);
However to perform above 2 steps i need to store the query result in a variable. Please let me know how to do it. I'm using Oracle 12c.
Unfortunately, neither of the two most natural ways to get the missing IDs into table_two (a multi-row INSERT or a MERGE) support the RETURNING.. BULK COLLECT INTO clause.
So, I think your best bet is to get the list of ids first and then use that list to maintain table_two.
Like this:
DECLARE
l_missing_id_list SYS.ODCINUMBERLIST;
BEGIN
SELECT project_id
BULK COLLECT INTO l_missing_id_list
FROM
(
SELECT t1.project_id FROM table_one t1
MINUS
SELECT t2.project_id FROM table_two t2 );
FORALL i IN l_missing_id_list.FIRST..l_missing_id_list.LAST
INSERT INTO table_two VALUES ( l_missing_id_list(i) );
COMMIT;
-- Values are now inserted and you have the list of IDs in l_missing_id_list to add to your email.
END;
That's the basic concept. Presumably you have more columns in TABLE_TWO than just the id, so you'll have to add those.
something like this. Use a cursor loop.
begin
for c_record in (select project_id from TABLE_ONE minus select project_id from TABLE_TWO)
loop
-- send your email however it is done using c_record.project_id
insert into TABLE_TWO columns (project_id) values (c_record.project_id);
end loop;
FYI, there is a disadvantage to doing it this way potentially. If you send the email and then the transaction is rolled back, the email still went out the door.
A more robust way would be to use Oracle Advances Queues, but that starts getting complicated pretty fast.
SELECT LISTAGG(project_id,',') WITHIN GROUP (ORDER BY project_id)
FROM (select project_id from TABLE_ONE minus select project_id from TABLE_TWO) x;

SQL. When I trying to do something like "INSERT INTO Table VALUES(x1,x2,x3) - can the x1 x2 x3 be sql queries, like SELECT <...>

I want to do something like this:
QSqlQuery q;
q.prepare("insert into Norm values(select from Disc id_disc WHERE name_disc=?, select from Spec code_spec WHERE name_spec=?,?");
q.addBindValue(MainModel->data(MainModel->index(MainModel->rowCount()-1, 1)).toString());
q.addBindValue(ui->comboBox->currentText());
q.addBindValue(MainModel->data(MainModel->index(MainModel->rowCount()-1, 2)).toString());
q.exec();
But it's not working. Surely for someone obviously where is the error and maybe he tells me how to do it right.
First of all your you have done spelling mistake. Its "INSERT" not "INCERT"
And yes we can insert SELECT query inside INSERT query.
eg:
INSERT INTO table2
(column_name(s))
SELECT column_name(s)
FROM table1;
INSERT ... SELECT ... is used when you want to insert multiple records, or when most values to be inserted come from the same record.
If you want to insert one record with values coming from several tables, you can use subqueries like you tried to do, but you have to use the correct syntax:
scalar subqueries must be written inside parentheses, and you must write the SELECT correctly as SELECT value FROM table:
INSERT INTO Norm
VALUES ((SELECT id_disc FROM Disc WHERE name_disc = ?),
(SELECT code_spec FROM Spec WHERE name_spec = ?),
?)
If you want data from two tables, you must first write a query which return pretended data - using JOIN, UNION, subqueries, ...
Then, just do
INSERT INTO target_table SELECT ...

How to append distinct records from one table to another

How do I append only distinct records from a master table to another table, when the master may have duplicates. Example - I only want the distinct records in the smaller table but I need to insert/append records to what I already have in the smaller table.
Ignoring any concurency issues:
insert into smaller (field, ... )
select distinct field, ... from bigger
except
select field, ... from smaller;
You can also rephrase it as a join:
insert into smaller (field, ... )
select distinct b.field, ...
from bigger b
left join smaller s on s.key = b.key
where s.key is NULL
If you don't like NOT EXISTS and EXCEPT/MINUS (cute, Remus!), you have also LEFT JOIN solution:
INSERT INTO smaller(a,b)
SELECT DISTINCT master.a, master.b FROM master
LEFT JOIN smaller ON smaller.a=master.a AND smaller.b=master.b
WHERE smaller.pkey IS NULL
You don't say the scale of the problem so I'll mention something I recently helped a friend with.
He works for an insurance company that provides supplemental Dental and Vision benefits management for other insurance companies. When they get a new client they also get a new database that can have 10's of millions of records. They wanted to identify all possible dupes with the data they already had in a master database of 100's of millions of records.
The solution we came up with was to identify two distinct combinations of field values (normalized in various ways) that would indicate a high probability of a dupe. We then created a new table containing MD5 hashes of the combos plus the id of the master record they applied to. The MD5 columns were indexed. All new records would have their combo hashes computed and if either of them had a collision with the master the new record would be kicked out to an exceptions file for some human to deal with it.
The speed of this surprised the hell out of us (in a nice way) and it has had a very acceptable false-positive rate.
You could use the distinct keyword to filter out duplicates:
insert into AnotherTable
(col1, col2, col3)
select distinct col1, col2, col3
from MasterTable
Based on Microsoft SQL Server and its Transact-SQL. Untested as always and the target_table has the same amount of rows as the source table (otherwise use columnnames between INSERT INTO and SELECT
INSERT INTO target_table
SELECT DISTINCT row1, row2
FROM source_table
WHERE NOT EXISTS(
SELECT row1, row2
FROM target_table)
Something like this would work for SQL Server (you don't mention what RDBMS you're using):
INSERT INTO table (col1, col2, col3)
SELECT DISTINCT t2.a, t2.b, t2.c
FROM table2 AS t2
WHERE NOT EXISTS (
SELECT 1
FROM table
WHERE table.col1 = t2.a AND table.col2 = t2.b AND table.col3 = t2.c
)
Tune where appropriate, depending on exactly what defines "distinctness" for your table.

Alternative SQL ways of looking up multiple items of known IDs?

Is there a better solution to the problem of looking up multiple known IDs in a table:
SELECT * FROM some_table WHERE id='1001' OR id='2002' OR id='3003' OR ...
I can have several hundreds of known items. Ideas?
SELECT * FROM some_table WHERE ID IN ('1001', '1002', '1003')
and if your known IDs are coming from another table
SELECT * FROM some_table WHERE ID IN (
SELECT KnownID FROM some_other_table WHERE someCondition
)
The first (naive) option:
SELECT * FROM some_table WHERE id IN ('1001', '2002', '3003' ... )
However, we should be able to do better. IN is very bad when you have a lot of items, and you mentioned hundreds of these ids. What creates them? Where do they come from? Can you write a query that returns this list? If so:
SELECT *
FROM some_table
INNER JOIN ( your query here) filter ON some_table.id=filter.id
See Arrays and Lists in SQL Server 2005
ORs are notoriously slow in SQL.
Your question is short on specifics, but depending on your requirements and constraints I would build a look-up table with your IDs and use the EXISTS predicate:
select t.id from some_table t
where EXISTS (select * from lookup_table l where t.id = l.id)
For a fixed set of IDs you can do:
SELECT * FROM some_table WHERE id IN (1001, 2002, 3003);
For a set that changes each time, you might want to create a table to hold them and then query:
SELECT * FROM some_table WHERE id IN
(SELECT id FROM selected_ids WHERE key=123);
Another approach is to use collections - the syntax for this will depend on your DBMS.
Finally, there is always this "kludgy" approach:
SELECT * FROM some_table WHERE '|1001|2002|3003|' LIKE '%|' || id || '|%';
In Oracle, I always put the id's into a TEMPORARY TABLE to perform massive SELECT's and DML operations:
CREATE GLOBAL TEMPORARY TABLE t_temp (id INT)
SELECT *
FROM mytable
WHERE mytable.id IN
(
SELECT id
FROM t_temp
)
You can fill the temporary table in a single client-server roundtrip using Oracle collection types.
We have a similar issue in an application written for MS SQL Server 7. Although I dislike the solution used, we're not aware of anything better...
'Better' solutions exist in 2008 as far as I know, but we have Zero clients using that :)
We created a table valued user defined function that takes a comma delimited string of IDs, and returns a table of IDs. The SQL then reads reasonably well, and none of it is dynamic, but there is still the annoying double overhead:
1. Client concatenates the IDs into the string
2. SQL Server parses the string to create a table of IDs
There are lots of ways of turning '1,2,3,4,5' into a table of IDs, but the Stored Procedure which uses the function ends up looking like...
CREATE PROCEDURE my_road_to_hell #IDs AS VARCHAR(8000)
AS
BEGIN
SELECT
*
FROM
myTable
INNER JOIN
dbo.fn_split_list(#IDs) AS [IDs]
ON [IDs].id = myTable.id
END
The fastest is to put the ids in another table and JOIN
SELECT some_table.*
FROM some_table INNER JOIN some_other_table ON some_table.id = some_other_table.id
where some_other_table would have just one field (ids) and all values would be unique