How efficiently does Oracle handle a very long IN operator list - sql

I have the following query (this is the simplified version of a much more complicated query):
SELECT * FROM TPM_TASK
WHERE (PROJECTID, VERSIONID) IN ((3,1), (24,1), (4,1))
In code I will be building that (PROJECTID,VERSIONID) key list programmatically, and this list could potentially be a couple thousand pairs long.
My question is how Oracle will optimize this query given that ProjectId and VersionId are indexed. Will the list be converted to a hash table, similar to a join against a temp table? Or will each key lookup be done one at a time?
I tried this query under my test database and got:
SELECT STATEMENT 68.0 68 2989732 19 8759 68 ALL_ROWS
TABLE ACCESS (FULL) 68.0 68 2989732 19 8759 1 TPMDBO TPM_TASK FULL TABLE ANALYZED 1
However, I believe this database doesn't have enough data to warrant an index scan. I tried the query on production and got:
SELECT STATEMENT 19.0 19 230367 23 9683 19 ALL_ROWS
INLIST ITERATOR 1
TABLE ACCESS (BY INDEX ROWID) 19.0 19 230367 23 9683 1 TPMDBO TPM_TASK BY INDEX ROWID TABLE ANALYZED 1
INDEX (RANGE SCAN) 4.0 4 64457 29 1 TPMDBO TPM_H1_TASK RANGE SCAN INDEX ANALYZED 1
This seems to hit the index, however I'm not sure what INLIST ITERATOR means. I'm guessing this means that Oracle is iterating through the list and doing a table access for each item in the list, which would probably not be too efficient with thousands of keys. However, perhaps Oracle is smart enough to optimize this better if I actually did give it several thousand keys.
NOTE: I don't want to load these keys into a temp table because frankly I don't like the way temp tables work under Oracle, and they usually end up in more frustration than they're worth (in my non-expert opinion anyway.)

The optimizer should base its decision on the number of items in the list and the number of rows in the table. If the table has millions of rows and the list has even a couple of thousand items, I would generally expect that it would use the index to do a couple thousand single-row lookups. If the table has a few thousand rows and the list has a couple thousand items, I'd expect that the optimizer to do a full scan of the table. In the middle, of course, is where all the interesting stuff happens and where it gets harder to work out exactly what plan the optimizer is going to choose.
In general, however, dynamically building this sort of query is going to be problematic from a performance perspective not because of how expensive a particular query execution is but because the queries you're generating are not sharable. Since you can't use bind variables (or, if you are using bind variables, you'll need a different number of bind variables). That forces Oracle to do a rather expensive hard parse of the query every time and puts pressure on your shared pool which will likely force out other queries that are sharable which will cause more hard parsing in the system. You'll generally be better served tossing the data you want to match on into a temporary table (or even a permanent table) so that your query can then be made sharable and parsed just once.
To Branko's comment, while Oracle is limited to 1000 literals in an IN list, that is only if you are using the "normal" syntax, i.e.
WHERE projectID IN (1,2,3,...,N)
If you use the tuple syntax that you posted earlier, however, you can have an unlimited number of elements.
So, for example, I'll get an error if I build up a query with 2000 items in the IN list
SQL> ed
Wrote file afiedt.buf
1 declare
2 l_sql_stmt varchar2(32000);
3 l_cnt integer;
4 begin
5 l_sql_stmt := 'select count(*) from emp where empno in (';
6 for i in 1..2000
7 loop
8 l_sql_stmt := l_sql_stmt || '(1),';
9 end loop;
10 l_sql_stmt := rtrim(l_sql_stmt,',') || ')';
11 -- p.l( l_sql_stmt );
12 execute immediate l_sql_stmt into l_cnt;
13* end;
SQL> /
declare
*
ERROR at line 1:
ORA-01795: maximum number of expressions in a list is 1000
ORA-06512: at line 12
But not if I use the tuple syntax
SQL> ed
Wrote file afiedt.buf
1 declare
2 l_sql_stmt varchar2(32000);
3 l_cnt integer;
4 begin
5 l_sql_stmt := 'select count(*) from emp where (empno,empno) in (';
6 for i in 1..2000
7 loop
8 l_sql_stmt := l_sql_stmt || '(1,1),';
9 end loop;
10 l_sql_stmt := rtrim(l_sql_stmt,',') || ')';
11 -- p.l( l_sql_stmt );
12 execute immediate l_sql_stmt into l_cnt;
13* end;
SQL> /
PL/SQL procedure successfully completed.

A better solution, which doesn't require temp tables, may be to put the data into a PL/SQL table, and then join to it. Tom Kyte has an excellent example here:
PL/SQL Table join example
Hope that helps.

Related

Query all views for specific text in Oracle database views

I was wondering if anyone had a query that would search all views to find specific text. The database version we are on is Oracle Database 12c. This will only be run in our dev/test database.
I'm newer to the company, new to this database structure, and new to using Oracle. I've only used MSSQL in the past. I couldn't find a data dictionary and felt bad always having to ask what something meant or where it was located.
I was trying to investigate some before asking. I'm trying to learn what all the columns mean and where all the data is connected to. I'm open to other suggestions.
For SQL, I have one that searches through the views and columns for data and is rather fast. I don't have an exact time. But, I thought it would be similar to running it in Oracle unless the database is a little different to where maybe running something like that won't return as quick. I found some queries for Oracle that search all tables, but I don't have access to any of the tables. How we have been given access is going through: other users > users > views > then query on that view.
I found this link that I thought might work - Oracle Search all tables all columns for string
When I run the first query in the accepted answer I get this error:
Error report -ORA-00932: inconsistent datatypes: expected - got CHAR
ORA-06512: at line 6
00932. 00000 - "inconsistent datatypes: expected %s got %s"
*Cause:
*Action:"`
The string that I am searching for contains numbers and letters. Ex. 123ABC
When I run the second query, I let it run for four hours and still nothing returned. Is there anyway to speed that one up?
I'm open to any other queries, suggestions, and help of pointing me in the right direction.
Thank you!
You have to understand that searching all CHAR (and its variations) datatype columns (as "123ABC" is a string) within the whole database is a tedious and a long time running process. It takes no time in a few relatively small tables; but, on a large database, it really takes a long time. You can't use any indexes, so ... be patient.
Also, note that code (behind that link) searches through ALL_TAB_COLUMNS view which contains not only your tables' columns (owned by you), but everything you have access to, and that contains various users. Have a look; that's my 11gXE database on a laptop:
SQL> select owner, count(*) from all_tab_columns group by owner;
OWNER COUNT(*)
------------------------------ ----------
MDSYS 736
CTXSYS 320
SYSTEM 54
APEX_040000 3327
SCOTT 69
XDB 35
SYS 15211
7 rows selected.
SQL> select count(*) from user_tab_columns;
COUNT(*)
----------
69
SQL>
See the difference? Using ALL_TAB_COLUMNS, you're searching through ~20.000 columns. In my own schema (and USER_TAB_COLUMNS), that's only 70 of them.
Therefore, consider switching to USER_TAB_COLUMNS (if you do, remove all OWNER column references).
PL/SQL procedure (in this case (regarding code you took from the question whose link you posted), an anonymous PL/SQL block) won't display anything until it is over.
Alternatively, you could create a "log" table, an autonomous transaction stored procedure (so that you could insert into the log table and commit) so that you'd then "trace" execution from another session. Something like this:
Log table and procedure:
SQL> create table search_log (table_name varchar2(30), column_name varchar2(30));
Table created.
SQL> create or replace procedure p_log (par_table_name in varchar2,
2 par_column_name in varchar2)
3 is
4 pragma autonomous_transaction;
5 begin
6 insert into search_log (table_name, column_name)
7 values (par_table_name, par_column_name);
8 commit;
9 end;
10 /
Procedure created.
Code from the link you posted; switched to USER_TAB_COLUMNS, searching for table/column that contains the 'KING' string:
SQL> DECLARE
2 match_count integer;
3 v_search_string varchar2(4000) := 'KING';
4 BEGIN
5 FOR t IN (SELECT table_name, column_name
6 FROM user_tab_columns
7 WHERE data_type like '%CHAR%'
8 )
9 LOOP
10 EXECUTE IMMEDIATE
11 'SELECT COUNT(*) FROM '|| t.table_name||
12 ' WHERE '||t.column_name||' = :1'
13 INTO match_count
14 USING v_search_string;
15 IF match_count > 0 THEN
16 --dbms_output.put_line( t.owner || '.' || t.table_name ||' '||t.column_name||' '||match_count );
17 p_log(t.table_name, t.column_name);
18 END IF;
19 END LOOP;
20 END;
21 /
PL/SQL procedure successfully completed.
SQL> select * From search_log;
TABLE_NAME COLUMN_NAME
------------------------------ ------------------------------
EMP ENAME
SQL>
Only one table found; EMP and its ENAME column.

Updating Millions of Records Oracle

I have created one query to update the 35 million records column,
but unfortunately, it took around more than one hour to process.
did I miss anything on the below query?
DECLARE
CURSOR exp_cur IS
SELECT
DECODE(
COLUMN_NAME,
NULL, NULL,
standard_hash(COLUMN_NAME)
) AS COLUMN_NAME
FROM TABLE1;
TYPE nt_fName IS TABLE OF VARCHAR2(100);
fname nt_fName;
BEGIN
OPEN exp_cur;
FETCH exp_cur BULK COLLECT INTO fname LIMIT 1000000;
CLOSE exp_cur;
--Print data
FOR idx IN 1 .. fname.COUNT
LOOP
UPDATE TABLE1 SET COLUMN_NAME=fname(idx);
commit;
DBMS_OUTPUT.PUT_LINE (idx||' '||fname(idx) );
END LOOP;
END;
The reason why bulk collect used with a forall construction is generally faster than the equivalent row-by-row loop is because it applies all the updates in one shot, instead of laboriously stepping though the rows one at a time and launching 35 million separate update statements, each one requiring the database to search for the individual row before updating it. But what you have written (even when the bugs are fixed) is still a row-by-row loop with 35 million search and update statements, plus the additional overhead of populating a 700 MB array in memory, 35 million commits, and 35 million dbms_output messages. It has to be slower because it has significantly more work to do than a plain update.
If it is practical to copy the data to a new table, insert will be a lot faster than update. At the end you can reapply any grants, indexes and constraints to the new table, rename both tables and drop the old one. You can also insert /*+ parallel enable_parallel_dml */ (or prior to Oracle 12c, you have to alter session enable parallel dml separately.) You could define the new table as nologging during the copy, but check with your DBA as that can affect replication and backups, though that might not matter if this is a test system. This will all need careful scripting if it's going to form part of a routine workflow.
Your code is updating all records of TABLE1 in each loop. (It loops 35 million times and in each loop updating 35 million records, That's why it is taking time)
You can simply use a single update statement as follows:
UPDATE TABLE1 SET COLUMN_NAME = standard_hash(COLUMN_NAME)
WHERE COLUMN_NAME IS NOT NULL;
So, If you want to use the BULK COLLECT and FORALL then you can use it as follows:
DECLARE
CURSOR EXP_CUR IS
SELECT COLUMN_NAME FROM TABLE1
WHERE COLUMN_NAME IS NOT NULL;
TYPE NT_FNAME IS TABLE OF VARCHAR2(100);
FNAME NT_FNAME;
BEGIN
OPEN EXP_CUR;
FETCH EXP_CUR BULK COLLECT INTO FNAME LIMIT 1000000;
FORALL IDX IN FNAME.FIRST..FNAME.LAST
UPDATE TABLE1
SET COLUMN_NAME = STANDARD_HASH(COLUMN_NAME)
WHERE COLUMN_NAME = FNAME(IDX);
COMMIT;
CLOSE EXP_CUR;
END;
/

Get number of inserted rows by just plain sql

is there a way to get the number of inserted rows inside of the same transaction?
I see that PL/SQL command:
SQL%ROWCOUNT
does the job, however I don't want to create a procedure just for that!
I tried to simply call
insert into T ...
select SQL%ROWCOUNT;
but it gives me "invalid character".
If I remember well mysql actually had a way to obtain this information, does oracle really not provide any means for that?
I don't want to create a procedure just for that
No need to create any procedure, you could simply use an anonymous PL/SQL block.
For example,
SQL> SET serveroutput ON
SQL> DECLARE
2 var_cnt NUMBER;
3 BEGIN
4 var_cnt :=0;
5 FOR i IN(SELECT empno FROM emp)
6 LOOP
7 INSERT INTO emp(empno) VALUES(i.empno);
8 var_cnt := var_cnt + SQL%ROWCOUNT;
9 END loop;
10 DBMS_OUTPUT.PUT_LINE(TO_CHAR(var_cnt)||' rows inserted');
11 END;
12 /
14 rows inserted
PL/SQL procedure successfully completed.
SQL>
Update If you cannot use PL/SQL, and just plain SQL, then you cannot use SQL%ROWCOUNT.
The only option that comes to my mind is to have a timestamp column in your table, and query the count based on the timestamp to know the number of rows inserted.
Try following,
DBMS_OUTPUT.put_line(TO_CHAR(SQL%ROWCOUNT)||' rows inserted');

PLSQL - Searching for record in a Nested Table that was Bulk Collected

I used bulk collect to fetch records into a nested table. I want to search for a record with exists method but it's not working out. I then found out the exists method uses index and does not look for the values. Do I need to go across each record and search for a match? Is there a shorter way to do it because I am going to use the same logic for large set of records?
I read in websites that bulk collect doesn't work properly with an associative array when using a varchar as a key so I used nested tables instead. Also, I don't want to read each record and store it in a hashmap as it degrades performance.
Create table sales(
name varchar2(100)
)
insert into sales(name) values('Test');
insert into sales(name) values('alpha');
insert into sales(name) values(null);
declare
type sales_tab is table of varchar2(1000);
t_sal sales_tab;
begin
select name bulk collect into t_sal from sales;
if(t_sal.exists('Test')) THEN
dbms_output.put_line('Test exists');
END IF;
dbms_output.put_line(t_sal.count);
end;
exists() function tells you if a particular element with integer or varchar2(for associative arrays index by varchar2 collections ) index of a collection exists. It does not test for membership. To be able to check if a collection contains an element with specific value member of condition can be used:
SQL> declare
2 type sales_tab is table of varchar2(1000);
3 t_sal sales_tab;
4 begin
5 select name
6 bulk collect into t_sal
7 from sales;
8
9 if('Test' member of t_sal) THEN
10 dbms_output.put_line('Test exists');
11 END IF;
12
13 dbms_output.put_line(t_sal.count);
14 end;
15 /
Test exists
3
PL/SQL procedure successfully completed

how to fetch, delete, commit from cursor

I am trying to delete a lot of rows from a table. I want to try the approach of putting rows I want to delete into a cursor and then keep doing fetch, delete, commit on each row of the cursor until it is empty.
In the below code we are fetching rows and putting them in a TYPE.
How can I modify the below code to remove the TYPE from the picture and just simply do fetch,delete,commit on the cursor itself.
OPEN bulk_delete_dup;
LOOP
FETCH bulk_delete_dup BULK COLLECT INTO arr_inc_del LIMIT c_rows;
FORALL i IN arr_inc_del.FIRST .. arr_inc_del.LAST
DELETE FROM UIV_RESPONSE_INCOME
WHERE ROWID = arr_inc_del(i);
COMMIT;
arr_inc_del.DELETE;
EXIT WHEN bulk_delete_dup%NOTFOUND;
END LOOP;
arr_inc_del.DELETE;
CLOSE bulk_delete_dup;
Why do you want to commit in batches? That is only going to slow down your processing. Unless there are other sessions that are trying to modify the rows you are trying to delete, which seems problematic for other reasons, the most efficient approach would be simply to delete the data with a single DELETE, i.e.
DELETE FROM uiv_response_income uri
WHERE EXISTS(
SELECT 1
FROM (<<bulk_delete_dup query>>) bdd
WHERE bdd.rowid = uri.rowid
)
Of course, there may well be a more optimal way of writing this depending on how the query behind your cursor is designed.
If you really want to eliminate the BULK COLLECT (which will slow the process down substantially), you could use the WHERE CURRENT OF syntax to do the DELETE
SQL> create table foo
2 as
3 select level col1
4 from dual
5 connect by level < 10000;
Table created.
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select * from foo for update;
3 l_rowtype c1%rowtype;
4 begin
5 open c1;
6 loop
7 fetch c1 into l_rowtype;
8 exit when c1%notfound;
9 delete from foo where current of c1;
10 end loop;
11* end;
SQL> /
PL/SQL procedure successfully completed.
Be aware, however, that since you have to lock the row (with the FOR UPDATE clause), you cannot put a commit in the loop. Doing a commit would release the locks you had requested with the FOR UPDATE and you'll get an ORA-01002: fetch out of sequence error
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select * from foo for update;
3 l_rowtype c1%rowtype;
4 begin
5 open c1;
6 loop
7 fetch c1 into l_rowtype;
8 exit when c1%notfound;
9 delete from foo where current of c1;
10 commit;
11 end loop;
12* end;
SQL> /
declare
*
ERROR at line 1:
ORA-01002: fetch out of sequence
ORA-06512: at line 7
You may not get a runtime error if you remove the locking and avoid the WHERE CURRENT OF syntax, deleting the data based on the value(s) you fetched from the cursor. However, this is still doing a fetch across commit which is a poor practice and radically increases the odds that you will, at least intermittently, get an ORA-01555: snapshot too old error. It will also be painfully slow compared to the single SQL statement or the BULK COLLECT option.
SQL> ed
Wrote file afiedt.buf
1 declare
2 cursor c1 is select * from foo;
3 l_rowtype c1%rowtype;
4 begin
5 open c1;
6 loop
7 fetch c1 into l_rowtype;
8 exit when c1%notfound;
9 delete from foo where col1 = l_rowtype.col1;
10 commit;
11 end loop;
12* end;
SQL> /
PL/SQL procedure successfully completed.
Of course, you also have to ensure that your process is restartable in case you process some subset of rows and have some unknown number of interim commits before the process dies. If the DELETE is sufficient to cause the row to no longer be returned from your cursor, your process is probably already restartable. But in general, that's a concern if you try to break a single operation into multiple transactions.
A few things. It seems from your company's "no transaction over 8 second" rule (8 seconds, you in Texas?), you have a production db instance that traditionally supported apps doing OLTP stuff (insert 1 row, update 2 rows, etc), and has now also become the batch processing db (remove 50% of the rows and replace with 1mm new rows).
Batch processing should be separated from OLTP instance. In a batch ("data factory") instance, I wouldn't try deleting in this case, I'd probably do a CTAS, drop old table, rename new table, rebuild indexes/stats, recompile invalid objs approach.
Assuming you are stuck doing batch processing in your "8 second" instance, you'll probably find your company will ask for more and more of this in the future, so ask the DBAs for as much rollback as you can get, and hope you don't get a snapshot too old by fetching across commits (cursor select driving the deletes, commit every 1000 rows or so, delete using rowid).
If DBAs cant help, you may be able to first create a temp table containing the rowids that you wish to delete, and then loop through the temp table to delete from main table (avoid fetching across commits), but your company will probably have some rule against this as well as this is another (basic) batch technique.
Something like:
declare
-- assuming index on someCol
cursor sel_cur is
select rowid as row_id
from someTable
where someCol = 'BLAH';
v_ctr pls_integer := 0;
begin
for rec in sel_cur
loop
v_ctr := v_ctr + 1;
-- watch out for snapshot too old...
delete from someTable
where rowid = rec.row_id;
if (mod(v_ctr, 1000) = 0) then
commit;
end if;
end loop;
commit;
end;