DB2 Select clause or using cursor - db2-luw

For reading bulk data from within a db2 function, is it preferred to read it via a general select clause or is it better to use here a cursor in relation to performance?

Related

Procedure - Dynamic where conditions

I have a procedure, based on the parameters, the where condition would differ.
The OUT parameters have to be INTO clause so that I can return the columns from the procedure.
Rather than having SQL condition for each if condition, what is the efficient way of doing this.
It looks to me more like a design question.
So it depends on what you need to achieve and how you want to organize your code.
Possibilities
1- your "if" chain of queries in the same procedure
2- one procedure for each query
3 - if the differencies between the "where" parts are not so big use sql constructs, including unions, case , and\or etc to let coexist different cases in one query
4- build sql dynamically and use execute immediate
Usually i don't like the 1, i would try with 3 or 4, then proceed with 2 if i can't.
EDIT
With dynamic sql, for getting out results you can do
EXECUTE IMMEDIATE stmt into o_total_count,o_total_sum,o_hold_status,o_normal_status;
In case you have input params for the query, you have to mark them with : and then add the USING clause with the appropriate input param.
Example
EXECUTE IMMEDIATE 'select count(*) from departments where department_id=:id' INTO l_cnt USING l_dept_id;

Oracle : Inserting large dataset into a table

I need to load data from different remote database into our own database. I write a single "complex" query using WITH statement. It is around 18 Million rows of data.
What is most efficient way to do the insert?
using cursor insert one by one
using INSERT INTO
or is there any other way?
The fastest way to do anything should be to use a single SQL statement. The next most efficient approach is to use a cursor doing BULK COLLECT operations to minimize context shifts between the SQL and PL/SQL engines. The least efficient approach is to use a cursor and process the data row-by-row.
As Justin wrote, the most efficient approach is to use a single SQL statement ( insert into ... select ... ). Additionally you can take advantage of direct-path insert
18 million rows will require quite a bit of rollback for your single insert stmt scenario. A cursor for loop would be much slower but you'd be able to commit every x rows.
Personally, I'd go old school and dump out to a file and load via sqlldr or data pump, esp as this is across databases.
You could use Data Synchronisation Studio and change the select statement to take 1 million at a time (I think 18m at once would probably overload your machine)

check select statement is valid or not in c#.net

i want to check select statement(string) is valid or not in c#.net, if select statement is right then retrieve data and fill dropdown list box else drop down should be empty
How often would the select statement be invalid? Seems like a simple try/catch block around the execution of the SQL might be sufficient.
As an aside, I hope you aren't making an app that would allow someone to type in arbitrary SQL into a box which you would then execute...
One approach which covers most scenarios is to execute the SQL with SET FMTONLY ON
e.g.
SET FMTONLY ON;
SELECT SomeField FROM ExampleQuery
From BOL, SET FMTONLY :
Returns only metadata to the client.
Can be used to test the format of the
response without actually running the
query.query.
That will error if the query is invalid. You can also check the result to determine what the schema of the resultset that is returned would be (i.e. no schema = not a SELECT statement).
Update:
In general terms when dealing with SQL that you want to protect against SQL injection there are other things you should be thinking about:
Avoid dynamic sql (concatenating user-entered values into an SQL string to be executed). Use parameterised SQL instead.
Encapsulate the query as a nested query. e.g.
SELECT * FROM (SELECT Something FROM ADynamicQueryThatsBeenGenerated) x
So if the query contains multiple commands, this would result in an error. i.e. this would result in an invalid query when encapsulated as a nested query:
SELECT SomethingFrom FROM MyTable;TRUNCATE TABLE MyTable

Query performance difference pl/sql forall insert and plain SQL insert

We have been using temporary table to store intermediate results in pl/sql Stored procedure. Could anyone tell if there is a performance difference between doing bulk collect insert through pl/sql and a plain SQL insert.
Insert into [Table name] [Select query Returning huge amount of data]
or
Cursor for [Select query returning huge amount of data]
open cursor
fetch cursor bulk collect into collection
Use FORALL to perform insert
Which of the above 2 options is better to insert huge amount of temporary data?.
Some experimental data for your problem (Oracle 9.2)
bulk collect
DECLARE
TYPE t_number_table IS TABLE OF NUMBER;
v_tab t_number_table;
BEGIN
SELECT ROWNUM
BULK COLLECT INTO v_tab
FROM dual
CONNECT BY LEVEL < 100000;
FORALL i IN 1..v_tab.COUNT
INSERT INTO test VALUES (v_tab(i));
END;
/
-- 2.6 sec
insert
-- test table
CREATE global TEMPORARY TABLE test (id number)
ON COMMIT preserve ROWS;
BEGIN
INSERT INTO test
SELECT ROWNUM FROM dual
CONNECT BY LEVEL < 100000;
END;
/
-- 1.4 sec
direct path insert
http://download.oracle.com/docs/cd/B10500_01/server.920/a96524/c21dlins.htm
BEGIN
INSERT /*+ append */ INTO test
SELECT ROWNUM FROM dual
CONNECT BY LEVEL < 100000;
END;
/
-- 1.2 sec
Insert into select must certainly be faster. Skips the overhead of storing the data in a collection first.
It depends on the nature of the work you're doing to populate the intermediate results. If the work can be done relatively simply in the SELECT statement for the INSERT, that will generally perform better.
However, if you have some complex intermediate logic, it may be easier (from a code maintenance point of view) to fetch and insert the data in batches using bulk collects/binds. In some cases it might even be faster.
One thing to note very carefully: the query plan used by the INSERT INTO x SELECT ... will sometimes be quite different to that used when the query is run by itself (e.g. in a PL/SQL explicit cursor). When comparing performance, you need to take this into account.
Tom Kyte of asktomhome fame has answered this question more firmly. If you are willing to do some searching you can find the question and his response which constains detailed testing results and explanations. He shows plsql cursor vs. plsql bulk collect including affect of periodic commit, vs. sql insert as select.
insert as select wins hands down all the time and the difference on even modest datasets is dramatic.
That said. the comment was made earlier about the complexity of intermediary computations. I can think of three situations where this would be relevant.
1) If computations require going outside of the Oracle database, then clearly a simple insert as select does not do the trick.
2) If the solution requires the use of PLSQL function calls then context switching can potentially kill your query and you may have better results with plsql calling plsql functions. PLSQl was made to call SQL but not the other way around. Thus calling PLSQL from SQL is expensive.
3) If computations make the sql code very difficulty to read then even though it may be slower, a plsql bulk collect solution may be better for these other reasons.
Good luck.
When we declare cursor explicitly, oracle will allocate a private SQL work area in our RAM. When you have select statement that returns multiple rows will be copied from table or view to private SQL work area as ACTIVE SET. Its size is the number of rows that meet your search criteria. Once cursor is opened, your pointer will be placed in the first row of ACTIVE SET. Here you can perform DML. For example if you perform some update operation. It will update any changes in rows in the work area and not in the table directly. So it is not using the table every time we need to update. It fetches once to the work area, then after performing operation, the update will be done once for all operations. This reduces input/output data transfer between database and user.
I Suggest using PL\SQL explicit cursor, u r just going to perform any DML operation at the private workspace alloted for the cursor. This will not hit the database server performance during peak hours

Bulk Insert into Oracle database: Which is better: FOR Cursor loop or a simple Select?

Which would be a better option for bulk insert into an Oracle database ?
A FOR Cursor loop like
DECLARE
CURSOR C1 IS SELECT * FROM FOO;
BEGIN
FOR C1_REC IN C1 LOOP
INSERT INTO BAR(A,
B,
C)
VALUES(C1.A,
C1.B,
C1.C);
END LOOP;
END
or a simple select, like:
INSERT INTO BAR(A,
B,
C)
(SELECT A,
B,
C
FROM FOO);
Any specific reason either one would be better ?
I would recommend the Select option because cursors take longer.
Also using the Select is much easier to understand for anyone who has to modify your query
The general rule-of-thumb is, if you can do it using a single SQL statement instead of using PL/SQL, you should. It will usually be more efficient.
However, if you need to add more procedural logic (for some reason), you might need to use PL/SQL, but you should use bulk operations instead of row-by-row processing. (Note: in Oracle 10g and later, your FOR loop will automatically use BULK COLLECT to fetch 100 rows at a time; however your insert statement will still be done row-by-row).
e.g.
DECLARE
TYPE tA IS TABLE OF FOO.A%TYPE INDEX BY PLS_INTEGER;
TYPE tB IS TABLE OF FOO.B%TYPE INDEX BY PLS_INTEGER;
TYPE tC IS TABLE OF FOO.C%TYPE INDEX BY PLS_INTEGER;
rA tA;
rB tB;
rC tC;
BEGIN
SELECT * BULK COLLECT INTO rA, rB, rC FROM FOO;
-- (do some procedural logic on the data?)
FORALL i IN rA.FIRST..rA.LAST
INSERT INTO BAR(A,
B,
C)
VALUES(rA(i),
rB(i),
rC(i));
END;
The above has the benefit of minimising context switches between SQL and PL/SQL. Oracle 11g also has better support for tables of records so that you don't have to have a separate PL/SQL table for each column.
Also, if the volume of data is very great, it is possible to change the code to process the data in batches.
If your rollback segment/undo segment can accomodate the size of the transaction then option 2 is better. Option 1 is useful if you do not have the rollback capacity needed and have to break the large insert into smaller commits so you don't get rollback/undo segment too small errors.
A simple insert/select like your 2nd option is far preferable. For each insert in the 1st option you require a context switch from pl/sql to sql. Run each with trace/tkprof and examine the results.
If, as Michael mentions, your rollback cannot handle the statement then have your dba give you more. Disk is cheap, while partial results that come from inserting your data in multiple passes is potentially quite expensive. (There is almost no undo associated with an insert.)
I think that in this question is missing one important information.
How many records will you insert?
If from 1 to cca. 10.000 then you should use SQL statement (Like they said it is easy to understand and it is easy to write).
If from cca. 10.000 to cca. 100.000 then you should use cursor, but you should add logic to commit on every 10.000 records.
If from cca. 100.000 to millions then you should use bulk collect for better performance.
As you can see by reading the other answers, there are a lot of options available. If you are just doing < 10k rows you should go with the second option.
In short, for approx > 10k all the way to say a <100k. It is kind of a gray area. A lot of old geezers will bark at big rollback segments. But honestly hardware and software have made amazing progress to where you may be able to get away with option 2 for a lot of records if you only run the code a few times. Otherwise you should probably commit every 1k-10k or so rows. Here is a snippet that I use. I like it because it is short and I don't have to declare a cursor. Plus it has the benefits of bulk collect and forall.
begin
for r in (select rownum rn, t.* from foo t) loop
insert into bar (A,B,C) values (r.A,r.B,r.C);
if mod(rn,1000)=0 then
commit;
end if;
end;
commit;
end;
I found this link from the oracle site that illustrates the options in more detail.
You can use:
Bulk collect along with FOR ALL that is called Bulk binding.
Because PL/SQL forall operator speeds 30x faster for simple table inserts.
BULK_COLLECT and Oracle FORALL together these two features are known as Bulk Binding. Bulk Binds are a PL/SQL technique where, instead of multiple individual SELECT, INSERT, UPDATE or DELETE statements are executed to retrieve from, or store data in, at table, all of the operations are carried out at once, in bulk. This avoids the context-switching you get when the PL/SQL engine has to pass over to the SQL engine, then back to the PL/SQL engine, and so on, when you individually access rows one at a time. To do bulk binds with INSERT, UPDATE, and DELETE statements, you enclose the SQL statement within a PL/SQL FORALL statement. To do bulk binds with SELECT statements, you include the BULK COLLECT clause in the SELECT statement instead of using INTO.
It improves performance.
I do neither for a daily complete reload of data. For example say I am loading my Denver site. There are other strategies for near real time deltas.
I use a create table SQL as I have found is just almost as fast as a bulk load
For example, below a create table statement is used to stage the data, casting the columns to the correct data type needed:
CREATE TABLE sales_dataTemp as select
cast (column1 as Date) as SALES_QUARTER,
cast (sales as number) as SALES_IN_MILLIONS,
....
FROM
TABLE1;
this temporary table mirrors my target table's structure exactly which is list partitioned by site.
I then do a partition swap with the DENVER partition and I have a new data set.