I'm trying to put a condition on spool. I want the file to be created only if my query returns at least one row.
define csvFile=DUMP_EMP.csv
spool &&csvFile
SELECT EMP_NO||','||EMP_NAME||','||DELETED
FROM EMP_TABLE
WHERE DELETED = 0;
spool off
Related
I am having a performance issue for the one of my queries.
The structure of the query is like below
With a02 as
(...
);
SELECT *
FROM
a02
inner join
a03 on a02.id=a03.id;
Table a02 is around 10000 rows and table a03 is around 40000 rows. The query takes about 1.5 hours to run.
However, if I create a02 as a global temp table and then run the query below, it takes less then 5 mins. Is that a normal behavior?
SELECT *
FROM
a02
inner join
a03 on a02.id=a03.id
I am hesitated to use global temp table as we sometimes get the following message when drop the table:
DROP TABLE A02;
SQL Error: ORA-14452: attempt to create, alter or drop an index on temporary table already in use
14452. 00000 - "attempt to create, alter or drop an index on temporary table already in use"
When you create a global temporary table (or even a local temporary table), Oracle has good statistics on the table -- because you just created it. This can affect the execution plan.
It would seem that Oracle is choosing a suboptimal execution plan for the query. I would suggest creating an index on id in each of the tables -- if possible. Or at least have an index on a03(id).
I would recommend identifying the SQL id for the query then using the SQL Monitor Report as it will tell you exactly what the execution plan is and where the SQL is spending most of it's time.
A simple way to get the SQL Monitor Report from SQL*Plus follows:
spool c:\temp\SQL_Monitor_rpt.html
SET LONG 1000000
SET LONGCHUNKSIZE 1000000
SET LINESIZE 1000
SET PAGESIZE 0
SET TRIM ON
SET TRIMSPOOL ON
SET ECHO OFF
SET FEEDBACK OFF
alter session set "_with_subquery" = optimizer;
SELECT DBMS_SQLTUNE.report_sql_monitor(
sql_id => '&SQLID' ,
type => 'HTML',
report_level => 'ALL') AS report
FROM dual;
spool off
Most likely there is a full table scan going on plus joins without the use of an index. You should probably index the columns in both tables that are involved in the join condition. Also, you can try the use of the /*+ MATERIALIZE */ hint in your WITH clause sub queries to mimic a Global Temporary Table without actually needing one.
I have a database table and I need to utilize the data in the table to tell me about segment sizes for the tables listed in the database table.
Here's an example of the kind of data in there, it's broken up into 4 columns and there are many rows:
TABLE_A TABLE_B JOIN_COND WHERE_CLAUSE
AZ AT A.AR_ID = B.AR_ID A.DE = 'AJS'
AZ1 AT1 A.AR_ID = B.AR_ID A.DE = 'AJS' AND B.END_DATE > '30-NOV-2015'
AZ2 AT3 A.AR_ID = B.AR_ID A.DE = 'AJS' AND B.END_DATE > '30-NOV-2015'
Here's what I need to accomplish:
Some sort of loop perhaps? That finds the size of each single "TABLE_A" in kilobytes.
Build a query that would find an estimate of the data (space) that would be needed to create a new table based on a subset of a query something like this:
...
SELECT *
FROM TABLE_A a, TABLE_B b
WHERE A.AR_ID = B.AR_ID
AND A.FININS_CDE = 'AJS'
AND B.END_DTE > '30-NOV-2015'
... but for every row in the table. So at the end of the process if there were 100 rows in the table, I would get 200 results:
100 rows telling me the size of each table A
100 results telling me the size that would be taken up by the subset with the WHERE clause.
You're going to need to use dynamic sql for this. The Oracle documentation is here.
You'll need to build some dynamic sql for each of your tables:
SELECT TABLE_A, 'select segment_name,segment_type,bytes/1024/1024 MB
from dba_segments
where segment_type=''TABLE'' and segment_name=''' || TABLE_A || ''''
FROM <your meta data table>
Then you'll need to loop over the result set and execute each of the statements and capture the results. Some info about that here.
Once you've executed all of the statements you'll have the answer for 1.
The next part is a little more tricky where you'll need to find the data type sizes for each of the columns, then adding all of these together you'll get the size of one row for one table. You can use vsize to get the size of each column.
Using more dynamic sql you can then build your actual statements and execute them as a SELECT COUNT(*) to get the actual number of rows. Multiply the number of rows by the size of a full row from each table and you'll have your answer. You'll obviously need another loop for that too.
Does that all make sense?
Below is the update statement that I am running for 32k times and it is taking more than 15 hours and running.
I have to update the value in table 2 for 32k different M_DISPLAY VALUES.
UPDATE TABLE_2 T2 SET T2.M_VALUE = 'COL_ANC'
WHERE EXISTS (SELECT 1 FROM TABLE_2 T1 WHERE TRIM(T1.M_DISPLAY) = 'ANCHORTST' AND T1.M_LABEL=T2.M_LABEL );
Am not sure Why is it taking such a long time as I have tuned the query,
I have copied 32000 update statements in a Update.sql file and running the SQL in command line.
Though it is updating the table, it is a neverending process
Please advice if I have gone wrong anywhere
Regards
Using FORALL
If you cannot rewrite the query to run a single bulk-update instead of 32k individual updates, you might still get lucky by using PL/SQL's FORALL. An example:
DECLARE
TYPE rec_t IS RECORD (
m_value table_2.m_value%TYPE,
m_display table_2.m_display%TYPE
);
TYPE tab_t IS TABLE OF rec_t;
data tab_t := tab_t();
BEGIN
-- Fill in data object. Replace this by whatever your logic for matching
-- m_value to m_display is
data.extend(1);
data(1).m_value := 'COL_ANC';
data(1).m_display := 'ANCHORTST';
-- Then, run the 32k updates using FORALL
FORALL i IN 1 .. data.COUNT
UPDATE table_2 t2
SET t2.m_value = data(i).m_value
WHERE EXISTS (
SELECT 1
FROM table_2 t1
WHERE trim(t1.m_display) = data(i).m_display
AND t1.m_label = t2.m_label
);
END;
/
Concurrency
If you're not the only process on the system, 32k updates in a single transaction can hurt. It's definitely worth committing a few thousand rows in sub-transactions to reduce concurrency effects with other processes that might read the same table while you're updating.
Bulk update
Really, the goal of any improvement should be bulk updating the entire data set in one go (or perhaps split in a few bulks, see concurrency).
If you had a staging table containing the update instructions:
CREATE TABLE update_instructions (
m_value VARCHAR2(..),
m_display VARCHAR2(..)
);
Then you could pull off something along the lines of:
MERGE INTO table_2 t2
USING (
SELECT u.*, t1.m_label
FROM update_instructions u
JOIN table_2 t1 ON trim(t1.m_display) = u.m_display
) t1
ON t2.m_label = t1.m_label
WHEN MATCHED THEN UPDATE SET t2.m_value = t1.m_value;
This should be even faster than FORALL (but might have more concurrency implications).
Indexing and data sanitisation
Of course, one thing that might definitely hurt you when running 32k individual update statements is the TRIM() function, which prevents using an index on M_DISPLAY efficiently. If you could sanitise your data so it doesn't need trimming first, that would definitely help. Otherwise, you could add a function based index just for the update (and then drop it again):
CREATE INDEX i ON table_2 (trim (m_display));
The query and subquery query the same table: TABLE_2. Assuming that M_LABEL is unique, the subquery returns 1s for all rows in TABLE_2 where M_DISPLAY is ANCHORTST. Then the update query updates the same (!) TABLE_2 for all 1s returned from subquery - so for all rows where M_DISPLAY is ANCHORTST.
Therefore, the query could be simplified, exploiting the fact that both update and select work on the same table - TABLE_2:
UPDATE TABLE_2 T2 SET T2.M_VALUE = 'COL_ANC' WHERE TRIM(T2.M_DISPLAY) = 'ANCHORTST'
If M_LABEL is not unique, then the above is not going to work - thanks to commentators for pointing that out!
For significantly faster execution:
Ensure that you have created an index on M_DISPLAY and M_LABEL columns that are in your WHERE clause.
Ensure that M_DISPLAY has a function based index. If it does not, then do not pass it to the TRIM function, because the function will prevent the database from using the index that you have created for the M_DISPLAY column. TRIM the data before storing in the table.
Thats it.
By the way, as has been mentioned, you shouldn't need 32k queries for meeting your objective. One will probably suffice. Look into query based update. As an example, see the accepted answer here:
Oracle SQL: Update a table with data from another table
With a simple UPDATE statement we can do it in batches when dealing with huge tables.
WHILE 1 = 1
BEGIN
UPDATE TOP (5000)
dbo.LargeOrders
SET CustomerID = N'ABCDE'
WHERE CustomerID = N'OLDWO';
IF ##rowcount < 5000
BREAK;
END
When working with MERGE statement, is it possible to do similar things? As I know this is not possible, because you need to do different operations based on the condition. For example to UPDATE when matched and to INSERT when not matched. I just want to confirm on it and I may need to switch to the old-school UPDATE & INSERT if it's true.
Why not use a temp table as the source of your MERGE and then handle batching via the source table. I do this in a project of mine where I'll batch 50,000 rows to a temp table and then use the temp table as the source for the MERGE. I do this in a loop until all rows are processed.
The temp table could be a physical table or an in memory table.
I got to delete some unwanted rows from a table based on the result of a select query from another table
DELETE /*+ parallels(fe) */ FROM fact_x fe
WHERE fe.key NOT IN(
SELECT DISTINCT di.key
FROM dim_x di
JOIN fact_y fa
ON fa.code = di.code
WHERE fa.code_type = 'ABC'
);
The inner select query returns 77 rows and executes in few milliseconds. but the outer delete query runs forever(for more than 8 hrs). I tried to count how many rows got to be deleted by converting the delete to select count(1) and its around 66.4 million fact_x rows out of total 66.8 million rows. I am not trying to truncate though. I need to retain remaining rows.
Is there any other way to achieve this? will deleting this by running a pl/sql cursor will work better?
Would it not make more sense just to insert the rows you want to keep into another table, then drop the existing table? Even if there are FKs to disable/recreate/etc. it almost certain to be faster.
Could you add a "toBeDeleted" column? The query to set that wouldn't need that "NOT IN" construction. Deleting the marked rows should also be "simple".
Then again, deleting 99,4% of the 67 million rows will take some time.
Try /*+ parallel(fe) */. No "S".