mysterious oracle query - sql

if a query in oracle takes the first time it is executed 11 minutes, and the next time, the same query 25 seconds, with the buffer being flushed, what is the possible cause? could it be that the query is written in a bad way?
set timing on;
set echo on
set lines 999;
insert into elegrouptmp select idcll,idgrpl,0 from elegroup where idgrpl = 109999990;
insert into SLIMONTMP (idpartes, indi, grecptseqs, devs, idcll, idclrelpayl)
select rel.idpartes, rel.indi, rel.idgres,rel.iddevs,vpers.idcll,nvl(cdsptc.idcll,vpers.idcll)
from
relbqe rel,
elegrouptmp ele,
vrdlpers vpers
left join cdsptc cdsptc on
(cdsptc.idclptcl = vpers.idcll and
cdsptc.cdptcs = 'NOS')
where
rel.idtits = '10BCPGE ' and
vpers.idbqes = rel.idpartes and
vpers.cdqltptfc = 'N' and
vpers.idcll = ele.idelegrpl and
ele.idgrpl = 109999990;
alter system flush shared_pool;
alter system flush buffer_cache;
alter system flush global context;
select /* original */ mvtcta_part_SLIMONtmp.idpartes,mvtcta_part_SLIMONtmp.indi,mvtcta_part_SLIMONtmp.grecptseqs,mvtcta_part_SLIMONtmp.devs,
mvtcta_part_SLIMONtmp.idcll,mvtcta_part_SLIMONtmp.idclrelpayl,mvtcta_part_vrdlpers1.idcll,mvtcta_part_vrdlpers1.shnas,mvtcta_part_vrdlpers1.cdqltptfc,
mvtcta_part_vrdlpers1.idbqes,mvtcta_part_compte1.idcll,mvtcta_part_compte1.grecpts,mvtcta_part_compte1.seqc,mvtcta_part_compte1.devs,mvtcta_part_compte1.sldminud,
mvtcta.idcll,mvtcta.grecptseqs,mvtcta.devs,mvtcta.termel,mvtcta.dtcptl,mvtcta.nusesi,mvtcta.fiches,mvtcta.indl,mvtcta.nuecrs,mvtcta.dtexel,mvtcta.dtvall,
mvtcta.dtpayl,mvtcta.ioi,mvtcta.mtd,mvtcta.cdlibs,mvtcta.libcps,mvtcta.sldinitd,mvtcta.flagtypei,mvtcta.flagetati,mvtcta.flagwarnl,mvtcta.flagdonei,mvtcta.oriindl,
mvtcta.idportfl,mvtcta.extnuecrs
from SLIMONtmp mvtcta_part_SLIMONtmp
left join vrdlpers mvtcta_part_vrdlpers1 on
(
mvtcta_part_vrdlpers1.idbqes = mvtcta_part_SLIMONtmp.idpartes
and mvtcta_part_vrdlpers1.cdqltptfc = 'N'
and mvtcta_part_vrdlpers1.idcll = mvtcta_part_SLIMONtmp.idcll
)
left join compte mvtcta_part_compte1 on
(
mvtcta_part_compte1.idcll = mvtcta_part_vrdlpers1.idcll
and mvtcta_part_compte1.grecpts = substr (mvtcta_part_SLIMONtmp.grecptseqs, 1, 2 )
and mvtcta_part_compte1.seqc = substr (mvtcta_part_SLIMONtmp.grecptseqs, -1 )
and mvtcta_part_compte1.devs = mvtcta_part_SLIMONtmp.devs
and (mvtcta_part_compte1.devs = ' ' or ' ' = ' ')
and mvtcta_part_compte1.cdpartc not in ( 'L' , 'R' )
)
left join mvtcta mvtcta on
(
mvtcta.idcll = mvtcta_part_SLIMONtmp.idclrelpayl
and mvtcta.devs = mvtcta_part_SLIMONtmp.devs
and mvtcta.grecptseqs = mvtcta_part_SLIMONtmp.grecptseqs
and mvtcta.flagdonei <> 0
and mvtcta.devs = mvtcta_part_compte1.devs
and mvtcta.dtvall > 20101206
)
where 1=1
order by mvtcta_part_compte1.devs,
mvtcta_part_SLIMONtmp.idpartes,
mvtcta_part_SLIMONtmp.idclrelpayl,
mvtcta_part_SLIMONtmp.grecptseqs,
mvtcta.dtvall;

"if a query in oracle takes the first
time it is executed 11 minutes, and
the next time, the same query 25
seconds, with the buffer being
flushed, what is the possible cause?"
The thing is, flushing the DB Buffers, like this ...
alter system flush shared_pool
/
... wipes the Oracle data store but there are other places where data gets cached. For instance the chances are your OS caches its file reads.
EXPLAIN PLAN is good as a general guide to how the database thinks it will execute a query, but it is only a prediction. It can be thrown out by poor statistics or ambient conditions. It is not good at explaining why a specific instance of a query took as much time as it did.
So, if you really want to understand what occurs when the database executes a specific query you need to get down and dirty, and learn how to use the Wait Interface. This is a very powerful tracing mechanism, which allows us to see the individual events that happen over the course of a single query execution. Each version of Oracle has extended the utility and richness of the Wait Interface, but it has been essential to proper tuning since Oracle 9i (if not earlier).
Find out more by reading Roger Schrag's very good overview .
In your case you'll want to run the trace multiple times. In order to make it easier to compare results you should use a separate session for each execution, setting the 10046 event each time.

What else is happening on the box when you ran these? You can get different timings based on other processes chewing resources. Also, with a lot of joins, performance will depend on memory usage (hash_area_size, sort_area_size, etc) and availability, so perhaps you are paging (check temp space size/usage also). In short, try sql_trace and tkprof to analyze deeper

Sometimes a block is written to the file system before it is committed (a dirty block). When that block is read later, Oracle sees that it was uncommitted. It checks the open transaction and, if the transaction isn't still there, it knows the change was committed. Therefore it writes the block back as a clean block. It is called delayed block cleanout.
That is one possible reason why reading blocks for the first time can be slower than a subsequent re-read.

Could be the second time the execution plan is known. Maybe the optimizer has a very hard time finding a execution plan for some reason.
Try setting
alter session set optimizer_max_permutations=100;
and rerun the query. See if that makes any difference.

could it be that the query is written in a bad way?
"bad" is a rather emotional expression - but broadly speaking, yes, if a query performs significantly faster the second time it's run, it usually means there are ways to optimize the query.
Actually optimizing the query is - as APC says - rather a question of "down and dirty". Obvious candidate in your examply might be the substring - if the table is huge, and the substring misses the index, I'd imagine that would take a bit of time, and I'd imagine the result of all those substrin operations are cached somewhere.

Here's Tom Kyte's take on flushing Oracle buffers as a testing practice. Suffice it to say he's not a fan. He favors the approach of attempting to emulate your production load with your test data ("real life"), and tossing out the first and last runs. #APC's point about OS caching is Tom's point - to get rid of that (non-trivial!) effect you'd need to bounce the server, not just the database.

Related

Optimize joining two big tables ORACLE 19C

how can I optimize the query below :
SELECT A.CNACT, A.FACML, A.LCACT, H.CAECH, H.CMECH, H.MCCMP,H.DAHIS,RANK() OVER (PARTITION BY H.CNACT,H.CAECH,H.CMECH ORDER BY H.DAHIS DESC) RK
FROM NATACF A,HISTER H WHERE A.CNACT = H.CNACT;
select count (*) FROM NATACF; -->74794
select count (*) FROM HISTER; -->2100720
you find in attachment the execution plan
Thank you.
As you see window sort and hash JOIN are not optimised effectively. What is the best way to optimise this?
the screenshot below of prod database :
Long story short, you want ALL data from *both" tables - no filtering in place.
Oracle reads whole smaller (driving) table into hash map.
Uses joining column CNACT as a key
They reads whole bigger table and performs lookup in the hashmap for each row read.
the complexity is O(N+M), each row is read only once
There is no way to evaluate such a query in faster way (aside from dirty tricks like putting both tables into CLUSTER, pinning tables in buffer cache,...).
PS: it is strange that explain plan shows 2 sec - both tables are not actually so big.
While prod DB says 5hours.
Try to execute the query using set timing on set echo off set pagesize 0 set termout off set feedback off set pause off set verify off set headings off. Basically read the whole result and then discard it and print exec. time. And you will see.
Maybe it is the app (or network) who has problem to transfer the whole big result set. Is such a case you would see "SQL*Net Message to client" wait event in AWR. Like the database is waiting for the application to accept more data. Like you are sending about 14GB of data into the application.
For example Java has problems with GC, or each row is used to costly Java Object creation.
we resolved the problem using the with :
WITH Z AS(SELECT
| X.DATRA,X.COINI,X.COINT,X.NUCPT,N.COBCN,X.CNACT,N.LCACT,N.CNACR,DECODE(X.CSOPT,NULL,X.CAECH,O.CAECR) CAECH,
| DECODE(X.CSOPT,NULL,X.CMECH,O.CMECR) CMECH,X.CSOPT,X.MTSNA,N.COTSJ,X.CSENS,X.QTCCP,D.TXCHA,R.MAINT,X.CODEV,
| D.CDVRF,C.TYEDI,C.NUSES
10: | FROM CUMPOR X,
| MRXIDE C,.....
The WITH clause - The materialized subquery data is persistent through the query.

Can I avoid locking rows for unnecessary amounts of time [in Django]?

Take the following code for example (ignore the lack of usefulness in its functionality, as it's just a simple example to include the things I need):
#transaction.commit_on_success
def test_for_success(username)
person = Person.objects.select_for_update().get(username=username)
response = urllib2.urlopen(URL_TO_SOME_SLOW_API, some_data)
if '<YES>' in response.read():
person.successes += 1
person.save()
My question pertaining the example has to do with when the queries hit the database. Clearly the first query will lock the Person row, and then I'm calling a slow API, which could take 3 seconds to respond, causing the row to be locked for 3 seconds. Am I understanding this correctly, and in the case of slow API hits happening in my transaction, if I move the location of my queries so that a SELECT FOR UPDATE doesn't happen until after all the slow API requests, will this have the seemingly obvious effect of not locking my rows for seconds at a time (the case for select_for_update in my application is unavoidable)? Or, am I misunderstanding, and somehow none of the SQL actually hits the database until the end of the transaction?
Your assumptions about your code are correct. If you look at the select_for_update() docs, this action does lock those rows in the database until they are unlocked. This would in effect lock out for the duration of your urllib request.
If you were to move the database call into the conditional after the request, you are correct again that the database would be locked for a much shorter amount of time (though if that is called alot will still have some clients who block on the call due to contention).

SQL update working not insert

Ok I am going to do my best describing this. I have a SP which passes in XML and updates and inserts another table. This was working yesterday. All I changed today was loading the temp table with a OPENXML vs xml.nodes. I even changed it back and I am still getting this interesting issue. I have an update and insert in the same transaction. The update works and then the Insert hangs, no error no nothing... going on 9 minutes. Normally takes 10 seconds. No Blocking processes according to master.sys.sysprocesses. The funny thing is the Select of the Insert returns no rows as they are already in the database. The update updates 72438 in
SQL Server Execution Times:
CPU time = 1359 ms, elapsed time = 7955 ms.
ROWS AFFECTED(72438)
I am out of ideas as to what could be causing my issue? Permissions I don't think so? Space I don't think so because a Error would be returned?
queries:
UPDATE [Sales].[dbo].[WeeklySummary]
SET [CountryId] = I.CountryId
,[CurrencyId] = I.CurrencyId
,[WeeklySummaryType] = #WeeklySummaryTypeId
,[WeeklyBalanceAmt] = M.WeeklyBalanceAmt + I.WeeklyBalanceAmt
,[CurrencyFactor] = I.CurrencyFactor
,[Comment] = I.Comment
,[UserStamp] = I.UserStamp
,[DateTimeStamp] = I.DateTimeStamp
FROM
[Sales].[dbo].[WeeklySummary] M
INNER JOIN
#WeeklySummaryInserts I
ON M.EntityId = I.EntityId
AND M.EntityType = I.EntityType
AND M.WeekEndingDate = I.WeekEndingDate
AND M.BalanceId = I.BalanceId
AND M.ItemType = I.ItemType
AND M.AccountType = I.AccountType
and
INSERT INTO [Sales].[dbo].[WeeklySummary]
([EntityId]
,[EntityType]
,[WeekEndingDate]
,[BalanceId]
,[CountryId]
,[CurrencyId]
,[WeeklySummaryType]
,[ItemType]
,[AccountType]
,[WeeklyBalanceAmt]
,[CurrencyFactor]
,[Comment]
,[UserStamp]
,[DateTimeStamp])
SELECT
I.[EntityId]
, I.[EntityType]
, I.[WeekEndingDate]
, I.[BalanceId]
, I.[CountryId]
, I.[CurrencyId]
, #WeeklySummaryTypeId
, I.[ItemType]
, I.[AccountType]
, I.[WeeklyBalanceAmt]
, I.[CurrencyFactor]
, I.[Comment]
, I.[UserStamp]
, I.[DateTimeStamp]
FROM
#WeeklySummaryInserts I
LEFT OUTER JOIN
[Sales].[dbo].[WeeklySummary] M
ON I.EntityId = M.EntityId
AND I.EntityType = M.EntityType
AND I.WeekEndingDate = M.WeekEndingDate
AND I.BalanceId = M.BalanceId
AND I.ItemType = M.ItemType
AND I.AccountType = M.AccountType
WHERE M.WeeklySummaryId IS NULL
UPDATE
Trying the advice here worked for a while I run the following before my stored procedure call
UPDATE STATISTICS Sales.dbo.WeeklySummary;
UPDATE STATISTICS Sales.dbo.ARSubLedger;
UPDATE STATISTICS dbo.AccountBalance;
UPDATE STATISTICS dbo.InvoiceUnposted
UPDATE STATISTICS dbo.InvoiceItemUnposted;
UPDATE STATISTICS dbo.InvoiceItemUnpostedHistory;
UPDATE STATISTICS dbo.InvoiceUnpostedHistory;
EXEC sp_recompile N'dbo.proc_ChargeRegister'
Still stalling at the Insert Statement, which again inserts 0 rows.
There are really only a few things that can be going on, and the trick here is to eliminate them in order, from simplest to most complex.
STEP 1: Hand craft a set of XML to run that will produce exactly one insert and no updates, so you can go "back to basics" as it were and establish that the code is still doing what you expect, and the result is exactly what you expect. This may seem silly or unnecessary but you really need this reality check to start.
STEP 2: Hand craft a set of XML that will produce a medium-size set of inserts, still with no updates. Based on your experience with the routine, try to find something that will run in a 3-4 seconds. Perhaps 5000 rows. Does it continue to behave as expected?
STEP 3: Assuming steps 1 and 2 pass easily, the next most likely problem is TRANSACTION SIZE. If your update hits 74,000 rows in a single statement, then SQL Server must allocate resources to be able to roll back all 74,000 rows in the case of an abort. Generally you should assume the resources (and time) required to maintain a transaction explode exponentially as the row count goes up. So hand-craft one more set of inserts that contains 50,000 rows. You should find it takes dramatically more time. Let it finish. Is it 10 minutes, an hour? If it takes a long time but finishes, you have an issue with TRANSACTION SIZE, the server is choking trying to keep track of everything required to roll back the insert in the event of failure.
STEP 4: Determine if your entire stored procedure is operating within a single implied transaction. If it is, the matter is entirely worse, because SQL Server is tracking together everything required to roll back both the 74,000 updates and the ??? inserts in a single transaction. See this page:
http://msdn.microsoft.com/en-us/library/ms687099(v=vs.85).aspx
STEP 5: If you've got a single implicit transaction, you can either. A) Turn that off, which may help some but will not entirely fix the problem, or B) break the sproc into two separate calls, one for updates, one for inserts, so that at least the two are in separate transactions.
STEP 6: Consider "chunking". This is a technique for avoiding exploding transaction costs. Considering just the INSERT to get us started, you wrap the insert into a loop that begins and commits a transaction inside each iteration, and exits when affected rows is zero. The INSERT is modified so that you pull only the first 1000 rows from the source and insert them (that 1000 number is kind of arbitrary, you may find 5000 produces better performance, you have to experiment a bit). Once the INSERT affects zero rows, there are no more rows to handle and the loop exits.
QUICK EDIT: The "chunking" system works because the complete throughput for a large set of rows looks something like a quadratic. If you execute an INSERT that affects a huge number of rows, the total time for all rows to be handled explodes. If on the other hand you break it up and go row-by-row, the overhead of opening and committing each statement causes the total time for all rows to explode. Somewhere in the middle, when you've "chunked" out 1k rows per statement, the transaction requirements are at their minimum and the overhead of opening and committing the statement is negligible, and the total time for all rows to be handled is a minimum.
I had a problem where the stored proc was actually getting recompiled in the middle of running because it was deleting rows from a temp table. My situation doesn't look like yours, but mine was so odd that reading about it might give you some ideas.
Unexplained SQL Server Timeouts and Intermittent Blocking
I think you should post the full stored proc because the problem doesn't look to be where you think it is.

linq to sql insert locks

here is some insert code
gkInfo.data.ToList()
.ForEach(p => p.hour.ToList()
.ForEach(r => r.block.ToList()
.ForEach(q =>
{
var v = new VarValues();
v.dt = DateTime.Parse(p.target_date + " " + (r.value - 1).ToString() + ":00:00");
v.id_objecttype = config.stations.Where(i => i.text == q.station_name).Single().id_objecttype;
v.id_object = q.bnum.ToString();
v.id_param = config.stations.Where(i => i.text == q.station_name).Single().id_param;
v.pl_lev = 3;
v.source = 0;
v.value = q.block_state;
v.version = version;
v.description = q.change_type;
m53500context1.VarValues.InsertOnSubmit(v);
}
)));
m53500context1.SubmitChanges();
and this code, locks table.
can i avoid it? or its impossible?
Although I don't know all the details regarding your issue the patterns seems very familiar. Often times you need to do a big update in the database, but at the same time you still need the database be available, so for example if there is a web site that is working off the dataset, it does not time-out while update operations in progress.
Sometimes update operations can be regular exports from another database, sometimes, calculating some caches, not unlike the example you have provided.
If you need your update be transactional (i.e. all or nothing), there is no real way around the lock. While the update is underway the table is locked. If you don't need transaction, then you can try and break up you update to smaller batches. SubmitChanges, wrap all the changes in a single transaction, so you gonna need to use several SubmitChanges, so each individual transaction is fast and thus not locking the table for long.
If being transactional is a requirement, you can do you inserts in a staging area, i.e. not in the same table that other processes read from. When the insert is finished you figure out the way, to swap the areas. This could complicated by the fact that there maybe updates into this table that you haven't accounted for, but I do not know if this is true in your case.
In the worst case you will need to have some application logic, that knows that update is in progress and while it's happening it reads the data from an alternate location. Of course you will have to provide this alternate location (a copy) to read from.
There is no hard and fast answer, but there a few things (above) that you can try. Also feel free to tell more about your specific task / requirements.
Please see: LINQ To SQL NO_LOCK. You need to set a different isolation level for your transaction.

What was your coolest SQL optimization, on a slow performing query?

Just speaking to a colleague of mine. He was walking with a hop in his step, on the way to the coffee machine.
I asked him "what's with the 'swarmy' walk?", he said, "I just reduced a two hour long query down to 40 seconds! It feels so good".
He altered a stored procedure, that was using cursors and introduced a temp table, that was refactored from the original dataset - I will email him soon, to get more info on actual implemetation.
But ultimately, he was buzzing.
Question is, what SQL, that sticks in your mind and has made you buzz, whilst optimising slow performing queries?
I have to say when I learned how to create and use covered indexes. Now, THAT was a performance booster.
Using SQL's BULKIMPORT to reduce several hours of inherited INSERT code to less than a minute.
It's always nice to take a poorly written, cursor-laden query and eliminate cursors, cut the code by half, and improve performance many-fold.
Some of the best improvements are in clarity (and often result in nice performance boosts, too).
Sorry, I don't tend to get a buzz from that sort of thing but most situations have been pretty basic, monitoring performance of queries and adding indexes to speed them up.
Now increasing the speed of "real" code that I've written by changing data structures and algorithms within the class, that's where I get my buzz (and reputation a the go-to man for performance fixes at work).
hey on the iphone which uses sqlite, i straight away reduced by database processing time from 40 seconds to 2 seconds with the use of exclusive write transactions... i was super happy doing this
as this was my first experience of sql on an embedded device - quite different from the usual server related stuff (indexes, normalizations, etc etc)
--- as far as servers go, indexes are real blessing. also if you take a bit of pain and get rid of as many nulls as you can in your table, you would be surprised with the performance gains - not many developers focus on nulls, they usually go with indexes and other documented stuff
few other lesser exploited ways - using xml to process multiple batch inserts / updates / deletes at 1 go instead of doing 1 insert at a time - in sql 2005 this can be super cool
It's all about indexes. And avoiding stupid things that make them useless.
Changing order of conditions inside WHERE clause so it filters the most discriminating condition first (while at the same time indexes from non-discriminating columns like gender are removed).
Back in the day, I worked on a CICS/DB2 system, written in COBOL. A lot of our queries were doing full table scans (and slow) even though we had all the proper indexes and WHERE clauses.
It turned out (and I may have this backwards, it's been 15 years) that the problem was that we were using PIC S9(n) COMP in WORKING STORAGE for the query parameters, but DB2 wanted PIC S9(n) COMP-3. By using the wrong data type, DB2 had to do a full table scan in order to convert the values in the database to the value we were passing in. We changed our variable definitions and the queries were able to use the indexes now, which dramatically improved our performance.
I had a query that was originally written for SQL Server 6.5, which did not support the SQL 92 join syntax, i.e.
select foo.baz
from foo
left outer join bar
on foo.a = bar.a
was instead written as
select foo.baz
from foo, bar
where foo.a *= bar.a
The query had been around for a while, and the relevant data had accumulated to make the query run too slow, abut 90 seconds to complete. By the time this problem arose, we had upgraded to SQL Server 7.
After mucking about with indexes and other Easter-egging, I changed the join syntax to be SQL 92 compliant. The query time dropped to 3 seconds.
I don't think I'll ever have that feeling again. I was a f%$^ing hero.
I answered this on a previous question ("Biggest performance improvement you’ve had with the smallest change?"), however, it's such a simple improvement, yet one that is and can be so often overlooked, that it bears repeating:
Indexes!
Well we had a similar thing where we had a slow query on a Open Freeway site. The answer wasn't so much optimising the query, but to optimise the server that it was on. We increased the cache limit and cache size so that the server would not run the query so often.
This has massively increased the speed of the system and ultimately made the client happy! :)
Not quite the calibre of the original posts optimisation skills, but it definitely made us buzz!
Splitting one ridiculously long stored procedure, which did a great deal of "if it's after 5 pm, return this bit of sql" and which took in excess of 20 seconds to run, into a set of stored procedures that were called by one controlling sp, and got the times down to subsecond responses.
One Word, Dynamic Queries
If you serching with large numbers of parameters you can discount them from the SQL string. This has sped up my queries dramatically and with reletive ease.
Create PROCEDURE dbo.qryDynamic
(
#txtParameter1 nvarchar(255),
#txtParameter2 nvarchar(255),
AS
SELECT qry_DataFromAView.*
FROM qry_DataFromAView
BEGIN
DECLARE #SQL nvarchar(2500)
DECLARE #txtJoin nvarchar(50)
Set #txtJoin = ' Where '
SET #SQL = 'SELECT qry_DataFromAView.*
FROM qry_DataFromAView'
IF #txtParameter1 is not null
Begin
SET #SQL=#SQL + #txtJoin + ' Field1 LIKE N''%'' + #dynParameter1 + N''%'') '
Set #txtJoin = ' And '
end
IF #txtParameter2 is not null
Begin
SET #SQL=#SQL + #txtJoin + ' Field2 LIKE N''%'' + #dynParameter2 + N''%'') '
Set #txtJoin = ' And '
end
SET #SQL=#SQL + ' ORDER BY Field2'
Exec sp_executesql #SQL, N'#dynParameter1 nvarchar(255), #dynParameter2 nvarchar(255)', #dynParameter1 = #txtParameter1 ,#dynParameter2 = #txtParameter2
END
GO
I had a warm glow after being able to use a Cross Tab query to scrap oodles (technical term) of processing and lookups...
Usually it's simple things like adding indexes or only getting the data you need, but when you find a problem that fits an answer you've seen before... good times!
(Half way of topic)
I rewrote a 3000 line stored procedure into LINQ2SQL/C#.
The stored procedure juggled lots of data between a bunch of unindexed temp tables.
The LINQ2SQL version read the data into a couple of Dictionaries and ILookups and then I joined the data manually with plain old C# code.
The stored procedure took about 20 seconds and the LINQ2SQL/C# version took 0.2 seconds.