Can I avoid locking rows for unnecessary amounts of time [in Django]? - sql

Take the following code for example (ignore the lack of usefulness in its functionality, as it's just a simple example to include the things I need):
#transaction.commit_on_success
def test_for_success(username)
person = Person.objects.select_for_update().get(username=username)
response = urllib2.urlopen(URL_TO_SOME_SLOW_API, some_data)
if '<YES>' in response.read():
person.successes += 1
person.save()
My question pertaining the example has to do with when the queries hit the database. Clearly the first query will lock the Person row, and then I'm calling a slow API, which could take 3 seconds to respond, causing the row to be locked for 3 seconds. Am I understanding this correctly, and in the case of slow API hits happening in my transaction, if I move the location of my queries so that a SELECT FOR UPDATE doesn't happen until after all the slow API requests, will this have the seemingly obvious effect of not locking my rows for seconds at a time (the case for select_for_update in my application is unavoidable)? Or, am I misunderstanding, and somehow none of the SQL actually hits the database until the end of the transaction?

Your assumptions about your code are correct. If you look at the select_for_update() docs, this action does lock those rows in the database until they are unlocked. This would in effect lock out for the duration of your urllib request.
If you were to move the database call into the conditional after the request, you are correct again that the database would be locked for a much shorter amount of time (though if that is called alot will still have some clients who block on the call due to contention).

Related

How to discard / ignore one of a stored procedure's many return values

I have a stored procedure that returns 2 values.
In another procedure, I call this (edit: NOT selectable) procedure but only need one of the two returned values.
Is there a way to discard the other value? I'm wondering what is a good practice, and hoping for a small performance gain.
Here is how I call the procedure without error:
CREATE or ALTER procedure my_proc1
as
declare variable v_out1 integer default null;
declare variable v_out2 varchar(10) default null;
begin
execute procedure my_proc2('my_param')
returning_values :v_out1, :v_out2;
end;
That is the only way I found to call this procedure without getting a -607 error 'unsuccessful metadata update request depth exceeded. (Recursive definition?)' whenever I use only one variable v_out1.
So my actual question is: can I avoid creating a v_out2 variable for nothing, as I will never use it (that value is only used in other procedures which also call my_proc2)?
Edit: the stored procedure my_proc2 is actually not selectable. But I made it selectable after all.
Because your stored procedure is selectable, you should call it by SELECT statement, ie
select out1, out2 from my_proc2('my_param')
and in that case you can indeed omit some of the return value(s). However, I wouldn't expect noticeable performance gain as the logic inside the SP which calculates the omitted field is still executed.
If your procedure is not selectable, then creating a wrapper SP is the only way, but again, it woulnd't give any performance gain as the code which does the hard work inside the original SP is still executed.
The answer is made to use text formatting while demonstrating "race conditions" in the multithreading programming (which SQL is) when [ab]using out-of-transaction objects (SQL sequences aka Firebird Generators).
So, the "use case".
Initial condition: table is empty, generator=0.
You start two concurrent transactions, A and B. For ease of imagining you may think those transactions were started from concurrent connections made by two persons working with your program on two networked computers. Though actually it does not matter much, if you open them transactions from one same connection - the scenario would not change a bit. Just for the ease of imagining.
The Tx.A issues UPDATE-OR-INSERT which inserts new row into the table. Doing so it up-ticks the generator. The transaction is not committed yet. Database condition: the table has one invisible (non-committed) row with auto_id=1, the generator = 1.
The Tx.B issues UPDATE-OR-INSERT too which inserts yet another row into the table. Doing so it also up-ticks the generator. The transaction maybe commits now, or maybe later, irrelevant. Database condition: the table has two rows (one or both are invisible (non-committed)) with auto_id=1 and auto_id=2, the generator = 2.
The Tx.A meets some error, throws the exception, DOWNTICKS the generator and rolls back. Database condition: the table has one row with auto_id=2 the generator = 1.
If Tx.B was not committed before, it is committed now. (this "if" just to demonstrate that it does not matter when other transactions would be committed, earlier or later, it only matters that Tx.A downticks the generator after any other transaction upticked it)
So, the final database condition: the table has one committed=visible row with auto_id=2 and the generator = 1.
Any next attempt to add yet one more row would try to up the generator 1+1=2 and then fail to insert new row with PK violation, then it would down the generator to 1 to recreate the faulty condition outlined above.
Your database stuck and without direct intervention by DB Administrator can not have data added further.
The very idea of rolling back the generator is defeating all intentions generators were created for and all expectations about generators behavior that the database and connection libraries and other programmers have.
You just placed a trap on the highway. It is only a matter of time until someone will be caught into it.
Even if you would continue guarding this hack by other hacks for now - wasting a lot of time and attention to do that scrupulously and pervasively - still one unlucky day in the future there would be another programmer, or even you would forget this gory details - and you would start using the generator in standard intended way - and would run into the trap.
Generators were not made to be backtracked during normal work.
existence of primary key is checked in the procedure before doing anything
Yep, that is the first reaction when multithreading programmer meets his first race condition. Let's just add more prior checks.
First few checks indeed can decrease probability of a clash, but it never can alleviate it completely. And the more use your program would see, the more transactions would get opened by more and more concurrent and active users - it is only a matter of time until this somewhat lowered probability would turn out still too much.
Think about it, SQL is about transactions, yet they had to invent and introduce explicitly out-of-transactions device Generator/Sequence is. If there was reliable solution without them - it would be just used instead of creating that so non-SQLish transaction boundary breaking tool.
When you say your SP "checks for PK violation" it is exactly the same as if you would drop the generator altogether and instead just issue "good old"
:new_id = ( select max(auto_id)+1 from MyTable );
By your description you actually do something like that, but in some indirect way. Something like
while exists( select * from MyTable where auto_id = gen_id(MyGen, +1))
do ;
:new_id = gen_id(MyGen, 0);
You may feel, that because you mentioned generators, you somehow overcame the cross-transaction invisibility problem. But you did not, because the very check "if PK was already taken" is done against in-transaction table.
That changes nothing, your two transactions Tx.A and Tx.B would not see each other's records, because they both did not committed yet. Now it only takes some unlucky Tx.C that would fail and downtick the generator to them collide on the same ID.
Or not, you do not even need Tx.C and downticking at all!
Here we bump into the multithreading idea about "atomic operations".
Let's look at it again.
while exists( select * from MyTable where auto_id = gen_id(MyGen, +1))
do ;
:new_id = gen_id(MyGen, 0);
In a single-threaded application that code is okay: you just keep running the generator up until the free slot, then you just query the value without changing it. "What could possibly go wrong?" But in multithreaded environment it is rooks waiting to be stepped over. Example:
Initial condition, table has 100 rows (auto_id goes from 1 to 100), the generator = 100.
Tx.A starts adding the row, upticks the generator in the while loop and exits the loop. It does not yet pass to the second line where local variable gets assigned. Not yet. The generator = 101, rows not added yet.
Tx.B starts adding the row, upticks the generator in the while loop and exits the loop. The generator = 102, rows not added yet.
Tx.A goes to the second line and reads gen_id(MyGen,0) into a variable for new row. While it was 101 out of the loop, it is 102 now!
Tx.B goes to the second line and reads gen_id(MyGen,0) and gets 102 too.
Tx.A and Tx.B both try to insert new row with auto_id=102
RACE CONDITIONS - both Tx.A and Tx.B try to commit their work. One of them succeeds, another fails. Which one? It is not predictable. A lucky one commits, an unlucky one fails.
The failed transaction downticks the generator.
Final condition: the table has 101 rows, the auto_id consistently goes from 1 to 100 and then skips to 102. The generator = 101, which his less than MAX(auto_id)
Now you might want to add more hacks, I mean more prior checks before actually inserting rows and committing. It will make mistakes yet less probable, right? Wrong. The more checks you do - the slower gets the code. The slower gets the code - the greater gets probability, that while one thread runs throw all them checks there happens another thread that interferes and alters the situation that was checked a moment ago.
The fundamental issue with multithreading is that any check is SEPARATE action. And between those actions the situation MAY change. Your procedure may check whatever it wants BEFORE actually inserting the row. It would not warrant much. Because when you finally gets at the row inserting statement, all the checks you did in the PAST are a matter of past. And the situation is potentially already altered. And warrants your checks were giving in the PAST only belong to that past, not to the moment at hands.
And even if you no more look for warranting sure thing, still adding every new check you can not even be sure if doing so you just decreased or increased probability of failure. Because multithreading is a bitch, it is flowing chaotically out of your control.
So, remember the KISS principle. Until proven otherwise - you most probably do not need SP2 at all, you only need one single UPDATE-OR-INSERT statement.
PS. There was a pretty fun game in my school days, it was called Pascal Robots. There are also C Robots I heard and probably implementation for other many languages. With Pascal Robots though came a number of already coded robots, demonstrating different strategies and approaches. Some of them were really thought out in very intrinsic details. And there was one robot which program was PRIMITIVE. It only had two loops: if you do not see an enemy - keep turning your radar around, if you do see an enemy - keep running to it and shooting at it. That was all. What could this idiot do against sophisticated robots having creative attack and defense strategies, flanking maneuvers, optimal distance to maintain by back and forth movements, escape tricks and more? Those sophisticated robots employed very extensive checks and very thought through hacks to be triggered by those checks. So... ...so that primitive idiot was second or maybe third best robot in the shipped set. there was only one or two smarties who could outwit it. With ALL the other robots this lean-and-fast idiot finished them before they could run through all their checks and hacks thrice. That is what multithreading does to programming. It was astonishing to watch those battles, which went so against out single-threaded intuition.

SQL MIN_ACTIVE_ROWVERSION() value does not change for a long while

We're troubleshooting a sort of Sync Framework between two SQL Server databases, in separate servers (both SQL Server 2008 Enterprise 64 bits SP2 - 10.0.4000.0), through linked server connections, and we reached to a point in which we're sort of stuck.
The logic to identify which are the records "pending to be synced" is of course based on ROWVERSION values, including the use of MIN_ACTIVE_ROWVERSION() to avoid dirty reads.
All SELECT operations are encapsulated in SPs on each "source" side. This is a schematic sample of one SP:
PROCEDURE LoaderRetrieve(#LastStamp bigint, #Rows int)
BEGIN
...
(vars handling)
...
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
Select TOP (#Rows) Field1, Field2, Field3
FROM Table
WHERE [RowVersion] > #LastStampAsRowVersionDataType
AND [RowVersion] < #MinActiveVersion
Order by [RowVersion]
END
The approach works just fine, we usually sync records with the expected rate of 600k/hour (job every 30 seconds, batch size = 5k), but at some point, the sync process does not find any single record to be transferred, even though there are several thousand of records with a ROWVERSION value greater than the #LastStamp parameter.
When checking the reason, we've found that the MIN_ACTIVE_ROWVERSION() has a value less than (or slightly greater, just 5 or 10 increments) the #LastStamp being searched. This of course shouldn't be a problem since the MIN_ACTIVE_ROWVERSION() approach was introduced to avoid dirty reads and posterior issues, BUT:
The problem we see in some occasions, during the above scenario occurs, is that the value for MIN_ACTIVE_ROWVERSION() does not change during a long (really long) period of time, like 30/40 minutes, sometimes more than one hour. And this value is by far less than the ##DBTS value.
We first thought this was related to a pending DB transaction not yet committed. As per MSDN definition about the MIN_ACTIVE_ROWVERSION() (link):
Returns the lowest active rowversion value in the current database. A rowversion value is active if it is used in a transaction that has not yet been committed.
But when checking sessions (sys.sysprocesses) with open_tran > 0 during the duration of this issue, we couldn't find any session with a waittime greater than a few seconds, only one or two occurrences of +/- 5 minutes waittime sessions.
So at this point we're struggling to understand the situation: The MIN_ACTIVE_ROWVERSION() does not change during a huge period of time, and no uncommitted transactions with long waits are found within this time frame.
I'm not a DBA and could be the case that we're missing something in the picture to analyze this problem, doing some research on forums and blogs couldn't found any other clue. So far the open_tran > 0 was the valid reason, but under the circumstances I've exposed, it's clear that there's something else and don't know why.
Any feedback is appreciated.
well, I finally find the solution after digging a bit more.
The problem is that we were looking for sessions with a long waittime, but the real deal was to find sessions which have an active batch since a while.
If there's a session where open_tran = 1, to obtain exactly since when this transaction is open (and of course still active, not yet committed), the field last_batch from sys.sysprocesses shall be checked.
Using this query:
select
batchDurationMin= DATEDIFF(second,last_batch,getutcdate())/60.0,
batchDurationSecs= DATEDIFF(second,last_batch,getutcdate()),
hostname,open_tran,* from sys.sysprocesses a
where spid > 50
and a.open_tran >0
order by last_batch asc
we could identify a session with an open tran being active 30+ minutes. And with hostname values and some more checks within the web services (and also using dbcc inputbuffer) we found the responsible process.
So, the final question actually is "there's indeed an active session with an uncommitted transaction", therefore the MIN_ACTIVE_ROWVERSION() does not change. We were just looking processes with the wrong criteria.
Now that we know which process behaves like this, next step will be to improve it.
Hope this results useful to someone else.

how to get next 1000 records the fastest way

I'm using Azure Table Storage.
Let's say i have a Partition in my Table with 10,000 records, and I would like to get records number 1000 to 1999. And next time i would like to get records number 4000 to 4999 etc.
What is the fastest way of doing that?
All I can find till now are two options, which I don't like very much:
1. run a query which returns all 10,000 records, and filter out what I want when I get all 10,000 records.
2. Run a query whichs returns 1000 records at a time, and use a continuation token to get the next 1000 records.
Is it possible to get a continuation token without downloading all corresponding records? It would be great if i can get Continuation Token 1, than get Continuation token 2, and with CT2 get records 2000 to 2999.
Theoretically you should be able to use continuation tokens without downloading the actual data for the first 1000 recors by closing the connection you have after the first request. And I mean closing it at TCP level. And before you read all data. Then open a new connection and use continuation token there. Two WebRequests will not do it since the HTTP implementation will likely use keep alive wchich means all your data is going to be read in the background even though you don't read it in your code. Actually you can configure your HTTP requests to not use keep alive.
However, another way is naturally if you know the RowKey and can search on that but I assume you don't know which row keys will be in each 1000 entity batch.
Last I would ask why you have this problem in the first place. And what your access pattern is. If inserts are common and getting these records is rare I wouldn't bother making it more efficient. if this is like a paging problem i would probably get all data on the first request and cache it (in the cloud). if inserts are rare but you need to run this query often I would consider making the insertion of data have one partion for every 1000 entities and rebalance as needed (due to sorting) as entities are inserted.

mysterious oracle query

if a query in oracle takes the first time it is executed 11 minutes, and the next time, the same query 25 seconds, with the buffer being flushed, what is the possible cause? could it be that the query is written in a bad way?
set timing on;
set echo on
set lines 999;
insert into elegrouptmp select idcll,idgrpl,0 from elegroup where idgrpl = 109999990;
insert into SLIMONTMP (idpartes, indi, grecptseqs, devs, idcll, idclrelpayl)
select rel.idpartes, rel.indi, rel.idgres,rel.iddevs,vpers.idcll,nvl(cdsptc.idcll,vpers.idcll)
from
relbqe rel,
elegrouptmp ele,
vrdlpers vpers
left join cdsptc cdsptc on
(cdsptc.idclptcl = vpers.idcll and
cdsptc.cdptcs = 'NOS')
where
rel.idtits = '10BCPGE ' and
vpers.idbqes = rel.idpartes and
vpers.cdqltptfc = 'N' and
vpers.idcll = ele.idelegrpl and
ele.idgrpl = 109999990;
alter system flush shared_pool;
alter system flush buffer_cache;
alter system flush global context;
select /* original */ mvtcta_part_SLIMONtmp.idpartes,mvtcta_part_SLIMONtmp.indi,mvtcta_part_SLIMONtmp.grecptseqs,mvtcta_part_SLIMONtmp.devs,
mvtcta_part_SLIMONtmp.idcll,mvtcta_part_SLIMONtmp.idclrelpayl,mvtcta_part_vrdlpers1.idcll,mvtcta_part_vrdlpers1.shnas,mvtcta_part_vrdlpers1.cdqltptfc,
mvtcta_part_vrdlpers1.idbqes,mvtcta_part_compte1.idcll,mvtcta_part_compte1.grecpts,mvtcta_part_compte1.seqc,mvtcta_part_compte1.devs,mvtcta_part_compte1.sldminud,
mvtcta.idcll,mvtcta.grecptseqs,mvtcta.devs,mvtcta.termel,mvtcta.dtcptl,mvtcta.nusesi,mvtcta.fiches,mvtcta.indl,mvtcta.nuecrs,mvtcta.dtexel,mvtcta.dtvall,
mvtcta.dtpayl,mvtcta.ioi,mvtcta.mtd,mvtcta.cdlibs,mvtcta.libcps,mvtcta.sldinitd,mvtcta.flagtypei,mvtcta.flagetati,mvtcta.flagwarnl,mvtcta.flagdonei,mvtcta.oriindl,
mvtcta.idportfl,mvtcta.extnuecrs
from SLIMONtmp mvtcta_part_SLIMONtmp
left join vrdlpers mvtcta_part_vrdlpers1 on
(
mvtcta_part_vrdlpers1.idbqes = mvtcta_part_SLIMONtmp.idpartes
and mvtcta_part_vrdlpers1.cdqltptfc = 'N'
and mvtcta_part_vrdlpers1.idcll = mvtcta_part_SLIMONtmp.idcll
)
left join compte mvtcta_part_compte1 on
(
mvtcta_part_compte1.idcll = mvtcta_part_vrdlpers1.idcll
and mvtcta_part_compte1.grecpts = substr (mvtcta_part_SLIMONtmp.grecptseqs, 1, 2 )
and mvtcta_part_compte1.seqc = substr (mvtcta_part_SLIMONtmp.grecptseqs, -1 )
and mvtcta_part_compte1.devs = mvtcta_part_SLIMONtmp.devs
and (mvtcta_part_compte1.devs = ' ' or ' ' = ' ')
and mvtcta_part_compte1.cdpartc not in ( 'L' , 'R' )
)
left join mvtcta mvtcta on
(
mvtcta.idcll = mvtcta_part_SLIMONtmp.idclrelpayl
and mvtcta.devs = mvtcta_part_SLIMONtmp.devs
and mvtcta.grecptseqs = mvtcta_part_SLIMONtmp.grecptseqs
and mvtcta.flagdonei <> 0
and mvtcta.devs = mvtcta_part_compte1.devs
and mvtcta.dtvall > 20101206
)
where 1=1
order by mvtcta_part_compte1.devs,
mvtcta_part_SLIMONtmp.idpartes,
mvtcta_part_SLIMONtmp.idclrelpayl,
mvtcta_part_SLIMONtmp.grecptseqs,
mvtcta.dtvall;
"if a query in oracle takes the first
time it is executed 11 minutes, and
the next time, the same query 25
seconds, with the buffer being
flushed, what is the possible cause?"
The thing is, flushing the DB Buffers, like this ...
alter system flush shared_pool
/
... wipes the Oracle data store but there are other places where data gets cached. For instance the chances are your OS caches its file reads.
EXPLAIN PLAN is good as a general guide to how the database thinks it will execute a query, but it is only a prediction. It can be thrown out by poor statistics or ambient conditions. It is not good at explaining why a specific instance of a query took as much time as it did.
So, if you really want to understand what occurs when the database executes a specific query you need to get down and dirty, and learn how to use the Wait Interface. This is a very powerful tracing mechanism, which allows us to see the individual events that happen over the course of a single query execution. Each version of Oracle has extended the utility and richness of the Wait Interface, but it has been essential to proper tuning since Oracle 9i (if not earlier).
Find out more by reading Roger Schrag's very good overview .
In your case you'll want to run the trace multiple times. In order to make it easier to compare results you should use a separate session for each execution, setting the 10046 event each time.
What else is happening on the box when you ran these? You can get different timings based on other processes chewing resources. Also, with a lot of joins, performance will depend on memory usage (hash_area_size, sort_area_size, etc) and availability, so perhaps you are paging (check temp space size/usage also). In short, try sql_trace and tkprof to analyze deeper
Sometimes a block is written to the file system before it is committed (a dirty block). When that block is read later, Oracle sees that it was uncommitted. It checks the open transaction and, if the transaction isn't still there, it knows the change was committed. Therefore it writes the block back as a clean block. It is called delayed block cleanout.
That is one possible reason why reading blocks for the first time can be slower than a subsequent re-read.
Could be the second time the execution plan is known. Maybe the optimizer has a very hard time finding a execution plan for some reason.
Try setting
alter session set optimizer_max_permutations=100;
and rerun the query. See if that makes any difference.
could it be that the query is written in a bad way?
"bad" is a rather emotional expression - but broadly speaking, yes, if a query performs significantly faster the second time it's run, it usually means there are ways to optimize the query.
Actually optimizing the query is - as APC says - rather a question of "down and dirty". Obvious candidate in your examply might be the substring - if the table is huge, and the substring misses the index, I'd imagine that would take a bit of time, and I'd imagine the result of all those substrin operations are cached somewhere.
Here's Tom Kyte's take on flushing Oracle buffers as a testing practice. Suffice it to say he's not a fan. He favors the approach of attempting to emulate your production load with your test data ("real life"), and tossing out the first and last runs. #APC's point about OS caching is Tom's point - to get rid of that (non-trivial!) effect you'd need to bounce the server, not just the database.

SQL connection lifetime

I am working on an API to query a database server (Oracle in my case) to retrieve massive amount of data. (This is actually a layer on top of JDBC.)
The API I created tries to limit as much as possible the loading of every queried information into memory. I mean that I prefer to iterate over the result set and process the returned row one by one instead of loading every rows in memory and process them later.
But I am wondering if this is the best practice since it has some issues:
The result set is kept during the whole processing, if the processing is as long as retrieving the data, it means that my result set will be open twice as long
Doing another query inside my processing loop means opening another result set while I am already using one, it may not be a good idea to start opening too much result sets simultaneously.
On the other side, it has some advantages:
I never have more than one row of data in memory for a result set, since my queries tend to return around 100k rows, it may be worth it.
Since my framework is heavily based on functionnal programming concepts, I never rely on multiple rows being in memory at the same time.
Starting the processing on the first rows returned while the database engine is still returning other rows is a great performance boost.
In response to Gandalf, I add some more information:
I will always have to process the entire result set
I am not doing any aggregation of rows
I am integrating with a master data management application and retrieving data in order to either validate them or export them using many different formats (to the ERP, to the web platform, etc.)
There is no universal answer. I personally implemented both solutions dozens of times.
This depends of what matters more for you: memory or network traffic.
If you have a fast network connection (LAN) and a poor client machine, then fetch data row by row from the server.
If you work over the Internet, then batch fetching will help you.
You can set prefetch count or your database layer properties and find a golden mean.
Rule of thumb is: fetch everything that you can keep without noticing it
if you need more detailed analysis, there are six factors involved:
Row generation responce time / rate(how soon Oracle generates first row / last row)
Row delivery response time / rate (how soon can you get first row / last row)
Row processing response time / rate (how soon can you show first row / last row)
One of them will be the bottleneck.
As a rule, rate and responce time are antagonists.
With prefetching, you can control the row delivery response time and row delivery rate: higher prefetch count will increase rate but decrease response time, lower prefetch count will do the opposite.
Choose which one is more important to you.
You can also do the following: create separate threads for fetching and processing.
Select just ehough rows to keep user amused in low prefetch mode (with high response time), then switch into high prefetch mode.
It will fetch the rows in the background and you can process them in the background too, while the user browses over the first rows.