Semaphore/Mutex lock/unlock frequency - locking

I have some code which I need to lock using a semaphore or mutex.
The code is something like this:
callA();
callB();
callC();
.
.
.
callZ();
I would like to know the efficient way to lock it. The options I am thinking are
lock before callA() and unlock after callZ(). My concern is the lock remains set for a pretty long period.
lock and unlock after each function call. I am worried about the 'too much overhead' of grabbing and releasing the lock.
Appreciare your help !!!

It all depends on your use case. How much lock/unlock/lock/unlock performance penalty can you tolerate? Weighed against this, how long are you willing to make another task block while waiting for the lock? Are some of the threads latency-critical or interactive and other threads bulk or low-priority? Are there other tasks that will take the same lock(s) through other code paths? If so, what do those look like? If the critical sections in callA, callB, etc.. are really separate then do you want to use 26 different locks? Or do they manipulate the same data, forcing you to use a single lock?
By the way, if you are using Linux, definitely use (pthreads) mutexes, not semaphores. The fast path for mutexes is completely usersparce. Locking and unlocking them when there is no contention is quite cheap. There is no fast path for semaphores.
Without knowing anything else, I would advise fine grained locking, especially if your individual functions are already organized to not make assumptions that would only be true if the lock is held across them all. But as I said, it really depends what you're doing and why you're doing it.

Related

Concurrent issues in SQL Server

I have set of validations which decides record to be inserted into the database with valid status code, the issue we are facing is that many users are making requests at the same time and middle of one transaction another transaction comes and both are getting inserted with valid status, which it shouldn't. it should return an error that record already exists which can be easily handled by a simple query but at specific scenarios we are allowing them to insert duplicates, I have tried sp_getapplock which is solving my problem but it is compromising performance big time. Are there any optimal ways to handle concurrent requests?
Thanks.
sp_getapplock is pretty much the befiest and most arbitrary lock you can take. It functions more like the lock keyword does in OOO programming. Basically you name a resource, give it a scope (proc or transaction), then lock it. Pretty much nothing can bypass that lock, which is why it's solved your race conditions. It's also probably mad overkill for what you're trying to do.
The first code/architecture idea that comes to mind is to restructure this table. I'm going to assume you have high update volumes or you wouldn't be running into these violations. You could simply use a try/catch block, and have the catch block retry on a PK violation. Clumsy, but might just do the trick.
Next, you could consider altering the structure of the table which receives this stream of updates throughout the day. Make this table primary keyed off an identity column, and pretty much nothing else. Inserts will be lightning fast, so any blockage will be negligible. You can then move this data in batches into a table better suited for batch processing (as opposed to trying to batch-process in real time)
There are also a whole range of transaction isolation settings which adjust SQL's regular locking system to support different variants (whether at the batch level, or inline via query hints. I'd read up on those, but you might consider looking at Serialized isolation. Various settings will enforce different runtime rules to fit your needs.
Also be sure to check your transactions. You probably want to be locking the hell out of this table (and potentially during some other action) but once that need is gone, so should the lock.

iOS SQLite Slow Performance

I am using SQLite in my iOS app and I have a lot of saving/loading to do while the user is interacting with the UI. This is a problem as it makes the UI very jittery and slow.
I've tried doing the operations in an additional thread but I don't think this is possible in SQLite. I get the error codes SQLITE_BUSY and SQLITE_LOCKED frequently if I do that.
Is there a way to do this in multithreading without those error codes, or should I abandon SQLite?
It's perfectly possible, you just need to serialise the access to SQLite in your background thread.
My answer on this recent question should point you in the right direction I think.
As mentioned elsewhere, SQLite is fine for concurrent reads, but locks at the database level for writes. That means if you're reading and writing in different threads, you'll get SQLITE_BUSY and SQLITE_LOCKED errors.
The most basic way to avoid this is to serialise all DB access (reads and writes) either in a dispatch queue or an NSOperationQueue that has a concurrency of 1. As this access is not taking place on the main thread, your UI will not be impacted.
This will obviously stop reads and writes overlapping, but it will also stop simultaneous reads. It's not clear whether that's a performance hit that you can take or not.
To initialise a queue as described above:
NSOperationQueue *backgroundQueue = [[NSOperationQueue alloc] init];
[backgroundQueue setMaxConcurrentOperationCount:1];
Then you can just add operations to the queue as you see fit.
Having everything in a dedicated SQLite thread, or a one-op-at-a-time operation queue are great solutions, especially to solve your jittery UI. Another technique (which may not help the jitters) is to spot those error codes, and simply loop, retrying the update until you get a successful return code.
Put SQLite into WAL mode. Then reads won't be blocked. Not so writes - you need to serialize them. There are various ways how to achieve it. One of them is offered by SQLite - WAL hook can be used to signal that the next write can start.
WAL mode should generally improve performance of your app. Most things will be a bit faster. Reads won't be blocked at all. Only large transactions (several MB) will slow down. Generally nothing dramatic.
Don't abandon SQLite. You can definitely do it in a thread different than the UI thread to avoid slowness. Just make sure only one thread is accessing the database at a time. SQLite is not great when dealing with concurrent access.
I recommend using Core Data which sits on top of sqlite. I use it in a multithreaded environment. Here's a guide on Concurrency with Core Data.
OFF:
Have you checkout: FMDB it is a sqlite Wrapper and is thread safe. I used it in all my sqlite Project.

Are database deadlocks a fact of life?

We all know about techniques to prevent db deadlocks - acquire locks in the same order, etc. But at some point, systems under pressure may simply suffer from deadlocks here and there. Should we simply accept that and always be prepared to retry when a deadlock occurs or should deadlocks be considered absolutely verboten and should we do everything in our power to prevent them?
The answer is yes.
You should do everything in your power to prevent them, but are you ever going to be satisfied that you've made them impossible?
Do everything in your power to prevent them, and be prepared to retry when they occur. :)
Keep in mind that "doing everything in your power" can mean things like queueing batch updates, making inserts into temp tables and then merging those into the main tables later and other non-trivial techniques. Be sure to check your transaction isolation level and your lock escalation policy.
This will probably be closed, but the world is trending to NoSQL solutions to this problem, breaking problems up so that guaranteed consistency isn't required from the datasource meaning that locks aren't required.
Facebook would be a good example of this, it doesn't matter when everyone sees your update, or if different users around the world see different versions of your profile. As long as the update works or eventually fails, that is good enough.

How far can you really go with "eventual" consistency and no transactions (aka SimpleDB)?

I really want to use SimpleDB, but I worry that without real locking and transactions the entire system is fatally flawed. I understand that for high-read/low-write apps it makes sense, since eventually the system becomes consistent, but what about that time in between? Seems like the right query in an inconsistent db would perpetuate havoc throughout the entire database in a way that's very hard to track down. Hopefully I'm just being a worry wart...
This is the pretty classic battle between consistency and scalability and - to some extent - availability. Some data doesn't always need to be that consistent. For instance, look at digg.com and the number of diggs against a story. There's a good chance that value is duplicated in the "digg" record rather than forcing the DB to do a join against the "user_digg" table. Does it matter if that number isn't perfectly accurate? Probably not. Then using something like SimpleDB might be a good fit. However if you are writing a banking system, you should probably value consistency above all else. :)
Unless you know from day 1 that you have to deal with massive scale, I would stick to simple more conventional systems like RDBMS. If you are working somewhere with a reasonable business model, you will hopefully see a big spike in revenue if there's a big spike in traffic. Then you can use that money to help solving the scaling problems. Scaling is hard and scaling is hard to predict. Most of the scaling problems that hurt you will be ones that you never expect.
I would much rather get a site off the ground and spend a few weeks fixing scale issues when traffic picks up then spend so much time worrying about scale that we never make it to production because we run out of money. :)
Assuming you're talking about this SimpleDB, you're not being a worrywart; there are real reasons not to use it as a real world DBMS.
The properties that you get from transaction support in a DBMS can be abbreviated by the acronym "A.C.I.D.": Atomicity, Consistency, Isolation, and Durability. The A and D have mostly to do with system crashes, and the C and I have to do with regular operation. They're all things people totally take for granted when working with commercial databases, so if you work with a database that doesn't have one or more of them, you might be in for any number of nasty surprises.
Atomicity: Any transaction will either complete fully or not at all (i.e. it will either commit or abort cleanly). This applies to single statements (like "UPDATE table ...") as well as longer, more complicated transactions. If you don't have this, then anything that goes wrong (like, the disk getting full, the computer crashing, etc.) might leave something half-done. In other words, you can't ever rely on the DBMS to really do the things you tell it to, because any number of real-world problems can get in the way, and even a simple UPDATE statement might get partially completed.
Consistency: Any rules you've set up about the database will always be enforced. Like, if you have a rule that says A always equals B, then nothing anybody does to the database system can break that rule - it'll fail any operation that tries. This isn't quite as important if all your code is perfect ... but really, when is that ever the case? Plus, if you're missing this safety net, things get really yucky when you lose ...
Isolation: Any actions taken on the database will execute as if they happened serially (one at a time), even if in reality they're happening concurrently (interleaved with each other). If more than one user is going to hit this database at the same time, and you don't have this, then things you can't even dream up will go wrong; even atomic statements can interact with each other in unforeseen ways and screw things up.
Durability: If you lose power or the software crashes, what happens to database transactions that were in progress? If you have durability, the answer is "nothing - they're all safe". Databases do this by using something called "Undo / Redo Logging", where every little thing you do to the database is first logged (typically on a separate disk for safety) in a way such that you can reconstruct the current state after a failure. Without that, the other properties above are sort of useless, because you can never be 100% sure that things will stay consistent after a crash.
Do any of these things matter to you? The answer has everything to do with the types of transactions you're doing, and what guarantees you want in a failure situation. There may well be cases (like a read-only database) where you don't need these, but as soon as you start doing anything non-trivial, and something bad happens, you'll wish you had 'em. Maybe it's OK for you to just revert to a backup anytime something unexpected happens, but my guess is that it isn't.
Also note that dropping all of these protections doesn't make it a given that your database will perform better; in fact, it's probably the opposite. That's because real-world DBMS software also has tons of code to optimize query performance. So, if you write a query that joins 6 tables on SimpleDB, don't assume that it'll figure out the optimal way to run that query - you might end up waiting hours for it to complete, when a commercial DBMS could use an indexed hash join and get it in .5 seconds. There are a zillion little tricks that you can do to optimize query performance, and believe me, you'll really miss them when they're gone.
None of this is meant as a knock on SimpleDB; take it from the author of the software: "Although it is a great teaching tool, I can't imagine that anyone would want to use it for anything else."

SELECT FOR UPDATE for locked queries

I'm using MySql 5.x and in my environment, I have a table with the name CALLS.
Table CALLS has a column status which takes an enum {inprogress, completed}.
I want reads/updates of the table to be row-locked, so:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SET AUTOCOMMIT = 0;
SELECT amount from CALLS where callId=1213 FOR UPDATE;
COMMIT
Basically I'm doing a FOR UPDATE even in situations whereby I only need to read the amount and return. I find that this allow me to ensure that reads/updates are prevented from interfering from each other. However I've been told this will reduce the concurrency of the app.
Is there anyway to achieve the same transaction consistency without incurring locking overheads ? Thanks.
Disclaimer: MySQL is generally full of surprises, so the following could be untrue.
What you are doing doesn't make any sense to me: You are committing after the SELECT, which should break the lock. So in my opinion, your code shouldn't really incur any significant overhead; but it doesn't give you any consistency improvements, either.
In general, SELECT FOR UPDATE can be a very sound and reasonable way to ensure consistency without taking more locks than are really needed. But of course, it should only be used when needed. Maybe you should have different code paths: One (using FOR UPDATE) used when the retrieved value is used in a subsequent change-operation. And another one (not using FOR UPDATE) used when the value doesn't have to be protected from changes.
What you've implemented there--in case you weren't familiar with it--is called pessimistic locking. You're sacrificing performance for consistency, which is sometimes a valid choice. In my professional experience, I've found pessimistic locking to be far more of a hindrance than a help.
For one thing, it can lead to deadlock.
The (better imho) alternative is optimistic locking, where you make the assumption that collisions occur infrequently and you simply deal with them when they happen. You're doing your owrk in a transaction so a collision shouldn't leave your data in an inconsistent state.
Here's more information on optimistic locking in a Java sense but the ideas are applicable to anything.