iOS SQLite Slow Performance

iOS SQLite Slow Performance - objective-c

I am using SQLite in my iOS app and I have a lot of saving/loading to do while the user is interacting with the UI. This is a problem as it makes the UI very jittery and slow.
I've tried doing the operations in an additional thread but I don't think this is possible in SQLite. I get the error codes SQLITE_BUSY and SQLITE_LOCKED frequently if I do that.
Is there a way to do this in multithreading without those error codes, or should I abandon SQLite?

It's perfectly possible, you just need to serialise the access to SQLite in your background thread.
My answer on this recent question should point you in the right direction I think.
As mentioned elsewhere, SQLite is fine for concurrent reads, but locks at the database level for writes. That means if you're reading and writing in different threads, you'll get SQLITE_BUSY and SQLITE_LOCKED errors.
The most basic way to avoid this is to serialise all DB access (reads and writes) either in a dispatch queue or an NSOperationQueue that has a concurrency of 1. As this access is not taking place on the main thread, your UI will not be impacted.
This will obviously stop reads and writes overlapping, but it will also stop simultaneous reads. It's not clear whether that's a performance hit that you can take or not.
To initialise a queue as described above:
NSOperationQueue *backgroundQueue = [[NSOperationQueue alloc] init];
[backgroundQueue setMaxConcurrentOperationCount:1];
Then you can just add operations to the queue as you see fit.

Having everything in a dedicated SQLite thread, or a one-op-at-a-time operation queue are great solutions, especially to solve your jittery UI. Another technique (which may not help the jitters) is to spot those error codes, and simply loop, retrying the update until you get a successful return code.

Put SQLite into WAL mode. Then reads won't be blocked. Not so writes - you need to serialize them. There are various ways how to achieve it. One of them is offered by SQLite - WAL hook can be used to signal that the next write can start.
WAL mode should generally improve performance of your app. Most things will be a bit faster. Reads won't be blocked at all. Only large transactions (several MB) will slow down. Generally nothing dramatic.

Don't abandon SQLite. You can definitely do it in a thread different than the UI thread to avoid slowness. Just make sure only one thread is accessing the database at a time. SQLite is not great when dealing with concurrent access.

I recommend using Core Data which sits on top of sqlite. I use it in a multithreaded environment. Here's a guide on Concurrency with Core Data.

OFF:
Have you checkout: FMDB it is a sqlite Wrapper and is thread safe. I used it in all my sqlite Project.

Related

How to check if SQLite database is locked

I have an app that makes quite a few calls to a local SQLite3 database and sometimes these calls happen very close together (from different areas of the app). How can I check, before a call to the database is made, if the database is currently locked?
Ideally I would rewrite the app (which has grown far beyond its original scope) but won't have time in this iteration.

I have no idea what to do in objective-c, but I have been using sqlite3 with c from quite long time And I also faced same issue. I used below method.
use busy_timeout and keep it configurable.
use busy_handler to keep retry for n number of time.
This two improvement works well for me, but I had observed some performance issue which i am able to handle via above configuration parameter. You need to do some trade of between fail-safe and performance.

Semaphore/Mutex lock/unlock frequency

I have some code which I need to lock using a semaphore or mutex.
The code is something like this:
callA();
callB();
callC();
.
.
.
callZ();
I would like to know the efficient way to lock it. The options I am thinking are
lock before callA() and unlock after callZ(). My concern is the lock remains set for a pretty long period.
lock and unlock after each function call. I am worried about the 'too much overhead' of grabbing and releasing the lock.
Appreciare your help !!!

It all depends on your use case. How much lock/unlock/lock/unlock performance penalty can you tolerate? Weighed against this, how long are you willing to make another task block while waiting for the lock? Are some of the threads latency-critical or interactive and other threads bulk or low-priority? Are there other tasks that will take the same lock(s) through other code paths? If so, what do those look like? If the critical sections in callA, callB, etc.. are really separate then do you want to use 26 different locks? Or do they manipulate the same data, forcing you to use a single lock?
By the way, if you are using Linux, definitely use (pthreads) mutexes, not semaphores. The fast path for mutexes is completely usersparce. Locking and unlocking them when there is no contention is quite cheap. There is no fast path for semaphores.
Without knowing anything else, I would advise fine grained locking, especially if your individual functions are already organized to not make assumptions that would only be true if the lock is held across them all. But as I said, it really depends what you're doing and why you're doing it.

Erlang ETS tables versus message passing: Optimization concerns?

I'm coming into an existing (game) project whose server component is written entirely in erlang. At times, it can be excruciating to get a piece of data from this system (I'm interested in how many widgets player 56 has) from the process that owns it. Assuming I can find the process that owns the data, I can pass a message to that process and wait for it to pass a message back, but this does not scale well to multiple machines and it kills response time.
I have been considering replacing many of the tasks that exist in this game with a system where information that is frequently accessed by multiple processes would be stored in a protected ets table. The table's owner would do nothing but receive update messages (the player has just spent five widgets) and update the table accordingly. It would catch all exceptions and simply go on to the next update message. Any process that wanted to know if the player had sufficient widgets to buy a fooble would need only to peek at the table. (Yes, I understand that a message might be in the buffer that reduces the number of widgets, but I have that issue under control.)
I'm afraid that my question is less of a question and more of a request for comments. I'll upvote anything that is both helpful and sufficiently explained or referenced.
What are the likely drawbacks of such an implementation? I'm interested in the details of lock contention that I am likely to see in having one-writer-multiple-readers, what sort of problems I'll have distributing this across multiple machines, and especially: input from people who've done this before.

first of all, default ETS behaviour is consistent, as you can see by documentation: Erlang ETS.
It provides atomicity and isolation, also multiple updates/reads if done in the same function (remember that in Erlang a function call is roughly equivalent to a reduction, the unit of measure Erlang scheduler uses to share time between processes, so a multiple function ETS operation could possibly be split in more parts creating a possible race condition).
If you are interested in multiple nodes ETS architecture, maybe you should take a look to mnesia if you want an OOTB multiple nodes concurrency with ETS: Mnesia.
(hint: I'm talking specifically of ram_copies tables, add_table_copy and change_config methods).
That being said, I don't understand the problem with a process (possibly backed up by a not named ets table).
I explain better: the main problem with your project is the first, basic assumption.
It's simple: you don't have a single writing process!
Every time a player takes an object, hits a player and so on, it calls a non side effect free function updating game state, so even if you have a single process managing game state, he must also tells other player clients 'hey, you remember that object there? Just forget it!'; this is why the main problem with many multiplayer games is lag: lag, when networking is not a main issue, is many times due to blocking send/receive routines.
From this point of view, using directly an ETS table, using a persistent table, a process dictionary (BAD!!!) and so on is the same thing, because you have to consider synchronization issues, like in objects oriented programming languages using shared memory (Java, everyone?).
In the end, you should consider just ONE main concern developing your application: consistency.
After a consistent application has been developed, only then you should concern yourself with performance tuning.
Hope it helps!
Note: I've talked about something like a MMORPG server because I thought you were talking about something similar.

An ETS table would not solve your problems in that regard. Your code (that wants to get or set the player widget count) will always run in a process and the data must be copied there.
Whether that is from a process heap or an ETS table makes little difference (that said, reading from ETS is often faster because it's well optimized and doesn't perform any other work than getting and setting data). Especially when getting the data from a remote node. For multple readers ETS is most likely faster since a process would handle the requests sequentially.
What would make a difference however, is if the data is cached on the local node or not. That's where self replicating database systems, such as Mnesia, Riak or CouchDB, comes in. Mnesia is in fact implemented using ETS tables.
As for locking, the latest version of Erlang comes with enhancements to ETS which enable multiple readers to simultaneously read from a table plus one writer that writes. The only locked element is the row being written to (thus better concurrent performance than a normal process, if you expect many simultaneous reads for one data point).
Note however, that all interaction with ETS tables is non-transactional! That means that you cannot rely on writing a value based on a previous read because the value might have changed in the meantime. Mnesia handles that using transactions. You can still use the dirty_* functions in Mneisa to squeeze out near-ETS performance out of most operations, if you know what you're doing.

It sounds like you have a bunch of things that can happen at any time, and you need to aggregate the data in a safe, uniform way. Take a look at the Generic Event behavior. I'd recommend using this to create an event server, and have all these processes share this information via events to your server, at that point you can choose to log it or store it somewhere (like an ETS table). As an aside, ETS tables are not good for peristent data like how many "widgets" a player has - consider Mnesia, or an excellent crash only db like CouchDB. Both of these replicate very well across machines.
You bring up lock contention - you shouldn't have any locks. Messages are processed in a synchronous order as they are received by each process. In fact, the entire point of the message passing semantics built into the language is to avoid shared-state concurrency.
To summarize, normally you communicate with messages, from process to process. This is hairy for you, because you need information from processes scattered all over the place, so my recommendation for you is based of the idea of concentrating all information that is "interesting" outside of the originating processes into a single, real-time source.

Will pool the connection help threading in sqlite (and how)?

I currently use a singleton to acces my database (see related question) but now when try to add some background processing everything fall apart. I read the sqlite docs and found that sqlite could work thread-safe, but each thread must have their own db connection. I try using egodatabase that promise a sqlite wrapper with thread safety but is very buggy, so I return to my old FMDB library I start to see how use it in multi-thread way.
Because I have all code with the idea of singleton, change everything will be expensive (and a lot of open/close connections could become slow), so I wonder if, as the sqlite docs hint, build a pooling for each connection will help. If is the case, how make it? How to know which connection to get from the pool (because 2 threads can't share the connection)?
I wonder if somebody already use sqlite in multi-threading with NSOperation or similar stuff, my searching only return "yeah, its possible" but let the details to my imagination...

You should look at using thread-local variables to hold the connection; if the variable is empty (i.e., holding something like a NULL) you know you can safely open a connection at that point to serve the thread and store the connection back in the variable. Don't know how to do this with Obj-C though.
Also be aware that SQLite is not tuned for concurrent writes. Writer locks are expensive, so keep any time in a writing transaction (i.e., one that includes an INSERT, UPDATE or DELETE) to a minimum in all threads. Transaction commits are also expensive too.

Is there a way of sharing a Core Data store between processes?

What am I trying to do?
A UI process that reads data from a Core Data store on disk. It wouldn't need to edit the data, just read and display the data.
A command line process that writes to the same data store as accessed by the UI.
Why?
So that the command line process can be running all the time but the user can quit the UI process and forget about the app until they need to look at the data it's captured.
What would be the simplest and most reliable way of achieving this?
What Have I Tried?
I've read up on sharing a data store between threads and implemented this once before, but I can't find anything in the docs or on the web indicating how to share a store between processes.
Is it as simple as pointing both processes at the same data store file? I've experimented with this briefly. It appeared to work OK, but I'm worried I might run into problems with locking etc when it's really put under stress.
Finally
I'd really appreciate someone giving me pointers on what direction to go with this. Thanks.

This might be one of those situations in which you'll simply have to Try It And See™.
Insofar as I can remember, SQLite (which is the data store you'll most likely want to be using) has built in mechanisms for file locking and so on; so the integrity of the file is likely to be assured. If, on the other hand, you use the CoreData/XML approach, you might run into problems.
In other words; use the SQLite backing for your file, and you should likely be fine.

You can do exactly what you want, you probably want to use the SQLite store otherwise saving and committing every time you want to synch out data will be horrifically slow. You just need to use some sort of IPC doorbell between the apps so that you can inform one app it needs to recheck the persistent store on disk and merge in its data.
Apple documents using multiple persistent store corindators as a valid option in Multi-Threading with Core Data (in "General Guidelines", open 2). That happens to be discussing completely parallel CD stacks in the same process, but it is valid if they are in completely separate address spaces as well.

Nearly two years on, and I've just found a much better way of doing this.
The answer seems to lie with Sync Services. I didn't even realise it existed! There's an excellent post about this at:
http://www.timisted.net/blog/archive/core-data-and-sync-services/
I've not tried this with my app yet, but it seems like an excellent way of sharing a core data store between two processes or applications.
If I experience any performance issues, I'll update this answer accordingly, but this seems like the Apple recommended way of doing it.

You need to re-think your architecture. If you want a daemon to own the data store, then have your GUI app connect to the daemon. Trying to share the data store is a can of worms you don't want to open.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas