Hybrid in Memory database SQLite - sql

I have an app that use collections of objects an is really fast.
Now, I am adding a database for persitancy, so I started to save things in SQLite database.
But now i found is much much slowly.
I think it is because the disk is slower than ram. it possible to have the DB in memory?
I found an Inmemory DB in the documentation of SQLite, but it is in memory and I need persitancy.
So, is it possible to have the DB in memory for perfoamnce and save to the disk after some time?
Thank you and regards.
Update:
In the answers they say that it is because I am doing lots of inserts, and this is true. They say I should make a bulk insert.
My code looks like this in a memory collection called Computers:
foreach (line in lines)
{
Computers.Add(new Computer { ComputerName = line});
}
Now in the DB:
foreach (line in lines)
{
string sql = "INSERT into computers VALUES ('"+computerName+"')";
SQLiteCommand command = new SQLiteCommand(sql, dbConnection); command.ExecuteNonQuery();
}
How can I do this in a single bulk insert?

You can use the Backup API to accomplish this. See http://www.sqlite.org/backup.html A complete code sample in C is provided.
If you are doing lots of inserts at once, make sure you combine them in a single transaction. Otherwise SQLite waits one disk rotation for each insert, which can greatly slow down the insertion process. Doing this may allow you to use a disk-based SQLite database, without the slowdowns.
See Also
SQLite Insert very slow?

If you have a database in memory and only occasionally flush it to disk, you lose one of the most important features of a database, which is ACID. In other words, you app can write data thinking it is persisted, then a subsequent problem can cause you to lose the data after all.
Many databases will perform quite well (though still much slower than in-RAM solutions) if you have enough RAM available for the database to cache frequently accessed items.
Start by examining what exactly is slow in your interactions with the database. Are you fetching data that doesn't change over and over? Perhaps cache that in the database? Are you making many separate updates to the database that could be combined into a single update?

Related

When/how to write data in redis cache to SQL database?

I have a fairly small relational database currently setup (SQLite, changing to PostgreSQL) that has some relatively simple many-to-one and many-to-many relations. The app uses Websockets to give real-time updates to any clients so I want to make any operations as quick as possible. I was planning on using redis to cache parts of the data in memory as required (parts that will be read/written frequently) so that queries will be faster. I know with the database currently so small, performance gains aren't going to be noticeable but I want it to be scalable.
There seems to be a lot of material/information suggesting using redis as a cache is a good idea, but I'm struggling to find much information about when it is suitable to write updates to the SQL database and how is best to do it.
For example, should I write updates to the redis-store, then send updated data out to clients and then write the same update to the SQL database all in the same request (in that order)? (i.e. more frequent smaller writes)
Or should I simply just write updates to the redis-store and send the updated data out to clients. Then, periodically (every minute?) read back from the redis-store and subsequently save it in the SQL database? (i.e. less frequent but larger writes)
Or is there some other best practice for keeping a redis store and SQL database consistent?
Would my first example perform poorly due to the larger number of writes to disk and the CPU being more active or would this be negligible?

Delayed writes of client-accumulated records (Delphi app -> SQL Server)

My multi-threaded Delphi application parses about 100k marketplace offers. Each worker thread writes parsed data to a remote SQL Server. Currently each thread parses 3-4 offers per second which means 10 threads fire about 35 calls-for-update to SQL Server. Every second.
The idea is to implement the optimized database writes – sort of a lazy bulk updates. Each thread accumulates 20-30 parsed offers and then writes them do database in a single pass. I assume that would be way more optimal and efficient than the current approach.
I would be happy to hear your general comments and suggestions as well as shedding some light on the techniques of lazy/delayed/chunky writes from Delphi app to SQL Server database.
There's also good old-fashioned BULK INSERTS from a flat file into the database. With a large data transfer app I developed (years ago) this was by far the fastest solution. But that was before large insert statements, and it only works if you can delay to batches of at least 1000 rows.
Since you have only two very simple numeric fields you won't have to worry about Unicode, delimiters, escaping characters etc. Just write your intermediate results to a simple ASCII file, then BULK INSERT this in one transaction.
You will have to make sure this works multithreaded (should not be too difficult with unique file names), and you will have to experiment with the amount of 'latency' you tolerate, whether you can use table locks etc. The larger the bulk inserted file, the more you gain.
Make sure that you set the SQL server transaction logging to Minimal logging to prevent large transaction logs
Delphi XE4 contains FireDAC, which gives you two approaches for a solution: CachedUpdates and Array DML.
I like #Uwe's suggestion. If you are rolling your own solution without FireDAC, however, you can use an in-memory dataset as a buffer and then blast the data to a stored procedure.
Of course, this would require changes outside of the code, and you would need appropriate permissions to create the stored procedure and so forth. But if this idea appeals to you, here are two links that may help with this technique:
In Memory Datasets
Bulk push to SQL via stored procedure

Is having a copy of SQL data in an application a good idea to save SQL SELECTS?

I am working on a multithreading .NET 4 application which acquires data continuously and writes them into a SQL database (MySQL or SQL Server - not yet sure).
Everytime when a INSERT is executed, at leat one prior SELECT is necessary in order so synchronize with the databaes. This means the applications gets a block which contains new and old data and then has to check which data sets are new and which are already in the database.
This means a lot of SELECTS which result everytime in more or less the same data.
Would it be a good idea to have a copy of the last x entries per table within the application?
This way the synchronization could be done on the copy instead of the database.
Pro:
Faster
Contra:
Uses a lot of memory
Risk of becomming unsynchronized with the database
What do you think? What is the best practice for such a use case?
Any other pros and cons?
Unless you have an external program writing to your database at the same time, you can use buffering.
But instead of buffering SELECT results, just add to the insert method a buffer of the last X (a reasonable number) insertion requests, and only insert the new one if it isn't on that list.
You might also want to lock the insertion method, to make sure the inclusion check is always correct.
If you have multiple processes writing to the database, it is non-trivial to maintain perfect synchronization between in-memory data and the database. In fact the only way to confirm you are synchronized is by making a SELECT query on database. So you have a trade-off between perfect synchronization and synchronization with some tolerance which is very efficient.
My suggestions, which may help in both cases, would be:
Tune your SELECT queries. Add indexes if necessary.
Create meta-data, like version numbers. So that you have to only check something very trivial to determine if you need synchronization.
Write a stored proc which implements your SELECT and INSERT logic. Then you do not have to worry about making multiple calls to the database.

Firebird backup restore is frustrating, is there a way to avoid it?

I am using Firebird, but lately the database grows really seriously.
There is really a lot of delete statements running, as well update/inserts, and the database file size grows really fast.
After tons of deleting records the database size doesn't decrease, and even worse, i have the feeling that actually the query getting slowed down a bit.
In order to fix this a daily backup/restore process have been involved, but because of it's time to complete - i could say that it is really frustrating to use Firebird.
Any ideas on workarounds or solution on this will be welcome.
As well, I am considering switching to Interbase because I heard from a friend that it is not having this issue - it is so ?
We have a lot of huge databases on Firebird in production but never had an issue with a database growth. Yes, every time a record being deleted or updated an old version of it will be kept in the file. But sooner or later a garbage collector will sweap it away. Once both processes will balance each other the database file will grow only for the size of new data and indices.
As general precaution to prevent an enormous database growth try to make your transactions as short as possible. In our applications we use one READ ONLY transaction for reading all the data. This transaction is open through whole application life time. For every batch of insert/update/delete statements we use short separate transactions.
Slowing of database operations could be resulted from obsolete indices stats. Here you can find an example of how to recalculate statistics for all indices: http://www.firebirdfaq.org/faq167/
Check if you have unfinished transactions in your applications. If transaction is started but not committed or rolled back, database will have own revision for each transaction after the oldest active transaction.
You can check the database statistics (gstat or external tool), there's oldest transaction and the next transaction. If the difference between those numbers keeps growing, you have the stuck transaction problem.
There are also monitoring tools the check situation, one I've used is Sinatica Monitor for Firebird.
Edit: Also, database file doesn't shrink automatically ever. Parts of it get marked as unused (after sweep operation) and will be reused. http://www.firebirdfaq.org/faq41/
The space occupied by deleted records will be re-used as soon as it is garbage collected by Firebird.
If GC is not happening (transaction problems?), DB will keep growing, until GC can do its job.
Also, there is a problem when you do a massive delete in a table (ex: millions of records), the next select in that table will "trigger" the garbage collection, and the performance will drop until GC finishes. The only way to workaround this would be to do the massive deletes in a time when the server is not very used, and run a sweep after that, making sure that there are no stuck transactions.
Also, keep in mind that if you are using "standard" tables to hold temporary data (ie: info is inserted and delete several times), you can get corrupted database in some circumstances. I strongly suggest you to start using Global Temporary Tables feature.

SQL Server slow down after duplicating database

I recently moved a bunch of tables from an existing database into a new database due to the database getting rather large. After doing so I noticed a dramatic decrease in performance of my queries when running against the new database.
The process I took to recreate the new database is this:
Generate Table CREATE scripts using sql servers automatic script
generator.
Run the create table scripts
Insert all data into new database
using INSERT INTO with a select from
the existing database.
Run all the alter scripts to create
the foreign keys and any indexes
Does anyone have any ideas of possible problems with my process, or some key step I'm missing that is causing this performance issue?
Thanks.
first I would an a mimimum make sure that auto create statistics is enabled you can also set auto update statistics to true
after that I would update the stats by running
sp_updatestats
or
UPDATE STATISTICS
Also realize that the first time you hit the queries it will be slower because nothing will be cached in RAM. On the second hit should be much faster
Did you script the indexes from the tables in the original database? Missing indexes could certainly account for poor performance.
Have you tried looking at the execution plans on each server when running these queries - that should allow you to easily see if they're doing something different e.g. table scanning due to a missing index, poor statistics etc.
Are both DBs sat on the same box with their data files on the same drive arrays?
Can you tell what about those queries got slower? New access plans? Same plans but they perform slower? Do they execute slower or are they suspended more? Did all queries got slower or just some? And last but not least, how do you know, ie. what exactly did you measure and how?
Some of the usual suspects could be:
The new storage is much slower (.mdf on slow disk, or on a busy disk)
You changed the data structure during move (ie. some indexes did not get ported)
You changed the data size (ie. compression options) resulting on more pages for the same data
Did anything else change at the same time, new app code or anything the like?
By extending the data size (you do no mention deleting the old tables) you are now trashing the buffer pool (did the page lifetime expectancy decreased in performance counters?)
Look on how you set up the initial size and growth options. If you didn't give it enough space to begin with or if you are growing by 1MB at a time that could be a cause of performance issues.