Crate DB Snapshot size - cratedb

I have setup a python script which creates a crate db snapshot everyday at noon.
The query I ran to setup the repo is:
CREATE REPOSITORY repo_name TYPE FS WITH (LOCATION='/path/to/folder', compress=true);
The query I run everyday in order to create the snapshot is:
CREATE SNAPSHOT repo_name.{} ALL WITH (wait_for_completion=true, ignore_unavailable=true);
On the initial run, the snapshot directory size was same as the database size (30GB).
After about a month, the database has grown to 40GB while the snapshot directory size has grown to ~120GB (almost thrice the size of the database!).
Is this normal?
If yes, are there any options/optimizations I can try out to reduce the size of the snapshots?

According to Crate the table data is not compressed. Only metadata is compressed. (I agree it's confusing.)
Snapshots are incremental so I'm not entirely sure why this has grown so much. (Do you perhaps ingest a lot of data that is then perhaps removed but still present during the snapshot?) May be worth raising a query with Crate directly on their github and checking if this is a bug or not.

Related

Is SELECT retrieving data from the WAL files?

To my understanding, database can postpone writing to table files to boost IO performance. When transaction is COMMITted, data are written to the WAL files.
I'm curious, how much delayed the writing to table files can be. In particular, when I use a simple SELECT, e.g.
SELECT * from myTable;
after COMMIT, is it possible that database has to retrieve data from the WAL files in addition to the table files?
The documentation talks about being able to postpone the flushing of data pages:
If we follow this procedure, we do not need to flush data pages to disk on every transaction commit.
What WAL files allow an RDBMS to do is keeping "dirty" data pages in memory and flushing them to disk at a later time. It does not work in a way that the data pages are modified at a later time.
So the answer to your question is "No, a SELECT is always retrieving data from the data pages, not from the WAL files".
PostgreSQL does not read from WAL for this purpose during normal operations. It would only read WAL in order to apply changes to the data files after a crash (or during replication onto another server).
When data from ordinary data files is changed, the pages of the data files are kept in shared memory (in shared_buffers) until they are written to disk. Any other processes wanting to see that data will find it in that shared memory, and it will be in its changed form. They will always look for it in shared_buffers before they try to read it from disk, so no one will ever see the stale on-disk version of the data, except for the recovery process after a crash.

WAL log files fill up quickly - how to prevent this?

currently the logs in the folder “/engine-rocksdb/journals” are running full (WAL logs).
When does ArangoDB do a cleaning run of these logs and delete them automatically and how to trigger this cleaning run earlier? My ArangoDB 3.10 runs in single mode and in a virtual environment (cloud with a network storage).
The logfile are increasing very fast for me because there are many writes to the DB. What is the best way, any idea?
What I have done so far:
If I set the value “rocksdb.wal-archive-size-limit” it does delete the logs when the set limit is reached, but it shows errors in the logfile:
2022-09-27T17:53:04Z [898948] WARNING [d9793] {engines} forcing removal of RocksDB WAL file '/archive/813371.log' with start sequence 5387062892 because of overflowing archive. configured maximum archive size is 1073741824, actual archive size is: 75401520
However, I still don't understand the meaning of the logfile output: "configured maximum archive size is 1073741824, actual archive size is: 75401520`". The "actual archive size" is smaller?
But what are the consequences of lowering the "wal-archive-size-limit" value? Is it possible to switch off the wal-archive completely. What exactly is it for? As I understand it, ArangoDb need it for transaction security (i.e. in case of power loss), right?
In general, yes, this is a good thing, but how can I get ArangoDb to a) limit this WAL-archive (without error massages) and b) do a cleaning run faster?
thx :-)
When does ArangoDB do a cleaning run of these logs and delete them automatically and how to trigger this cleaning run earlier?
ArangoDB uses RocksDB underneath, and RocksDB will move WAL file (.log files) into its archive as soon as possible. In order to do so all data from the WAL file needs to be safely stored in the column families' .sst files and have been flushed to disk.
ArangoDB will delete files from the WAL archive (and only from there) once it can assure that an archived WAL file is not used anymore. It will not remove files for the archive that are or may be in current use.
There are a few reasons why ArangoDB may keep archived WAL files for some time:
when server-to-server replication is used: while a follower replicates data, it may read from the leader's WAL. Deleting the WAL file on the leader may make the replication fail
when arangodump is used to create a database dump, it will create a snapshot of data on the server, and the WAL files for that snapshot will be kept around until the snapshot isn't needed anymore (i.e. arangodump finishes).
the first 180 seconds after server start, all WAL files are intentionally kept, for forensic reasons, and to allow followers to replay events from a leader's WAL when it is restarted. The value of 180 seconds can be changed by adjusting the startup option --rocksdb.wal-file-timeout-initial.
there can be some background processing of changes that may refer to data from WAL files. For example, each insert into a collection will need to increase the collection's count() value by 1. To save an extra write into RocksDB on each insert, the count() value is only written to the storage engine by a background thread, ideally only once every X insert operations. However, this may lead to WAL files being around for a bit longer, especially if the background thread cannot keep up with the insert workload.
There is the startup option --rocksdb.wal-archive-size-limit to put a hard limit on the cumulated size of the WAL files in the archive. From your question, it appears that you are currently using ArangoDB version 3.10.
From the warning message you posted, it seems that the WAL archive cleanup somehow applies the wrong limit values.
It turns out that there has been a recent bugfix, released in ArangoDB version 3.10.1, 3.9.4, and 3.8.8, that should rectify this behavior. So upgrading to one of these or later versions may actually help when using the WAL archive size limit.
Shared your question in the Speedb hive, on Discord, and here is what we got for you:
"By default, ArangoDB set the max_wal_size to 1G the value of rocksdb.wal-archive-size-limit must be set to at least twice this number (otherwise you may end up with a single WAL file and the delete will fail)."
Hope this help, if it doesn't or you have follow up questions, please join the Speedb Discord and we will be happy to help.

Jackrabbit repository incremental backup

I'm using Jackrabbit v2.2.x. I want to know if is there a way to take incremental backup of a jackrabbit repository? I mean, just the delta (difference) based on date or something else. Actually the problem is that the repository size is in TeraBytes and every time we have to take production data it takes a lot of time to copy full repository.
If the storage backend support incremental backups, an incremental low level backup might be the easiest solution.
If not, possibly you could use the EventJournal to iterate over the changes since the last backup, and just backup those changes. Most likely this will require more work however.
Another solution is to do an incremental backup of the data store (if this is what uses most of the disk space), and do a full backup of the node data (persistence managers).

Core Data sqlite file growing

I have a synchronization process and Im using Core Data to store a lot of information. Several times I downloaded the real SQLite database file with the organizer to check if the data is correct.
Some days ago I recongized that the size difference of two SQLite files was huge. One file was 80MB, the other file about 100MB. As I checked out the data in it with a SQLiteviewer there was no difference. The same tables, same indexes, same rows. How can that be? Is is is possible that some data is still in the file when I delete objects over Core Data?
EDIT:
The solution is an option flag which can be inserted in an option NSDictionary and added as parameter to the addPersistentStore method.
NSSQLiteManualVacuumOption
Option key to rebuild the store file, forcing a database wide defragmentation when the store is added to the coordinator.
This invokes SQLite's VACUUM command. It is ignored by stores other than the SQLite store.
Sqlite does not proactively return unused disk space as it deletes data, for performance reasons. This could be why you see the difference. See this link for more info:
SQLite FAQ

Is having multiple data/log files a good thing even on the same LUN?

I have read that it is a good idea to have one file per CPU/CPU Core so that SQL can more efficiently stream data to and from the disks. Ok, I can see the benefit if they are on different spindles, but what if I only have one spindle (4 drives in Raid 10) for my data files (.mdf and .ndf), will I still benefit from splitting the data files (from just the .mdf file to a .mdf and several .ndf files)? Same goes for the log file, although I see no benefit to it as the data has to be written serially and you're limited by the spindle's sequential write speed...
FYI, this is in regards to SQL Server 2005/2008...
Thanks.
The recommendation for multiple tempdb data files is definitely not about IOPS. It is about contention on the allocation pages (GAM, SGAM, PFS) in tempdb. SQL 2005+ doesn't require as big of a load on these pages, but contention still occurs. Not all system require a 1 file to 1 core mapping. Most sytems will perform well with 1 file to 2 or 4 cores. Having too many files adds overhead for managing the files. A good recommendation is to start with 1:4 or 1:2 and increasing if contention continues. Don't go above 1:1.
For other databases, this is not recommended.
And yes, only 1 log file ... always.
8 Steps to better Transaction Log throughput:
Create only ONE transaction log file.
Even though you can create multiple
transaction log files, you only need
one... SQL Server DOES not "stripe"
across multiple transaction log files.
Instead, SQL Server uses the
transaction log files sequentially.
Misconceptions around TF 1118:
Why is the trace flag not required so
much in 2005 and 2008? In SQL Server
2005, my team changed the allocation
system for tempdb to reduce the
possibility of contention. There is
now a cache of temp tables. When a new
temp table is created on a cold system
(just after startup) it uses the same
mechanism as for SQL 2000. When it is
dropped though, instead of all the
pages being deallocated completely,
one IAM page and one data page are
left allocated, and the temp table is
put into a special cache. Subsequent
temp table creations will look in the
cache to see if they can just grab a
pre-created temp table 'off the
shelf'. If so, this avoids accessing
the allocation bitmaps completely. The
temp table cache isn't huge (I think
it's 32 tables), but this can still
lead to a big drop in latch
contention in tempdb.
So the answer is NO to both questions. Log striping was never an issue, and one-NDF-per-CPU is largely a myth, one that will take a very long time to die out. Multiple files IMHO make sense only if you can stripe IO (separate LUNs). Multiple filegroups though make sense, but not for IO reasons, for administrative purposes: piecemeal restores and archive read-only filegroups.
Still good. This is not about IOPS - it is about SQL Server BLOCKING a file for certain operations. mostly when file extends are allocated to a table / index. If you do a lot of inserts / updates, multiple files basically mean another thread will block another file, not wait on the first one.
So, this is not really about IOPS loads, it is about a blocking behavior.