Why do SQL databases use a write-ahead log over a command log?

Why do SQL databases use a write-ahead log over a command log? - sql

I read about Voltdb's command log. The command log records the transaction invocations instead of each row change as in a write-ahead log. By recording only the invocation, the command logs are kept to a bare minimum, limiting the impact the disk I/O will have on performance.
Can anyone explain the database theory behind why Voltdb uses a command log and why the standard SQL databases such as Postgres, MySQL, SQLServer, Oracle use a write-ahead log?

I think it is better to rephrase:
Why does new distributed VoltDB use a command log over write-ahead log?
Let's do an experiment and imagine you are going to write your own storage/database implementation. Undoubtedly you are advanced enough to abstract a file system and use block storage along with some additional optimizations.
Some basic terminology:
State : stored information at a given point of time
Command : directive to the storage to change its state
So your database may look like the following:
Next step is to execute some command:
Please note several important aspects:
A command may affect many stored entities, so many blocks will get dirty
Next state is a function of the current state and the command
Some intermediate states can be skipped, because it is enough to have a chain of commands instead.
Finally, you need to guarantee data integrity.
Write-Ahead Logging - central concept is that State changes should be logged before any heavy update to permanent storage. Following our idea we can log incremental changes for each block.
Command Logging - central concept is to log only Command, which is used to produce the state.
There are Pros and Cons for both approaches. Write-Ahead log contains all changed data, Command log will require addition processing, but fast and lightweight.
VoltDB: Command Logging and Recovery
The key to command logging is that it logs the invocations, not the
consequences, of the transactions. By recording only the invocation,
the command logs are kept to a bare minimum, limiting the impact the disk I/O will
have on performance.
Additional notes
SQLite: Write-Ahead Logging
The traditional rollback journal works by writing a copy of the
original unchanged database content into a separate rollback journal
file and then writing changes directly into the database file.
A COMMIT occurs when a special record indicating a commit is appended
to the WAL. Thus a COMMIT can happen without ever writing to the
original database, which allows readers to continue operating from the
original unaltered database while changes are simultaneously being
committed into the WAL.
PostgreSQL: Write-Ahead Logging (WAL)
Using WAL results in a significantly reduced number of disk writes,
because only the log file needs to be flushed to disk to guarantee
that a transaction is committed, rather than every data file changed
by the transaction.
The log file is written sequentially, and so the
cost of syncing the log is much less than the cost of flushing the
data pages. This is especially true for servers handling many small
transactions touching different parts of the data store. Furthermore,
when the server is processing many small concurrent transactions, one
fsync of the log file may suffice to commit many transactions.
Conclusion
Command Logging:
is faster
has lower footprint
has heavier "Replay" procedure
requires frequent snapshot
Write Ahead Logging is a technique to provide atomicity. Better Command Logging performance should also improve transaction processing. Databases on 1 Foot
Confirmation
VoltDB Blog: Intro to VoltDB Command Logging
One advantage of command logging over ARIES style logging is that a
transaction can be logged before execution begins instead of executing
the transaction and waiting for the log data to flush to disk. Another
advantage is that the IO throughput necessary for a command log is
bounded by the network used to relay commands and, in the case of
Gig-E, this throughput can be satisfied by cheap commodity disks.
It is important to remember VoltDB is distributed by its nature. So transactions are a little bit tricky to handle and performance impact is noticeable.
VoltDB Blog: VoltDB’s New Command Logging Feature
The command log in VoltDB consists of stored procedure invocations and
their parameters. A log is created at each node, and each log is
replicated because all work is replicated to multiple nodes. This
results in a replicated command log that can be de-duped at replay
time. Because VoltDB transactions are strongly ordered, the command
log contains ordering information as well. Thus the replay can occur
in the exact order the original transactions ran in, with the full
transaction isolation VoltDB offers. Since the invocations themselves
are often smaller than the modified data, and can be logged before
they are committed, this approach has a very modest effect on
performance. This means VoltDB users can achieve the same kind of
stratospheric performance numbers, with additional durability
assurances.

From the description of Postgres' write ahead http://www.postgresql.org/docs/9.1/static/wal-intro.html and VoltDB's command log (which you referenced), I can't see much difference at all. It appears to be the identical concept with a different name.
Both sync only the log file to the disk but not the data so that the data could be recovered by replaying the log file.
Section 10.4 of VoltDB explains that their community version does not have command log so it would not pass the ACID test. Even in the enterprise edition, I don't see the details of their transaction isolation (e.g. http://www.postgresql.org/docs/9.1/static/transaction-iso.html) needed to make me comfortable that VoltDB is as serious as Postges.

With WAL, readers read from pages from unflushed logs. No modification is made to the main DB. With command logging, you have no ability to read from the command log.
Command logging is therefore vastly different. VoltDB uses command logging to create recovery points and ensure durability, sure - but it is writing to the main db store (RAM) in real time - with all the attendant locking issues, etc.

The way I read it is as follows: (My own opinion)
Command Logging as described here logs only transactions as they occur and not what happens in or to them. Ok, so here is the magic piece... If you want to rollback you need to restore the last snapshot and then you can replay all the transactions that were applied after that (Described in the link above). So effectively you are restoring a backup and re-applying all your scripts, only VoltDB has now automated it for you.
The real difference that i see with this is that you cannot rollback to a point in time logically as with a normal transaction log. Normal transaction logs (MSSQL, MySQL etc.) can easily rollback to a point in time (in the correct setup) as the transactions can be 'reversed'.
Interresting question comes up - referring to the pos by pedz, will it always pass the ACID test even with the Command Log? Will do some more reading...
Add: Did more reading and I don't think this is a good idea for very big and busy transactional databases. A DB snapshot is automatically created when the Command Logs fill up, to save you from big transaction logs and the IO used for this? You are going to incur large IO amounts with your snapshots being done at a regular interval and you are also using your memory to the brink. Alos, in my view you lose your ability to rollback easily to a point in time before the last automatic snapshot - think this will get very tricky to manage.
I'll rather stick to Transaction Logs for Transactional systems. It's proven and it works.

Its really just a matter of granularity. They log operations at the level of stored procedures, most RDBMS log at the level of individual statements (and 'lower'). Also their blurb regarding advantages is a bit of a red herring:
One advantage of command logging over ARIES style logging is that a
transaction can be logged before execution begins instead of executing
the transaction and waiting for the log data to flush to disk.
They have to wait for the command to be logged too, its just a much smaller record.
If I'm not mistaken VoltDB's unit of transaction is a stored proc. Traditional RDBMS usually need to support ad-hoc transactions containing any number of statements, so procedure-level logging is out of the question. Furthermore stored procedures are often not truly deterministic in traditional RDBMS (i.e. given params+log+data always produce same output), which they would have to be for this to work.
Nevertheless the performance improvements would be substantial for this constrained RDBMS model.

Few terminologies before I start explaining:
Logging schemes: The database uses logging schemes such as Shadow paging, Write Ahead Log (WAL), to implement concurrency, isolation, and durability (how is a different topic).
In order to understand why WAL is better, let's see an issue with shadow paging. In shadow paging, the database uses a master version and a shadow version of the database so that if the table size is 1 billion and the buffer pool manager does not have enough memory to hold all the tuple (records) in the memory the dirty pages are not written to the master version until the transaction(s) are not committed.
Once all the transactions are committed, the flag is switched and the shadow version becomes the master version. In the diagram above there are Page 3 and Page 5 that are old and can be garbage collected.
The issue with this approach is a large number of fragmented tuples left behind which is randomly located, this is slower as compared to if the dirty pages are sequentially accessed, and this is what Write Ahead Log does.
The other advantage of using WAL is the runtime performance (as you are not doing random IO to flush out the pages) but slower recovery time. Whereas, with shadow paging, the recovery performance is faster (which is required occasionally).

Related

SQL SERVER TRANSACTION LOG

what are the consequences if the Transaction log growth is restricted and full in SQL SERVER

It will explodes and burn down your house..
Seriously , it will generate problems such as, not being able to perform transaction.

I strongly agree with Kundan.
But would like add some more points on this:
Additionally, transaction log expansion may occur for one of the
following reasons or in one of the following scenarios:
A very large transaction log file.
Transactions may fail and may start to roll back.
Transactions may take a long time to complete.
Performance issues may occur.
Blocking may occur.
The database is participating in an AlwaysOn availability group.
You can take following actions i the log file is full:
Backing up the log.
Freeing disk space so that the log can automatically grow.
Moving the log file to a disk drive with sufficient space.
Increasing the size of a log file.
Adding a log file on a different disk.
Completing or killing a long-running transaction.
For more info please refer to the below mentioned link:
https://support.microsoft.com/en-in/help/317375/a-transaction-log-grows-unexpectedly-or-becomes-full-in-sql-server
https://msdn.microsoft.com/en-us/library/ms175495.aspx

.NET SQLDatareader isolated read

I have a SQL Server database which stores accounts with credits (about 200.000 records), and a separate table which stores the transactions (about 20.000.000).
Whenever a transaction is added to the database the credits are updated.
What I need to do is update client programs (using a web service) to store the credits locally, and whenever new transactions are added to the server they are sent to the clients as well (using timestamps for the delta). My main problem is creating the first data set for the client. I need to supply the list of all accounts and the last timestamp on the transaction table.
This would mean I have to create this list and the last timestamp within a snapshot, because any updates during creating this list would mean a mismatch in credits total and last transaction timestamp.
I've researched the ALLOW_SNAPSHOT_ISOLATION setting and using snapshot isolation on the SqlCommand transaction, but from what I've read this will induce a significant performance penalty. Is this true, and can this problem be solved using other means ?

but from what I've read this will induce a significant performance penalty.
I don't even want to know where you read that. I'll refer you to the official document. The costs come from additional tempdb space used for row versions and from traversing old row versions. These problems do not concern you if you have a low write rate.
Snapshot isolation is a boon for solving blocking and consistency issues. It is a perfect match for your scenario.
Many SQL Server questions on Stack Overflow lead me to comment "did you investigate snapshot isolation yet?". Most underused feature.
Oracle and Postgres have it always on.

Don't jump onto SI wagon hastily. As everythng else it has it's benefits and has it's drawbacks.
As the drawbacks are concerned, for example, the application might count on blocking behaviour or/and is willing to wait for that last version of the data. You should thoroughly test the application under SI to be sure it behaves correctly. Further, an uncommitted transaction could make a mess of the version store and lead to dramatic tempdb growth, so monitoring is a must.
Also, SI might be an overkill for you, if normally you don't have blocking issues.
Instead, if what you need is a one-off or close to it, create a database snapshot of your database, create the initial list from that snapshot, and then simply drop it.

Could an query with READ UNCOMMITTED isolation level cause locks on the tables it access?

My app needs to batch process 10M rows, the result of a complex SQL query that join tables.
I'll plan to be iterating a resultset, reading a hundred per iteration.
To run this on a busy OLTP production DB and avoid locks, I figured I'll query with a READ UNCOMMITTED isolation level.
Would that get the query out of the way of any DB writes? avoiding any rows/table locks?
My main concern is my query blocking any other DB activity, I'm far less concerned with the other way around.
Side Notes:
1. I'll be reading historical data, so I'm unlikely to meet uncommitted data. It's OK if I do.
2. The iteration process could take hours. The DB connection would remain open through this process.
3. I'll have two such concurrent batch instances at most.
4. I can tolerate dup rows. (by product of read uncommitted).
5. DB2 is the target DB, but I want a solution that fits other DBs vendors as well.
6. Will snapshot isolation level help me clear out server memory?

Have you actually encountered any real locks on read?
As far as I'm concerned, the only reason that READ UNCOMMITED existed in SQL standard was to allow non-locking reads. So I don't know DB2, but I blindly bet that it does not lock data during read in READ UNCOMMITED mode. Most modern RDBMS systems however don't lock data at all during read (*). So READ UNCOMMITED is either not available (in Oracle, for example) or is silently promoted to READ COMMITED (PostgreSQL).
If you can freely choose the engine, either check DB2 transaction isolation level handling or go for Oracle/PostgreSQL/other.
(*) More precisely, they don't exclusively lock the data. Some shared locks can be placed on queried tables so no DDL alters them during read.

My answer applies to SQL Server.
Read committed releases lock after every row read (approximately). Locking is probably not your problem.
I recommend you use the safer READ COMMITTED. Better yet, use snapshot isolation. That removes many locking problems. There are disadvantages as well, sou you better read a little about it.
My main concern is my query blocking any other DB activity
Snapshot isolation makes all locking concerns go away for read-only transactions. No blocking either way, full data consistency. Be aware that long-running transactions can cause TempDB to fill with snapshot versions.
The DB connection would remain open through this process.
That's a problem because a network hiccup, app deployment or mirroring failover would kill your batch process.
Be aware, that read uncommitted can cause queries to sporadically fail outright. You need retry logic or tolerate failed jobs.

In sql server Transaction isolation level Read uncommitted cause no lock on table.

Is it enough to test a stored-procedure safely just by running it in a transaction?

I have a sp called MoveSomeItems which gets some rows from tableA from Foo Db. and moves them to tableA in Bar Db.
I want to test this sp if it really moves the items.
Is it enough to run this sp in a transaction and select the rows to see if they are moved OR I should approach it in a different way?

This depends upon what the impact of it all going wrong is? What impact would having incorrect data in the destination table be, will it kill someone, simply annoy them or is it unlikely anyone will notice? Will it be easy to fix?
There are risks associated with the approach you have given. For instance:
If the database is very busy, it is possible to cause excessive locking or even a deadlock with a transaction that may cause other transactions to fail. Setting the TRANSACTION ISOLATION LEVEL to READ UNCOMITTED and the DEADLOCK PRIORITY to LOW will help to minimise this but not eliminate it entirely.
There is the possibility that other transactions may be running in READ UNCOMMITED isolation mode. In which case they will see the results of the insert temporarily until the roll back is issued.
It is worth noting that if the procedure you are testing calls COMMIT TRANSACTION inside it you might not get the result you want when you call the ROLLBACK.
You might push the database or log to run out of disk space.
You might use up all the available CPU, Memory, Disk IO, Network or some other capacity limit.
Finally, I suspect this is not a complete list. The point I’m trying to make is that it could go wrong in strange ways.
If you have a personal development database that is fully backed up then you wouldn't even need the transaction, simply do a restore after the event. The transaction may well save you some time though. This is the safest solution.
If you are using a shared development database your approach might be acceptable enough, but I would still do a backup just in case, especially if you are already on bad terms with the team.
If you are using a live database it may still be acceptable if the system as a whole is not that critical and can sustain some downtime while you repair things. Again do a backup.
If the database you are looking at is controlling a process that is safety critical or some other mission critical function, don't even go there you may lose the no claims on your liability insurance or worse. In this instance it is best to restore a backup onto a test server and test there thus creating my first scenario. But be warned there are lots of issues that have to be considered when doing this. For instance it may be illegal to use personal information in a test system. Also there may be dependencies on other systems that will need to be mocked out to ensure you don't affect them, for example don't connect a test system to a live email server.

If I havea complex stored proc that I want to be able to test and rollback, I add an input parameter(always as the last parameter), #debug with a default value of 0 (so you don't need to specify it when you are running on prod).
Then I write code at the end to test if the parameter = 1 and if so I run any select queries to shwo me what data I want to see and then send the program to the catch block using raiseerror (Never write multiple transactions without a try catch block) and have it rollback.
This way you can easily check your results on dev and automatically rollback.

Daily Data subset of main database

I have a large Db 500GB one of our customers wants daily snapshot of only his data, He only has a 3mb connection , I suspect that is the Max ! What method is the most effective method I could use?
1. Views that are updated but it wants the underlaying tables.
2. Replication I don’t know much about this.
3. Alternative method.

Merge replication, which allows you to initialize a subscriber without using a snapshot. You will have to initialize replication on the subscriber from a backup. You can possibly do the same with transactional replication, but it just never worked quite the same for me. YMMV. When it breaks (etc.) you will have to be prepared to ship a new backup and start over (hacking replication sometimes works, but don't count on it). Changing database structures is also a pain once in replication. I have seen ~500Gb with less than 3Mbit work in production, though not without proper planning and preparation (and grey hair)
I have used transactional replication with a read only subscription, where I invoked the distribution with a batch file on a task schedule once a day. Transactional replication did not feel as maintained (from MS) or as stable as merge replication, though the data integrity with transactional was more consistent
I have not tried transaction log shipping, but that might also be an option
(p.s. notice I didn't say "If it breaks")

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas