Does PostgreSQL guarantes unicity of `commit timestamp` for two simultaneous commits? - sql

Context : I expose an API that provides only the new stuff presents in a PostgreSQL table. "New" means added since the last call.
To do that, I have switched on the track_commit_timestamp option defined into my PostgreSQL server and ordered table content by this commit date. It works well.
To provide the new stuff, I keep the most recent commit timestamp presents on my SQL select response and uses it to build the next request like a starting point (date >= to this value).
But, I have a question about this mechanism (track_commit_timestamp) : Is it possible for 2 simultaneous commits to have the same timestamp and are not "visibles" at the same moment ? And so, if my "SQL select" request is executed during this small interval I lost the second transaction content ?

I don't think "simultaneous" makes sense here.
Imagine two transactions writing to your database (A,B) and one reading from it (C).
time------------>
A->|
B------------>|
C-->|
Now, (C) will of course see A with a timestamp of T1. B also has a timestamp of T1 but hasn't committed yet. As a result, C can't see it since it may be rolled back.
So, you can only reliably access data for transactions that completed before C started. The latest viable timestamp is just before the oldest concurrent transaction.
There are two simplifications that might help you and one alternative approach.
Firstly, if all your transactions are short (say under 1 second) then you simply subtract 1 second from your transaction's start time and read changes older than that. This is actually quite commonly workable.
Secondly, if concurrency is not a concern you can enforce serialisation perhaps with transaction levels or through an explicit lock.
The alternative is to stop worrying about timestamps and instead consider your situation as a stream of changes. This would be the approach taken by various trigger-based replication systems or the logical replication supported as an add-on in 9.5(?) onwards and natively in version 10.

Related

How can data be synchronized between processes in SQL?

I'm wondering something perhaps extremely stupid, but I can't seem to find an answer (which is not a good sign, usually).
Assuming we have a SQL server (MySQL, PostgreSQL, this question even applies to Sqlite3 though there's no server) and several clients connected to it. I've seen countless times queries that might be hard to sync in my opinion.
So let's assume we have a table (usage statistics, say) with a row per day.
statistic (
day,
num_requests
)
(I avoid mentioning data types, since it's not the point, but the number of requests should be a number of some sort.)
So when a new web request is sent, the web server will ask this table's current statistic and increase the number of requests. No biggie right?
number = cursor.execute("""
SELECT num_requests FROM statistic
WHERE ...
""")
number += 1
cursor.execute("""
UPDATE statistic SET num_requests=?
WHERE ...
""", (number, ))
But what does happen if two requests are handled somewhat simultaneously, perhaps on several clients? Different processes? They each ask for today's current statistic (just a read operation, non-blocking), they get the number of requests from this row (this step doesn't involve the server) and then they increment it by 1. At this point, if both requests are running somewhat simultaneously, they have both incremented the same number once and they send an UPDATE requests with their number.
In the end, the number of requests for today's statistic has increased by one, although they were two requests. I know there are mechanisms to ensure proper data synchronization, but I fail to see how it could address the situation in this case. Read usually is non-blocking as far as I know. Write can be blocking, but since read for the other process has happened before, the second write operation will not be acceptable. And I don't see any way to express that logically.
In other words, this seems like the point where we would lock the row in most programming languages, and say "from that point onward, you can neither read it or write it, I'm working on it". The first request will execute its read (lock), increment and write, and then will unlock. The second request will have to wait patiently for the lock to be released. I don't see that mechanism in SQL. Is that transparent and not even necessary? And if so, how does it work? Or have we lived our entire life with problems like it?
Thanks!
cursor.execute("""
UPDATE statistic SET num_requests=num_requests+1
WHERE ...
""", (number, ))

How to consistently track all new rows in a SQL database table

What I am trying to do
I am developing a web service, which runs in multiple server instances, all accessing the same RDBMS (PostgreSQL). While the database is needed for persistence, it contains very little data, which is why every server instance has a cache of all the data. Further the application is really simple in that it only ever inserts new rows in rather simple tables and selects that data in a scheduled fashion from all server instances (no updates or changes... only inserts and reads).
The way it is currently implemented
basically I have a table which roughly looks like this:
id BIGSERIAL,
creation_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- further data columns...
The server is doing something like this every couple of seconds (pseudocode):
get all rows with creation_timestamp > lastMaxTimestamp
lastMaxTimestamp = max timestamp for all data just retrieved
insert new rows into application cache
The issue I am running into
The application skips certain rows when updating the caches. I analyzed the issue and figured out, that the problem is caused in the following way:
one server instance is creating a new row in the context of a transaction. An id for the new row is retrieved from the associated sequence (id=n) and the creation_timestamp (with value ts_1) is set.
another server does the same in the context of a different transaction. The new row in this transaction gets id=n+1 and a creation_timestamp ts_2 (where ts_1 < ts_2).
transaction 2 finishes before transaction 1
one of the servers executes a "select all rows with creation_timestamp > lastMaxTimestamp". It gets row n+1, but not n1. It sets lastMaxTimestamp to ts_2.
transaction 1 completes
some time later the server from step 4 executes "select all rows with creation_timestamp > lastMaxTimestamp" again. But since lastMaxTimestamp=ts_2 and ts_2>ts_1 the row n will never be read on that server.
Note: CURRENT_TIMESTAMP has the same value during a transaction, which is the transaction start time.
So the application gets inconsistent data into its cache and can't get new rows based on the insertion timestamp OR based on the sequence id. Transaction isolation levels don't really change anything about the situation, since the problem is created in essence by transaction 2 finishing before transaction 1.
My question
Am I missing something? I am thinking there must be a straightforward way to get all new rows of a RDBMS, but I can't come up with a simple solution... at least with a simple solution that is consistent. Extensive locking (e.g. of tables) wouldn't be acceptable because of performance reasons. Simply trying to ensure to get all ids from that sequence seems like a) a complicated solution and b) can't be done easily, since rollbacks during transactions can happen (which would lead to sequence ids not being used).
Anyone has the solution?
After a lot of searching, I found the right keywords to google for... "transaction commit timestamp" to leads to all sorts of transaction timestamp tracking and system columns like xmin:
https://dba.stackexchange.com/questions/232273/is-there-way-to-get-transaction-commit-timestamp-in-postgres
This post has some more detailed information:
Questions about Postgres track_commit_timestamp (pg_xact_commit_timestamp)
In short:
you can turn on a postgresql option to track timestamps of commits and compare those instead of the current_timestamps/clock_timestamps inside the transaction
it seems though, that it is only tracked when a transaction is completed - not when it is commited, which makes the solution not bullet proof. There are also further issue to consider like transaction id (xmin) rollover for example
logical decoding / replication is something to look into for a proper solution
Thanks to everyone trying to help me find an answer. I hope this summary is useful to someone in the future.

Global revision without locking

Given this set of rules, would it be possible to implement this in SQL?
Two transactions that don't modify the same rows should be able to run concurrently. No locks should occur (or at least their use should be minimized as much as possible).
Transactions can only read committed data.
A revision is defined as an integer value in the system.
A new transaction must be able to increment and query a new revision. This revision will be applied to every rows that the transaction modifies.
No 2 transactions can share the same revision.
A transaction X that is committed before transaction Y must have a revision lower than the one assigned to transaction Y.
I want to use integer as the revision in order to optimize how I query all changes since a specific revision. Something like this:
SELECT * FROM [DummyTable] WHERE [DummyTable].[Revision] > clientRevision
My current solution uses an SQL table [GlobalRevision] with a single row [LastRevision] to keep the latest revision. All my transactions' isolation level are set to Snapshot.
The problem with this solution is that the [GlobalRevision] table with the single row [LastRevision] becomes a point of contention. This is because I must increment the revision at the start of a transaction so that I can apply the new revision to the modified rows. This will keep a lock on the [LastRevision] row throughout the duration of the transaction, killing the concurrency. Even though two concurrent transactions modify totally different rows, they cannot be executed concurrently (Rule #1: Failed).
Is there any pattern in SQL to solve this kind of issue? One solution is to use Guids and keep an history of revisions (like git revisions) but this is less easier than just having an integer that we can compare to see if a revision is newer than another one.
UPDATE:
The business case for this is to create a Baas system (Backend as a service) with data synchronization between client and server. Here are some use cases for this kind of system:
Client while online modifies an asset, pushes the update to the server, server updates DB [this is where my question relates to], server sends update notifications to interested clients that synchronize their local data with the new changes.
Client connects to server, client requests a pull to the server, server finds all changes that were applied after client's revision and return them to the client, client applies the changes and sets its new revision.
...
As you can see, the global revision lets me put a revision on every changes committed on the server and from this revision, I can determine what updates need to be sent to the clients depending on their specific revision.
This needs to scale to multiple thousands of users that can push updates in parallel and those changes must be synchronized to other connected users. So the longer it takes to execute a transaction, the longer it takes for other users to receive the change notifications.
I want to avoid as much as possible contention for this reason. I am not an expert in SQL so I just want to make sure there is not something I am missing that would let me do that easily.
Probably the easiest thing for you to try would be to use a SEQUENCE for your revision number, assuming you're at SQL 2012 or newer. This is a lighter-weight way of generating an auto-incrementing value that you can use as a revision ID per your rules. Acquiring them at scale should be far less subject to the contention issues you describe than using a full-fledged table.
You do need to know that you could end up with revision number gaps if a given transaction rolled back, because SEQUENCE values operate outside of transactional scope. From the article:
Sequence numbers are generated outside the scope of the current
transaction. They are consumed whether the transaction using the
sequence number is committed or rolled back.
If you can relax the requirement for an integer revision number and settle for knowing what the data was at a given point in time, you might be able to use Change Data Capture, or, in SQL 2016, Temporal Tables. Both of these technologies allow you to "turn back time" and see what the data looked like at a known timestamp.

SQL Server - On-Insert Trigger vs. Per-Minute Job

My question is mainly concerned with "what's best for performance", but kinda "philosophically" speaking as well (if it makes a difference)... so let's jump right in.
[TableA].[ColumnB] stores a value that needs to exist in [TableC].[ColumnD]. Right off the bat, no answers involving Foreign-keys - just assume that they're "not allowed" in this environment for whatever reason.
But due to "circumstances x,y,z", [TableA].[ColumnB] sometimes gets values that do not exist in [TableC].[ColumnD], because, let's say, [TableA] gets populated from an object that exists in running code as a "serialized blob", an in-memory representation of the data, and the [ColumnB] values got populated before those values were deleted from [TableC].[ColumnD] by some other process. ANYWAY, this is for example's sake, so don't get bogged down in the "why does this condition happen", just accept that it does.
To "fix" the problem, which method is best of these two: 1. make a Trigger that fires on-INSERT on [TableA], to Update [ColumnB] to the value that it should be (and assume I have a "mapping" of bad-to-good values). Or, 2. run a scheduled-Job every hour/minute/whatever that runs Update queries to change all possible "bad" values to their corresponding "good" values.
To put it more generally, what's better for performance and/or what is best practice: a Trigger, or a periodic Scheduled-Job? In context, let's say [TableA] is typically on the order of hundreds of thousands of rows, with Inserts happening 10-100 records at-a-time, as frequently as every few minutes to as rarely as a few times per day.
On-insert.
Doing triggers is like callbacks- They're more logically sound, and they spread any lag into every query. Doing continual checks (called polling or cron-jobs), you end up with more severe moments of lag every now and then. In almost all cases, using triggers/callbacks are the better way to go as having 1ms of lag added to every query is better than 100ms of lag at seemingly random intervals.
Use of triggers is generally discouraged, but your load is light and your case seems to be a natural trigger case. Consider using instead-of trigger to avoid two operations on the same row (one insert instead of insert and update). It may be the simplest and most reliable solution (as long as you have written reliable code in the trigger that won't cause the whole operation to crash).
Since you are considering a batch job, you are not concerned with timing issues. I.e it's OK with your application that tables may be out of sync for 1 minute or even 1 hour. That's the major difference with the trigger approach, which will guarantee that tables are in sync all the time. Potential timing issues would make me uncomfortable. On the plus side, you won't be at risk of crashing the original insert operation with your trigger.
If you go this route, please consider Change Tracking feature. Change tracking will indicate which rows have been inserted since the last time you checked, so you won't have to scan the whole table for new records. Alternatively, if your TableA has an INDENITY primary or unique key, you can implement similar design without change tracking functionality.
Triggers are both best performance and practice, as they maintain referential integrity as well as allowing the server to optimise for performance.
You didn't say what version of SQL Server you were using, but if it's 2008+, you can use Change Data Capture to keep track of data changes to your "primary" table. Then, periodically, you can run a batch over the change table and do whatever processing is required over that small set.

Efficiently detecting concurrent insertions using standard SQL

The Requirements
I have a following table (pseudo DDL):
CREATE TABLE MESSAGE (
MESSAGE_GUID GUID PRIMARY KEY,
INSERT_TIME DATETIME
)
CREATE INDEX MESSAGE_IE1 ON MESSAGE (INSERT_TIME);
Several clients concurrently insert rows in that table, possibly many times per second. I need to design a "Monitor" application that will:
Initially, fetch all the rows currently in the table.
After that, periodically check if there are any new rows inserted and then fetch
these rows only.
There may be multiple Monitors concurrently running. All the Monitors need to see all the rows (i.e. when a row is inserted, it must be "detected" by all the currently running Monitors).
This application will be developed for Oracle initially, but we need to keep it portable to every major RDBMS and would like to avoid as much database-specific stuff as possible.
The Problem
The naive solution would be to simply find the maximal INSERT_TIME in rows selected in step 1 and then...
SELECT * FROM MESSAGE WHERE INSERT_TIME >= :max_insert_time_from_previous_select
...in step 2.
However, I'm worried this might lead to race conditions. Consider the following scenario:
Transaction A inserts a new row but does not yet commit.
Transaction B inserts a new row and commits.
The Monitor selects rows and sees that the maximal INSERT_TIME
is the one inserted by B.
Transaction A commits. At this point, A's INSERT_TIME is actually
earlier than the B's (A's INSERT was actually executed before
B's, before we even knew who is going to commit first).
The Monitor selects rows newer than B's INSERT_TIME (as a consequence of step 3). Since A's INSERT_TIME is earlier than B's insert time, A's row is skipped.
So, the row inserted by transaction A is never fetched.
Any ideas how to design the client SQL or even change the database schema (as long as it is mildly portable), so these kinds of concurrency problems are avoided, while still keeping a decent performance?
Thanks.
Without using any of the platform-specific change data capture (CDC) technologies, there are a couple of approaches.
Option 1
Each Monitor registers a sort of subscription to the MESSAGE table. The code that writes messages then writes each MESSAGE once per Monitor, i.e.
CREATE TABLE message_subscription (
message_subscription_id NUMBER PRIMARY KEY,
message_id RAW(32) NOT NULLL,
monitor_id NUMBER NOT NULL,
CONSTRAINT uk_message_sub UNIQUE (message_id, monitor_id)
);
INSERT INTO message_subscription
SELECT message_subscription_seq.nextval,
sys_guid,
monitor_id
FROM monitor_subscribers;
Each Monitor then deletes the message from its subscription once that is processed.
Option 2
Each Monitor maintains a cache of the recent messages it has processed that is at least as long as the longest-running transaction could be. If the Monitor maintained a cache of the messages it has processed for the last 5 minutes, for example, it would query your MESSAGE table for all messages later than its LAST_MONITOR_TIME. The Monitor would then be responsible for noting that some of the rows it had selected had already been processed. The Monitor would only process MESSAGE_ID values that were not in its cache.
Option 3
Just like Option 1, you set up subscriptions for each Monitor but you use some queuing technology to deliver the messages to the Monitor. This is less portable than the other two options but most databases can deliver messages to applications via queues of some sort (i.e. JMS queues if your Monitor is a Java application). This saves you from reinventing the wheel by building your own queue table and gives you a standard interface in the application tier to code against.
You need to be able to identify all rows added since the last time you checked (i.e. the monitor checks). You have a "time of insert" column. However, as you spell it out, that time of insert column cannot be used with "greater than [last check]" logic to reliably identify subsequently inserted new items. Commits do not occur in the same order as (initial) inserts. I am not aware of anything that works on all major RDBMSs that would clearly and safely apply such an "as of" tag at the actual time of commit. [This is not to say I would know it if such a thing existed, but it seems a pretty safe guess to me.] Thus, you will have to use something other than a "greater than last check" algorithm.
That leads to filtering. Upon insert, an item (row) is flagged as "not yet checked"; when a montior logs in, it reads all not yet checked items, returns that set, and flips the flag to "checked" (and if there are multiple monitors, this must all be done within its own transaction). The monitors' queries will have to read all the data and pick out which have not yet been checked. The implication is, however, that this will be a fairly small set of data, at least relative to the entire set of data. From here, I see two likely options:
Add a column, perhaps "Checked". Store a binary 1/0 value for is/isnot checked. The cardinality of this value will be extreme -- 99.9s Checked, 00,0s Unchecked, so it should be rather efficient. (Some RDBMSs provide filtered queries, such that the Checked rows won't even be in the index; once flipped to checked, a row will presumably never be flipped back, so the overhead to support this shouldn't be too great.)
Add a separate table identify those rows in the "primary" table that have not yet been checked. When a montior logs in, it reads and deletes the items from that table. This doesn't seem efficient... but again, if the data set involved is small, the overall performance pain might be acceptable.
You should use Oracle AQ with a multi-subscriber queue.
This is Oracle specific, but you can create an abstraction layer of stored procedures (or abstract in Java if you like) so that you have a common API to enqueue the new messages and have each subscriber (monitor) dequeue any pending messages. Behind that API, for Oracle you use AQ.
I am not sure if there is a queuing solution for other databases.
I don't think you will be able to come up with a totally database agnostic approach that meets your requirements. You could extend the example above that included the 'checked' column, to have a second table called monitor_checked - that would contain one row per message per monitor. That is basically what AQ does behind the scenes, so it is sort of reinventing the wheel.
With PostgreSQL, use PgQ. It has all those little details worked out for you.
I doubt you will find a robust and manageable database-agnostic solution for this.