Cassandra query execution sequencing vs Eventual consistency issue - datastax

I am confused with cassandra eventual constancy vs query sequencing, i have following questions
If I send 2 queries in sequence (without turning on isIdempotent property). First query is to Delete record and second query is to create records. Is it possible that the query 2 executes before query one.
my java code will look like this
public void foo(){
delete(entity);//First delete a record
create(entity); //Second create a record
}
another thing I am not specifying any timestamp in my query.
2) My second question is, Cassandra is eventually consistent. And if I send both the above queries in sequential order and it doesnt get replicated to all nodes, will those queries maintain the order when actually its getting replicated to all nodes?
I tried to look cassandra documentation , although it talks about query sequencing in batch operations, but it doesnt talk about query sequencing in non batch operation.
I am using cassandra 2.1

By default, in modern versions, we use client side timestamps. Check the driver documentation here:
https://datastax.github.io/java-driver/manual/query_timestamps/
Based on the timestamp, C* operates using LWW heuristics (last write wins) if the create has an earlier timestamp than the delete, a query won't return data. If the create has a newer timestamp, it will.
If you need linearization, i.e. the guarantee that certain operations will be executed in sequence, you can use lightweight transactions based on paxos:
http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0

Related

What happens to an SQL SELECT query if the data changes whilst the query is running

We have a large database which is regularly being written to. We have a SELECT query on this DB which takes a couple of minutes to run. If we run this query and during those couple of minutes newer data is inserted/updated, will those changes be picked up by the query (assuming it matches the parameters of the query)? Or will it instead return the results as it was at run time?
You are using Oracle: when executing a query, you will read the data as of a specific point in time to guarantee consistency. By default, that will be the time that the execution started.
Have a read about it in the Database Concepts guide https://docs.oracle.com/en/database/oracle/oracle-database/19/cncpt/introduction-to-oracle-database.html#GUID-4D3C43F5-4EC6-4A21-9E91-8E4F33FE7790 . How Oracle achieves this is one of its features that sets it apart from most other RDBMS.
Each query – no matter what is the underlying database – operates within a TRANSACTION, whether explicit or implied. And, each transaction has a specific "isolation level."
The best strategy is to BEGIN TRANSACTION explicitly, but the reality is that you need to make yourself aware of what your chosen DB-interface is doing behind the scenes. This will give you your answer. "When you ran your query," your software did something.

How to consistently track all new rows in a SQL database table

What I am trying to do
I am developing a web service, which runs in multiple server instances, all accessing the same RDBMS (PostgreSQL). While the database is needed for persistence, it contains very little data, which is why every server instance has a cache of all the data. Further the application is really simple in that it only ever inserts new rows in rather simple tables and selects that data in a scheduled fashion from all server instances (no updates or changes... only inserts and reads).
The way it is currently implemented
basically I have a table which roughly looks like this:
id BIGSERIAL,
creation_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-- further data columns...
The server is doing something like this every couple of seconds (pseudocode):
get all rows with creation_timestamp > lastMaxTimestamp
lastMaxTimestamp = max timestamp for all data just retrieved
insert new rows into application cache
The issue I am running into
The application skips certain rows when updating the caches. I analyzed the issue and figured out, that the problem is caused in the following way:
one server instance is creating a new row in the context of a transaction. An id for the new row is retrieved from the associated sequence (id=n) and the creation_timestamp (with value ts_1) is set.
another server does the same in the context of a different transaction. The new row in this transaction gets id=n+1 and a creation_timestamp ts_2 (where ts_1 < ts_2).
transaction 2 finishes before transaction 1
one of the servers executes a "select all rows with creation_timestamp > lastMaxTimestamp". It gets row n+1, but not n1. It sets lastMaxTimestamp to ts_2.
transaction 1 completes
some time later the server from step 4 executes "select all rows with creation_timestamp > lastMaxTimestamp" again. But since lastMaxTimestamp=ts_2 and ts_2>ts_1 the row n will never be read on that server.
Note: CURRENT_TIMESTAMP has the same value during a transaction, which is the transaction start time.
So the application gets inconsistent data into its cache and can't get new rows based on the insertion timestamp OR based on the sequence id. Transaction isolation levels don't really change anything about the situation, since the problem is created in essence by transaction 2 finishing before transaction 1.
My question
Am I missing something? I am thinking there must be a straightforward way to get all new rows of a RDBMS, but I can't come up with a simple solution... at least with a simple solution that is consistent. Extensive locking (e.g. of tables) wouldn't be acceptable because of performance reasons. Simply trying to ensure to get all ids from that sequence seems like a) a complicated solution and b) can't be done easily, since rollbacks during transactions can happen (which would lead to sequence ids not being used).
Anyone has the solution?
After a lot of searching, I found the right keywords to google for... "transaction commit timestamp" to leads to all sorts of transaction timestamp tracking and system columns like xmin:
https://dba.stackexchange.com/questions/232273/is-there-way-to-get-transaction-commit-timestamp-in-postgres
This post has some more detailed information:
Questions about Postgres track_commit_timestamp (pg_xact_commit_timestamp)
In short:
you can turn on a postgresql option to track timestamps of commits and compare those instead of the current_timestamps/clock_timestamps inside the transaction
it seems though, that it is only tracked when a transaction is completed - not when it is commited, which makes the solution not bullet proof. There are also further issue to consider like transaction id (xmin) rollover for example
logical decoding / replication is something to look into for a proper solution
Thanks to everyone trying to help me find an answer. I hope this summary is useful to someone in the future.

Does PostgreSQL guarantes unicity of `commit timestamp` for two simultaneous commits?

Context : I expose an API that provides only the new stuff presents in a PostgreSQL table. "New" means added since the last call.
To do that, I have switched on the track_commit_timestamp option defined into my PostgreSQL server and ordered table content by this commit date. It works well.
To provide the new stuff, I keep the most recent commit timestamp presents on my SQL select response and uses it to build the next request like a starting point (date >= to this value).
But, I have a question about this mechanism (track_commit_timestamp) : Is it possible for 2 simultaneous commits to have the same timestamp and are not "visibles" at the same moment ? And so, if my "SQL select" request is executed during this small interval I lost the second transaction content ?
I don't think "simultaneous" makes sense here.
Imagine two transactions writing to your database (A,B) and one reading from it (C).
time------------>
A->|
B------------>|
C-->|
Now, (C) will of course see A with a timestamp of T1. B also has a timestamp of T1 but hasn't committed yet. As a result, C can't see it since it may be rolled back.
So, you can only reliably access data for transactions that completed before C started. The latest viable timestamp is just before the oldest concurrent transaction.
There are two simplifications that might help you and one alternative approach.
Firstly, if all your transactions are short (say under 1 second) then you simply subtract 1 second from your transaction's start time and read changes older than that. This is actually quite commonly workable.
Secondly, if concurrency is not a concern you can enforce serialisation perhaps with transaction levels or through an explicit lock.
The alternative is to stop worrying about timestamps and instead consider your situation as a stream of changes. This would be the approach taken by various trigger-based replication systems or the logical replication supported as an add-on in 9.5(?) onwards and natively in version 10.

Bigquery caching when hitting table would provide a different result?

As part of our Bigquery solution we have a cron job which checks the latest table created in a dataset and will create more if this table is out of date.This check is done with the following query
SELECT table_id FROM [dataset.__TABLES_SUMMARY__] WHERE table_id LIKE 'table_root%' ORDER BY creation_time DESC LIMIT 1
Our integration tests have recently been throwing errors because this query is hitting Bigquery's internal cache even though running the query against the underlying table would provide a different result. This caching also occurs if I run this query in the web interface from Google cloud console.
If I specify for the query not to cache using the
queryRequest.setUseQueryCache(false)
flag in the code then the tests pass correctly.
My understanding was that Bigquery automatic caching would not occur if running the query against the underlying table would provide a different result. Am I incorrect in this assumption in which case when does it occur or is this a bug?
Well the answer for your question is: you are doing conceptually wrong. You always need to set the no cache param if you want no cache data. Even on the web UI there are options you need to use. The default is to use the cached version.
But, fundamentally you need to change the process and use the recent features:
Automatic table creation using template tables
A common usage pattern for streaming data into BigQuery is to split a logical table into many smaller tables, either for creating smaller sets of data (e.g., by date or by user ID) or for scalability (e.g., streaming more than the current limit of 100,000 rows per second). To split a table into many smaller tables without adding complex client-side code, use the BigQuery template tables feature to let BigQuery create the tables for you.
To use a template table via the BigQuery API, add a templateSuffix parameter to your insertAll request
By using a template table, you avoid the overhead of creating each table individually and specifying the schema for each table. You need only create a single template, and supply different suffixes so that BigQuery can create the new tables for you. BigQuery places the tables in the same project and dataset. Templates also make it easier to update the schema because you need only update the template table.
Tables created via template tables are usually available within a few seconds.
This way you don't need to have a cron, as it will automatically create the missing tables.
Read more here: https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables

Does Aerospike have something similar to HBase's co-processor?

HBase's co-processor is a good example of "moving calculations not data". Not sure if Aerospike support something similar to this?
Aerospike supports user-defined functions (UDFs), which are functions users load in to the database and execute.
Aerospike provides two types of UDF, record and stream, both of which are equivalent to HBase's Endpoint coprocessors, in that they execute against the data and return a result. The record UDF is executed against a single record, allowing for record modifications and calculations on a single record. The stream UDF is executed against query results, providing ability to analyze or aggregate the data. Both UDF execute on the node containing the data, and return a user-defined result.
Aerospike does not support the concept of HBase's Observer coprocessors, which are executed based on an event.
This isn't quite a direct answer to your question, but VoltDB supports near-arbitrary Java processing in the database process local to the partition of data you're interested in. You can mix Java and SQL in a fully transactional environment and still scale to millions of ACID transactions per second.