This is a general question which is boggling me from long time.
In the process between endorsing peer responding to transaction proposals( assume N) from client to sending the R/W set to orderer what if there is another transaction proposal( Assume M) being proposed which has changes for the values in 'N' ? how is this handled, because in this case the version of commit will be different for N and M, and hence M will fail.
If you say that it's like simulation from endorser or commit from peer which comes first wins. Is this fair ?
How do you explain ?
As you state above, conflicts are not detected during simulation; they are handled as part of validation and commit. All transactions are ordered, so the first one added to a block by the orderer will be processed as valid and committed. Subsequent transactions will be marked as invalid due to conflict in the persisted block(s) and the state change will be ignored.
Related
I have a pipeline like this:
env.addSource(kafkaConsumer)
.keyBy { value -> value.f0 }
.window(EventTimeSessionWindows.withGap(Time.minutes(2)))
.reduce(::reduceRecord)
.addSink(kafkaProducer)
I want to expire keyed data with a TTL.
Some blog posts point that I need a ValueStateDescriptor for that.
I made one like this:
val desc = ValueStateDescriptor("val state", MyKey::class.java)
desc.enableTimeToLive(ttlConfig)
But how do I actually apply this descriptor to my pipeline so it will actually do the TTL expiry?
The pipeline you've described doesn't use any keyed state that would benefit from setting state TTL. The only keyed state in your pipeline is the contents of the session windows, and that state is being purged as soon as possible -- as the sessions close. (Furthermore, since you are using a reduce function, that state consists of just one value per key.)
For the most part, expiring state is only relevant for state you explicitly create, in which case you will have ready access to the state descriptor and can configure it to use State TTL. Flink SQL does create state on your behalf that might not automatically expire, in which case you will need to use Idle State Retention Time to configure it. The CEP library also creates state on your behalf, and in this case you should ensure that your patterns either eventually match or timeout.
When I try to use Aerospike client Write() I obtain this error:
22 AS_PROTO_RESULT_FAIL_FORBIDDEN
The error occurs only when the Write operation is called after a Truncate() and only on specific keys.
I tried to:
change the key type (string, long, small numbers, big numbers)
change the Key type passed (Value, long, string)
change the retries number on WritePolicy
add a delay (200ms, 500ms) before every write
generate completely new keys (GUID.NewGuid().ToString())
None solved the case so I think the unique cause is the Truncate operation.
The error is systematic; for the same set of keys fail exactly on the same keys.
The error occurs also when after calling the Truncate I wait X seconds and checking the Console Management the Objects number on the Set is "0" .
I have to wait minutes (1 to 5) to be sure that running the process the problem is gone.
The cluster has 3 nodes with replica factor of 2. SSD persistence
I'm using the NuGet C# Aerospike.Client v 3.4.4
Running the process on a single local node (docker, in memory) does not give any error.
How can I know when the Truncate() process (the delete operation behind it) is completely terminated and I can safely use the Set ?
[Solution]
As suggested our devops checked the timespan synchronization. He found that the NTP was not enabled on the machine images (by mistake).
Enabled it. Tested again. No more errors.
Thanks,
Alex
Sounds like a potential issue with time synchronization across nodes, make sure you have ntp setup correctly... That would be my only guess at this point, especially as you are mentioning it does work on a single node. The truncate command will capture the current time (if you don't specify a time) and will use that to prevent records written 'prior' to that time from being written. Check under the (from top of my head, sorry if not exactly this) /opt/aerospike/smd/truncate.smd to see on each node the timestamp of the truncated command and check the time across the different nodes.
[Thanks #kporter for the comment. So the time would be the same in all truncate.smd file, but a time discrepancy between machine would then still cause writes to fail against some of the nodes]
I have multiple writers overwriting the same key in redis. How do I guarantee that only the chosen one write last?
Can I perform write synchronisation in Redis withour synchronise the writers first?
Background:
In my system a unique dispatcher send works to do to various workers. Each worker then write the result in Redis overwrite the same key. I need to be sure that only the last worker that receive work from the dispatcher writes in Redis.
Use an ordered set (ZSET): add your entry with a score equal to the unix timestamp, then delete all but the top rank.
A Redis Ordered set is a set, where each entry also has a score. The set is ordered according to the score, and the position of an element in the ordered set is called Rank.
In order:
Remove all the entries with score equal or less then the one you are adding(zremrangebyscore). Since you are adding to a set, in case your value is duplicate your new entry would be ignored, you want instead to keep the entry with highest rank.
Add your value to the zset (zadd)
delete by rank all the entries but the one with HIGHEST rank (zremrangebyrank)
You should do it inside a transaction (pipeline)
Example in python:
# timestamp contains the time when the dispatcher sent a message to this worker
key = "key_zset:%s"%id
pipeline = self._redis_connection.db.pipeline(transaction=True)
pipeline.zremrangebyscore(key, 0, t) # Avoid duplicate Scores and identical data
pipeline.zadd(key, t, "value")
pipeline.zremrangebyrank(key, 0, -2)
pipeline.execute(raise_on_error=True)
If I were you, I would use redlock.
Before you write to that key, you acquire the lock for it, then update it and then release the lock.
I use Node.js so it would look something like this, not actually correct code but you get the idea.
Promise.all(startPromises)
.bind(this)
.then(acquireLock)
.then(withLock)
.then(releaseLock)
.catch(handleErr)
function acquireLock(key) {
return redis.rl.lock(`locks:${key}`, 3000)
}
function withLock(lock) {
this.lock = lock
// do stuff here after get the lock
}
function releaseLock() {
this.lock.unlock()
}
You can use redis pipeline with Transaction.
Redis is single threaded server. Server will execute commands syncronously. When Pipeline with transaction is used, server will execute all commands in pipeline atomically.
Transactions
MULTI, EXEC, DISCARD and WATCH are the foundation of transactions in Redis. They allow the execution of a group of commands in a single step, with two important guarantees:
All the commands in a transaction are serialized and executed sequentially. It can never happen that a request issued by another client is served in the middle of the execution of a Redis transaction. This guarantees that the commands are executed as a single isolated operation.
A simple example in python
with redis_client.pipeline(transaction=True) as pipe:
val = int(pipe.get("mykey"))
val = val*val%10
pipe.set("mykey",val)
pipe.execute()
127.0.0.1:6379> keys *
1) "trending_showrooms"
2) "trending_hashtags"
3) "trending_mints"
127.0.0.1:6379> sort trending_mints by *->id DESC LIMIT 0 12
1) "mint_14216"
2) "mint_14159"
3) "mint_14158"
4) "mint_14153"
5) "mint_14151"
6) "mint_14146"
The keys are expired but the keys are inside set. I have to remove the expire keys automatically in redis
You can't set a TTL on individual members within the SET.
This blog post dives a bit deeper on the issue and provides a few workarounds.
https://quickleft.com/blog/how-to-create-and-expire-list-items-in-redis/
Hope that helps.
Please ready this page entirely: https://redis.io/topics/notifications
Summing up, you must have a sentinel program listening to PUB/SUB messages, and you must alter the redis.conf file to enable keyevent expire notifications:
in redis.conf:
notify-keyspace-events Ex
In order to enable the feature a non-empty string is used, composed of
multiple characters, where every character has a special meaning
according to the following table
E Keyevent events, published with __keyevent#<db>__ prefix.
x Expired events (events generated every time a key expires)
Then the sentinel program must listen to the channel __keyevent#0__:del, if your database is 0. Change the database number if using any other than zero.
Then when you subscribe to the channel and receive the key which is expiring, you simply issue a SREM trending_mints key to remove it from the set.
IMPORTANT
The expired events are generated when a key is accessed and is found
to be expired by one of the above systems, as a result there are no
guarantees that the Redis server will be able to generate the expired
event at the time the key time to live reaches the value of zero.
If no command targets the key constantly, and there are many keys with
a TTL associated, there can be a significant delay between the time
the key time to live drops to zero, and the time the expired event is
generated.
Basically expired events are generated when the Redis server deletes
the key and not when the time to live theoretically reaches the value
of zero.
So keys will be deleted due to expiration, but the notification is not guaranteed to occur in the moment TTL reaches zero.
ALSO, if your sentinel program misses the PUB/SUB message, well... that's it, you won't be notified another time! (this is also on the link above)
In timestamp based concurrency control why do you have to reject write in transaction T_i on element x_k if some transaction with T_j where j > i already read it.
As stated in document.
If T_j is not planing to do any update at all why is it necessary to be so restrictive on T_i's actions ?
Assume that T_i occurs first and T_j goes on second. Assume T_i also writes to x. The second read of t_j should fail due to T_i already using the value of x. T_i is younger than T_j and if T_j uses the last committed version of x, it shall cause a stale value being used if T_i writes to x.
You need to abort the writing transaction t_j during a read, write or at commit time due to the the potential for a stale value being used. If the writing transaction didn't abort, and someone else read and used the old value, the database is not serializable. As you would get a different result if you ran the transactions in a different order. This is what the text quoted means by timestamp order.
Any two reads of the same value at the same time is dangerous as it causes a not accurate view of the database, it reveals a non-serializable order. If three transactions are running and all use x, then the serializable order is undefined. You need to enforce one read of x at a time, and this forces the transactions to be single file and see the last transaction's x. So t_i then t_j, then t_k in order, finishing before the next one starts.
Think what could happen even if t_j were not to write, it would use a value that technically doesn't exist in the datbase that is stale, it would have ignored the outcome of t_i if t_i wrote.
If three transactions all read x and don't write x, then it is safe to run them at the same time. You would need to know in advance that all three transactions don't write to x.
As in the whitepaper Serializable Snapshot Isolation attests, the dangerous structure is two read-write dependencies. But a read-write x followed by a read x is dangerous also due to the value being stale if both transactions run at the same time, it needs to be serializable, so you abort the second read x as there is a younger transaction using x.
I wrote a multiversion concurrency implementation in a simulation. See the simulation runner. My simulation simulates 100 threads all trying to read and write two numbers, A and B. They want to increment the number by 1. We set A to 1 and B to 2 at the beginning of the simulation.
The desired outcome is that A and B should be set to 101 and 102 at the end of the simulation. This can only happen if there is locking or serialization due to multiversion concurrency control. If you didn't have concurrency control or locking, this number will be less than 101 and 102 due to data races.
When a thread reads A or B we iterate over versions of key A or B to see if there is a version that is <= transaction.getTimestamp() and committed.get(key) == that version. If successful, it sets the read timestamp of that value as the transaction that last read that value. rts.put("A", transaction)
At commit time, we check that the rts.get("A").getTimestamp() != committingTransaction.getTimestamp(). If this check is true, we abort the transaction and try again.
We also check if someone committed since the transaction began - we don't want to overwrite their commit.
We also check for each write that the other writing transaction is younger than us then we abort. The if statement is in a method called shouldRestart and this is called on reads and at commit time and on all transactions that touched a value.
public boolean shouldRestart(Transaction transaction, Transaction peek) {
boolean defeated = (((peek.getTimestamp() < transaction.getTimestamp() ||
(transaction.getNumberOfAttempts() < peek.getNumberOfAttempts())) && peek.getPrecommit()) ||
peek.getPrecommit() && (peek.getTimestamp() > transaction.getTimestamp() ||
(peek.getNumberOfAttempts() > transaction.getNumberOfAttempts() && peek.getPrecommit())
&& !peek.getRestart()));
return defeated;
}
see the code here The or && peek.getPrecommit() means that a younger transaction can abort if a later transaction gets ahead and the later transaction hasn't been restarted (aborted) Precommit occurs at the beginning of a commit.
During a read of a key we check the RTS to see if it is lower than the reading than our transaction. If so, we abort the transaction and restart - someone is ahead of us in the queue and they need to commit.
On average, the system reaches 101 and 102 after around < 300 transaction aborts. With many runs finishing well below 200 attempts.
EDIT: I changed the formula for calculating which transactions wins. So if another transactions is younger or the other transactions has a higher number of attempts, the current transactions aborts. This reduces the number of attempts.
EDIT: the reason there was high abort counts was that a committing thread would be starved by reading threads that would abort restart due to the committing thread. I added a Thread.yield when a read fails due to an ahead transaction, this reduces restart counts to <200.