I am using v3.10.1 of Chronicle Map. In my map, I have approximately 77K entries. I am trying to iterate through this map using entrySet() method. It doesn't iterate successfully and in between throws Chronicle specific exception. Here are the logs produced from Chronicle Map
016-09-17 06:39:15 [ERROR] n.o.c.m.i.CompiledMapIterationContext - Contexts locked on this segment:
net.openhft.chronicle.map.impl.CompiledMapIterationContext#205cd34b: used, segment 62, local state: UNLOCKED, read lock count: 0, update lock count: 0, write lock count: 0
Current thread contexts:
net.openhft.chronicle.map.impl.CompiledMapIterationContext#205cd34b: used, segment 62, local state: UNLOCKED, read lock count: 0, update lock count: 0, write lock count: 0
and Exception:
2016-09-17 06:39:15 [ERROR] akka.dispatch.TaskInvocation - Failed to acquire the lock in 60 seconds.
Possible reasons:
- The lock was not released by the previous holder. If you use contexts API,
for example map.queryContext(key), in a try-with-resources block.
- This Chronicle Map (or Set) instance is persisted to disk, and the previous
process (or one of parallel accessing processes) has crashed while holding
this lock. In this case you should use ChronicleMapBuilder.recoverPersistedTo() procedure
to access the Chronicle Map instance.
- A concurrent thread or process, currently holding this lock, spends
unexpectedly long time (more than 60 seconds) in
the context (try-with-resource block) or one of overridden interceptor
methods (or MapMethods, or MapEntryOperations, or MapRemoteOperations)
while performing an ordinary Map operation or replication. You should either
redesign your logic to spend less time in critical sections (recommended) or
acquire this lock with tryLock(time, timeUnit) method call, with sufficient
time specified.
- Segment(s) in your Chronicle Map are very large, and iteration over them
takes more than 60 seconds. In this case you should
acquire this lock with tryLock(time, timeUnit) method call, with longer
timeout specified.
- This is a dead lock. If you perform multi-key queries, ensure you acquire
segment locks in the order (ascending by segmentIndex()), you can find
an example here: https://github.com/OpenHFT/Chronicle-Map#multi-key-queries
java.lang.RuntimeException: Failed to acquire the lock in 60 seconds.
Possible reasons:
- The lock was not released by the previous holder. If you use contexts API,
for example map.queryContext(key), in a try-with-resources block.
- This Chronicle Map (or Set) instance is persisted to disk, and the previous
process (or one of parallel accessing processes) has crashed while holding
this lock. In this case you should use ChronicleMapBuilder.recoverPersistedTo() procedure
to access the Chronicle Map instance.
- A concurrent thread or process, currently holding this lock, spends
unexpectedly long time (more than 60 seconds) in
the context (try-with-resource block) or one of overridden interceptor
methods (or MapMethods, or MapEntryOperations, or MapRemoteOperations)
while performing an ordinary Map operation or replication. You should either
redesign your logic to spend less time in critical sections (recommended) or
acquire this lock with tryLock(time, timeUnit) method call, with sufficient
time specified.
- Segment(s) in your Chronicle Map are very large, and iteration over them
takes more than 60 seconds. In this case you should
acquire this lock with tryLock(time, timeUnit) method call, with longer
timeout specified.
- This is a dead lock. If you perform multi-key queries, ensure you acquire
segment locks in the order (ascending by segmentIndex()), you can find
an example here: https://github.com/OpenHFT/Chronicle-Map#multi-key-queries
at net.openhft.chronicle.hash.impl.BigSegmentHeader.deadLock(BigSegmentHeader.java:59)
at net.openhft.chronicle.hash.impl.BigSegmentHeader.updateLock(BigSegmentHeader.java:231)
at net.openhft.chronicle.map.impl.CompiledMapIterationContext$UpdateLock.lock(CompiledMapIterationContext.java:768)
at net.openhft.chronicle.map.impl.CompiledMapIterationContext.forEachSegmentEntryWhile(CompiledMapIterationContext.java:3810)
at net.openhft.chronicle.map.impl.CompiledMapIterationContext.forEachSegmentEntry(CompiledMapIterationContext.java:3816)
at net.openhft.chronicle.map.ChronicleMapIterator.fillEntryBuffer(ChronicleMapIterator.java:61)
at net.openhft.chronicle.map.ChronicleMapIterator.hasNext(ChronicleMapIterator.java:77)
at java.lang.Iterable.forEach(Iterable.java:74)
It is a single-threaded and persistent map.
Related
what I'm looking for to do is a multithread python script (more of 100 threads), each thread will read a value from API (every second) and it will put it in a table (table "ALLVALUES") specifying the key and overwriting it.
Main script every 5 seconds will read the table ALLVALUES and will retrieve the last value.
I tried with sqlite but sometimes the write operation in a specific thread fail because sqllite block the db when write, I use the WAL configuration, but the results are the same.
Which is the architecture that I could use in order to solve my problem?
write and read must be very fast so for this reason I'm looking for something local.
Thank you
I would like to commit a transaction that included almost 20K objects with Apache Ignite.
I've configured the IgniteCache with a CacheConfiguration in Transactional Mode:
CacheConfiguration<Long, CDonneeAffichage> ccda = new
CacheConfiguration<>(CacheConstant.CST_CACHE_DONNEE_AFFICHAGE);
ccda.setIndexedTypes(Long.class, CDonneeAffichage.class);
ccda.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
The transaction is created with IgniteTransactions:
IgniteTransactions transactions = igniteInstance.transactions();
try (Transaction tx = transactions.txStart()){
//20K put in the Cache
tx.commit();
}
The update takes around 20 seconds and during this time it's possible to retrieve partial data for example after 12 seconds I can see 11k objects of the total.
I really need to be able to have a consistent data:
Before the transaction commit, I should have 0 data return
After the transaction commit, I should have the whole 20K.
Does anybody know if it's possible to do this kind of transactions with Apache Ignite?
Thanks,
Transactions in Ignite satisfy ACID requirements. But some operations are not transactional. There are no transactional operations, that involve all entries in the cache, because it would require locking all keys, which is quite a complex action. No other transactions would be able to perform their work, because all entries would be locked.
So, when you call the IgniteCache#size() method, then transaction is not started, and partial results may be received. Transactions are only isolated from other transactions, and not from operations outsize transactions.
To determine if an API method is transactional or not, you can check a list of exceptions, that it throws. If there is a TransactionException, then it means, that the method supports transactions.
Also SQL is currently non-transactional. Release of transactional SQL in experimental mode is planned for version 2.7, which is about to be released.
Where can I find information about the how flow of the read/write request in the cluster when fired from the client API?
In Aerospike configuration doc ( http://www.aerospike.com/docs/reference/configuration ), it's mentioned about transaction queues, service threads, transaction threads etc but they are not discussed in the architecture document. I want to understand how it works so that I can configure it accordingly.
From client to cluster node
In your application, a record's key is the 3-tuple (namespace, set, identifier). The key is passed to the client for all key-value methods (such as get and put).
The client then hashes the (set, identifier) portion of the key through RIPEMD-160, resulting in a 20B digest. This digest is the actual unique identifier of the record within the specified namespace of your Aerospike cluster. Each namespace has 4096 partitions, which are distributed across the nodes of the cluster.
The client uses 12 bits of the digest to determine the partition ID of this specific key. Using the partition map, the client looks up the node that owns the master partition corresponding to the partition ID. As the cluster grows, the cost of finding the correct node stays constant (O(1)) as it does not depended on the number of records or the number of nodes.
The client converts the operation and its data into an Aerospike wire protocol message, then uses an existing TCP connection from its pool (or creates a new one) to send the message to the correct node (the one holding this partition ID's master replica).
Service threads and transaction queues
When an operation message comes in as a NIC transmit/receive queue interrupt,
a service thread picks up the message from the NIC. What happens next depends on the namespace this operation is supposed to execute against. If it is an in-memory namespace, the service thread will perform all of the following steps. If it's a namespace whose data is stored on SSD, the service thread will place the operation on a transaction queue. One of the queue's transaction threads will perform the following steps.
Primary index lookup
Every record has a 64B metadata entry in the in-memory primary index. The primary-index is expressed as a collection of sprigs per-partition, with each sprig being implemented as a red-black tree.
The thread (either a transaction thread or the service thread, as mentioned above) finds the partition ID from the record's digest, and skips to the correct sprig of the partition.
Exist, Read, Update, Replace
If the operation is an exists, a read, an update or a replace, the thread acquires a record lock, during which other operations wait to access the specific sprig. This is a very short lived lock. The thread walks the red-black tree to find the entry with this digest. If the operation is an exists, and the metadata entry does exist, the thread will package the appropriate message and respond. For a read, the thread will use the pointer metadata to read the record from the namespace storage.
An update needs to read the record as described above, and then merge in the bin data. A replace is similar to an update, but it skips first reading the current record. If the namespace is in-memory the service thread will write the modified record to memory. If the namespace stores on SSD the merged record is placed in a streaming write buffer, pending a flush to the storage device. The metadata entry in the primary index is adjusted, updating its pointer to the new location of the record. Aerospike performs a copy-on-write for create/update/replace.
Updates and replaces also needs to be communicated to the replica(s) if the replication factor of the namespace is greater than 1. After the record locking process, the operation will also be parked in the RW Hash (Serializer), while the replica write completes. This is where other transactions on the same record will queue up until they hit the transaction pending limit (AKA a hot key). The replica write(s) is handled by a different thread (rw-receive), releasing the transaction or service thread to move on to the next operation. When the replica writes complete the RW Hash lock is released, and the rw-receive thread will package the reply message and send it back to the client.
Create and Delete
If the operation is a new record being written, or a record being deleted, the partition sprig needs to be modified.
Like update/replace, these operations acquire the record-level lock and will go through the RW hash. Because they add or remove a metadata entry from the red-black tree representing the sprig, they must also acquire the index tree reduction lock. This process also happens when the namespace supervisor thread finds expired records and remove them from the primary index. A create operation will add an element to the partition sprig.
If the namespace stores on SSD, the create will load the record into a streaming write buffer, pending a flush to SSD, and ahead of the replica write. It will update the metadata entry in the primary index, adjusting its pointer to the new block.
A delete removes the metadata entry from the partition sprig of the primary index.
Summary
exists/read grab the record-level lock, and hold it for the shortest amount of time. That's also the case for update/replace when replication factor is 1.
update/replace also grab the RW hash lock, when replication factor is higher than 1.
create/delete also grab the index tree reduction lock.
For in-memory namespaces the service thread does all the work up to potentially the point of replica writes.
For data on SSD namespaces the service thread throws the operation onto a transaction queue, after which one of its transaction threads handles things such as loading the record into a streaming write buffer for writes, up until the potential replica write.
The rw-receive thread deals with replica writes and returning the message after the update/replace/create/delete write operation.
Further reading
I've addressed key-value operations, but not batch, scan or query. The difference between batch-reads and single-key reads is easier to understand once you know how single-read works.
Durable deletes do not remove the metadata entry of this record from the primary index. Instead, those are a new write operation of a tombstone. There will be a new 64B entry in the primary index, and a 128B entry in the SSD for the record.
Performance optimizations with CPU pinning. See: auto-pin, service-threads, transaction-queues.
Service threads == transaction queues == number of cores in your CPU or use CPU pinning - auto-pin config parameter if available in your version and possible in your OS env.
transaction threads per queue-> 3 (default is 4, for objsize <1KB, non data-in-memory namespace, 3 is optimal)
Changes with server ver 4.7+, the transaction is now handled by the service thread itself. By default, number of service threads is now set to 5 x no. of cpu cores. Once a service thread picks a transaction from the socket buffer, it carries it through completion unless it ends up in the rwHash (e.g. writes for replicating). The transaction queue is still there (internally) but only relevant for transaction restarts when queued up in the rwHash. (Multiple pending transactions for the same digest).
I have a large cursor bases query which runs within a stored procedure under a job. It makes tons of calculations for bunches of market data in a loop all day long. Every such iteration pools pieces of history time series from a disk, fatches it to temporary tables with appropriate indexing, joins them in a number of trunsformations with intermediate results and stores calculation output to disk. At the end of each loop I drop (mostly) or truncate all temporary tables to deallocate pages of user objects inside tempdb and get namespace ready for the next iteration.
My problem is that after each cycle all internal objects, which DB Engine creates for query execution and dump them to tempdb, - keep disk space reserved for them even being deallocated after transactions commit. And it adds up on every cycle as the next bunch of new internal objects is swiped to disk.
It leads to permanent tempdb grouth, all of accumulating reserved space related to new-and-new deallocated internal objects. DB Engine releases/shrinks (whatever) these tons of wasted disk space only after session closes when proc finishes it's cycles.
I can overcome the problem by reducing number of cycles in each job run, just start it again. But I would like a complete fundamental decision: I need a command or any kind of trick inside a session to force rubbish collection on my demand to clean up / kill deallocated internal objects completely and release tempdb disk space reserved for them. Days of googling did not help. Folks, help!
We have exactly the same issue:
time consuming recalculations are executed every night;
a lot of temporary tables are used in order to used parallel
execution plans
In order to fix the issue, we've just divided the processes to small processes executing each in separate session, but chained (in order to avoid blocking issues) - when the first part is executed, it fires up the next part, then after it is executed, it fires the next one.
For example (if you have a way to chain your calculations), you can break your loop iterations to separate calling of procedure with different parameters. Executed in different sessions, when each of them is finished the pages will be released.
Suppose we have a blog tool where each time a user performs a modification on an Article(Id,Body,Revisions), the revision counter is incremented by 1. If we would execute the following query (in MS SQL), and, assuming that we have many people trying to update the article, would we then get the 'right' Revisions?
Since I'm using EF, I have expressed the query in the following way:
context.Database.ExecuteSqlCommand("UPDATE dbo.Articles SET Revisions = Revisions + 1 WHERE Id=#p0;", articleId);
NB: What I mean by 'right' Revisions is that if we would have 100 people updating the article simultaneously, once they are all finished, the Revisions would be set to 100.
Yes, this is thread-safe. The database engine will lock the record during the update, which means any other threads will have to wait for it to finish its update.
During that time the field will indeed increment with one, without any interference from other threads. Once done, the resource is unlocked, and the next waiting thread will lock it in turn, and do the same.
As explained in the docs, the lock is an exclusive one:
Exclusive (X) Used for data-modification operations, such as INSERT, UPDATE, or DELETE. Ensures that multiple updates cannot be made to the same resource at the same time.
and:
Exclusive Locks
Exclusive (X) locks prevent access to a resource by concurrent transactions. No other transactions can read or modify data locked with an exclusive (X) lock.