Which is Faster, Fine-grained or Coarse-grained? - locking

I'm the sophomore studying the subject of operating system now.
I would like to compare the fine-grained and coarse-grained by implementing binary search tree and using mutex.
The lock and unlock were used for node insert and delete function, and the execution time was printed.
I thought Fine Grained would be faster. However, even if I change the number, the coarse grained is faster.
Is there anyone who can explain about this? Did I make the wrong code?
Result Page
Result Page 2
int lab2_node_insert_fg(lab2_tree *tree, lab2_node *new_node){
if (tree->root == NULL) {
tree->root = new_node;
return LAB2_ERROR;
}
if (search_key(tree,new_node->key)) {
return LAB2_ERROR;
}
lab2_node* cur = tree->root;
while (1) {
if (cur->key < new_node->key) {
if (cur->right == NULL) {
pthread_mutex_lock(&lock); // LOCK
cur->right = new_node;
pthread_mutex_unlock(&lock); // UNLOCK
return LAB2_SUCCESS;
}
cur = cur->right;
}
else {
if (cur->left == NULL) {
pthread_mutex_lock(&lock); // LOCK
cur->left = new_node;
pthread_mutex_unlock(&lock); // UNLOCK
return LAB2_SUCCESS;
}
cur = cur->left;
}
}
}
int lab2_node_insert_cg(lab2_tree *tree, lab2_node *new_node){
pthread_mutex_lock(&lock); // LOCK
if (tree->root == NULL) {
tree->root = new_node;
pthread_mutex_unlock(&lock); // UNLOCK
return LAB2_ERROR;
}
if (search_key(tree,new_node->key)) {
pthread_mutex_unlock(&lock); // UNLOCK
return LAB2_ERROR;
}
lab2_node* cur = tree->root;
while (1) {
if (cur->key < new_node->key) {
if (cur->right == NULL) {
cur->right = new_node;
pthread_mutex_unlock(&lock); // UNLOCK
return LAB2_SUCCESS;
}
cur = cur->right;
}
else {
if (cur->left == NULL) {
cur->left = new_node;
pthread_mutex_unlock(&lock); // UNLOCK
return LAB2_SUCCESS;
}
cur = cur->left;
}
}
}

I assume that the first snippet is supposed to represent the fine-grained version, the other the coarse-grained version.
The "fine-grained" version is missing a lock for creating the root node. Also, you hold the lock for assigning the new node, you don't do so for the condition. While you iterate the tree, locking a node at a time is insufficient in the presence of concurrent deletes. All of which are concurrency issues.
Usually, the difference between fine-grained and coarse-grained locking is not the "size" of your lock sections (the amount of code you cover with a lock), it refers to the lock granularity in relation to the data structure itself.
So, one approach is to have a single lock for the whole tree - coarse-grained locking. The other approach is to introduce locks at lower levels (e.g. per node) to eliminate lock contention if two operations running in parallel are operating on different tree nodes. This is similar to the difference between table locks and row-level locks in the database space.
One approach for the fine-grained version is to have a per-node lock and use so-called hand-over-hand locking. For the insert operation, you'd have to lock the current node until you identified and locked the next node as you descend.
Do additional research, change your implementation accordingly and retest.
Generally speaking, fine-grained locking can lead to a more scalable implementation if the access pattern tends to hit different locks. If that's not the case most of the time, then a coarse-grained version might perform as good or slightly better because a single lock can be held throughout the operation rather than having to acquire different locks with most of them leading to lock contention again. As always, it depends on the actual usage pattern.

Related

Entity Framework CORE batch processing taking long time

I am passing a few records with Jquery ajax() to a .Net CORE MVC controller to batch update a SQL table. It then calls TransactionalDeleteAndInsert() in a repository to delete then insert the records as shown in the following code.
When done, _repoContext.SaveChangesAsync() is executed. The delete/update actions take a few seconds to complete, but when I refresh the screen or navigate to another page, the get method to get the updated list took more than 2 hours. What am I doing wrong?
public int BatchInsert(IList<T> entityList)
{
int inserted = 0;
foreach(T entity in entityList)
{
this.RepositoryContext.Set<T>().Add(entity);
inserted++;
}
return inserted;
}
public int BatchDelete(IList<T> entityList)
{
int deleted = 0;
foreach(T entity in entityList)
{
this.RepositoryContext.Set<T>().Remove(entity);
deleted++;
}
return deleted;
}
public List<int> TransactionalDeleteAndInsert(IList<T> deleteEntityList, IList<T> insertEntityList)
{
using (var transaction = this.RepositoryContext.Database.BeginTransaction())
{
int totalDeleted = this.BatchDelete(deleteEntityList);
int totalInserted = this.BatchInsert(insertEntityList);
transaction.Commit();
List<int> result = new List<int>();
result.Add(totalDeleted);
result.Add(totalInserted);
return result;
}
}
In the snippet above there is not SaveChangesAsync() or SaveChanges() method executed (I take it that you execute it later on).
That means the whole process occurs locally in your context/memory only.
Transactions
The fact that BatchDelete() and BatchInsert() methods are wrapped in a transaction does not make a difference because these operations occur in your context, which will probably be recreated in your next request (given that it's lifetime is scoped).
The transaction would make more sense if your code was like this
using (var transaction = this.RepositoryContext.Database.BeginTransaction())
{
int totalDeleted = this.BatchDelete(deleteEntityList);
this.SaveChanges();
int totalInserted = this.BatchInsert(insertEntityList);
this.SaveChanges();
transaction.Commit();
List<int> result = new List<int>();
result.Add(totalDeleted);
result.Add(totalInserted);
return result;
}
So if for any reason your second db operation would fail the first one would rollback too. (I am aware that this example would not make sense in your case, you could simply execute SaveChanges() method at the end of your TransactionalDeleteAndInsert() method and you you could avoid any unwanted data saved in your db in case insert fails)
Slow db operations
That could be due to many reasons. It could be a slow sql server, very big tables, or long add/remove lists. This is the reason that when you refresh it takes long, because by refreshing you query again your database while it is already under heavy pressure.

Reactive programming - running jobs in a cluster

I need to run some jobs in a cluster, only one at a time.
Because my team uses Hazelcast, I ended up with a solution based on
Hazelcast ILock implementation. For the purpose of the question, I am going to make a generalisation about it. Let's suppose we have the following interfaces (that could be easily implemented e.g. by Hazelcast or Reddison (Redis)):
public interface MyDistributedLock {
boolean lock();
void unlock();
boolean isLockedByCurrentThread();
}
public interface MyLockDistributedFactory {
MyDistributedLock getLock(String name);
}
And lock method waiting if lock cannot be acquired:
private Mono<Void> lock(String name, Publisher<?> publisher, MyLockDistributedFactory myLockFactory) {
// important to release lock on the same thread as
// it was aquired
Scheduler scheduler = Schedulers.newSingle(name.toLowerCase());
return Mono.defer(() -> Mono.just(myLockFactory.getLock(name)))
publishOn(scheduler)
.doOnNext(MyDistributedLock::lock)
.doOnNext(lock -> LOGGER.info("Process acquired lock for resource {}", name))
.flatMapMany(lock -> Flux.from(publisher))
.publishOn(scheduler)
.doFinally(signalType -> {
MyDistributedLock lock = myLockFactory.getLock(name);
if (signalType == SignalType.CANCEL) {
// cancel ignores publishOn
scheduler.schedule(() -> {
lock.unlock();
LOGGER.info("Process released lock for resource {} due to signal type {}", name, signalType);
});
} else if (lock.isLockedByCurrentThread()) {
lock.unlock();
LOGGER.info("Process released lock for resource {} due to signal type {}", name, signalType);
}
})
.then();
}
And example of some job
private Mono<Void> someJobRunEveryOneHourOnEveryNodeInCluster() {
MyLockDistributedFactory hazelcast = ...;
return lock("some-job", Flux.just(1,2), hazelcast)
.repeatWhen(afterOneHour());
}
I wonder whether this is a good approach of using Project reactor (and correct implementation) or it should be done in a different way. Please advice.
it is a correct approach when using Reactor, because you took care of offsetting the blocking portion into a dedicated Scheduler/Thread.
But I'd say mutually exclusive code like this is not a very good fit for reactive programming in general: you lose one of the key benefits of doing more with less threads, you risk blocking other parts of the application should you forget to publishOn a dedicated thread, etc...

Doing an atomic read in objective-C

I have a thread-safe class, a cancel token, that transitions from an unstable mutable state (not cancelled) to a stable immutable state (cancelled). Once an instance has become immutable, I'd like to stop paying the cost of acquiring a lock before checking the state.
Here's a simplification of what things look like now:
-(bool) isCancelled {
#synchronized(self) {
return _isCancelled;
}
}
-(bool) tryCancel {
#synchronized(self) {
if (_isCancelled) return false;
_isCancelled = true;
}
return true;
}
and what I want to try:
-(bool) isCancelled {
bool result;
// is the following correct?
// can the two full barriers be reduced to a single read-acquire barrier somehow?
OSMemoryBarrier();
result = _isCancelled != 0;
OSMemoryBarrier();
return result;
}
-(bool) tryCancel {
return OSAtomicCompareAndSwap32Barrier(0, 1, &_isCancelled);
}
Is using two memory barriers the correct approach? How should I expect it to compare to the cost of acquiring a lock (insert standard refrain about profiling here)? Is there a cheaper way to do it?
Edit: this sounds like possible premature optimization. is this lock acquisition slowing things down?
Edit2: its possible compiler optimization will defeat this. be aware.
if you are concerned about the gotchas with double checked locking, perhaps dispatch_once() could be useful for you?
would double checked locking work in this case?
-(void) doSomething {
if (!_isCanceled) { //only attempt to acquire lock if not canceled already
#synchronized(self) {
if (!_isCanceled) // now check again (the double check part)
doSomethingElse();
}
}
}
read the wikipedia entry on double checked locking for more info

Neo4j Java API Concurrency v2.0M3: Exception when iterating over relationships while other threads creating new relationships concurrently

What I try to achieve here is to get the number of relationships of a particular node, while other threads adding new relationships to it concurrently. I run my code in a unit test with
TestGraphDatabaseFactory().newImpermanentDatabase() graph service.
My code is executed by ~50 threads, and it looks something like this:
int numOfRels = 0;
try {
Iterable<Relationship> rels = parentNode.getRelationships(RelTypes.RUNS, Direction.OUTGOING);
while (rels.iterator().hasNext()) {
numOfRels++;
rels.iterator().next();
}
}
catch(Exception e) {
throw e;
}
// Enforce relationship limit
if (numOfRels > 10) {
// do something
}
Transaction tx = graph.beginTx();
try {
Node node = createMyNodeAndConnectToParentNode(...);
tx.success();
return node;
}
catch (Exception e) {
tx.failure();
}
finally {
tx.finish();
}
The problem is once a while I get a "ArrayIndexOutOfBoundsException: 1" in the try-catch block above (the one surrounding the getRelationships()). If I understand correctly Iterable is not thread-safe and causing this problem.
My question is what is the best way to iterate over constantly changing relationships and nodes using Neo4j's Java API?
I am getting the following errors:
Exception in thread "Thread-14" org.neo4j.helpers.ThisShouldNotHappenError: Developer: Stefan/Jake claims that: A property key id disappeared under our feet
at org.neo4j.kernel.impl.core.NodeProxy.setProperty(NodeProxy.java:188)
at com.inbiza.connio.neo4j.server.extensions.graph.AppEntity.createMyNodeAndConnectToParentNode(AppEntity.java:546)
at com.inbiza.connio.neo4j.server.extensions.graph.AppEntity.create(AppEntity.java:305)
at com.inbiza.connio.neo4j.server.extensions.TestEmbeddedConnioGraph$appCreatorThread.run(TestEmbeddedConnioGraph.java:61)
at java.lang.Thread.run(Thread.java:722)
Exception in thread "Thread-92" java.lang.ArrayIndexOutOfBoundsException: 1
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:72)
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:36)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:55)
at com.inbiza.connio.neo4j.server.extensions.graph.AppEntity.create(AppEntity.java:243)
at com.inbiza.connio.neo4j.server.extensions.TestEmbeddedConnioGraph$appCreatorThread.run(TestEmbeddedConnioGraph.java:61)
at java.lang.Thread.run(Thread.java:722)
Exception in thread "Thread-12" java.lang.ArrayIndexOutOfBoundsException: 1
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:72)
at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull(RelationshipIterator.java:36)
at org.neo4j.helpers.collection.PrefetchingIterator.hasNext(PrefetchingIterator.java:55)
at com.inbiza.connio.neo4j.server.extensions.graph.AppEntity.create(AppEntity.java:243)
at com.inbiza.connio.neo4j.server.extensions.TestEmbeddedConnioGraph$appCreatorThread.run(TestEmbeddedConnioGraph.java:61)
at java.lang.Thread.run(Thread.java:722)
Exception in thread "Thread-93" java.lang.ArrayIndexOutOfBoundsException
Exception in thread "Thread-90" java.lang.ArrayIndexOutOfBoundsException
Below is the method responsible of node creation:
static Node createMyNodeAndConnectToParentNode(GraphDatabaseService graph, final Node ownerAccountNode, final String suggestedName, Map properties) {
final String accountId = checkNotNull((String)ownerAccountNode.getProperty("account_id"));
Node appNode = graph.createNode();
appNode.setProperty("urn_name", App.composeUrnName(accountId, suggestedName.toLowerCase().trim()));
int nextId = nodeId.addAndGet(1); // I normally use getOrCreate idiom but to simplify I replaced it with an atomic int - that would do for testing
String urn = App.composeUrnUid(accountId, nextId);
appNode.setProperty("urn_uid", urn);
appNode.setProperty("id", nextId);
appNode.setProperty("name", suggestedName);
Index<Node> indexUid = graph.index().forNodes("EntityUrnUid");
indexUid.add(appNode, "urn_uid", urn);
appNode.addLabel(LabelTypes.App);
appNode.setProperty("version", properties.get("version"));
appNode.setProperty("description", properties.get("description"));
Relationship rel = ownerAccountNode.createRelationshipTo(appNode, RelTypes.RUNS);
rel.setProperty("date_created", fmt.print(new DateTime()));
return appNode;
}
I am looking at org.neo4j.kernel.impl.core.RelationshipIterator.fetchNextOrNull()
It looks like my test generates a condition where else if ( (status = fromNode.getMoreRelationships( nodeManager )).loaded() || lastTimeILookedThereWasMoreToLoad ) is not executed, and where currentTypeIterator state is changed in between.
RelIdIterator currentTypeIterator = rels[currentTypeIndex]; //<-- this is where is crashes
do
{
if ( currentTypeIterator.hasNext() )
...
...
while ( !currentTypeIterator.hasNext() )
{
if ( ++currentTypeIndex < rels.length )
{
currentTypeIterator = rels[currentTypeIndex];
}
else if ( (status = fromNode.getMoreRelationships( nodeManager )).loaded()
// This is here to guard for that someone else might have loaded
// stuff in this relationship chain (and exhausted it) while I
// iterated over my batch of relationships. It will only happen
// for nodes which have more than <grab size> relationships and
// isn't fully loaded when starting iterating.
|| lastTimeILookedThereWasMoreToLoad )
{
....
}
}
} while ( currentTypeIterator.hasNext() );
I also tested couple locking scenarios. The one below solves the issue. Not sure if I should use a lock every time I iterate over relationships based on this.
Transaction txRead = graph.beginTx();
try {
txRead.acquireReadLock(parentNode);
long numOfRels = 0L;
Iterable<Relationship> rels = parentNode.getRelationships(RelTypes.RUNS, Direction.OUTGOING);
while (rels.iterator().hasNext()) {
numOfRels++;
rels.iterator().next();
}
txRead.success();
}
finally {
txRead.finish();
}
I am very new to Neo4j and its source base; just testing as a potential data store for our product. I will appreciate if someone knowing Neo4j inside & out explains what is going on here.
This is a bug. The fix is captured in this pull request: https://github.com/neo4j/neo4j/pull/1011
Well I think this a bug. The Iterable returned by getRelationships() are meant to be immutable. When this method is called, all the available Nodes till that moment will be available in the iterator. (You can verify this from org.neo4j.kernel.IntArrayIterator)
I tried replicating it by having 250 threads trying to insert a relationship from a node to some other node. And having a main thread looping over the iterator for the first node. On careful analysis, the iterator only contains the relationships added when getRelationship() was last called. The issue never came up for me.
Can you please put your complete code, IMO there might some silly error. The reason it cannot happen is that the write locks are in place when adding a relationship and reads are hence synchronized.

NHibernate FlushMode Auto Not Flushing Before Find

All right, I've seen some posts asking almost the same thing but the points were a little bit different.
This is a classic case: I'm saving/updating an entity and, within the SAME SESSION, I'm trying to get them from the database (using criteria/find/enumerable/etc) with FlushMode = Auto. The matter is: NHibernate isn't flushing the updates before querying, so I'm getting inconsistent data from the database.
"Fair enough", some people will say, as the documentation states:
This process, flush, occurs by default at the following points:
from some invocations of Find() or Enumerable()
from NHibernate.ITransaction.Commit()
from ISession.Flush()
The bold "some invocations" clearly says that NH has no responsibility at all. IMO, though, we have a consistency problem here because the doc also states that:
Except when you explicity Flush(), there are absolutely no guarantees about when the Session executes the ADO.NET calls, only the order in which they are executed. However, NHibernate does guarantee that the ISession.Find(..) methods will never return stale data; nor will they return the wrong data.
So, if I'm using CreateQuery (Find replacement) and filtering for entities with property Value = 20, NH may NOT return entities with Value = 30, right? But that's what happens in fact, because the Flush is not happening automatically when it should.
public void FlushModeAutoTest()
{
ISession session = _sessionFactory.OpenSession();
session.FlushMode = FlushMode.Auto;
MappedEntity entity = new MappedEntity() { Name = "Entity", Value = 20 };
session.Save(entity);
entity.Value = 30;
session.SaveOrUpdate(entity);
// RETURNS ONE ENTITY, WHEN SHOULD RETURN ZERO
var list = session.CreateQuery("from MappedEntity where Value = 20").List<MappedEntity>();
session.Flush();
session.Close();
}
After all: am I getting it wrong, is it a bug or simply a non predictable feature so everybody have to call Flush to assure its work?
Thank you.
Filipe
I'm not very familiar with the NHibernate source code but this method from the ISession implementation in the 2.1.2.GA release may answer the question:
/// <summary>
/// detect in-memory changes, determine if the changes are to tables
/// named in the query and, if so, complete execution the flush
/// </summary>
/// <param name="querySpaces"></param>
/// <returns></returns>
private bool AutoFlushIfRequired(ISet<string> querySpaces)
{
using (new SessionIdLoggingContext(SessionId))
{
CheckAndUpdateSessionStatus();
if (!TransactionInProgress)
{
// do not auto-flush while outside a transaction
return false;
}
AutoFlushEvent autoFlushEvent = new AutoFlushEvent(querySpaces, this);
IAutoFlushEventListener[] autoFlushEventListener = listeners.AutoFlushEventListeners;
for (int i = 0; i < autoFlushEventListener.Length; i++)
{
autoFlushEventListener[i].OnAutoFlush(autoFlushEvent);
}
return autoFlushEvent.FlushRequired;
}
}
I take this to mean that auto flush will only guarantee consistency inside a transaction, which makes some sense. Try rewriting your test using a transaction, I'm very curious if that will fix the problem.
If you think about it, the query in your example must always go to the db. The session is not a complete cache of all records in the db. So there could be other entities with the value of 20 on disk. And since you didn't commit() a transaction or flush() the session NH has no way to know which "view" you want to query (DB | Session).
It seems like the "Best Practice" is to do everything (gets & sets) inside of explicit transactions:
using(var session = sessionFactory.OpenSession())
using(var tx = session.BeginTransaction())
{
// execute code that uses the session
tx.Commit();
}
See here for a bunch of details.
managing and tuning hibernate is an artform.
why do you set an initial value of 20, save, then change it to 30?
As a matter of practice, if you are going modify the session, then query the session, you might want to explicitly flush between those operations. You might have a slight performance hit (after all, you then don't let hibernate optimize session flushing), but you can revisit if it becomes a problem.
You quoted that "session.find methods will never return stale data". I would modify your code to use a find instead of createQuery to see if it works.