Processes coordinating using Chubby Lock Service - locking

Do processes, racing to get same coarse grained lock using Chubby service, need to wait if lock is unavailable? Is there any api similar to try_lock (lock if available otherwise don't wait)?

Related

Syncronize multiple instances of Spring Cache with a Redis lock

I'm building a Spring Boot application that uses Spring Cache with a Redis backing store and needs to synchronize the updates made to the cache.
The caching is not made on the fly, but by an scheduled process that updates the cache periodically.
The algorithm I came up with is:
periodically the instances will check if the Redis cache is older than some predetermined time
if that's the case, the instance will try to acquire a lock on some Redis key
if the instance successfully locks the key, it will then proceed with the update
if some other instance already locked the key, move on
all instances can still read the cache
Everything is more or less already built, all I need is to implement the locking/releasing mechanism.
Spring Cache is using Lettuce to interact with Redis, what is the best way to get an connection to Redis and manage the locking mechanism?
As you may already be aware, Spring's Cache Abstraction provides simple coordination amongst multiple Threads in a single Spring [Boot] application process using the sync attribute on the #Cacheable annotation (see ref doc).
NOTE: Despite the comment ("... use the sync attribute to instruct the underlying cache provider to lock the cache entry while the value is being computed. As a result, only one thread is busy computing the value, while the others are blocked until the entry is updated in the cache.") in the documentation, the locking mechanics is handled by the core framework itself, and in most cases, not the provider. Anyway...
However, this "coordination" is only per-process and will not work for multiple Spring [Boot] application instances, or (OS) JVM processes. In this case, you need some form of distributed locking across your multiple Spring [Boot] application instances to coordinates access to shared cache entries stored in the single Redis server (cluster) shared by your Spring [Boot] application instances.
I am no Redis expert (I am still learning), but I am familiar with similar NoSQL stores (Apache Geode/VMware GemFire, Hazelcast, etc) and distributed locking mechanisms. I see that distributed locking is possible to achieve with Redis as well. In a quick search, I found "Distributed Locking" in Redis, and specifically, "Building a lock in Redis". This is probably the best way to go.
In addition, if you want to make this distributed locking automatically/transparently available through Spring's Cache Abstraction, then you could possibly create a custom AOP Aspect and weave this Aspect together with the framework provided Caching Aspect (Interceptor), being conscious of ordering, as 1 idea.
Alternatively, you could implement wrapper implementations for the Spring Cache and CacheManager SPI interfaces that implement distributed locking on top of the core Redis Cache and CacheManager provider implementations provided by Spring Boot/Spring Data Redis.
Of course, there are multiple ways to go about this. Just tossing out more ideas, but have a look at the distributed locking information in the book.

How to re-queue celery task until after lock is released

I have a number of celery tasks that currently use a non-blocking redis lock to ensure atomic edits of a cluster of db tables for a particular user. The timing of the tasks is unpredictable and the frequency is high but sporadic. For example some are triggered by web hooks, other by user interaction.
Right now I am using autoretry_for and retry_backoff to respond to failures to acquire the non-blocking lock. This is sub-optimal since it leaves lots of idle time once the lock has been released. I can't use a blocking lock because I need to let other tasks that don't require the lock to still run.
What I need is a way to re-run any tasks that failed to acquire the non-blocking lock as soon as possible after the lock is released. It doesn't matter if some handful of non-locking tasks are run in the meantime, I just need the tasks that failed to acquire to run reasonably soon but certainly without idle time.
How can this be done? It's like I want to add the tasks to some kind of global celery chain, but post-hoc, and then fire any waiting chains every time the non-blocking lock is released. Or something like that. Which user each tasks are crunching is calculated from the task's arguments, it is not passed in through the arguments (so key-based queues won't work).

Trying to understand process state differences

I was wondering what the difference between Microsoft's process states vs. other OS process states was? I've researched see that there is a basic model for process states comprised of 5 states: New (added to ready queue), Ready (list of processes ready to execute), Running (currently running process), Waiting or Blocked (process put on hold to wait for I/o event or waiting for resource), and Terminated (all done).
All operating systems seem to have these 5 states. Is there really a difference between Microsoft and others?

Guidelines for using lucene.net in a web service app?

Just started reading up on Lucene.net and I would like some of my REST based web services to use the powerful searching facilities of Lucene.net
However I came across a link which said that I should create a windows service (with WCF) to do all the lucene searches/indexes etc as IIS recycles the application pool which will cause all sorts of locking issues.
My question is, is this correct? If so, is there another way of resolving this problem without creating a windows service (with WCF)? Also since I have REST based services, would I make a call from these services to the Windows WCF service which would make things slower?
Indexing
During your reading you would have picked up that indexing is done using the IndexWriter class. Lucene will only allow 1 IndexWriter instance open at a time. When using the default locking it creates a lock file in the index directory and prevents any other IndexWriter instances from being created. For this reason it may be better to implement indexing in a process that you have more control over.
If your indexing process is terminated with extreme prejudice and your IndexWriter class does not get closed, the lock on your index folder is maintained and no other instances will be allowed. Because of this Lucene allows you to lift a lock from an Indexed folder (using IndexWriter.unlock)- a dangerous method because if there are two IndexWriters open on the same index it will corrupt the index. If you have a windows service that is performing the indexing, and it's the only process in your solution that does the indexing (and any updates), you can confidently unlock the indexing folder on startup of the service. In a web service based environment where you are performing indexing from a web method - controlling and recovering from locking issues becomes problematic.
Searching
The IndexSearcher class is used for the searches. This in readonly mode can be done from your service based code. I don't think it's necessary to create a separate set of WCF methods for this purpose.
Optimization
The index may required to be optimized for performance periodically depending on the volumes. Once again having the indexing in a separate process you can schedule the optimization nightly, weekly or what ever is required. Optimization is done by a call to one method.
Indexing new data
How and when to get the indexing process to index new data.... I don't know what data you're indexing so it's hard to tell. In my scenario I have WCF methods that are responsible for input data - high volume. I require the data that has been received to be available for searching as soon as possible. So,
my Model layer has a notification layer that when new records of the required type have been successfully committed, a simple notification message is inserted into a local queue in MSMQ.
The reason for MSMQ is that the queue is persisted and transactional and that any messages in there are available even after a crash of system reboot - allowing me to never (cough!) lose any messages.
The indexing service takes the notification, build the Lucene Document and indexes the data.
The indexing service can also be triggered to do a full re-index by deleting the existing index an crawling the Db.
EDIT:
Example architecture:
WCF Service Methods taking on data commiting it to the Model layer. The Model layer notifies a listening client that an CRUD operation occurred successfully on items. The listening client posts the notification in a queue.
Windows Service handles Indexing of data, watching the queue for indexing requests.
ASP.Net app provides user interface with search features.
You can simply disable application pool recycling and host your application/service in IIS.
To disable recycling on config changes, use the disallowRotationOnConfigChange parameter.
You can also split your application in two parts: Index updates and searches.
Handle index updates from a windows service, and have your IIS portion handles searches (readonly). You would do this by having a mechanism that detects index updates, and refresh the IndexSearchers. This way, if the performance penalty of using services is a concern for you, it wont impact search time which is the important aspect for the users. With this configuration you can even have a master index update node, and distribute searches across different web servers in a farm. The only downside is you dont have the near real time searching functionality thats built in the IndexWriter class.
http://wiki.apache.org/lucene-java/NearRealtimeSearch
That being said, I've never had performance issues with setups that have the Lucene functions exposed over a WCF service, especially if your running either on the same machine with NetNamedPipe or on a local LAN with NetTcp.

Real time application on Microsoft Azure

I'm working on a real-time application and building it on Azure.
The idea is that every user reports something about himself and all the other users should see it immediately (they poll the service every seconds or so for new info)
My approach for now was using a Web Role for a WCF REST Service where I'm doing all the writing to the DB (SQL Azure) without a Worker Role so that it will be written immediately.
I've come think that maybe using a Worker Role and a Queue to do the writing might be much more scalable, but might interfere with the real-time side of the service. (The worker role might not take the job immediately from the queue)
Is it true? How should I go about this issue?
Thanks
While it's true that the queue will add a bit of latency, you'll be able to scale out the number of Worker Role instances to handle the sheer volume of messages.
You can also optimize queue-reading by getting more than one message at a time. Since a single queue has a scalability target of 500 TPS, this lets you go well beyond 500 messages per second on reads.
You might look into a Cache for buffering the latest user updates, so when polling occurs, your service reads from cache instead of SQL Azure. That might help as the volume of information increases.
You could have a look at SignalR, it does not support farm scenarios out-of-the-box, but should be able to work with the use of either internal endpoint calls to update every instance, using the Azure Service Bus, or using the AppFabric Cache. This way you get a Push scenario rather than a Pull scenario, thus you don't have to poll your endpoints for potential updates.