I have a kogito setup(data-index, jobs-service, infinispan, kafka etc.) and works with my Quarkus application.
My application has a parent .bpmn2 process, and in that process, I create a subprocess when a signal is received. See the process image:
And I have the same logic in child process, it goes like 3 levels down. Thus, I have a hierarchy in data index resources. When I fetch the parent process via data-index in GraphQL, I see the childProcessInstances etc. I also persist the data in infinispan, so when restarting the application, my data persists.
But now I only have a small application and I do not know how it works in a big project, working many parent and child processes as nested.
And I wonder how my Quarkus application handles a big workload of these processes? Is it possible to run infinite multiple procesess? Thanks in advance.
Related
We are developing a Web API using .Net Core. To perform background tasks we have used Hosted Services.
System has been hosted in AWS Beantalk Environment with the Load Balancer. So based on the load Beanstalk creates/remove new instances of the system.
Our problem is,
Since background services also runs inside the API, When load balancer increases the instances, number of background services also get increased and there is a possibility to execute same task multiple times. Ideally there should be only one instance of background services.
One way to tackle this is to stop executing background services when in a load balanced environment and have a dedicated non-load balanced single instance environment for background services only.
That is a bit ugly solution. So,
1) Is there a better solution for this?
2) Is there a way to identify the primary instance while in a load balanced environment? If so I can conditionally register Hosted services.
Any help is really appreciated.
Thanks
I am facing the same scenario and thinking of a way to implement a custom service architecture that can run normally on all of the instance but to take advantage of pub/sub broker and distributed memory service so those small services will contact each other and coordinate what's to be done. It's complicated to develop yes but a very robust solution IMO.
You'll "have to" use a distributed "lock" system. You'll have to use, for example, a distributed memory cache who put a lock when someone (a node of your cluster) is working on background. If another node is trying to do the same job, he'll be locked by the first lock if the work isn't done yet.
What i mean, if all your nodes doesn't have a "sync handler" you can't handle this kind of situation. It could be SQL app lock, distributed memory cache or other things ..
There is something called Mutex but even that won't control this in multi-instance environment. However, there are ways to control it to some level (may be even 100%). One way would be to keep a tracker in the database. e.g. if the job has to run daily, before starting your job in the background service you might wanna query the database if there is any entry for today, if not then you will insert an entry and start your job.
We have a particular scenario in our application - All the child actors in this application deals with huge volume of data (Around 50 - 200 MB).
Due to this, we have decided to create the child actors in the same machine (worker process) in which parent actor was created.
Currently, this is achieved by the use of Roles. We also use .NET memory cache to transfer the data (Several MBs) between child actors.
Question : Is it ok to turn off clustering in the child actors to achieve the result we are expecting?
Edit: To be more specific, I have explained the our application setup in detail, below.
The whole process happens inside a Akka.NET cluster of around 5 machines
Worker processes (which contains both parent and child actors) are deployed in each of those machines
Both parent and child actors are cluster enabled, in this setup
When we found out the network overhead caused by distributing the child actors across machines, we decided to restrict child actor creation to the corresponding machines which received the primary request, and distribute only the parent actor across machines.
While approaching an Akka.NET expert with this problem, we were advised to use "Roles" in order to restrict the child actor creation to a single machine in a cluster system. (E.g., Worker1Child, Worker2Child instead of "Child" role)
Question (Contd.) : I just want to know, if simply by disabling cluster option in child actors will achieve the same result; and is it a best practice to do so?
Please advise.
Sounds to me like you've been using a clustered pool router to remotely deploy worker actors across the cluster - you didn't explicitly mention this in your description, but that's what it sounds like.
It also sounds like, what you're really trying to do here is take advantage of local affinity: have child worker actors for the same entities all work together inside the same process.
Here's what I would recommend:
Have all worker actors created as children of parents, locally, inside the same process, but either using something like the child-per-entity pattern or a LOCAL pool router.
Distribute work between the worker nodes using a clustered group router, using roles etc.
Any of the work in that high volume workload should all flow directly from parent to children, without needing to round-trip back and forth between the rest of the cluster.
Given the information that you've provided here, this is as close to a "general" answer as I can provide - hope you find it helpful!
I have a fairly simple Akka.NET system that tracks in-memory state, but contains only derived data. So any actor can on startup load its up-to-date state from a backend database and then start receiving messages and keep their state from there. So I can just let actors fail and restart the process whenever I want. It will rebuild itself.
But... I would like to run across multiple nodes (mostly for the memory requirements) and I'd like to increase/decrease the number of nodes according to demand. Also for releasing a new version without downtime.
What would be the most lightweight (in terms of Persistence) setup of clustering to achieve this? Can you run Clustering without Persistence?
This not a single question, so let me answer them one by one:
So I can just let actors fail and restart the process whenever I want - yes, but keep in mind, that hard reset of the process is a lot more expensive than graceful shutdown. In distributed systems if your node is going down, it's better for it to communicate that to the rest of the nodes before, than requiring them to detect the dead one - this is a part of node failure detection and can take some time (even sub minute).
I'd like to increase/decrease the number of nodes according to demand - this is a standard behavior of the cluster. In case of Akka.NET depending on which feature set are you going to use, you may sometimes need to specify an upper bound of the cluster size.
Also for releasing a new version without downtime. - most of the cluster features can be scoped to a set of particular nodes using so called roles. Each node can have it's set of roles, that can be used what services it provides and detect if other nodes have required capabilities. For that reason you can use roles for things like versioning.
Can you run Clustering without Persistence? - yes, and this is a default configuration (in Akka, cluster nodes don't need to use any form of persistent backend to work).
I was reading about Processes. I wan't to know what really happens. My situation :
"I opened an Application. That creates a process say process1. I have other applications interfaced with this one and all these open up when i click a button inside my running application. I want to know Does my process1 create new processes and IPC happens OR processes for all the linked applications are created at once and then IPC happens?"
Obviously,a running application is a bunch of processes,or maybe a single process which has internally multiple threads acting within these processes.
So,your activity decides the creation and deletion of processes.say,if you are running an application such as media player and you suddenly start searching related info about the album---so here,totally a new process is created which helps interaction through web and after returning the output,it may die,may not,but the process was created on your request.Also,mostly ipc happens within processes,exactly as per your thinking,but shared memory communication is also one of the option,which is complicated and is less common.
One more thing to point out is that there are several 'daemon processes' which are running in the background and don't die before shutdown instruction!So,these processes are also sometimes related to the running application and serves its request.But,mostly,newer processes are created when we switch our task or perform certain action in the application.
Consider the following scenario.
There are 2 Hazelcast nodes. One is stopped, another is running under quite heavy load.
Now, the second node comes up. The application starts up and its Hazelcast instance hooks up to the first. Hazelcast starts data repartitioning. For 2 nodes, it essentially means
that each entry in IMap gets copied to the new node and two nodes are assigned to be master/backup arbitrarily.
PROBLEM:
If the first node is brought down during this process, and the replication is not done completely, part of the IMap contents and ITopic subscriptions may be lost.
QUESTION:
How to ensure that the repartitioning process has finished, and it is safe to turn off the first node?
(The whole setup is made to enable software updates without downtime, while preserving current application state).
I tried using getPartitionService().addMigrationListener(...) but the listener does not seem to be hooked up to the complete migration process. Instead, I get tens to hundreds calls migrationStarted()/migrationCompleted() for each chunk of the replication.
1- When you gracefully shutdown first node, shutdown process should wait (block) until data is safely backed up.
hazelcastInstance.getLifecycleService().shutdown();
2- If you use Hazelcast Management Center, it shows ongoing migration/repartitioning operation count in home screen.