I'm having a problem with a JEE6 application running in a clustered environment using WebSphere ApplicationServer 8.
A search index is used for quick search in the UI (using Lucene), which must be re-indexed after new data arrived in the corresponding DB layer. To achieve this we're sending a JMS message to the application, then the search index will be refreshed.
The problem is, that the messages only arrives at one of the cluster members. So only there the search index is up to date. At the other servers it remains outdated.
How can I achieve that the search index gets updated at all cluster members?
Can I receive the message somehow on all servers?
Or is there a better way to do this?
I found a possible solution:
Generally, a JMS message delivered via a queue goes only to one of the cluster members. I found a possible way to get the info to all of the cluster members, using a EJB timer. Creating a non-persistent timer should call the callback method on all of the cluster members. This might be a convenient way to recreate the local search index on all the cluster members.
It is important to be a non-persistent ejb timer, because persistent timers get synchronized on the cluster and are only executed on one of the cluster members.
Related
We are trying to use the Continuous Query feature of Ignite. But we are facing an issue on handling that event. Below is our problem statement
We have defined a Continuous Query with remote filter for a cache and shared the filter definition with Thick Client.
We are running multiple replica of the "Thin Client" in Kubernetes cluster.
Now the problem is each instance of the "Thin Client" running in k8s cluster have registered the remote filter and each instance receiving the event and trying to process the data in parallel. This resulting in duplicating data process or even overriding the data in my store.
Is there any way to form a consumer group and ensure that only one instance of the "Thin Client" is receiving the notification and its processing the data ?
My Thick client and Thin Clients are in .NET
Couldn't found any details on Ignite document
https://ignite.apache.org/docs/latest/key-value-api/continuous-queries
Here each thin client is starting its own continuous query and thereby, by design, each thin client is getting its own event to consume. If you want to route an event to a specific client then you would need to start only one continuous query, and distribute that event to your app as you see fit.
Take a look at ignite messaging to see whether it fits your use case.
Also check out the distributed Queue/Set which have unique delivery guarantees.
We are shifting from Monolithic to Microservice Architecture for our e-commerce marketplace application. We chosen Redis pub/sub for microservice to microservice communication and also for some push notification purpose. Push notification strategy is like below:
Whenever an order is created (i,e customer creates an order), the backend publishes an event in respective channel (queue) and the specific push-notification-microservice consumes this event (json message) and sends push notification to the seller mobile.
For the time being we are using redis-server installed in our ubuntu machine without any hassle. But the headache is in future when millions of order will be generated in a point of time then how can we handle this situation ? That means, we need to scale the Redis Queue, right ?
My exact clean question (regardless the above scenario) is:
How can I horizontally scale Redis Queue instead of increasing the RAM in same machine ?
Whenever an order is created (i,e customer creates an order), the
backend publishes an event in respective channel (queue) and the
specific push-notification-microservice consumes this event (json
message) and sends push notification to the seller mobile.
IIUC you're sending a message over Redis PUB/SUB, that's not durable that means if the only producer is up and other services/consumers are down then consumers will miss messages. Any services that are down will lose all those messages that are sent when the said service was down.
Now let's assume, you're using Redis LIST and other combinations of data structures to solve the missing events issue.
Scaling Redis queue is a little bit tricky since entire data is stored in a list, that resides on a single Redis machine/host. What you can do is create your own partitioning scheme and design your Redis keys as per the partitioning scheme as Redis does internally when we add a new master in the cluster, creating consistent hashing would require some efforts.
Very simple you can distribute loads based on the userId for example if userId is between 0 and 1000 then use queue_0, 1000-2000 queue_1, and so on. This is a manual process that you can be automated using some script. Whenever a new queue is added to the set all consumers have to be notified and the publisher will be updated as well.
Dividing based on the number is a range partition scheme, you can use a hash partition scheme as well, either you use a range or hash partitioning scheme, whenever a new queue is added to the queue set the consumers must be notified for potential updates. Consumers can spawn a new worker for the new queue, removing a queue could be tricky as all consumers must have drained their respective queue.
You might consider using Rqueue
I am trying to setup akka.net cluster sharding by creating a simple project.
Project layout:
Actors - class library that defines one actor and message. Is reference by other projects
Inbound - Starts ShardedRegion and is the only node participating in cluster sharding. And should be the one hosting the coordinator too.
MessageProducer - Will host only shardedregion proxy to send messages to the ProcessorActor.
Lighthouse - seed node
Uploaded images show that the coordinator singleton is not initialized and messages send through sharedregion proxy are not delivered.
Based on the blog post by petabridge, petabridge.com/blog/cluster-sharding-technical-overview-akkadotnet/, I have excluded lighthouse, by setting akka.cluster.sharding.role, from participating in cluster sharding so that coordinator is not created on it.
Not sure what am I missing to get this to work.
This was already answered on gitter, but here's the tl;dr:
Shard region proxy needs to share the same role as a corresponding shard region. Otherwise proxy may not be able to find shard coordinator, and therefore not able to find initial location of a shard, it wants to send message to.
IMessageExtractor.GetMessage method is used to extract an actual message, that is going to be send to sharded actor. In example message extractor was used to extract string property from enveloping message, yet a receiver actor has Receive handler set for envelope, not a string.
I have a middleware based on Apache Camel which does a transaction like this:
from("amq:job-input")
to("inOut:businessInvoker-one") // Into business processor
to("inOut:businessInvoker-two")
to("amq:job-out");
Currently it works perfectly. But I can't scale it up, let say from 100 TPS to 500 TPS. I already
Raised the concurrent consumers settings and used empty businessProcessor
Configured JAVA_XMX and PERMGEN
to speed up the transaction.
According to Active MQ web Console, there are so many messages waiting for being processed on scenario 500TPS. I guess, one of the solution is scale the ActiveMQ up. So I want to use multiple brokers in cluster.
According to http://fuse.fusesource.org/mq/docs/mq-fabric.html (Section "Topologies"), configuring ActiveMQ in clustering mode is suitable for non-persistent message. IMHO, it is true that it's not suitable, because all running brokers use the same store file. But, what about separating the store file? Now it's possible right?
Could anybody explain this? If it's not possible, what is the best way to load balance persistent message?
Thanks
You can share the load of persistent messages by creating 2 master/slave pairs. The master and slave share their state either though a database or a shared filesystem so you need to duplicate that setup.
Create 2 master slave pairs, and configure so called "network connectors" between the 2 pairs. This will double your performance without risk of loosing messages.
See http://activemq.apache.org/networks-of-brokers.html
This answer relates to an version of the question before the Camel details were added.
It is not immediately clear what exactly it is that you want to load balance and why. Messages across consumers? Producers across brokers? What sort of concern are you trying to address?
In general you should avoid using networks of brokers unless you are trying to address some sort of geographical use case, have too many connections for a signle broker to handle, or if a single broker (which could be a pair of brokers configured in HA) is not giving you the throughput that you require (in 90% of cases it will).
In a broker network, each node has its own store and passes messages around by way of a mechanism called store-and-forward. Have a read of Understanding broker networks for an explanation of how this works.
ActiveMQ already works as a kind of load balancer by distributing messages evenly in a round-robin fashion among the subscribers on a queue. So if you have 2 subscribers on a queue, and send it a stream of messages A,B,C,D; one subcriber will receive A & C, while the other receives B & D.
If you want to take this a step further and group related messages on a queue so that they are processed consistently by only one subscriber, you should consider Message Groups.
Adding consumers might help to a point (depends on the number of cores/cpus your server has). Adding threads beyond the point your "Camel server" is utilizing all available CPU for the business processing makes no sense and can be conter productive.
Adding more ActiveMQ machines is probably needed. You can use an ActiveMQ "network" to communicate between instances that has separated persistence files. It should be straight forward to add more brokers and put them into a network.
Make sure you performance test along the road to make sure what kind of load the broker can handle and what load the camel processor can handle (if at different machines).
When you do persistent messaging - you likely also want transactions. Make sure you are using them.
If all running brokers use the same store file or tx-supported database for persistence, then only the first broker to start will be active, while others are in standby mode until the first one loses its lock.
If you want to loadbalance your persistence, there were two way that we could try to do:
configure several brokers in network-bridge mode, then send messages
to any one and consumer messages from more than one of them. it can
loadbalance the brokers and loadbalance the persistences.
override the persistenceAdapter and use the database-sharding middleware
(such as tddl:https://github.com/alibaba/tb_tddl) to store the
messages by partitions.
Your first step is to increase the number of workers that are processing from ActiveMQ. The way to do this is to add the ?concurrentConsumers=10 attribute to the starting URI. The default behaviour is that only one thread consumes from that endpoint, leading to a pile up of messages in ActiveMQ. Adding more brokers won't help.
Secondly what you appear to be doing could benefit from a Staged Event-Driven Architecture (SEDA). In a SEDA, processing is broken down into a number of stages which can have different numbers of consumer on them to even out throughput. Your threads consuming from ActiveMQ only do one step of the process, hand off the Exchange to the next phase and go back to pulling messages from the input queue.
You route can therefore be rewritten as 2 smaller routes:
from("activemq:input?concurrentConsumers=10").id("FirstPhase")
.process(businessInvokerOne)
.to("seda:invokeSecondProcess");
from("seda:invokeSecondProcess?concurentConsumers=20").id("SecondPhase")
.process(businessInvokerTwo)
.to("activemq:output");
The two stages can have different numbers of concurrent consumers so that the rate of message consumption from the input queue matches the rate of output. This is useful if one of the invokers is much slower than another.
The seda: endpoint can be replaced with another intermediate activemq: endpoint if you want message persistence.
Finally to increase throughput, you can focus on making the processing itself faster, by profiling the invokers themselves and optimising that code.
In a web application, if I need to write an event to a queue, I would make a connection to redis to write the event.
Now if I want another backend process (say a daemon or cron job) to process the or react the the publishing of the event in redis, do I need a persistant connection?
Little confused on how this pub/sub process works in a web application.
Basically in Redis there are two different messaging models:
Fire and Forget / One to Many: Pub/Sub. At the time a message is PUBLISH-ed all the subscribers will receive it, but this message is then lost forever. If a client was not subscribed there is no way it can get it back.
Persisting Queues / One to One: Lists, possibly used with blocking commands such as BLPOP. With lists you have a producer pushing into a list, and one or many consumers waiting for elements, but one message will reach only one of the waiting clients. With lists you have persistence, and messages will wait for a client to pop them instead of disappearing. So even if no one is listening there is a backlog (as big as your available memory, or you can limit the backlog using LTRIM).
I hope this is clear. I suggest you studying the following commands to understand more about Redis and messaging semantics:
LPUSH/RPUSH, RPOP/LPOP, BRPOP/BLPOP
PUBLISH, SUBSCRIBE, PSUBSCRIBE
Doc for this commands is available at redis.io
I'm not totally sure, but I believe that yes, pub/sub requires a persistent connection.
For an alternative I would take a peek at resque and how it handles that. Instead of using pub/sub it simply adds an item to a list in redis, and then whatever daemon or cron job you have can use the lpop command to get the first one.
Sorry for only giving a pseudo answer and then a plug.