I am exploring the NATS for queuing and currently i am using redis lists. I stuck in below scenario, which easily manageable in redis:
1) There is one daemon which pushing the value in queue and one daemon which continuously reading from queue. If my reading daemon get stopped, redis starts storing data in queue . Once i start read daemon it pops from that last value where it got stopped like FIFO. In this there is no chance to loss my data. Is there any same provision providing by NATS?
2) If my redis server goes down, i can retrieve data ( leaving few) which already available in queue. If NATS server goes down can i retrieve my data?
In addition to the features of the core NATS platform, NATS Streaming provides the following:
At-least-once-delivery - NATS Streaming offers message acknowledgements between publisher and server (for publish operations) and between subscriber and server (to confirm message delivery). Messages are persisted by the server in memory or secondary storage (or other external storage) and will be redelivered to eligible subscribing clients as needed.
Message/event persistence - NATS Streaming offers configurable message persistence either in-memory or via flat files. The storage subsystem uses a public interface that allows contributors to develop their own custom implementations.
Related
I'm looking at turning a monolith application into a microservice-oriented application and in doing so will need a robust messaging system for interprocesses-communication. The idea is for the microserviceprocesses to be run on a cluster of servers for HA, with requests to be processed to be added on a message queue that all the applications can access. I'm looking at using Redis as both a KV-store for transient data and also as a message broker using the ServiceStack framework for .Net but I worry that the concept of eventual consistency applied by Redis will make processing of the requests unreliable. This is how I understand Redis to function in regards to Mq:
Client 1 posts a request to a queue on node 1
Node 1 will inform all listeners on that queue using pub/sub of the existence of
the request and will also push the requests to node 2 asynchronously.
The listeners on node 1 will pull the request from the node, only 1 of them will obtain it as should be. An update of the removal of the request is sent to node 2 asynchronously but will take some time to arrive.
The initial request is received by node 2 (assuming a bit of a delay in RTT) which will go ahead and inform listeners connected to it using pub/sub. Before the update from node 1 is received regarding the removal of the request from the queue a listener on node 2 may also pull the request. The result being that two listeners ended up processing the same request, which would cause havoc in our system.
Is there anything in Redis or the implementation of ServiceStack Redis Mq that would prevent the scenario described to occur? Or is there something else regarding replication in Redis that I have misunderstood? Or should I abandon the Redis/SS approach for Mq and use something like RabbitMQ instead that I have understood to be ACID-compliant?
It's not possible for the same message to be processed twice in Redis MQ as the message worker pops the message off the Redis List backed MQ and all Redis operations are atomic so no other message worker will have access to the messages that have been removed from the List.
ServiceStack.Redis (which Redis MQ uses) only supports Redis Sentinel for HA which despite Redis supporting multiple replicas they only contain a read only view of the master dataset, so all write operations like List add/remove operations can only happen on the single master instance.
One notable difference from using Redis MQ instead of specific purpose MQ like Rabbit MQ is that Redis doesn't support ACK's, so if the message worker process that pops the message off the MQ crashes then it's message is lost, as opposed to Rabbit MQ where if the stateful connection of an un Ack'd message dies the message is restored by the RabbitMQ server back to the MQ.
In my project I am calculating about 10-100mbs of data on a zookeeper worker. I then use HTTP PUT to transfer the data from the worker process to my webserver, which eventually gets delivered to the client. Is there anyway using Zookeeper or Curator to transfer that data or am I on my own to get the data out of the Worker process and onto a process outside my ensemble?
I wouldn't recommend to use Zookeeper to transfer data, especially of such relatively large size. It is not really designed to do it. Zookeeper works best when it used to synchronize distributed processes or to store some relatively small configuration data that is shared among multiple hosts.
There is a hard limit of 1 Mb per ZK node and if you try to push it to the limit, Zookeeper clients may get timeouts and go into disconnected state while Zookeeper service processes large chunk of data.
I am trying to set up cluster of brokers, which should have same feature like rabbitMQ cluster, but over WAN (my machines are in different locations), so rabbitMQ cluster does not work.
I am looking to alternatives, rabbitMQ federation is just backup the messages in the downstream, can not make sure they have exactly the same messages available at any time (downstream still keeps the old messages already consumed in the upstream)
how about ActiveMQ Master/Slave, I have found :
http://activemq.apache.org/how-do-distributed-queues-work.html
"queues and topics are all replicated between each broker in the cluster (so often to a master and maybe a single slave). So each broker in the cluster has exactly the same messages available at any time so if a master fails, clients failover to a slave and you don't loose a message."
My concern is that if it can automatically update to make sure Master/Slave always have the same messages, which means the consumed messages in Master will also disappear in Slaves.
Thanks :)
ActiveMQ has various clustering features.
First there is High Availability - "Master/Slave". The idea is that several physical servers act as a single logical ActiveMQ broker. If one goes down, another takes it place without losing data. You can do that by sharing the message store (shared file system or shared JDBC), or you could setup a replicated cluster, which replicates read/writes to the master down to all slaves (you need three+ servers). ActiveMQ is using LevelDB and Apache Zookeeper to achieve this.
The other format of cluster available in ActiveMQ is to be able to distribute load and separate security over several logical brokers. Brokers are then connected in a network of brokers. Messages are by default passed around to the broker with available consumers for that message. However, there is a rich toolbox of features in ActiveMQ to tweak a network of brokers to do things as always send a copy of a message to specific broker etc. It takes some messing with the more advanced features though (static network connectors and queue mirroring, maybe more).
Maybe there is a better way to solve your requirements, which is not really specified in the question?
I'm trying to get some feedback on the recommendations for a service 'roster' in my specific application. I have a server app that maintains persistant socket connections with clients. I want to further develop the server to support distributed instances. Server "A" would need to be able to broadcast data to the other online server instances. Same goes for all other active instances.
Options I am trying to research:
Redis / Zookeeper / Doozer - Each server instance would register itself to the configuration server, and all connected servers would receive configuration updates as it changes. What then?
Maintain end-to-end connections with each server instance and iterate over the list with each outgoing data?
Some custom UDP multicast, but I would need to roll my own added reliability on top of it.
Custom message broker - A service that runs and maintains a registry as each server connects and informs it. Maintains a connection with each server to accept data and re-broadcast it to the other servers.
Some reliable UDP multicast transport where each server instance just broadcasts directly and no roster is maintained.
Here are my concerns:
I would love to avoid relying on external apps, like zookeeper or doozer but I would use them obviously if its the best solution
With a custom message broker, I wouldnt want it to become a bottleneck is throughput. Which would mean I might have to also be able to run multiple message brokers and use a load balancer when scaling?
multicast doesnt require any external processes if I manage to roll my own, but otherwise I would need to maybe use ZMQ, which again puts me in the situation of depends.
I realize that I am also talking about message delivery, but it goes hand in hand with the solution I go with.
By the way, my server is written in Go. Any ideas on a best recommended way to maintain scalability?
* EDIT of goal *
What I am really asking is what is the best way to implement broadcasting data between instances of a distributed server given the following:
Each server instance maintains persistent TCP socket connections with its remote clients and passes messages between them.
Messages need to be able to be broadcasted to the other running instances so they can be delivered to relavant client connections.
Low latency is important because the messaging can be high speed.
Sequence and reliability is important.
* Updated Question Summary *
If you have multiple servers / multiple end points that need to pub/sub between each other, what is a recommended mode of communication between them? One or more message brokers to re-pub messages to a roster of the discovered servers? Reliable multicast directly from each server?
How do you connect multiple end points in a distributed system while keeping latency low, speed high, and delivery reliable?
Assuming all of your client-facing endpoints are on the same LAN (which they can be for the first reasonable step in scaling), reliable UDP multicast would allow you to send published messages directly from the publishing endpoint to any of the endpoints who have clients subscribed to the channel. This also satisfies the low-latency requirement much better than proxying data through a persistent storage layer.
Multicast groups
A central database (say, Redis) could track a map of multicast groups (IP:PORT) <--> channels.
When an endpoint receives a new client with a new channel to subscribe, it can ask the database for the channel's multicast address and join the multicast group.
Reliable UDP multicast
When an endpoint receives a published message for a channel, it sends the message to that channel's multicast socket.
Message packets will contain ordered identifiers per server per multicast group. If an endpoint receives a message without receiving the previous message from a server, it will send a "not acknowledged" message for any messages it missed back to the publishing server.
The publishing server tracks a list of recent messages, and resends NAK'd messages.
To handle the edge case of a server sending only one message and having it fail to reach a server, server can send a packet count to the multicast group over the lifetime of their NAK queue: "I've sent 24 messages", giving other servers a chance to NAK previous messages.
You might want to just implement PGM.
Persistent storage
If you do end up storing data long-term, storage services can join the multicast groups just like endpoints... but store the messages in a database instead of sending them to clients.
All,
I'm looking for advice over the following scenario:
I have a component running in one part of the corporate network that sends messages to an application logic component for processing. These components might reside on the same server, different servers in the same network (LAN ot WAN) or live outside in the cloud. The application server should be scalable and resilient.
The messages are related in that the sequence they arrive is important. They are time-stamped with the client timestamp.
My thinking is that I'll get the clients to use WCF basicHttpBinding (some are based on .NET CF which only has basic) to send messages to the Application Server (this is because we can guarantee port 80/443 will be open for outgoing connections). Server accepts these, and writes these into a queue. This queue can be scaled out if needed over multiple machines.
I'm hesitant to use MSMQ for the queue though as to properly scale out we are going to have to install seperate private queues on each application server and round-robin monitor the queues. I'm concerned though that we could lose a message on a server that's gone down until the server is restored, and we could end up processing a later message from a different server and disrupt the sequence.
What I'd prefer is a central queue (e.g. a database table) that all application servers monitor.
With this in mind, what I'd like to do is to create a custom WCF binding, similar to netMsmqBinding, but that uses the DB table instead but I'm confused as to whether I can simply create a custom transport or a I need a full binding, and whether the binding will allow the client to send over HTTP. I've looked around the internet but I'm a little confused as to where to start.
I could not bother with the custom WCF binding but it seems a good way to introduce scalability if I do need to seperate the servers.
Any suggestions please would be helpful, including alternatives.
Many thanks
I would start with MSMQ because it is exactly for this purpouse. Use single transactional queue on clustered machine and let application servers to take messages for processing from this queue. Each message processing has to be part of distributed transaction (MSDTC).
This scenario will ensure:
clustered queue host will ensure that if one cluster node fails the other will still be able to handle requests
sending each message as recoverable - it means that message will be persisted on hard drive (not only in memory) so in critical failure of the whole cluster you will still have all messages.
transactional queue will ensure that all message transport operations will be atomic - moving message from outgoing queue to destination queue will be processed as transaction. It means that original message from outgoing queue will be kept in queue until ack from destination queue arrives. Transactional processing can ensure in order delivery.
Distributed transaction will allow application servers consuming messages in transaction. Message will not be deleted from queue until application server commits transaction or transaction time outs.
MSMQ is also available on .NET CF so you can send messages directly to queue without intermediate non-reliable web service layer.
It should be possible to configure MSMQ over HTTP (but I have never used it so I'm not sure how it cooperates with previous mentioned features).
Your proposed solution will be pretty hard. You will end up in building BizTalk's MessageBox. But if you really want to do it, check Omar's post about building database queue table.