OpenStack uses messaging (RabbitMQ by default I think ?) for the communication between the nodes. On the other hand Kubernetes (lineage of Google's internal Borg) uses RPC. Docker's swarm uses RPC as well. Both are gRPC/protofbuf based which seems to be used heavily inside Google as well.
I understand that messaging platforms like Kafka are widely used for streaming data and log aggregation. But systems like OpenStack, Kubernetes, Docker Swarm etc. need specific interactions between the nodes and RPC seems like a natural choice because it allows APIs to be defined for specific operations.
Did OpenStack chose messaging after evaluating the pros and cons of messaging vs RPC? Are there any good blogs/system reviews comparing the success of large scale systems using messaging vs RPC? Does messaging offer any advantage over RPC in scaled distributed systems?
Does messaging offer any advantage over RPC in scaled distributed systems ?
Mostly persistence is a big advantage for messaging system. Another point is broadcasting. You need to implement this into gRPC by yourself. Service Discovery and Security can be another reason. In Messaging System you just need to keep one system highly secure, while with gRPC you might have many points where somebody could break into the system. Message queue systems usually already have some kind of service discovery implemented. With gRPC you have to use at least another library for this.
Are there any good literate comparing the success of large scale systems using messaging vs RPC ?
It's not a vs. There are different use cases. Messaging Systems are generally slower than RPC protocols. Not only slower than gRPC. The reason for this is also simple. You just introduce a middleware between two or more nodes. But they provide persistence, broadcasting, Pub/Sub etc.
Did Openstack chose messaging after evaluating the pros cons of messaging vs RPC ?
Probably
Does messaging offer any advantage over RPC in scaled distributed systems ?
Ready to use solution, just use a client
Persistence
Ready to use Service Discovery
Pub/Sub Pattern
Failure tolerance
Most of the points needs to be implemented with gRPC by yourself.
Related
I would like to know how, if possible, a client app (winform) sends NServicebus command A to be processed by a MSMQ queue and command B to be processed by a Azure storage queue or Azure service bus? If not, how may I get around of it?
Since this question was asked, there is now a transport bridge which is specifically for this scenario: bridging messages between two different transports.
Will this help? https://docs.particular.net/samples/azure/azure-service-bus-msmq-bridge/
Common examples include:
A hybrid solution that spans across endpoints deployed on-premises and in a cloud environment.
Departments within organization integrating their systems that use different messaging technologies for historical reasons.
Traditionally, such integrations would require native messaging or relaying. Bridging is an alternative, allowing endpoints to communicate over different transports without a need to get into low-level messaging technology code. With time, when endpoints can standardize on a single transport, bridging can be removed with a minimal impact on the entire system.```
I am in a situation where I can use Service Fabric (locally) but cannot leverage Azure Service Bus (or anything "cloud"). What would be the corollary for queuing/pub-sub? Service Fabric is allowed since it is able to run in a local container, and is "free". Other 3rd party messaging infrastructure, like RabbitMQ, are also off the table (at the moment).
I've built systems using a locally grown bus, built on MSMQ and WCF, but I don't see how to accomplish the same thing in SF. I suspect I can have SF services use a custom ICommunicationListener that exposes msmq, but that would only be available inside the cluster (the way I understand it). I can build an HTTPBridge (in SF) in front of those to make them available outside the cluster, but then I'd lose the lifetime decoupling (client being able to call a service, using queues, even if that service isn't online at the time) since the bridge itself wouldn't benefit from any of the aspects of queuing.
I have a few possibilities but all suffer from some malady that only exists because of SF, locally. Also, the same code needs to easily deploy to full Azure SF (where I can use ASB and this issue disappears) so I don't want to build two separate systems just because of where I am hosting it in some instances.
Thanks for any tips.
You can build this yourself, for example like this. This uses a BrokerService that will distribute message-data to subscribed services and actors.
You can also run a containerized queuing platform like RabbitMQ with volumes.
By running the queue system inside the cluster you won't introduce an external dependency.
The problem is not SF, The main issue with your design is that you are coupling architectural requirements to implementations. SF runs on top of VirtualMachines, in the end, the only difference is that SF put the services in those machines, using another solution you would have an Agent Deploying these services in there or doing a Manual deployment. The challenges are the same.
It is clear from the description that the requirement in your design is a need for a message queue, the concept of queues are the same does not matter if it is Service Bus, RabbitMQ or MSMQ. Each of then will have the basic foundations of queues with specifics of each implementation, some might add transactions, some might implement multiple patterns, and so on.
If you design based on specific implementation, you will couple your solution to the implementation and make your solution hard to maintain and face challenges like you described.
Solutions like NServiceBus and Masstransit reduce a lot of these coupling from your code, and if you think these are not enough, you can create your own abstraction. Then you use configurations to tied your business logic to implementations.
Despite the above advice, I would not recommend you using different
solutions per environment, because as said previously, each solution
has it's own implementations and they might not assimilate to each other, as example, you might face issues in
production because you developed against MSMQ on DEV and TEST
environments, and when deployed to Production you use ServiceBus, they
have different limitations, like message size, retention period and son
on.
If you are willing to use MSMQ, you can add MSMQ to the VMs running your cluster and connect from your services without any issue. Take a look into this SO first: How can I use MSMQ in Azure Service Fabric
I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)
Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.
I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.
Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.
I have explored the web on MULE and got to understand that for Apps to communicate among themselves - even if they are deployed in the same Mule instance - they will have to use either TCP, HTTP or JMS transports.
VM isn't supported.
However I find this a bit contradictory to ESB principles. We should ideally be able to define EndPoints in and ESB and connect to that using any Transport? I may be wrong.
Also since all the apps are sharing the same JVM one would expect to be able to communicate via the in-memory VM queue rather than relying on a transactionless HTTP protocol, or TCP where number of connections one can make is dependent on server resources. Even for JMS we need to define and manage another queue and for heavy usage that may have impact on performances. Though I agree if we have distributed and clustered systems may be HTTP or JMS will be only options.
Is there any plan to incorporate VM as a inter-app communication protocol or is there any other way one Flow can communicate with another Flow Endpoint but in different app?
EDIT : - Answer from Mulesoft
http://forum.mulesoft.org/mulesoft/topics/concept_of_endpoint_and_inter_app_communication
Yes, we are thinking about inter-app communication for a future release.
Still is not clear when we are going to do it but we have a couple of ideas on how we want this feature to behave. We may create a server level configuration in which you can define resources to use in all your apps. There you would be able to define a VM connector and use it to send messages between apps in the same server.
As I said, this is just an idea.
Regarding the usage of VM as inter-app communication, only MuleSoft can answer if VM will have a future feature or not.
I don't think it's contradictory to the ESB principle. The "container" feature is pretty well defined in David A Chappell's "Enterprise Service Bus book" chapter 6. The container should try it's best to keep the applications isolated.
This will provide some benefits like "independently deployable integration services" (same chapter), easier clusterization, and other goodies.
You should approach same VM inter-app communications as if they where between apps placed in different servers.
Seems that Mule added in 3.5 version, a feature to enable communication between apps deployed in the same server. But sharing a VM connector is only available in the Enterprise edition.
Info:
http://www.mulesoft.org/documentation/display/current/Shared+Resources#SharedResources-DefiningDomains
Example:
http://blogs.mulesoft.org/optimize-resource-utilization-mule-shared-resources/
I don't have a specific query here ; just need some design guidelines.
I came across this article on Node.js , MQTT and Websockets.
I guess we can achieve similar purpose using Node/Java + ActiveMQ + Websockets. My query is how to select between MQ and MQTT ? Can I safely use an "open" server like mosquitto in a medium-large scale project, compared to ActiveMQ ?
This article has had some insight, and it seems like I should use both MQ and MQTT, as MQTT may possibly help if I get lightweight clients in future.
Thanks !
Adding to what Shashi has said, these have different capabilities and use cases.
MQTT defines a standard wire protocol for pub/sub and, as Shashi noted, is designed for very lightweight environments. As such it has a very minimal wire format, a few basic qualities of service and a basic feature set.
Traditional message queueing systems on the other hand are generally proprietary (although AMQP aims to change that), cover both point-to-point and pub/sub, offer many qualities of service and tend to have a more heavyweight wire format, although this exists to support enhanced feature sets such as reply-to addressing, protocol conversion, etc.
A good example of MQTT would be where you have endpoints in phones, tablets and set-top boxes. These have minimal horsepower, memory and system resources. Typically connections from these either stay MQTT and they talk amongst themselves, or they are bridged to an enterprise-class MQ where they can intercommunicate with back-end applications. For example, an MQTT-based chat client might talk directly to another through the MQTT broker. Alternatively, an MQTT-based content-delivery system would bridge to an enterprise messaging network which hosted the ads and other content to be delivered to apps running on phones and tablets. The enterprise back-end would manage all the statistics of ad delivery and views upon which billings are based and the MQTT leg allows the content to be pushed with minimal battery or horsepower consumption on the end-user device.
So MQTT is used for embedded systems and end-user devices where power, bandwidth and network stability are issues. This is often in combination with traditional MQ messaging, although I haven't ever seen MQTT used as the exclusive transport for traditional messaging applications. Presumably, this is because MQTT lacks some of the more robust features such as message correlation, reply-to addressing and point-to-point addressing that have been core to messaging for 20 years.
MQTT protocol is suited for small devices like sensors, mobile phones etc that have small memory footprint. These devices are typically located in a brittle network and typically have low computing power.
These devices connect to an organizations back-end network via MQTT protocol for sending and receiving messages. For example a temperature sensor in an oil pipeline would collect temperature of the oil flowing through the pipe and send it to the control center. In response a command message could be sent over MQTT to another device to reduce/stop the flow of oil through that pipe.
WebSphere MQ has the capability to send/receive messages to/from the MQTT devices. So if you plan to implement a messaging based solution that involves devices & sensors, you can consider MQ and MQTT.
HTH
As already discussed, MQTT defines an applicative wire protocol (i.e. how the information is organized and then serialized, before to be transferred).
Mosquitto, or whatever else MQTT broker, is just an implementation of the Hub and Spoke Integration Pattern, just like JMS and AMQP based brokers, the difference consists in the wire protocol at transport level: AMQP defines a standardized transport wire protocol, instead JMS brokers like ActiveMQ defines their own proprietary format, namely the OpenWire. Of course, not standard implementations, like Mosquitto, implement proprietary wire transport protocol (this impacts interoperability, but can be a better choice in terms of perfomances).
Back to the question. Brokers like Mosquitto can be used in real scenarios, according to your needs in terms of scalability and reliability: normally, clustering is needed to assure i. Availability, ii. Reliability and iii. Scalability. Brokers thought for PAN (Private Area Netorks), normally do not provide OTB (Out of The Box) such features - ActiveMQ provides that.
Concluding, it's up to your requirements to pick for you the best solution.