I'm fairly new to the realm of microservices but know basics about load balancing. I recently read an article about the microservices: Enough with the microservices.
There it's mentioned that both the microservices and load balancers have clusters/different VM's for deploying many copies of application but in the case of microservices, we have a separate database in contrast to load balancers which backs a single database. Is it the only difference between them?
Here's the quoted text:
"multiple copies of the same microservice can be deployed in order to
achieve a form of scalability. However, most companies that adopt
microservices too early will use the same storage subsystem (most
often a database) to back all of their microservices. What that means
is that you don’t really have horizontal scalability for your
application, only for your service. If this is the scalability method
you plan to use, why not just deploy more copies of your monolith
behind a load balancer? You’ll accomplish the same goal with less
complexity."
You can not compare Micro-services with load balancer... you should compare it with monolithic or SOA architecture.
In monolithic approach you mainly have only one database for the whole system and a monolithic application as a single project for your business.
monolithic is single unit But SOA is a coarse-grain approach and Microservice is fine-grain approach. In microservice architecture instead of designing a monolithic system you design different micro-services around your business capabilities and base on your domain and bounded-context.
each micro-services may have their own database. for e.g. order micro-service may have mysql database, recommendation micro-service may have Cassandra database and user-search micro service may have Elasticsearch or SOLR database.
In microservices each micro-service can talk to another base on two different communication style:
Sync (Rest is suggested)
Async (via message brokers like Kafka, RabbitMQ, ActiveMQ or NATS
and etc.)
Scaling up-down in micro-services architecture is much easier than monolithic systems and you can even change a part of system and redeploy it independently without affecting the whole system.
Also micro-services adhere to let-it-crash paradigm and with using EIP patterns like Circuit-Breaker you can let user think system is always up and working and Base on CAP theorem you can have high-available system by compensating for consistency and having Eventual Consistency according to BASE instead of ACID
For load balancing Client-side Load Balancing with Ribbon devised by Netflix is very viable approach.
Also with using NginX, Docker Swarm and kubernetes you can implement load balancing.
In a nutshell there is nothing to do about comparing Microservices with Load balancer.
Here's the (hopefully) simplest answer to your question:
Microservices are a different (micro-) application each. Each with its own application logic and database.
Load Balancers are usually used to distribute client requests to a cluster of instances of the same application.
That means: You can also use a load balancer to distribute requests for a microservice that is deployed in a cluster with many instances. But a load balancer can also be used to distribute requests to many instances of a large monolithic application (as opposed to micro).
The probably best overview for what Microservices are supposed to be.
Related
I am in a situation where I can use Service Fabric (locally) but cannot leverage Azure Service Bus (or anything "cloud"). What would be the corollary for queuing/pub-sub? Service Fabric is allowed since it is able to run in a local container, and is "free". Other 3rd party messaging infrastructure, like RabbitMQ, are also off the table (at the moment).
I've built systems using a locally grown bus, built on MSMQ and WCF, but I don't see how to accomplish the same thing in SF. I suspect I can have SF services use a custom ICommunicationListener that exposes msmq, but that would only be available inside the cluster (the way I understand it). I can build an HTTPBridge (in SF) in front of those to make them available outside the cluster, but then I'd lose the lifetime decoupling (client being able to call a service, using queues, even if that service isn't online at the time) since the bridge itself wouldn't benefit from any of the aspects of queuing.
I have a few possibilities but all suffer from some malady that only exists because of SF, locally. Also, the same code needs to easily deploy to full Azure SF (where I can use ASB and this issue disappears) so I don't want to build two separate systems just because of where I am hosting it in some instances.
Thanks for any tips.
You can build this yourself, for example like this. This uses a BrokerService that will distribute message-data to subscribed services and actors.
You can also run a containerized queuing platform like RabbitMQ with volumes.
By running the queue system inside the cluster you won't introduce an external dependency.
The problem is not SF, The main issue with your design is that you are coupling architectural requirements to implementations. SF runs on top of VirtualMachines, in the end, the only difference is that SF put the services in those machines, using another solution you would have an Agent Deploying these services in there or doing a Manual deployment. The challenges are the same.
It is clear from the description that the requirement in your design is a need for a message queue, the concept of queues are the same does not matter if it is Service Bus, RabbitMQ or MSMQ. Each of then will have the basic foundations of queues with specifics of each implementation, some might add transactions, some might implement multiple patterns, and so on.
If you design based on specific implementation, you will couple your solution to the implementation and make your solution hard to maintain and face challenges like you described.
Solutions like NServiceBus and Masstransit reduce a lot of these coupling from your code, and if you think these are not enough, you can create your own abstraction. Then you use configurations to tied your business logic to implementations.
Despite the above advice, I would not recommend you using different
solutions per environment, because as said previously, each solution
has it's own implementations and they might not assimilate to each other, as example, you might face issues in
production because you developed against MSMQ on DEV and TEST
environments, and when deployed to Production you use ServiceBus, they
have different limitations, like message size, retention period and son
on.
If you are willing to use MSMQ, you can add MSMQ to the VMs running your cluster and connect from your services without any issue. Take a look into this SO first: How can I use MSMQ in Azure Service Fabric
Recently several service discovery tools have become popular/"mainstream", and I’m wondering under what primary use cases one should employ them instead of traditional load balancers.
With LBs, you cluster a bunch of nodes behind the balancer, and then clients make requests to the balancer, who then (typically) round robins those requests to all the nodes in the cluster.
With service discovery (Consul, ZK, etc.), you let a centralized “consensus” service determine what nodes for particular service are healthy, and your app connects to the nodes that the service deems as being healthy. So while service discovery and load balancing are two separate concepts, service discovery gives you load balancing as a convenient side effect.
But, if the load balancer (say HAProxy or nginx) has monitoring and health checks built into it, then you pretty much get service discovery as a side effect of load balancing! Meaning, if my LB knows not to forward a request to an unhealthy node in its cluster, then that’s functionally equivalent to a consensus server telling my app not to connect to an unhealty node.
So to me, service discovery tools feel like the “6-in-one,half-dozen-in-the-other” equivalent to load balancers. Am I missing something here? If someone had an application architecture entirely predicated on load balanced microservices, what is the benefit (or not) to switching over to a service discovery-based model?
Load balancers typically need the endpoints of the resources it balances the traffic load. With the growth of microservices and container based applications, runtime created dynamic containers (docker containers) are ephemeral and doesnt have static end points. These container endpoints are ephemeral and they change as they are evicted and created for scaling or other reasons. Service discovery tools like Consul are used to store the endpoints info of dynamically created containers (docker containers). Tools like consul-registrator running on container hosts registers container end points in service discovery tools like consul. Tools like Consul-template will listen for changes to containers end points in consul and update the load balancer (nginx) for sending the traffic to. Thus both Service Discovery Tools like Consul and Load Balancing tools like Nginx co-exist to provide runtime service discovery and load balancing capability respectively.
Follow up: what are the benefits of ephemeral nodes (ones that come and go, live and die) vs. "permanent" nodes like traditional VMs?
[DDG]: Things that come quickly to my mind: Ephemeral nodes like docker containers are suited for stateless services like APIs etc. (There is traction for persistent containers using external volumes - volume drivers etc)
Speed: Spinning up or destroying ephemeral containers (docker containers from image) takes less than 500 milliseconds as opposed to minutes in standing up traditional VMs
Elastic Infrastructure: In the age of cloud we want to scale out and in according to users demand which implies there will be be containers of ephemeral in nature (cant hold on to IPs etc). Think of a markerting campaign for a week for which we expect 200% increase in traffic TPS, quickly scale with containers and then post campaign, destroy it.
Resource Utilization: Data Center or Cloud is now one big computer (compute cluster) and containers pack the compute cluster for max resource utilization and during weak demand destroy the infrastructure for lower bill/resource usage.
Much of this is possible because of lose coupling with ephemeral containers and runtime discovery using service discovery tool like consul. Traditional VMs and tight binding of IPs can stifle this capability.
Note that the two are not necessarily mutually exclusive. It is possible, for example, that you might still direct clients to a load balancer (which might perform other roles such as throttling) but have the load balancer use a service registry to locate instances.
Also worth pointing out that service discovery enables client-side load balancing i.e. the client can invoke the service directly without the extra hop through the load balancer. My understanding is that this was one of the reasons that Netflix developed Eureka, to avoid inter-service calls having to go out and back through the external ELB for which they would have had to pay. Client-side load balancing also provides a means for the client to influence the load-balancing decision based on its own perspective of service availability.
If you look at the tools from a completely different perspective, namely ITSM/ITIL, load balancing becomes "just that", whereas service discovery is a part of keeping your CMDB up to date, and ajour with all your services, and their interconnectivity, for better visibility of impact, in case of downtime, and an overview of areas that may need supplementing, in case of High availability applications.
Furthermore, service-discovery only gives you a picture as of the last scan, and not near-real-time (of course dependent on which scanning interval you have set), whereas load balancing will keep an up-to-date picture of your application's health.
I am currently evaluating solutions for WCF Webservice HA with load balancing. I see 2 feasible approaches for the type of WS i am authoring which are
1) Using the RoutingService API / Class provided by .NET
2) Using a HTTP load balancer like nginx.
Which one is a better approach for WCF WS hosted on IIS.
It depends on a lot of factors. The main one being, is the load balancing requirement a pure availability/scalability driven requirement or is it a business requirement?
If you simply require scale, eg a round robin distribution, or high availability, eg. active/passive failover, and you already have a network load balancer in front of your servers, then I would definitely use that.
The simple reason is that then it will be looked after by your infrastructure people, which is how it should be. Load balancing for scale/availability etc is not normally a development concern.
However, if you have a requirement for routing based on message content, eg routing of high priority calls to one endpoint only, or meeting call processing SLAs for different content, then this becomes a business requirement, because routing logic will then be determined on the business context of the call.
This most definitely is a development concern. In this instance I would certainly use the routing service to implement these various business cases.
Hope this helps you.
I am new in
Apache ZooKeeper : ZooKeeper is a centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services.
Apache Mesos : Apache Mesos is a cluster manager that simplifies the complexity of running applications on a shared pool of servers.
Apache Helix : Apache Helix is a generic cluster management framework used for the automatic management of partitioned, replicated and distributed resources hosted on a cluster of nodes.
Erlang Langauge : Erlang is a programming language used to build massively scalable soft real-time systems with requirements on high availability.
It sounds to me that Helix and Mesos both are useful for Clustering management System. How they are related to ZooKeeper? It'd better if someone give me a real world example for their usage.
I am curious to know How [BOINC][1] are distributing tasks to their clients? Are they using any of the above technologies? (Forget about Erlang).
I just need a brief view on it :)
Erlang was built by Ericsson, designed for use in phone systems. By design, it runs hundreds, thousands, or even 10s of thousands of small processes to handle tasks by sending information between them instead of sharing memory or state. This enables all sorts of interesting features that are great for high availability distributed systems such as:
hot code reloading. Each process is paused, it's relevant module code is swapped out, and it is resumed where it left off, so deploys can happen without restarting or causing significant interruption.
Easy distributed messaging and clustering. Sending a message to a local process or a remote one is fairly seamless in most instances.
Process-local GC. Garbage collection happens in each process independently instead of a global stop-the-world even like java, aiding in low-latency results.
Supervision trees and complex process hierarchy and monitoring/managing.
A few concrete real-world examples that makes great use of Erlang would be:
MongooseIM A highly performant and incredibly scalable, distributed XMPP / Chat server
Riak A distributed key/value store.
Mesos, on the other hand, you can sort of think of as a platform effectively for turning a datacenter of servers into a platform for teams and developers. If I, say as a company, own a datacenter with 10,000 physical servers, and I have 1,000 engineers developing hundreds of services, a good way to allow the engineers to deploy and manage services across that hardware without them needing to worry about the servers directly. It's an abstraction layer over-top of the physical servers to that allows you to share and intelligently allocate resources.
As a user of Mesos, I might say that I have Service X. It's an executable bundle that lives in location Y. Each instance of Service X needs 4 GB of RAM and 2 cores. And I need 8 instances which will be attached to a load balancer. You can specify this in configuration and deploy based on that config. Mesos will find hardware that has enough ram and CPU capacity available to handle each instance of that service and start it running in each of those locations.
It can handle a lot of other more complex topics about the orchestration of them as well, but that's probably a bit in-depth for this :)
Zookeepers most common use cases are Service Discover and configuration management. You can think of it, fundamentally, a bit like a nested key value store, where services can look at pre-defined paths to see where other services currently live.
A simple example is that I have a web service using a shared database cluster. I know a simple name for that database cluster and where the configuration for it lives in zookeeper. I can look up (or repeatedly poll) that path in zookeeper to check what the addresses of the active database hosts are. And on the other side, if I take a database node out of rotation and replace it with a new one, the config in zookeeper gets updated with the new address, and anything continually looking at it will detect this change and change where it's connected to.
A more complex use case for zookeeper is how Kafka uses it (or did at the time that I last used Kafka). Kafka has streams, and streams have many shards. Each consumer of each stream use zookeeper to save checkpoints in each shard after they have read and processed up to a certain point in the stream. That way if the consumer crashes or is restarted, it knows where to pick up in the stream.
I dont know about Meos and Earlang language. But this article might help you with Helix and Zookeeper.
This article tells us:
Zookeeper is responsible for gluing all parts together where Helix is cluster management component that registers all cluster details (cluster itself, nodes, resources).
The article is related to clustering in JBPM using helix and zookeeper.But with this you will get a basic idea on what helix and zookeeper is used for.
And from most of the articles i read online it seems like zookeeper and helix are used together.
Apache Zookeeper can be installed on a single machine or on a cluster.
It can be used to keep track of logs. It can provide various services on a distributed platform.
Storm and Kafka rely on Zookeeper.
Storm uses Zookeeper to store all state so that it can recover from an outage in any of its (distributed) component services.
Kafka queue consumers can use Zookeeper to store information on what has been consumed from the queue.
I've heard a lot of people touting success using Linux based proxies to handle routing for high availability of web applications, but what are others doing with web services? I have a bank of WCF services that need to be moved to a high availability (failover) model, meaning that if a particular server hosting the WCF services goes down, the request is routed to another of the servers in the bank. I would rather stay away from implementing a Linux based solution, since there are no Linux knowledgeable people in the environment.
If you don't need durability, you can load balance WCF service requests just like normal web requests without doing anything special. If you need durability and want requests to survive being cut off mid-process, use the netMsmqBinding.
I would rather stay away from
implementing a Linux based solution,
since there are no Linux knowledgeable
people in the environment.
This is probably a strong enough reason to not use a Linux-based solution. Doing what you describe well requires reasonable expertise beyond a simple recipe approach, and substantial maintenance.