WCF .NET Webservice load balancing using RoutingService vs nginx - wcf

I am currently evaluating solutions for WCF Webservice HA with load balancing. I see 2 feasible approaches for the type of WS i am authoring which are
1) Using the RoutingService API / Class provided by .NET
2) Using a HTTP load balancer like nginx.
Which one is a better approach for WCF WS hosted on IIS.

It depends on a lot of factors. The main one being, is the load balancing requirement a pure availability/scalability driven requirement or is it a business requirement?
If you simply require scale, eg a round robin distribution, or high availability, eg. active/passive failover, and you already have a network load balancer in front of your servers, then I would definitely use that.
The simple reason is that then it will be looked after by your infrastructure people, which is how it should be. Load balancing for scale/availability etc is not normally a development concern.
However, if you have a requirement for routing based on message content, eg routing of high priority calls to one endpoint only, or meeting call processing SLAs for different content, then this becomes a business requirement, because routing logic will then be determined on the business context of the call.
This most definitely is a development concern. In this instance I would certainly use the routing service to implement these various business cases.
Hope this helps you.

Related

Difference between Microservices and load balancer?

I'm fairly new to the realm of microservices but know basics about load balancing. I recently read an article about the microservices: Enough with the microservices.
There it's mentioned that both the microservices and load balancers have clusters/different VM's for deploying many copies of application but in the case of microservices, we have a separate database in contrast to load balancers which backs a single database. Is it the only difference between them?
Here's the quoted text:
"multiple copies of the same microservice can be deployed in order to
achieve a form of scalability. However, most companies that adopt
microservices too early will use the same storage subsystem (most
often a database) to back all of their microservices. What that means
is that you don’t really have horizontal scalability for your
application, only for your service. If this is the scalability method
you plan to use, why not just deploy more copies of your monolith
behind a load balancer? You’ll accomplish the same goal with less
complexity."
You can not compare Micro-services with load balancer... you should compare it with monolithic or SOA architecture.
In monolithic approach you mainly have only one database for the whole system and a monolithic application as a single project for your business.
monolithic is single unit But SOA is a coarse-grain approach and Microservice is fine-grain approach. In microservice architecture instead of designing a monolithic system you design different micro-services around your business capabilities and base on your domain and bounded-context.
each micro-services may have their own database. for e.g. order micro-service may have mysql database, recommendation micro-service may have Cassandra database and user-search micro service may have Elasticsearch or SOLR database.
In microservices each micro-service can talk to another base on two different communication style:
Sync (Rest is suggested)
Async (via message brokers like Kafka, RabbitMQ, ActiveMQ or NATS
and etc.)
Scaling up-down in micro-services architecture is much easier than monolithic systems and you can even change a part of system and redeploy it independently without affecting the whole system.
Also micro-services adhere to let-it-crash paradigm and with using EIP patterns like Circuit-Breaker you can let user think system is always up and working and Base on CAP theorem you can have high-available system by compensating for consistency and having Eventual Consistency according to BASE instead of ACID
For load balancing Client-side Load Balancing with Ribbon devised by Netflix is very viable approach.
Also with using NginX, Docker Swarm and kubernetes you can implement load balancing.
In a nutshell there is nothing to do about comparing Microservices with Load balancer.
Here's the (hopefully) simplest answer to your question:
Microservices are a different (micro-) application each. Each with its own application logic and database.
Load Balancers are usually used to distribute client requests to a cluster of instances of the same application.
That means: You can also use a load balancer to distribute requests for a microservice that is deployed in a cluster with many instances. But a load balancer can also be used to distribute requests to many instances of a large monolithic application (as opposed to micro).
The probably best overview for what Microservices are supposed to be.

How Can I use Apache to load balance Marklogic Cluster

Hi I am new to Marklogic and Apache. I have been provided task to use apache as loadbalancer for our Marklogic cluster of 3 machines. Marklogic cluster is currently running on Linux servers.
How can we achieve this? Any information regarding this would be helpful.
You could use mod_proxy_balancer. How you configure it depends what MarkLogic client you would like to use. If you would like to use the Java Client API, please follow the second example here to allow apache to generate stickiness cookies. If you would like to use XCC, please configure it to use the ML-Server-generated or backend-generated "SessionID" cookie.
The difference here is that XCC uses sessions whereas the Java Client API builds on the REST API which is stateless, so there are no sessions. However, even in the Java Client API when you use multi-request transactions, that imposes state for the duration of that transaction so the load balancer needs a way to route requests during that transaction to the correct node in the MarkLogic cluster. The stickiness cookie will be resent by the Java Client API with every request that uses a Transaction so the load balancer can maintain that stickiness for requests related to that transaction.
As always, do some testing of your configuration to make sure you got it right. Properly configuring apache plugins is an advanced skill. Since you are new to apache, your best hope of ensuring you got it right is checking with an HTTP monitoring tool like WireShark to look at the HTTP traffic from your application to MarkLogic Server to make sure things are going to the correct node in the cluster as expected.
Note that even with the client APIs (Java, Node.js) its not always obvious or explicit at the language API layer what might cause a session to be created. Explicitly creating multi statement transactions definately will, but other operations may do so as well. If you are using the same connection for UI (browser) and API (REST or XCC) then the browser app is likely to be doing things that create session state.
The safest, but least flexable configuration is "TCP Session Affinity". If they are supported they will eliminate most concerns related to load balancing. Cookie Session Affinity relies on guarenteeing that the load balencer uses the correct cookie. Not all code is equal. I have had cases where it the load balancer didn't always use the cookie provided. Changing the configuration to "Load Balancer provided Cookie Affinity" fixed that.
None of this is needed if all your communications are stateless at the TCP layer, the HTTP layer and the app layer. The later cannot be inferred by the server.
Another conern is if your app or middle tier is co-resident with other apps or the same app connecting to the same load balancer and port. That can be difficult to make sure there are no 'crossed wires' . When ML gets a request it associates its identity with the client IP and port. Even without load balencers, most modern HTTP and TCP client libraries implement socket caching. A great perfomrance win, but a hidden source of subtle random severe errors if the library or app are sharing "cookie jars" (not uncomnon). A TCP and Cookie Jar cache used by different application contexts can end up sending state information from one unrelated app in the same process to another. Mostly this is in middle tier app servers that may simply pass on requests from the first tier without domain knowledge, presuming that relying on the low level TCP libraries to "do the right thing" ... They are doing the right thing -- for the use case the library programmers had in mind -- don't assume that your case is the one the library authors assumed. The symptoms tend to be very rare but catastrophic problems with transaction failures and possibly data corruption
and security problems (at an application layer) because the server cannot tell the difference between 2 connections from the same middle tier.
Sometimes a better strategy is to load balance between the first tier and the middle tier, and directly connect from the middle tier to MarkLogic.
Especially if caching is done at the load balancer. Its more common for caching to be useful between the middle tier and the client then the middle tier and the server. This is also more analogous to the classic 3 tier architecture used with RDBMS's .. where load balancing is between the client and business logic tiers not between business logic and database.

Do companies that provide APIs use a shim or proxy in front of their APIs?

I'm researching how large companies manage their public APIs. I'm thinking of companies with mature established APIs such as Google, Facebook, Twitter, and Amazon.
These companies have a number of different APIs that they expose to the public. Google, for example, has Plus, AdSense, AdWords etc. APIs that are publicly consumable. I'd like to understand if they use a cluster of reverse-proxy servers in front of those APIs to provide common functionality so that their specialist API servers don't need to implement that.
For example: Throttling and Authentication could be handled at this layer instead of implementing it in each API cluster.
The questions: Does anyone use a shim or reverse proxy in front of their APIs to handle common tasks? What are the use cases that make a reverse-proxy a good or bad idea for a cluster of API servers?
Most large companies explore a variety of things to handle the traffic and load on their servers. Roughly speaking:
A load balancer sits between the entry point and the actual client.
A reverse proxy often times sits between these to handle static files, pre-computed/rendered views, and other such largely static assets.
Any cast is used for DNS purposes, so that you are routed towards the nearest server that handles that URL.
Back pressure is employed in systems to limit the amount of requests feeding through a single pipeline and so that services don't tip over.
Memcached, Redis and the like are used as short term caches. That is, if it's going to roughly be the same result every 5 seconds, then that result can be cached in memory for faster delivery. Some proxies can be configured to read out of these.
If you're really interested, start reading some of the Netflix blog. Take a look at some of the open source they've used like Hystrix or Zuul. You can also take a look at some of their videos. They make heavy use of proxies and have built in some very advanced distributed behavior.
As far as a reverse proxy being a good idea, think in terms of failure. If your service calls out to another API by direct route and that service fails, then your service will fail and cascade upwards to the end user. On the other hand, if it's hitting a reverse proxy, then that proxy can be configured or even auto detect failures and divert traffic to back up servers.
As far as a reverse proxy being a good idea, think in terms of load. Sometimes servers can only handle a fraction of the traffic individually so that load must be shared on many servers. This is true not just of CPU capped but also IO capped resources (even if the return signal itself will not be the cause of the IO capping.)
Daisy chaining like this presents its own special little hell but it's sometimes unavoidable. On the downsides and what makes it a really bad choice if you can avoid it at all costs is a loss of deterministic behavior. Sometimes the stupidest things will bring your servers down. And by stupid, I mean, really, really dumb stuff that you never thought in a million years might bite you in the butt (think server clocks out of sync.) You have to start using rolling deploys of code, take down servers manually or forcefully if they stop responding, and keep those proxy configs in good order.
HTTP1.1 support can also be an issue. Not all reverse proxy adhere to the spec. In fact, some of them only cover ~50%. HAProxy does not do SSL. If you're only limited hardware then thread based proxy can unexpectedly swamp the system with threads.
Finally, adding in a proxy is one more thing that will break (not can, will.) You have to monitor them just like any piece of the platform, aggregate their logs, and run mock drills on them too.

Load Balancing with the WSHttp Binding: Do not use reliable sessions? WHY?

We have WCF service X: deployed on server A and Server B, host address:
http://127.0.0.1:8777/ServiceX/
And we load balance the two servers. We accesss the service via http://myappserver/ServiceX
We need to use per-session mode, and we set [reliable sessions] as true:
We don't find any issue till now based on testing. But the below linked MSDN article says that Do not use reliable sessions for Load Balancing with the WSHttp Binding? Please can someone give more details? Thanks a lot.
WCF Load Balancing http://msdn.microsoft.com/en-us/library/ms730128.aspx
Reliable Messaging means all your messages from your established client reach the same endpoint behind any intermediaries like routers and load balancers.
Load balancing means your calls will be distributed across all nodes as the load balancer sees fit.
Those two goals are mutually exclusive. You can have one or the other, not both.
I have not had time to try this myself yet, but I found this old blog entry (https://blogs.msdn.microsoft.com/drnick/2007/07/13/sticky-sessions/):
This division according to groups would allow a feature like reliable messaging to work because the same server would be used to process all of the messages in the reliable session. The feature that this division method represents is typically called “sticky sessions” or some other phrase for affinitization in the load balancer.
Given that you mention that your firewall supports sticky sessions, I suspect/hope you will be fine.

What would be the best approach to designing a highly available pool of web services?

I've heard a lot of people touting success using Linux based proxies to handle routing for high availability of web applications, but what are others doing with web services? I have a bank of WCF services that need to be moved to a high availability (failover) model, meaning that if a particular server hosting the WCF services goes down, the request is routed to another of the servers in the bank. I would rather stay away from implementing a Linux based solution, since there are no Linux knowledgeable people in the environment.
If you don't need durability, you can load balance WCF service requests just like normal web requests without doing anything special. If you need durability and want requests to survive being cut off mid-process, use the netMsmqBinding.
I would rather stay away from
implementing a Linux based solution,
since there are no Linux knowledgeable
people in the environment.
This is probably a strong enough reason to not use a Linux-based solution. Doing what you describe well requires reasonable expertise beyond a simple recipe approach, and substantial maintenance.