In case, I have my applications running in 2 different regions. If I use a traffic manager to route my packet to the nearest region, from a high availability perspective, how do I manage the manage data replication between these regions?
For example, if the application server in region A fails all my traffic will be routed to the application server in region B. Does this mean I will need bi-directional replication across both regions?
The short answer to your question is Yes.
for example we have three layers web,application and database.
We can create two web and two application servers with different location and can use SQL always on.
This way we can use traffic manager to point to two different regions.
If you need data parity between the two regions, you need to implement that since TM operates at the DNS layer and is agnostic to your data model. What it will do is direct your traffic to the best healthy endpoint in terms of latency (assuming you are using the performance routing mode).
Related
What I'd like to achive is to be able to scale out Azure SQL Database.
Business Critical tier has this feature to enable several read-only replicas. This is a great feature that would let me offload some traffic over to those replicas
The problem for me is that I don't understand how to manage those replicas and I don't understand how load balancing works there. Basically, I should be able to manage how many replicas there are, I probably need to have around 10 of replicas and have traffic equality balanced across them
Is this something that I could do?
If you look at the note here, it says
In Premium and Business Critical service tiers, only one of the read-only replicas is accessible at any given time. Hyperscale supports multiple read-only replicas.
This means Premium and Business critical service tiers may have multiple replicas (3-4) but only 1 of them is accessible as read only. There is no control as to which one and there is no load balancing capabilities. It is only good for use if there is a separate application which require read access only (example analytical workloads).
For Hyperscale you can refer to this.
Hyperscale allows for 1-4 secondaries(1 by default). The link states
If more than one secondary replica is present, the workload is distributed across all available secondaries.
There is no additional information and it seems the the control to load balance is abstracted away from us.
You can definitely not achieve your requirement of 10 read replicas from any of these configurations.
Mirroring is replicating data between Kafka cluster, while Replication is for replicating nodes within a Kafka cluster.
Is there any specific use of Replication, if Mirroring has already been setup?
They are used for different use cases. Let's try to clarify.
As described in the documentation,
The purpose of adding replication in Kafka is for stronger durability and higher availability. We want to guarantee that any successfully published message will not be lost and can be consumed, even when there are server failures. Such failures can be caused by machine error, program error, or more commonly, software upgrades. We have the following high-level goals:
Inside a cluster there might be network partitions (a single server fails, and so forth), therefore we want to provide replication between the nodes. Given a setup of three nodes and one cluster, if server1 fails, there are two replicas Kafka can choose from. Same cluster implies same response times (ok, it also depends on how these servers are configured, sure, but in a normal scenario they should not differ so much).
Mirroring, on the other hand, seems to be very valuable, for example, when you are migrating a data center, or when you have multiple data centers (e.g., AWS in the US and AWS in Ireland). Of course, these are just a couple of use cases. So what you do here is to give applications belonging to the same data center a faster and better way to access data - data locality in some contexts is everything.
If you have one node in each cluster, in case of failure, you might have way higher response times to go, let's say, from AWS located in Ireland to AWS in the US.
You might claim that in order to achieve data locality (services in cluster one read from kafka in cluster one) one still needs to copy the data from one cluster to the other. That's definitely true, but the advantages you might get with mirroring could be higher than those you would get by reading directly (via an SSH tunnel?) from Kafka located in another data center, for example single connections down, clients connection/session times longer (depending on the location of the data center), legislation (some data can be collected in a country while some other data shouldn't).
Replication is the basis of higher availability. You shouldn't use Mirroring to handle high availability in a context where data locality matters. At the same time, you should not use just Replication where you need to duplicate data across data centers (I don't even know if you can without Mirroring/an ssh tunnel).
I am currently developing a system that makes heavy use of redis for a series of web services.
One of the key criteria of this system is fast responses.
At present the layout (ignoring load balancers etc) is as follows:
2 x Front End Play Framework 2.x Servers
2 x Job Handling/Persistence Play Framework 2.x Servers
1 x MySQL Server
2 x Redis Servers, 1 master, 1 slave
In this setup, redis serves 2 tasks - as a shared cache and also as a message bus.
Currently the front end servers host a service which interacts in its entirety with Redis.
The front end servers try to balance reads across the pool of read servers (currently the master and 1 slave), but being Redis they need to make their writes to the master server. They handle cache updates etc by sending messages on the queues, which are picked up by the job handling servers.
The job handling servers do blocking listens (BLPOP) to the Redis write server and process tasks when necessary. They have the only connection to MySQL.
At present the read replica server is a dedicated server - more there to be able to switch it to write master if the current master fails.
I was thinking of putting a read replica slave of redis on each of the front end servers which means that read latency would be even less, and writes (messages for queues) get pushed to the write server on a separate connection.
If I need to scale, I could just add more front end servers with read slaves.
It sounds like a win/win to me as even if the write server temporarily drops out, the front end servers can still read data at least from their local slave and act accordingly.
Can anyone think of reasons why this might not be such a good idea?
I understand the advantages of this approach... but consider this: what happens when you need to scale just one component (i.e. FE server or Redis) but not the other? For example, more traffic could mean you'll need more app servers to handle it while the Redises will be considerably less loaded. On the other hand, if your dataset grows and/or more load is put on the Redises - you'll need to scale these and not the app.
The design should fit your requirements, and the simplicity of your suggested setup has a definite appeal (i.e. to scale, just add another identical lego block) but from my meager experience - anything that sounds too good to be true usually is. In the longer run, even if this works for you now, you may find yourself in a jam down the road. My advice - separate your Redis(es) from you app servers, deal with and/or work around the network and make sure each layer is available and scalable on its own right.
From the active/active documentation -
we have developed active/active high availability for queues
This solution still requires a RabbitMQ cluster, which means that it will not cope
seamlessly with network partitions within the cluster and, for that reason, is not
recommended for use across a WAN (though of course, clients can still connect from
as near and as far as needed)
What does it mean "not recommended for use across a WAN".
I cant understand this remark -
If I buy three machines on ec2 will I need to establish a domain controller/dns server?
What does this restriction mean? and why?
Replication is a time-sensitive application, this means that timing assumptions have to be done in order to get the distributed state synchronized across the replicas.
The Internet is an asynchronous network per definition, the network asynchronously evolves and there's no way to make assumptions on delivery times, neither in case MPLS (Multiprotocol Label Switching) paths are defined: the BGP (Border Gateway Protocol) introduces a lot of unpredictability, paths can be very unpredictable, and this translates in unpredictable latency.
According to above, unpredictable latency is a killer factor for Active-Active replication (i.e. mirroring the state synchronously among the replicas to reach a consistent distributed state).
Another problem to be taken into consideration consists in the Network Partitioning: in a set of replicas, one or more can be isolated creating "islands of non-consistent replicas": let's assume the replica set R = { R1, R2, R3, ..., RN }, for network connectivity reasons (e.g. BGP problems) a subset of replicas like {R1, R2, R3} may be isolated from the remaining ones. Network partitioning implies distributed state inconsistencies: the subset of replicas will be consistent, but globally they evolve independently towards a corrupted distributed state.
The CAP Theorem deals with the replication problem over the WAN (Wide Area Network, i.e. the Internet). It states:
Consistency, Availability and Partitioning cannot be achieved over the WAN or another asynchronous network, 2 out of 3 need to be chosen for large scale distributed systems (e.g. Availability and Network Partitioning for the well known NoSQL databases).
Coming back to the original question: according to above, that statement (from RabbitMQ documentation) tries to sum up in a pragmatic way the problems that I highlighted above (i.e. Active-Active replication cannot be achieved over the WAN). For this reason, if you need to replicate your Broker instances over the WAN, techniques like the Shoveling and Federation are commonly used in RabbitMQ deploys.
It means if you have 3 EC2 instances in your cluster, they should be in the same data center. Not US East and US West for example*. RabbitMQ uses Erlang's node communication and is pretty chatty. Low latency communication is critical to having a performant cluster.
*Ideally even the same subnet, but that's not always possible.
I have a service based architecture where a web farm full of asp clients hit application server farm of WCF services. Obviously all the database access is done by the WCF services. Now I would like to cache my frequently used database retrieved objects using Velocity at the service tier level. I am considering to make each physical application server also part of the cache cluster.
According to Velocity documentation, if I use regions, objects are stored only at a single host. I actually wouldn't have any problem if each host kept it's own cache provided that I could somehow synchronize them.
So my questions are
If I create one region on one host is it also created on another one?
When I clear a cache region, is it cleared on one host only?
If I subscribe to a region level notification on all the hosts, can I catch events of one host on another one?
In this scenario should I use regions at all or stay away from them?
I hope my questions are clear. Actually I am more interested in a solution to my problem than answers to my questions
Yes you are right in reading the doc that the region will exists only in one host.
" I actually wouldn't have any problem if each host kept it's own cache provided that I could somehow synchronize them."
When you say synchronize, you mean when HA in enabled ? Velocity would actually take care of that if thats what you meant.
For the questions:
1. No.
2. Yes
3. Notifications will be sent to the client. So i am not sure if there is anyway to send notifications to other host.
4. Regions gives Search capabilities and takes away HA from you. In your case, you could use the advantages of HA.
Having regions not necessarily means that you don't have HA. if your create your own cache (and don't use the 'default' one) you can create it with Secondarys = 1 (HA on)
now let’s say you have 4 cache hosts; when you define a region , it will have both primary and secondary hosts. so each action on the region will result it being applied in both.
Shany
Named caches distribute across participating nodes. Named regions live on a single node. Regions can be HA, but they cannot take full advantage of distributed cache scaling, as their object load does not distribute across participating nodes in the cluster. Also, using named caches with HA requires three nodes minimum, rather than two nodes if you used the "default" cache only.