how the VMware FT test-and-set works when primary is actually down? - virtual-machine

The VMware FT uses an atomic test-and-set operation on the shared storage to avoid split brain issue.
In its original paper ("The Design of a
Practical System for
Fault-Tolerant Virtual Machines"), the system uses a network disk server, shared by both primary and backup. That network disk server has a
"test-and-set service". The test-and-set service maintains a flag that
is initially set to false. If the primary or backup thinks the other
server is dead, and thus that it should take over by itself, it first
sends a test-and-set operation to the disk server.
The flag should be true after one test-and-set operation.
My question is when the new primary is down, who is responsible to set the flag back to false so the new backup can successfully pass the test-and-set?

Related

Connection strings for SQL Azure with geo-replica

With read-only routing, we can have a Failover Group listener direct the connection to a read-only secondary automatically, which can provide additional capacity.
I have set this up but I am confused about the fact that the FG provides two different FQDNs for the connection, one is servername.database.windows.net and the other servername.secondary.database.windows.net. These work as expected when the system is up and running but what is not clear is what happens to the secondary connection if the primary goes offline and a failover takes place. Would the secondary connection automatically route to the new primary/only server or would it simply stop working because there would be no secondaries available?
I would test it but I can't find a way to take the secondary offline to simulate it being unavailable.
Alternatively. when I tried using the primary connection with ApplicationIntent=ReadOnly, it seems it sends all traffic to the primary server so that doesn't work either.
what happens to the secondary connection if the primary goes offline and a failover takes place?
Auto-failover groups provide read-write and read-only listener end-points that remain unchanged during geo-failovers. This means you do not have to change the connection string for your application after a geo-failover, because connections are automatically routed to the current primary. Whether you use manual or automatic failover activation, a geo-failover switches all secondary databases in the group to the primary role.
would it simply stop working because there would be no secondaries available?
If you add a single database to the failover group, it automatically creates a secondary database using the same edition and compute size on secondary server.
If you add a database that already has a secondary database in the secondary server, that geo-replication link is inherited by the group.
When you add a database that already has a secondary database in a server that is not part of the failover group, a new secondary is created in the secondary server.
Refer: Auto-failover groups overview & best practices

Does Horizontal scaling(scale out) option available in AZURE SQL Managed Instance?

Does Horizontal scaling(scale out) option available in AZURE SQL Managed Instance ?
Yes, Azure SQL managed instance support scale out.
You you reference the document #Perter Bons have provided in comment:
Document here:
Scale up/down: Dynamically scale database resources with minimal downtime
Azure SQL Database and SQL Managed Instance enable you to dynamically
add more resources to your database with minimal downtime; however,
there is a switch over period where connectivity is lost to the
database for a short amount of time, which can be mitigated using
retry logic.
Scale out: Use read-only replicas to offload read-only query workloads
As part of High Availability architecture, each single database,
elastic pool database, and managed instance in the Premium and
Business Critical service tier is automatically provisioned with a
primary read-write replica and several secondary read-only replicas.
The secondary replicas are provisioned with the same compute size as
the primary replica. The read scale-out feature allows you to offload
read-only workloads using the compute capacity of one of the
read-only replicas, instead of running them on the read-write
replica.
HTH.
Yes scale out option is available in Business Critical(BC) tier. The BC utilizes three nodes. One is primary and two are secondary. They use Always on on the backend. If you need to utilize for reporting, just ApplicationIntent=Readonly in the connection string and your application will be routed one of the secondary nodes.

Dirty Reads in SQL Server AlwaysOn

I have a pair of SQL Server 2014 databases set up as a synchronous AlwaysOn availability group.
Both servers are set to the Synchronous commit availability mode, with a session timeout of 50 seconds. The secondary is set to be a Read-intent only readable secondary.
If I write to the primary and then immediately read from the secondary (via ApplicationIntent=ReadOnly), I consistently read dirty data (i.e. the state before the write). If I wait for around a second between writing and reading, I get the correct data.
Is this expected behaviour? If so, is there something I can do to ensure that reads from the secondary are up-to-date?
I'd like to use the secondary as a read-only version of the primary (as well as a fail-over), to reduce the load on the primary.
There is no way you can get dirty reads unless you are using no-lock hint..
When you enable Read Only secondaries in AlwaysOn..Internally SQL uses rowversioning to store Previous version of the row..
further you are using Synchronous commit mode,this ensures log records are committed first on secondary,then on primary..
what you are seeing is Data latency..
This whitePaper deals with this scenario..Below is relevant part which helps in understanding more about it..
The reporting workload running on the secondary replica will incur some data latency, typically a few seconds to minutes depending upon the primary workload and the network latency.
The data latency exists even if you have configured the secondary replica to synchronous mode. While it is true that a synchronous replica helps guarantee no data loss in ideal conditions (that is, RPO = 0) by hardening the transaction log records of a committed transaction before sending an ACK to the primary, it does not guarantee that the REDO thread on secondary replica has indeed applied the associated log records to database pages.
So there is some data latency. You may wonder if this data latency is more likely when you have configured the secondary replica in asynchronous mode. This is a more difficult question to answer. If the network between the primary replica and the secondary replica is not able to keep up with the transaction log traffic (that is, if there is not enough bandwidth), the asynchronous replica can fall further behind, leading to higher data latency.
In the case of synchronous replica, the insufficient network bandwidth does not cause higher data latency on the secondary but it can slow down the transaction response time and throughput for the primary workload

Is the communication between primary and active secondary secured and how it works

Premium service tier of Azure SQL database provides active geo replication due which upto 4 readable secondaries can be created. I want to know if the communication between primary and secondary database is secure and are there any chances of data being hacked in the transit?
For more infomation:Azure SQL Database Inside#High Availability
First, a transaction is not considered to be committed unless the
primary replica and at least one secondary replica can confirm that
the transaction log records were successfully written to disk. Second,
if both a primary replica and a secondary replica must report success,
small failures that might not prevent a transaction from committing
but that might point to a growing problem can be detected.

How intelligent is SQL Server Mirroring?

While working on my current development product I have setup SQL server mirroring between the primary data center and the secondary data center. In the primary data center the SQL .mdf and .ldf files are stored on the SAN.
Now admittedly it should be very unlikely for us to lose the SAN but if for example the connection to the SAN was lost and the database integrity was lost. Would the mirroring still happen? I.e. would SQL now mirror the broken database and now both are equally broken?
From googling its not clear when mirroring will and will not happen so I was hoping that the community may be able to share some of there experiences.
I also have backup schedules setup which would be a final fail safe but realistically I would hope that the mirrored database would be our quickest way to bring everything back online.
In this scenario at present there is no witness server in the mirroring process although with the benefits of automatic failover I am thinking of adding one.
Thanks
As far as mirroring corruption between PRIMARY and SECONDARY goes: unfortunately, it depends. If the corruption is immediate and physical, then not normally -- the corruption is typically picked up by checks done at the end of the transaction and rolled back.
However, a database can exist in a corrupted state for some time before anything realises it is corrupted. If the underlying data pages are not touched, the engine never has cause to check them. So it is possible that underlying storage issues may mean that either database can become corrupted and you won't know until you attmept to access the affected pages. Traditionally, this would be a write operation, since your client connection will only read from the current active database (and not the partner).
This is why it is important to perform regular maintenance checks on your databases (e.g. DBCC CHECKDB). This becomes harder in a mirrored environment because only PRIMARY can typically be checked, so you really have to induce a manual failover to test your SECONDARY (unless you are running Enterprise, where you might be able to snapshot the mirror and check that -- I've not tried).
Starting with SQL Server 2008, the engine will attempt something called Automatic Page Repair, where it tries to automatically recover corrupted pages it encounters during the mirroring process. You should probably keep an eye on sys.dm_db_mirroring_auto_page_repair if this is something you are worried about.
If it is logical corruption, where the wrong data is entered, this will push across to SECONDARY without any means of stopping it.
However, I should point out that your approach might leave you with other issues. Mirroring isn't backup. And mirroring isn't great over WAN links.
In synchronous mode, it receives the client request, then writes to PRIMARY, then writes to SECONDARY, gets the OK back from SECONDARY and then sends an OK back to the client. If it can't write to SECONDARY, or doesn't get the response from SECONDARY, it rolls back the operation on PRIMARY (even though it was successful) and sends a failure back to the client.
A failing WAN link (even temporarily) can cause PRIMARY to choose not to accept connections (because it can't see SECONDARY). A failover mid-connection can leave you in an invalid logical data state, so make sure your transactions are sound.
With a WITNESS server, this can be a little more robust -- placing the witness server alongside PRIMARY in the same LAN allows WITNESS and PRIMARY to form quorum and agree that PRIMARY is still working, even though it can't see SECONDARY (thus not locking you out of a perfectly functioning database).
Instead, over my slower site-to-site links, I prefer to use log shipping between PRIMARY and SECONDARY. With a bit of effort I can control the transport between sites so as to rate-limit over the WAN link and it is possible keep the log-shipped SECONDARY in a single-user standby mode. This allows me to run the standard DBCC CHECKDB commands against SECONDARY, as well as also querying the SECONDARY for data reconcilliation purposes, too. I can also put a delay on the restoration, too, so I have some leeway to failover before a major logical data error reaches the SECONDARY (although that really depends on the RDO).
If I have a high-availability requirement, I might put in mirroring at the main site only -- i.e. two servers + witness. The relatively-quick few-second automatic failover time provided by the witnessed environment has saved me a few late-night calls, in the past.
Hope this helps.
J.