Is the communication between primary and active secondary secured and how it works - sql

Premium service tier of Azure SQL database provides active geo replication due which upto 4 readable secondaries can be created. I want to know if the communication between primary and secondary database is secure and are there any chances of data being hacked in the transit?

For more infomation:Azure SQL Database Inside#High Availability
First, a transaction is not considered to be committed unless the
primary replica and at least one secondary replica can confirm that
the transaction log records were successfully written to disk. Second,
if both a primary replica and a secondary replica must report success,
small failures that might not prevent a transaction from committing
but that might point to a growing problem can be detected.

Related

What is the order of asynchronous replication in Active Geo-Replication feature of Azure SQL?

In SQL Azure, Active Geo-Replication asynchronously replicates committed transactions from the primary database to up to four secondary databases on different servers. Does the asynchronous action happen independently on all the four secondaries? Or is this action chained where-in asynchronous action first happens from primary to secondary and then from the first secondary to the second secondary and then second secondary to third secondary and finally from the third secondary to the fourth secondary? If this process is chained does Azure SQL allow the user to identify the chained root secondary till the leaf most secondary to mitigate the replication lag?
The replication is asynchronous to up to four secondaries. It is not chained.
That being said, you can create secondaries of a secondary. In that scenario, replication is chained to the secondaries of the secondary.

Passive Replication in Distributed Systems - Replacing the Primary Server

In a passive replication based distributed system, if the primary server fails, one of the backups is promoted as primary. However, suppose that the original primary server recovers, then how do we switch back the primary server to it from the current backup?
I was wondering
if the failed primary server recovers, it must be incorporated into the system as a secondary and updated to reflect the most accurate information at the given point of time. To restore it as the primary server, it can be promoted as the primary in case the current primary (which was originally a backup) fails, otherwise, if required the current primary can be blocked for a while, the original primary promoted as primary again and the blocked reintroduced as backup.
I could not find an answer to this question elsewhere and this is what I feel. Please suggest any better alternatives.
It depends on what system you're looking at. Usually there's no immediate need to replace the backup when the original primary server recovers; if there is, you'd need to synchronize the two and promote the original primary.
Distributed synchronization (or consensus) is a hard problem. There's a lot of literature out there and I recommend that you read up. An example of a passively replicated system (with Leaders/Followers/Candidates) is Raft, which you could start with. A good online visualization can be found here, and the paper is here.
ZAB and Paxos are worth a read as well!

Dirty Reads in SQL Server AlwaysOn

I have a pair of SQL Server 2014 databases set up as a synchronous AlwaysOn availability group.
Both servers are set to the Synchronous commit availability mode, with a session timeout of 50 seconds. The secondary is set to be a Read-intent only readable secondary.
If I write to the primary and then immediately read from the secondary (via ApplicationIntent=ReadOnly), I consistently read dirty data (i.e. the state before the write). If I wait for around a second between writing and reading, I get the correct data.
Is this expected behaviour? If so, is there something I can do to ensure that reads from the secondary are up-to-date?
I'd like to use the secondary as a read-only version of the primary (as well as a fail-over), to reduce the load on the primary.
There is no way you can get dirty reads unless you are using no-lock hint..
When you enable Read Only secondaries in AlwaysOn..Internally SQL uses rowversioning to store Previous version of the row..
further you are using Synchronous commit mode,this ensures log records are committed first on secondary,then on primary..
what you are seeing is Data latency..
This whitePaper deals with this scenario..Below is relevant part which helps in understanding more about it..
The reporting workload running on the secondary replica will incur some data latency, typically a few seconds to minutes depending upon the primary workload and the network latency.
The data latency exists even if you have configured the secondary replica to synchronous mode. While it is true that a synchronous replica helps guarantee no data loss in ideal conditions (that is, RPO = 0) by hardening the transaction log records of a committed transaction before sending an ACK to the primary, it does not guarantee that the REDO thread on secondary replica has indeed applied the associated log records to database pages.
So there is some data latency. You may wonder if this data latency is more likely when you have configured the secondary replica in asynchronous mode. This is a more difficult question to answer. If the network between the primary replica and the secondary replica is not able to keep up with the transaction log traffic (that is, if there is not enough bandwidth), the asynchronous replica can fall further behind, leading to higher data latency.
In the case of synchronous replica, the insufficient network bandwidth does not cause higher data latency on the secondary but it can slow down the transaction response time and throughput for the primary workload

How intelligent is SQL Server Mirroring?

While working on my current development product I have setup SQL server mirroring between the primary data center and the secondary data center. In the primary data center the SQL .mdf and .ldf files are stored on the SAN.
Now admittedly it should be very unlikely for us to lose the SAN but if for example the connection to the SAN was lost and the database integrity was lost. Would the mirroring still happen? I.e. would SQL now mirror the broken database and now both are equally broken?
From googling its not clear when mirroring will and will not happen so I was hoping that the community may be able to share some of there experiences.
I also have backup schedules setup which would be a final fail safe but realistically I would hope that the mirrored database would be our quickest way to bring everything back online.
In this scenario at present there is no witness server in the mirroring process although with the benefits of automatic failover I am thinking of adding one.
Thanks
As far as mirroring corruption between PRIMARY and SECONDARY goes: unfortunately, it depends. If the corruption is immediate and physical, then not normally -- the corruption is typically picked up by checks done at the end of the transaction and rolled back.
However, a database can exist in a corrupted state for some time before anything realises it is corrupted. If the underlying data pages are not touched, the engine never has cause to check them. So it is possible that underlying storage issues may mean that either database can become corrupted and you won't know until you attmept to access the affected pages. Traditionally, this would be a write operation, since your client connection will only read from the current active database (and not the partner).
This is why it is important to perform regular maintenance checks on your databases (e.g. DBCC CHECKDB). This becomes harder in a mirrored environment because only PRIMARY can typically be checked, so you really have to induce a manual failover to test your SECONDARY (unless you are running Enterprise, where you might be able to snapshot the mirror and check that -- I've not tried).
Starting with SQL Server 2008, the engine will attempt something called Automatic Page Repair, where it tries to automatically recover corrupted pages it encounters during the mirroring process. You should probably keep an eye on sys.dm_db_mirroring_auto_page_repair if this is something you are worried about.
If it is logical corruption, where the wrong data is entered, this will push across to SECONDARY without any means of stopping it.
However, I should point out that your approach might leave you with other issues. Mirroring isn't backup. And mirroring isn't great over WAN links.
In synchronous mode, it receives the client request, then writes to PRIMARY, then writes to SECONDARY, gets the OK back from SECONDARY and then sends an OK back to the client. If it can't write to SECONDARY, or doesn't get the response from SECONDARY, it rolls back the operation on PRIMARY (even though it was successful) and sends a failure back to the client.
A failing WAN link (even temporarily) can cause PRIMARY to choose not to accept connections (because it can't see SECONDARY). A failover mid-connection can leave you in an invalid logical data state, so make sure your transactions are sound.
With a WITNESS server, this can be a little more robust -- placing the witness server alongside PRIMARY in the same LAN allows WITNESS and PRIMARY to form quorum and agree that PRIMARY is still working, even though it can't see SECONDARY (thus not locking you out of a perfectly functioning database).
Instead, over my slower site-to-site links, I prefer to use log shipping between PRIMARY and SECONDARY. With a bit of effort I can control the transport between sites so as to rate-limit over the WAN link and it is possible keep the log-shipped SECONDARY in a single-user standby mode. This allows me to run the standard DBCC CHECKDB commands against SECONDARY, as well as also querying the SECONDARY for data reconcilliation purposes, too. I can also put a delay on the restoration, too, so I have some leeway to failover before a major logical data error reaches the SECONDARY (although that really depends on the RDO).
If I have a high-availability requirement, I might put in mirroring at the main site only -- i.e. two servers + witness. The relatively-quick few-second automatic failover time provided by the witnessed environment has saved me a few late-night calls, in the past.
Hope this helps.
J.

Database Replication or Mirroring?

What is the difference between Replication and Mirroring in SQL server 2005?
In short, mirroring allows you to have a second server be a "hot" stand-by copy of the main server, ready to take over any moment the main server fails. So mirroring offers fail-over and reliability.
Replication, on the other hand, allows two or more servers to stay "in sync" - that means the secondary servers can answer queries and (depending on setup) actually change data (it will be merged in the sync). You can also use it for local caching, load balancing, etc.
Mirroring is a feature that creates a copy of your database at bit level. Basically you have the same, identical, database in two places. You cannot optionally leave out parts of the database. You can have only one mirror, and the 'mirror' is always offline (it cannot be modified). Mirroring works by shipping the database log as is being created to the mirror and apply (redo-ing) the log on the mirror. Mirroring is a technology for high availability and disaster recoverability.
Replication is a feature that allow 'slices' of a database to be replicated between several sites. The 'slice' can be a set of database objects (ie. tables) but it can also contain parts of a table, like only certain rows (horizontal slicing) or only certain columns to be replicated. You can have multiple replicas and the 'replicas' are available to query and even can be updated. Replication works by tracking/detecting changes (either by triggers or by scanning the log) and shipping the changes, as T-SQL statements, to the subscribers (replicas). Replication is a technology for making data available at off sites and to consolidate data to central sites. Although it is sometimes used for high availability or for disaster recoverability, it is an artificial use for a problem that mirroring and log shipping address better.
There are several types and flavours of replication (merge, transactional, peer-to-peer etc.) and they differ in how they implement change tracking or update propagation, if you want to know more details you should read the MSDN spec on the subject.
Database mirroring is used to increase database uptime and reliability.
Replication is used primarily to distribute portions of your primary database -- the publisher -- to one or more subscriber databases. This is often done to make data available (typically for read only) on remote servers so that remote clients can access the data locally (to them) rather than directly from the publisher across a slower WAN connection. Although, as the previous posts indicate, there are more complex scenarios where updates are permitted on the subscribers. It also can have the benefit of reducing the I/O load on the publisher.