Is Hazelcast async write transitive? - replication

I am doing some simple benchmarking with Hazelcast to see if it might fit our needs for a distributed data grid. The idea is to have an odd number of servers (eg 5) with '> n/2' replication (eg 3).
With all servers and the client running on my local machine (no network latency) I get the following results:
5 x H/C server (sync backup = 2, async backup = 0); 100 Client Threads : 35,196 puts/second
5 x H/C server (sync backup = 1, async backup = 1); 100 Client Threads : 41,918 puts/second
5 x H/C server (sync backup = 0, async backup = 2); 100 Client Threads : 52,007 puts/second
As expected, async backups allow higher throughput than sync backups. For our use case we would probably opt for the middle option (1x sync and 1x async) as this give us an acceptable balance between resilience and performance.
My question is: If Hazelcast is configured with 1x sync and 1x async, and the node crashes after the sync backup is performed (server returns 'OK' to client and client thread carries on) but before the async backup is performed (so the data is only on one replica and not the second), will the node that received the sync backup pick up the task of the async backup, or will it just wait until the entire cluster re-balances and the 'missing' data from the crashed node is re-distributed from copies? And if the latter, once the cluster re-balances will there be a total of 3 copies of the data, as there would have been if the node hadn't crashed, or will there only be 2 copies because the sync'd node assumes that another node already received its copy?

The partition owner is responsible for creating all backups.
In other words: The 1st backup does NOT create a new backup request for the 2nd backup - it's all responsibility of the owner.
If a member holding a backup replica is stale then anti-entropy mechanism kicks in and the backup partition will be updated to match the owner.
When a member goes down then the 1st (=sync) backup is eventually promoted to be a new partition owner. It's a responsibility of the new owner to make sure a configured redundancy is honoured - a new backup will be created to make sure there 2 backups as configured.

Related

Azure : How to clone sql virtual machine

I am trying to clone a SQL VM to another resource group,
Cloning a normal VM is simple,
Create disk snapshot (OS & data Disks)
Create Disk from snapshot
create VM from managed Disk
The image I have is (image: Sql Server 2019 Standard on Windows Server 2022-Gen2), following the above steps only creating a vm but not SQL Virtual Machine.
Please let me know if anyone knows the correct steps or any documentation.
Thanks in advance.
You can perform this activity through PowerShell. Please go through below steps-
To create a snapshot using the Azure portal, complete these steps, you can skip this if you have already created a snapshot.
In the Azure portal, select Create a resource.
Search for and select Snapshot.
In the Snapshot window, select Create. The Create snapshot window appears.
For Resource group, select an existing resource group or enter the name of a new one.
Enter a Name, then select a Region and Snapshot type for the new snapshot. If you would like to store your snapshot in zone-resilient storage, you need to select a region that supports availability zones. For a list of supporting regions, see Azure regions with availability zones.
For Source subscription, select the subscription that contains the managed disk to be backed up.
For Source disk, select the managed disk to snapshot.
For Storage type, select Standard HDD, unless you require zone-redundant storage or high-performance storage for your snapshot.
If needed, configure settings on the Encryption, Networking, and Tags tabs. Otherwise, default settings are used for your snapshot.
Select Review + create.
Moving snapshot of SQL Virtual Machine under different resource group in another subscription
PowerShell:
#Provide the subscription Id of the subscription where snapshot exists
$sourceSubscriptionId='yourSourceSubscriptionId'
#Provide the name of your resource group where snapshot exists
$sourceResourceGroupName='yourResourceGroupName'
#Provide the name of the snapshot
$snapshotName='yourSnapshotName'
#Set the context to the subscription Id where snapshot exists
Select-AzSubscription -SubscriptionId $sourceSubscriptionId
#Get the source snapshot
$snapshot= Get-AzSnapshot -ResourceGroupName $sourceResourceGroupName -Name $snapshotName
#Provide the subscription Id of the subscription where snapshot will be copied to
#If snapshot is copied to the same subscription then you can skip this step
$targetSubscriptionId='yourTargetSubscriptionId'
#Name of the resource group where snapshot will be copied to
$targetResourceGroupName='yourTargetResourceGroupName'
#Set the context to the subscription Id where snapshot will be copied to
#If snapshot is copied to the same subscription then you can skip this step
Select-AzSubscription -SubscriptionId $targetSubscriptionId
#Store your snapshots in Standard storage to reduce cost. Please use Standard_ZRS in regions where zone redundant storage (ZRS) is available, otherwise use Standard_LRS
#Please check out the availability of ZRS here: https://docs.microsoft.com/en-us/Az.Storage/common/storage-redundancy-zrs#support-coverage-and-regional-availability
$snapshotConfig = New-AzSnapshotConfig -SourceResourceId $snapshot.Id -Location $snapshot.Location -CreateOption Copy -SkuName Standard_LRS
#Create a new snapshot in the target subscription and resource group
New-AzSnapshot -Snapshot $snapshotConfig -SnapshotName $snapshotName -ResourceGroupName $targetResourceGroupName

Ignite with backup count zero

I have set backup count of ignite cache to zero. I have created two server node(say s1 and s2) and one client node(c1). I have set cache mode as Partitioned. I have inserted data in ignite cache. Stopped server 2 and tried access data it is not getting data. If backup count is 0 then how to copy data from one server node other server node. Does ignite does automatically when we stop node.
The way Ignite manages this is with backups. If you set it to zero, you have no resilience and removing a node will result in data loss (unless you enable persistence). You can configure how Ignite responds to this situation with the Partition Loss Policy.

Database copy limit per database reached. The database X cannot have more than 10 concurrent database copies (Azure SQL)

In our application, we have a master database 'X'. For each new client, we will create a new database copy of master database 'X'.
I am using the following SQL command which will be executed against Azure SQL server.
CREATE DATABASE [NEW NAME] AS COPY OF [MASTER DB]
We are using a custom queue tier so that we can create more than one client at a time parallelly.
I am facing issues in following scenario.
I am trying to create 70 clients. Once 25 clients got created I am getting below error.
Database copy limit per database reached. The database 'BlankDBClient' cannot have more than 10 concurrent database copies
Can you please share your thoughts on this?
SQL Azure has logic to do various operations online/automatically for you (backups, upgrades, etc). There are IOs required to do each copy, so there are limits in place because the machine does not have infinite iops. (Those limits may change a bit over time as we work to improve the service, get newer hardware, etc).
In terms of what options you have, you could:
Restore N databases from a database backup (which would still have IO limits but they may be higher for you depending on your reservation size)
Consider models to copy in parallel using a single source to hierarchically create what you need (copy 2 from one, then copy 2 from each of the ones you just copied, etc)
Stage out the copies over time based on the limits you get back from the system.
Try a larger reservation size for the source and target during the copy to get more IOPS and lower the time to perform the operations.
In addition to Connor answer, you can consider to a have a dacpac or bacpac of that master database stored on Azure Storage and once you have submitted 25 concurrent database copies you can start restoring the dacpac from Azure Storage.
You can also monitor how many database copies are showing COPYING on the state_desc column of the following queries, after sending the first batch of 25 copies, and when those queries return less than 25 rows, start sending more copies until reaching the 25 limit. Keep doing this until finishing the queue of copies required.
Select
[sys].[databases].[name],
[sys].[databases].[state_desc],
[sys].[dm_database_copies].[start_date],
[sys].[dm_database_copies].[modify_date],
[sys].[dm_database_copies].[percent_complete],
[sys].[dm_database_copies].[error_code],
[sys].[dm_database_copies].[error_desc],
[sys].[dm_database_copies].[error_severity],
[sys].[dm_database_copies].[error_state]
From
[sys].[databases]
Left
Outer
Join
[sys].[dm_database_copies]
On
[sys].[databases].[database_id] = [sys].[dm_database_copies].[database_id]
Where
[sys].[databases].[state_desc] = 'COPYING'
SELECT state_desc, *
FROM sys.databases
WHERE [state_desc] = 'COPYING'

CEPH Write Acknowledgement in case a replica node is down

While a ceph write operation , standard PUT operation - in case the data node that holds the partition (based on hash) is found dead, then does the coordinator node still sends SUCCESS ACK back for write operation ?
So the question is in case one of 3 replica nodes is found unhealthy, is the WRITE operation ACKED as failure ?
it seems it will fail in write acknowledgment in case a replica node is down if replication factor > 1 (example 2)
Data management begins with clients writing data to pools. When a client writes data to a Ceph pool, the data is sent to the primary OSD. The primary OSD commits the data locally and sends an immediate acknowledgement to the client if replication factor is 1. If the replication factor is greater than 1 (as it should be in any serious deployment) the primary OSD issues write subops to each subsidiary (secondary, tertiary, etc) OSD and awaits a response. Since we always have exactly one primary OSD, the number of subsidiary OSDs is the replication size - 1. Once all responses have arrived, depending on success, it sends acknowledgement (or failure) back to the client.

data sync azure, first synchronization large database

I'm trying to sync a local flock of data to an azure database, however as I have a lot of data it's taking a long time to sync first. I have already put the data in my azure database through the migration program. Even so, when I do the first synchronization it takes all the data. How to make it sync only the changes.
scenario:
local database
azure database (copy of local bank 1)
azure do not synchronize all data on first synchronization
Azure Data Sync currently does not support this scenario. When two databases are put in the sync group, they will be synchronized until each row has been resolved.
Data Sync won't know the two databases are identical until it compares the data row by row. It is a very costly process and may take a long time if you have large databases/tables. Our recommendation is to have data only on one side and keep the same table empty in the other databases. In this case, data sync will use bulk load during initialization and it is much faster than row by row comparison.
Hope this helps.
I found a kludge way to skip the initial sync.
First setup and start the sync one time and wait for the provisioning to complete. Then stop the sync.
Then execute
update [YOURDB].[DataSync].[provision_marker_dss]
set [state] = 0
where object_id = object_id('YourSyncTable')
Then start the sync again. It will complete almost instantly.
Then execute
update [YOURDB].[DataSync].[provision_marker_dss]
set [state] = 1
where object_id = object_id('YourSyncTable')
And then you should be good to go.