Azure SQL Pool what really it is and could it be used for Postgres database - azure-sql-database

I have question regarding SQL Pool. Not sure i understood what it is. Does SQL Pool service is the service for SQL Server type databases? I have Postgres database and consider to move it to Azure nevertheless what is there any usage of SQL Pool service in case of Azure Postgres or it's only for Azure SQL Server database? Last: Does SQL Pool also used by Synapse ETL?

Azure SQL Pool is used with Azure Synapse Analytics to query Big Data. You can consider it as a Data Warehouse. Once your dedicated SQL pool is created, you can import big data with simple PolyBase T-SQL queries, and then use the power of the distributed query engine to run high-performance analytics.
How SQL Pool works? In a cloud data solution, data is ingested into big data stores from a variety of sources. Once in a big data store, Hadoop, Spark, and machine learning algorithms prepare and train the data. When the data is ready for complex analysis, dedicated SQL pool uses PolyBase to query the big data stores. PolyBase uses standard T-SQL queries to bring the data into dedicated SQL pool tables.
No, PostgreSQL can't be used in SQL Pool. There is actually no link between these two services. If you want to migrate the on-premises PostgreSQL to Azure, you can use Azure Database for PostgreSQL. Check Tutorial: Migrate PostgreSQL to Azure DB for PostgreSQL online using DMS via the Azure CLI.

Related

How to increase performance on Azure inbuilt SQL Serverless Pool in Synapse

We are currently extracting multiple tables from Azure SQL Servereless pool in Synapse. Unlike a regular Azure SQL Database it is very easy to increase the performance from Basic all the way through to Premium or Business continuity.
Can someone let me know how to go about increasing the performance of Azure SQL Serverles Pool in synapse?
Serverless SQL pool is a distributed data processing system and it doesn't have any inbuilt storage to store data. It uses external table to query the data from Azure data lake storage. Therefore, data cannot be copied to the serverless SQL pool. If data needs to be extracted from serverless SQL pool, you can extract data directly from the underlying external storage. If the target datastore supports polybase data loading, use that to load to the target table from ADLS.

Where is data physically stored in Azure Synapse Dedicated SQL Pool?

Documentation from Microsoft and others strongly emphasizes the separation between storage and compute in Azure Synapse Analytics.
In the case of a Serverless SQL pool, it is clearly explained that the data is stored in an Azure Data Lake DSL Gen2.
However, in the case of a Dedicated SQL Pool, the documentation is not explicit enough on data storage.
In a book that deals with Azure Synapse, it is stated that in the case of Dedicated SQL Pool, data is stored in Storage Nodes which are completely separate from Compute Nodes.
Since this claim is not in Microsoft's documentation, I dare not trust it.
So, is there an official resource that sheds light on this question?
This is a question that has been on my mind for a long time as well. However, I have come to the conclusion that data is actually stored in Dedicated SQL Pools.
Let me explain why I believe this.
Take a look at the documentation given here,
https://learn.microsoft.com/en-us/azure/synapse-analytics/quickstart-copy-activity-load-sql-pool
Notice that it is about loading data into a Dedicated SQL Pool. Further, to quote part of the documentation,
A dedicated SQL pool offers T-SQL based compute and storage
capabilities. After creating a dedicated SQL pool in your Synapse
workspace, data can be loaded, modeled, processed, and delivered for
faster analytic insight.
It is said that Dedicated SQL Pools provide both compute and storage capabilities.
Furthermore, with Dedicated SQL Pools, you may already know that it is possible to create traditional tables. We can organize these tables into something along the lines of a star or snowflake schema to model our data warehouses.
Creation of such tables, however, is not possible with Serverless SQL Pools. Only the creation of metadata objects, i.e. views or external tables are allowed. This is explained here,
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/on-demand-workspace-overview
To quote the relevant passage of the article,
Serverless SQL pool has no local storage, only metadata objects are
stored in databases. Therefore, T-SQL related to the following
concepts isn't supported:
Tables Triggers Materialized views DDL statements other than ones
related to views and security DML statements
To me, the fact that tables can actually be created in Dedicated SQL Pools is further proof that the data is physically stored in them.
My final argument is around the idea of distributions. The concept is explained here,
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/massively-parallel-processing-mpp-architecture
This talks about how data is divided up among the compute nodes and how queries are executed in parallel on the distributions in these nodes. It would not be possible to implement this if the data was not actually stored in these nodes.
In my humble opinion, how I believe Azure Storage comes into the picture (at least, when it comes to Dedicated SQL Pools) is with regards to storing data as files in a data lake and then ingesting them into the pool for analysis.
An explanation can be found here,
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/overview-architecture
Yet another quote,
Serverless SQL pool allows you to query your data lake files, while
dedicated SQL pool allows you to query and ingest data from your data
lake files. When data is ingested into dedicated SQL pool, the data is
sharded into distributions to optimize the performance of the system.
This is where Polybase comes into play. You can define various data loading patterns (into Dedicated SQL Pools) using Polybase as explained here,
https://learn.microsoft.com/en-us/azure/synapse-analytics/sql/load-data-overview
The Microsoft documentation on Design tables using dedicated SQL pool in Azure Synapse Analytics, found at https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview, states the following:
Table persistence: Tables store data either permanently in Azure
Storage, temporarily in Azure Storage, or in a data store external to
dedicated SQL pool.
Regular table A regular table stores data in Azure Storage as part of
dedicated SQL pool...

Can I use an Azure SQL Database as the source for a replication publication?

Due to reasons (I've been told it's a networking issue with MIs; regardless, we can't fix it, we're waiting on a solution from MS that may or may not come out this year), we cannot talk from on-prem to managed instances. However, we can reach Azure SQL Databases.
We would like to replicate lookup data from on-prem to Azure Managed Instances (MIs) as well as ASDs. Is there any way to use the ASD as a "jump" box for replication, maybe by putting the Distributor on an MI that can talk to the ASD?
Looked at Azure Data Sync, but the 5-minute-minimum makes it a no-go.
Otherwise, our current fallback is to run an Azure VM/AKS instance, replicate to it, then from there to the ASDs/MIs. But man, I'd rather not have to do that.
Any suggestions appreciated.
One Way Transactional replication using SQL Data Sync for Azure.
If they wish to maintain the replication running after the migration to Managed Instances, transactional replication will be the best option at this time. Replication to Azure SQL Database
Or using ETL via Azure DataFactory
Transfer data from a SQL Server database to an Azure SQL Database using Azure Blob Storage and the Azure Data Factory (ADF): this is a supported legacy technique that benefits from a replicated staging copy.
ADF pipeline consisting of two data migration processes. They work together to transfer data between a SQL Server database and an Azure SQL Database on a regular basis. The two actions are as follows:
Data should be copied from a SQL Server database to an Azure Blob Storage account

Azure Elastic Pool with Azure SQL Databases and MySQL databases

Fast question:
Is it possible to have an Elastic Pool in Azure with Azure SQL Databases and MySQL databases?
Or in alternative an Elastic Pool made of Managed Instance and MySQL databases?
Thank you #Francesco Mantovani. For now Lets post the answer that we have for now. Once you have your article ready, you can still post it as additional answer here.
===
Cut from https://dba.stackexchange.com/questions/279553/azure-elastic-pool-is-it-supported-for-mysql
Azure DB for MySQL is similar as Azure SQL DB Elastic pool or Azure SQL DB managed instance.
With Azure DB for MySQL server, we can create one or multiple DBs. We
can >
Create a single DB per server to use all the resources or
Create multiple databases to share the resources. The pricing is structured per-server, based on the configuration of pricing tier,
vCores, and storage (GB).
Reference : https://learn.microsoft.com/en-us/azure/mysql/concepts-servers
Similarly in Azure SQL DB Elastic Pool
Azure SQL DB elastic pools are cost-effective solution for managing
and scaling multiple databases that have varying and unpredictable
usage demands. The DBs in an elastic pool are on a single server and
share a set number of resources at a set price. Elastic pools in Azure
SQL DB enable SaaS developers to optimize the price performance for a
group of databases within a prescribed budget while delivering
performance elasticity for each database.
Reference : https://learn.microsoft.com/en-us/azure/azure-sql/database/elastic-pool-overview
Only Azure SQL DB has the feature to have multiple databases with separate physical resources in the same logical server.
In Azure DBs for MySQL, If you wish to have two DBs with their own dedicated resources, you need to have two separate Azure DBs for MySQL Servers.
I totally agree with MadhurajVadde-MT:
In Azure DBs for MySQL, If you wish to have two DBs with their own
dedicated resources, you need to have two separate Azure DBs for MySQL
Servers.
It might sound ridiculous but all Azure OSS servers are made to store several databases by default:
https://www.jeeja.biz/2021/08/26/lets-get-confused-azure-database-for-mysql-mariadb-postgresql-part-2/
You kinda have Elastic Pool by default.

Migrate on-prem SQL Server database to Azure SQL database

We're in the process of a server migration from an on-prem server (Win2008R2) to Azure PaaS.
To move the DBs, we used the Microsoft Data Migration Assistant (DMA) tool, which worked great and we can connect to the migrated Azure DB via SQL Server Management Studio.
Considering:
Made quite a few changes to the migrated Azure DB (tables, stored procedures, indexes) to work with the apps in Azure
Combined multiple on-prem DBs into one DB in Azure via DMA to save costs
On-prem DB is continually being modified by insert/update operations (multiple tables) during the migration process
Question: what is the best and fastest way to migrate data (all vs missing/updated) considering the above?
I would recommend you to migrate first only the schema of your on-premises databases to Azure SQL Databases and then let Azure SQL Data Sync to migrate the data to Azure and keep it updated on Azure SQL Database.
My suggestion to start with an empty schema on the Azure SQL Database side is because when SQL data Sync finds data on-premises and on Azure it start comparing both databases and that consumes a lot of resources.
On the initial sync SQL Data Sync may consume a lot of resources on the on-premises database server even when having an empty schema on the Azure side, for that you can use SQL Server Resource Governor to cap the CPU used by the data sync sessions in your on premises SQL Server, and this way avoid big performance impact possibly affecting database users.
When you are ready, you can switch your users (gradually or not if SQL Data Sync is on bi-directional mode) to Azure. Once your users have been migrated, you can then remove the member database (the on-premises database) from the SQL Data Sync configuration and stop SQL Data Sync operation.
I disagree with all the answers here.
If you are running on Win2008R2 there is a high chance that you are on an old SQL Server (2008? 2012?) which are both deprecated and unsuitable for Azure SQL Database. And probably the application is also old and not suitable for the Cloud in general. I suggest you a good testing phase.
Here my to do list:
Upgrade SQL Server to SQL Server 2016 on-prem and test if all your queries are still running correctly
Test how ready is your SQL Server to go to Azure SQL Database through Microsoft Data Migration Assistant (DMA) tool or the new Azure SQL Migration extension for Azure Data Studio (came out his month).
Don even think for a second that merging databases will reduce your overall costs. Decide if going multi-tenant or single-tanant not because of the price of the database.
Plan for hours of downtime based on the size of the migration. Don't migrate while your database is modified. Expect downtime. The best way is to take a backup of the day before and then resume the logs.
and test like crazy. This is not gonna be easy because the app is old.
Good luck.
Visual Studio also has a great tool for comparing both schema and data between two databases on different servers.
It can then update the target database with any changes after which you can switch over to use the Azure DB.
This method would require downtime of around 5-30 minutes depending on amount of data, but that might be acceptible depending on your requirements.