Azure SQL database data archive solution - azure-sql-database

We have Azure SQL database , we need build data archival solution to manage data more than 2 years old.
Requirement:
Archive and delete more than 2 years data from certain transaction tables.
Should archive the data in low cost storage.
Should be able to quickly restore the data if required.
Looking for recurring job need to execute on every week.
Please recommend an azure solution to achieve this.

Here are the approaches you can try:
If you want to archive the complete database, you can Export a bacpac directly from Azure Portal. This bacpac file will be stored in existing Azure Storage account. Once done, you can delete the data from database. Refer Export to a BACPAC file - Azure SQL Database.
If you need to only archive 2 years of data, you can create a Stored Procedure for each table in your database. You can run that SP using Stored Procedure activity in Azure Data Factory.
The cheapest (free) option to store archive data is by creating bacpac file and store it in local machine. Or else you can use Blob Storage cold storage service for archiving data.
To restore data from bacpac, just simply import the bacpac file in your database. To restore from Blob Storage, update the container from cool to hot and use ADF to copy from that file into destination database.
If you are using SP activity in ADF as mentioned in point 1, you can trigger the ADF pipeline to run your SP weekly/monthly or whenever as per your requirement. Refer Schedule Trigger in ADF.

Related

Excel into Azure Data Factory into SQL

I read a few threads on this but noticed most are outdated, with excel becoming an integration in 2020.
I have a few excel files stored in Drobox, I would like to automate the extraction of that data into azure data factory, perform some ETL functions with data coming from other sources, and finally push the final, complete table to Azure SQL.
I would like to ask what is the most efficient way of doing so?
Would it be on the basis of automating a logic app to extract the xlsx files into Azure Blob, use data factory for ETL, join with other SQL tables, and finally push the final table to Azure SQL?
Appreciate it!
Before using Logic app to extract excel file Know Issues and Limitations with respect to excel connectors.
If you are importing large files using logic app depending on size of files you are importing consider this thread once - logic apps vs azure functions for large files
Just to summarize approach, I have mentioned below steps:
Step1: Use Azure Logic app to upload excel files from Dropbox to blob storage
Step2: Create data factory pipeline with copy data activity
Step3: Use blob storage service as a source dataset.
Step4: Create SQL database with required schema.
Step5: Do schema mapping
Step6: Finally Use SQL database table as sink

Options for ingesting and processing data in Azure sql

I need expert opinion on a project I am working on. We currently get data files that we load into our Azure sql database using a local script that calls stored procedures. I am planning on replacing the script with ssis jobs to load the data into our Azure Sql but wondering if that's a good option given our needs.I am opened to different suggestions too. The process we go through is to load data file to staging tables and validate before making updates to live tables. The validation and updates are done by calling stored procedures...so the ssis package will just load the data and make calls to those stored procedures. I have looked at ADF IR and Databricks but they seem overkill but am open to hear people with experience using those as well. I am currently running the ssis package locally as well. Any suggestion on better architecture or tools for this scenario? Thanks!
I would definitely have a look at Azure Data Factory Data flows. With this you can easily build your ETL pipelines in the a Azure Data Factory GUI.
In the following example two text files from a Blob Storage are read, joined, a surrogate key is added and finally the data is loaded to Azure Synapse Analytics (would be the same for Azure SQL):
You finally put this Mapping Data Flow into a pipeline and can trigger it, e. g. if new data arrives.
You can just BULK INSERT data from Azure Blob Store:
https://learn.microsoft.com/en-us/sql/relational-databases/import-export/examples-of-bulk-access-to-data-in-azure-blob-storage?view=sql-server-ver15#accessing-data-in-a-csv-file-referencing-an-azure-blob-storage-location
Then you can use ADF (no IR) or Databricks or Azure Batch or Azure Elastic Jobs to schedule the execution.

U SQL: direct output to SQL DB

Is there a way to output U-SQL results directly to a SQL DB such as Azure SQL DB? Couldn't find much about that.
Thanks!
U-SQL only currently outputs to files or internal tables (ie tables within ADLA databases), but you have a couple of options. Azure SQL Database has recently gained the ability to load files from Azure Blob Storage using either BULK INSERT or OPENROWSET, so you could try that. This article shows the syntax and gives a reminder that:
Azure Blob storage containers with public blobs or public containers
access permissions are not currently supported.
wasb://<BlobContainerName>#<StorageAccountName>.blob.core.windows.net/yourFolder/yourFile.txt
BULK INSERT and OPENROWSET with Azure Blob Storage is shown here:
https://blogs.msdn.microsoft.com/sqlserverstorageengine/2017/02/23/loading-files-from-azure-blob-storage-into-azure-sql-database/
You could also use Azure Data Factory (ADF). Its Copy Activity could load the data from Azure Data Lake Storage (ADLS) to an Azure SQL Database in two steps:
execute U-SQL script which creates output files in ADLS (internal tables are not currently supported as a source in ADF)
move the data from ADLS to Azure SQL Database
As a final option, if your data is likely to get into larger volumes (ie Terabytes (TB) then you could use Azure SQL Data Warehouse which supports Polybase. Polybase now supports both Azure Blob Storage and ADLS as a source.
Perhaps if you can tell us a bit more about your process we can refine which of these options is most suitable for you.

Existing SSIS Package conversion to point to Azure SQL Data Warehouse

Migration of on-premise SSIS packages to Azure SQL Data Warehouse.
Can someone suggest references or ideas/steps involved in modifying existing SSIS packages that loads a on-premise sql data warehouse to populate a SQL Data Warehouse?
Is this possible?
Regards,
KK
If you would like to just use your existing SSIS package without changing much, it can be as simple as re-configuring the OLEDB destination to connect to Azure Data Warehouse endpoint.
But then, the right way to go about loading data to Azure DW depends on the amount of data involved and what intervals. If you are exporting large amounts of data at regular intervals, then you might want to edit your SSIS package to first stage the data in Azure blob storage flat files. Next, use execute SQL task to create external tables via Polybase and then use CREATE TABLE AS dbo.InternalTable AS SELECT * FROM blob.ExternalTable.
Please check this guidance from Microsoft
If you use the latest feature pack you can go (via blob storage) into the destination table.
https://msdn.microsoft.com/en-US/library/mt146770.aspx

Data movement from SQL on-premise to SQL Azure

I have migrated my database schema to SQL Azure, but I have huge(millions) data records to be migrated please suggest me an approach to move data
Approaches I have tried.
SQLAzureMW tool (but it takes 14 hours time, its not feasible for me)
Import export on SQL server(even this is taking time)
Any other approaches ..need help..!!
For large datasets you usually have to take a more imaginative approach to migration!
One possible approach is to take a full data backup. Ensuring that transaction logs are committed and cleared at the same time.
upload, or use Azure Import / Export to get the backup into Azure blob storage
syncronise your transaction logs with Azure blob storage
Create an Azure SQL database, import backup
replay transaction logs
Keep in sync with transaction logs until you are ready to switch over.
If 14 hours using SQLAzure Migration Wizard and your database is Azure compatible, you have 4 other choices:
export locally to BACPAC, upload BACPAC to Azure, and import BACPAC to Azure
export BACPAC directly to Azure and then import BACPAC to Azure
Use SSMS migration wizard with the most recent version of SSMS (includes a number of functional and performance enhancements)
Use SQL Server transaction replication - see additional requirements for this option. This last option enables you to incrementally migrate to SQL DB and then when SQL DB is current with your on-premise database, just cut your application(s) over to SQL DB with minimal downtime
For more information, see https://azure.microsoft.com/en-us/documentation/articles/sql-database-cloud-migrate/#options-to-migrate-a-compatible-database-to-azure-sql-database