Azure Data Factory Copy Activity to Copy to Azure Data Lake Table - azure-data-lake

I need to Copy data incrementally from On-Prem SQL server into Table in Azure Data Lake Store.
But when creating Copy Activity using Azure Portal, in the Destination I only see the folders(No option for Tables).
How can I do scheduled On-prem table to Data Lake Table Syncs?

Data Lake Store does not have a notion of tables. It is file storage system (like HDFS). You can however use capabilities such as Hive or Data Lake Analytics on top of your data stored in Data Lake Store to conform your data to a schema. In hive, you can do that using external tables, while in Data Lake Analytics you can run a simple extract script.
I hope this helps!

Azure Data Lake Analytics (ADLA) does have the concept of databases which have tables. However they are not currently exposed as a target in Data Factory. I believe it's on the backlog, although I can't find the reference right now.
What you could do is use Data Factory to copy data into Data Lake Store then run a U-SQL script which imports it into the ADLA database.
If you do feel this is an important feature, you can create a request here and vote for it:
https://feedback.azure.com/forums/327234-data-lake
ADLA Databases and tables:

Related

How to store data either in Parquet or Jason files in "Azure Datalake Gen2 blob storage" before purge data in Azure SQL table

How to store data either in Parquet or Jason files in "Azure Datalake Gen2 blob storage" before purge data in Azure SQL table. What are the steps & services required to achieve this.
You can use Copy activity in Azure Data Factory to store data from SQL table as parquet files. Rather simplified, but here are the steps just to give you an idea.
If source sql table is in-house, install Azure Integration Runtime on in-house server.
Connect to source table using linked service and dataset using IR from above step.
Connect to Azure Gen 2 stortage from ADF, again using linked service.
Configure copy activity with source and sink.
Read more details here

Access Azure Data Lake Analytics Tables from SQL Server Polybase

I need to export a multi terabyte dataset processed via Azure Data Lake Analytics(ADLA) onto a SQL Server database.
Based on my research so far, I know that I can write the result of (ADLA) output to a Data Lake store or WASB using built-in outputters, and then read the output data from SQL server using Polybase.
However, creating the result of ADLA processing as an ADLA table seems pretty enticing to us. It is a clean solution (no files to manage), multiple readers, built-in partitioning, distribution keys and the potential for allowing other processes to access the tables.
If we use ADLA tables, can I access ADLA tables via SQL Polybase? If not, is there any way to access the files underlying the ADLA tables directly from Polybase?
I know that I can probably do this using ADF, but at this point I want to avoid ADF to the extent possible - to minimize costs, and to keep the process simple.
Unfortunately, Polybase support for ADLA Tables is still on the roadmap and not yet available. Please file a feature request through the SQL Data Warehouse User voice page.
The suggested work-around is to produce the information as Csv in ADLA and then create the partitioned and distributed table in SQL DW and use Polybase to read the data and fill the SQL DW managed table.

U SQL: direct output to SQL DB

Is there a way to output U-SQL results directly to a SQL DB such as Azure SQL DB? Couldn't find much about that.
Thanks!
U-SQL only currently outputs to files or internal tables (ie tables within ADLA databases), but you have a couple of options. Azure SQL Database has recently gained the ability to load files from Azure Blob Storage using either BULK INSERT or OPENROWSET, so you could try that. This article shows the syntax and gives a reminder that:
Azure Blob storage containers with public blobs or public containers
access permissions are not currently supported.
wasb://<BlobContainerName>#<StorageAccountName>.blob.core.windows.net/yourFolder/yourFile.txt
BULK INSERT and OPENROWSET with Azure Blob Storage is shown here:
https://blogs.msdn.microsoft.com/sqlserverstorageengine/2017/02/23/loading-files-from-azure-blob-storage-into-azure-sql-database/
You could also use Azure Data Factory (ADF). Its Copy Activity could load the data from Azure Data Lake Storage (ADLS) to an Azure SQL Database in two steps:
execute U-SQL script which creates output files in ADLS (internal tables are not currently supported as a source in ADF)
move the data from ADLS to Azure SQL Database
As a final option, if your data is likely to get into larger volumes (ie Terabytes (TB) then you could use Azure SQL Data Warehouse which supports Polybase. Polybase now supports both Azure Blob Storage and ADLS as a source.
Perhaps if you can tell us a bit more about your process we can refine which of these options is most suitable for you.

Is it possible to export data from MS Azure SQL directly Into the Azure Table Storage?

Is there any direct way within the Azure MSSQL ecosystem to export SQL returned data set into the Azure table storage?
Something like BCP but with the Table Storage connection string on the -Output end?
There is a service named Azure Data Factory which can directly copy data from Azure SQL Database to Azure Table Storage, even between other supported data stores, please see the section Supported data stores of the article "Data movement and the Copy Activity: migrating data to the cloud and between cloud stores" to know, but it is for Web, not like BCP command tool.
You can refer to the tutorial Build your first Azure data factory using Azure Portal/Data Factory Editor to know how to use it.
And as references, you can refer to the articles Move data to and from Azure SQL Database using Azure Data Factory & Move data to and from Azure Table using Azure Data Factory to know how it works.

Error trying to move data from Azure table to DataLake store with DataFactory

I've been building a Datafactory pipeline to move data from my azure table storage to a datalake store, but the tasks fail with an exception that I can't find any information on. The error is
Copy activity encountered a user error: ErrorCode=UserErrorTabularCopyBehaviorNotSupported,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=CopyBehavior property is not supported if the source is tabular data source.,Source=Microsoft.DataTransfer.ClientLibrary,'.
I don't know where the problem lies, if in the datasets, the linked services or the pipeline, and can't seem to find any info at all on the error I'm seeing on the console.
Since the copy behavior from Azure Table Storage to Azure Data Lake Store is not currently supported as a temporary work around you could go from Azure Table Storage to Azure Blob Storage to Azure Data Lake store.
Azure Table Storage to Azure Blob Storage
Azure Blob Storage to Azure Data Lake Store
I know this is not ideal solution but if you are under time constraints, it is just an intermediary step to get the data into the data lake.
HTH
The 'CopyBehaviour' property is not supported for Table storage (which is not a file based store) that you are trying to use as a source in ADF copy activity. That is the reason why you are seeing this error message.