How Paramterize Copy Activity to SQL DB with Azure Data Factory - azure-data-factory-2

I'm trying to automatically update tables in Azure SQL Database from another SQLDB with Azure Data Factory. At the moment, the only way to update the table Azure SQL Database is to physically select the table you want to update in Azure SQL Database, as shown here:
My configuration to automatically select a table the SQLDB that I want to copy to Azure SQL Database is as follows:
The parameters are as follows:
#concat('SELECT * FROM ',pipeline().parameters.Domain,'.',pipeline().parameters.TableName)
Can someone let me know how to configure my SINK and/or connection to automatically insert the table selected from SOURCE.
My SINK looks like the following:
And my connection looks like the following:

Can someone let me know how to configure my SINK and/or connection to
automatically insert the table selected from SOURCE.
You can use Edit option in the SQL dataset.
Create a dataset parameter for the sink table name. In the SQL sink dataset check the Edit checkbox in it and use the dataset parameter. If you want, you can use dataset parameter for the database name also. Here I have given directly (dbo).
Now in the copy activity sink, you can give the table name dynamically from any pipeline parameter (give your parameter in this case) or any variable using the dynamic content.
Also, enable the Auto create table which will create new table if the table with the given name not exists and if it exists it ignores creation and copies data to it.
My sample result:

Related

Get CSV Data from Blob Storage to SQL server Using ADF

I want to transfer data from csv file which is in an azure blob storage with the correct data types to SQL server table.
How can I get the structure for the table in the CSV file? ( I mean like when we do script table to new query in SSMS).
Note that the CSV file is not available on premise.
If your target table is already created in SSMS, copy activity will take care of the schema of source and target tables.
This is my sample csv file from blob:
In the sink I have used a table from Azure SQL database. For you, you can create SQL server dataset by SQL server linked service.
You can see the schema of csv and target tables and their mapping.
Result:
if your target table is not created in SSMS, you can use dataflows and can define the schema that you want in the Projection.
Create a data flow and in the sink give our blob csv file. In the projection of sink, we can give the datatypes that we want for the csv file.
As our target table is not created before, check on edit in the dataset and give the name for the table.
In the sink, give this dataset (SQL server dataset in your case) and make sure you check on the Recreate table in the sink Settings, so that a new table with that name will be created.
Execute this Dataflow, your target table will be created with your user defined data types.

How to create a database, table and Insert data into it and use it as a source in another data flow in SSIS?

I have a need to create a SQL database and a table and Insert data into the table from another SQL database . And also to use this newly created database as a oledb source in another dataflow in the same SSIS package. The table and database name are fixed.
I tried using script task to create database and tables. But when I have to insert data , I am not able to give database name in the connection manager as the database is created only in runtime.
I have tried setting ValidExternalMetaData to false, but that doesnt seems to help as well.
Any idea or suggestions on how to accomplish this will be of great help. Thanks
I think you just need two things to make this work:
While developing the package, the database and table will need to exist.
Set DelayValidation to true on the connection manager and dataflow tasks in order to avoid failures with connection tests before they are created.
use a variable to hold the new table name create and populate the using the variable then use the variable name in the source object.

Bulk copy multiple csv files from Blob Container to Azure SQL Database

Environment:
MS Azure:
Blob Container, multiple csv files saved in a folder. This is my source.
Azure Sql Database. This is my target
Goal:
Use Azure Data Factory and build a pipeline to "copy" all files from the container and store them in their respective tables in the Azure Sql database by automatically creating those tables.
How do I do that? I tried following this but I just end up having tables incorrectly created in the database, where table is created with a single column having same name as the table name.
I believe I followed the instructions from that link pretty must as they are.
My CSV file is as follows, one column contains the table name.
The previous steps will not be repeated,it is the same as the link.
At Step3 inside the Foreach activity, we should add a Lookup activity to query the table name from the source dataset.
We can declare a String type variable tableName pervious, then set the value via expression #activity('Lookup1').output.firstRow.tableName.
At sink setting of the Copy activity, we can key in #variables('tableName').
ADF will auto create the table for us.
The debug result is as follows:

Azure Machine Learning Write output to Azure SQL Database

I am using Azure Machine Learning to clustering data.
The input data is from an Azure SQL Database, and it works fine.
At the end of everything I want to write the output to a table in the same Azure SQL Database, but I get this error:
Error: Error 1000: AFx Library library exception:
Sql encountered an error: Login failed for user
Anyone any idea?
Thank you very much!
Please follow the instructions and examine the examples provided here to properly use the Export Data module to save the data of ML to Azure SQL Database.
How to Export Data to an Azure SQL Database
Add the Export Data module to your experiment. You can find this module in the Data Input and Output group in the experiment items list in Azure Machine Learning Studio.
Connect it to the module that produces the data that you want to export to Azure SQL DB.
For Data destination, select Azure SQL Database. This option supports Azure SQL Data Warehouse as well.
Set the following options specific to Azure SQL Database or Azure SQL Data Warehouse.
Database server name
Type the server name that is generated by Azure. Typically it has the form <generated_identifier>.database.windows.net.
Database name
Type the name of a database on the server you just specified.The database must already exist; the Export Data cannot create it.
Server user account name
Type the user name of an account that has access permissions for the database.
Server user account password
Provide the password for the specified user account.
Comma-separated list of columns to be saved
Type the names of the columns in the experiment that you want to write to the database.
Data table name
Type the name of the table where data will be stored.
For Azure SQL Database, if the table does not exist, it will be created. For Azure SQL Data Warehouse, the table must already exist and have the correct schema, so be sure to create it in advance.
Comma-separated list of datatable columns
Type the names of the columns as you wish them to appear in the destination table. The columns should correspond in order with the column names that you list in Comma-separated list of columns to be saved.
if you are writing to Azure SQL Data Warehouse, the columns names must match those already in the destination table schema.
Number of rows written per SQL Azure operation
Indicate how many rows should be written to the destination table in each batch. By default, the value is set to 50, which is the default batch size for Azure SQL Database. However, you should increase this value if you have a large number of rows to write.
TIP:
For Azure SQL Data Warehouse, we recommend that you set this value to 1. If you use a larger batch size, the size of the command string that is sent to Azure SQL Data Warehouse can exceed the allowed string length, causing an error.
If you don't want to write new results each time you run the experiment, select the Use cached results option. If there are no other changes to module parameters, the experiment will write the data the first time the module is run, and thereafter not perform writes.
However, a write will always be performed if any parameters have been changed in Export Data that would change the results.
Run the experiment.
Find the issue!
I needed to create an specific user with this SQL code:
CREATE USER AMLApplicationUser WITH PASSWORD = '************';
and then add the user to these roles on the database I want to write.
ALTER ROLE db_datareader ADD MEMBER AMLApplicationUser;
ALTER ROLE db_datawriter ADD MEMBER AMLApplicationUser;
I guess only the datawriter role is enough, but I needed datareader too.
So in conclusion, seems that database admin role can be used to read data, but not to write data from AML.
Thank you for your help!

SSIS how to use a table created in a SQL Task as destination in a following Data Flow Task

In SSIS I have a SQL Task which drops and creates a table T. Then I have a Data Flow task which needs to use T as destination to write data.
The Destination Assistant and the fast-load option needs the table T already present in the database to show it as possible destination.
Maybe I could use SQL Command as data access mode but I don't know how to access the incoming data columns from the stream.
How can I use table T as destination in the data flow task?
Store the tablename inside a package variable, select destination type as Tablename from variable and use it, but make sure to set Delay Validation property to True (change this property in the dataflow task and destination)
Note: when designing package T table must be found in the database to read it's structure in the destination, also if tablename is fixed can achieve this without the use of a variable
instead of drop table T in first sql task, truncate table T and Table t will be a permanently available as destination assistant. Hope this helps
In the SQL Task instead of drop and create, can you just Delete or Truncate the data in table T?