U-SQL Dynamic External Data Source

U-SQL Dynamic External Data Source - datasource

Greetings fellow developers!
Does anyone know if there is there a way to query from dynamic external data source names in U-SQL?
For example, in the MS sample script below, we would like the "MyAzureSQLDWDataSource" to be generated dynamically.
#results =
SELECT DateTime.Now AS dayTime, *
FROM EXTERNAL MyAzureSQLDWDataSource LOCATION "dbo.AdventureWorksDWBuildVersion";
OUTPUT #results
TO "/Output/ReferenceGuide/DDL/DataSources/Query2B.csv"
USING Outputters.Csv(outputHeader: true);
Thanks!

You cant connect to external data source.
You can use DataFactory and use a Copy Activity to download your data to DataLake and U-SQL Activity to execute your USQL Script.
You can trigger it on demand or schedule it.
https://learn.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities

Related

How Paramterize Copy Activity to SQL DB with Azure Data Factory

I'm trying to automatically update tables in Azure SQL Database from another SQLDB with Azure Data Factory. At the moment, the only way to update the table Azure SQL Database is to physically select the table you want to update in Azure SQL Database, as shown here:
My configuration to automatically select a table the SQLDB that I want to copy to Azure SQL Database is as follows:
The parameters are as follows:
#concat('SELECT * FROM ',pipeline().parameters.Domain,'.',pipeline().parameters.TableName)
Can someone let me know how to configure my SINK and/or connection to automatically insert the table selected from SOURCE.
My SINK looks like the following:
And my connection looks like the following:

Can someone let me know how to configure my SINK and/or connection to
automatically insert the table selected from SOURCE.
You can use Edit option in the SQL dataset.
Create a dataset parameter for the sink table name. In the SQL sink dataset check the Edit checkbox in it and use the dataset parameter. If you want, you can use dataset parameter for the database name also. Here I have given directly (dbo).
Now in the copy activity sink, you can give the table name dynamically from any pipeline parameter (give your parameter in this case) or any variable using the dynamic content.
Also, enable the Auto create table which will create new table if the table with the given name not exists and if it exists it ignores creation and copies data to it.
My sample result:

SSIS - connection to new database

I am new to SSIS so the question might seem simple. What I'm trying to do is to extract data from a source and load it into a new database which should be created in the process (not beforehand). I create that DB using Execute SQL task. However I encounter a problem as I'm unable to connect to that DB using data destination because DB does not exist at that moment.
Can you please help me with ideas how to solve this problem? Or maybe there is any other way how to create the kind of package I described?

I think you need to create db first in your sql server and then point to that db in destination connection. And map the columns with your source query or table with your destination table.

In your requirement you are asking to extract data from suppose Database1 and copy that data in database2. And this should be done during execution of SSIS package.
For this you need to use Execute SQL Task for Destination also.
For example:
Create database Database2;
Insert into Database2.TableName
Select * from Database1.TableName

ADF Copy into SQL table without creating source file

I have a scenario to copy output of GET Metadata activity into a SQL table. Can I do this directly without using Databricks notebook?

You can make use of look up activity.
GetMetadata -> Lookup
And write insert SQL statement in Query, or use stored procedure.

U-SQL job to query multiple tables with dynamic names

Our challenge is the following one :
in an Azure SQL database, we have multiple tables with the following table names : table_num where num is just an integer. These tables are created dynamically so the number of tables can vary. (from table_1, table_2 to table_N) All tables have the same columns.
As part of a U-SQL script file, we would like to execute the same query on all of these tables and generate an output csv file with the combined results of all these queries.
We tried several things :
U-SQL does not allow looping so we were thinking creating a View in our Azure SQL database that would combine all the tables using a cursor of some sort. Then, the U-SQL file would query this View (using external source). However, a View in Azure SQL database can only be created via a function and a function cannot execute dynamic SQL or even call a stored procedure...
We did not find a way to call a stored procedure of the external data source directly from U-SQL
we dont want to update our U-SQL job each time a new table is added...
Is there a way to do that in U-SQL through a custom extractor for instance? Any other ideas?

One solution I can think of is to use Azure Data Factory (v2) to assist in this.
You could create a pipeline with the following activities:
Lookup activity configured to execute the stored procedure
For Each activity that uses the output of the lookup activity as a source
As a child item use a U-Sql Activity that executes your U-Sql script which writes the output of a single table (the item of the For Each activity) to blob or datalake
Add a Copy Activity that merges the blobs from step 2.1 to one final blob.
If you have little or no experience working with ADF v2 do mind that it takes some time to get to know it but once you do, you won't regret it. Having a GUI to create the pipeline is a nice bonus.
Edit: as #wBob mentions another (far easier) solution is to somehow create a single table with all rows since all dynamically generated table have the same schema. You can create a stored procedure for populating this table for example.

SSIS how to use a table created in a SQL Task as destination in a following Data Flow Task

In SSIS I have a SQL Task which drops and creates a table T. Then I have a Data Flow task which needs to use T as destination to write data.
The Destination Assistant and the fast-load option needs the table T already present in the database to show it as possible destination.
Maybe I could use SQL Command as data access mode but I don't know how to access the incoming data columns from the stream.
How can I use table T as destination in the data flow task?

Store the tablename inside a package variable, select destination type as Tablename from variable and use it, but make sure to set Delay Validation property to True (change this property in the dataflow task and destination)
Note: when designing package T table must be found in the database to read it's structure in the destination, also if tablename is fixed can achieve this without the use of a variable

instead of drop table T in first sql task, truncate table T and Table t will be a permanently available as destination assistant. Hope this helps

In the SQL Task instead of drop and create, can you just Delete or Truncate the data in table T?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

U-SQL Dynamic External Data Source - datasource

You cant connect to external data source. You can use DataFactory and use a Copy Activity to download your data to DataLake and U-SQL Activity to execute your USQL Script. You can trigger it on demand or schedule it. https://learn.microsoft.com/en-us/azure/data-factory/concepts-pipelines-activities

Related

How Paramterize Copy Activity to SQL DB with Azure Data Factory

SSIS - connection to new database

ADF Copy into SQL table without creating source file

U-SQL job to query multiple tables with dynamic names

SSIS how to use a table created in a SQL Task as destination in a following Data Flow Task

Categories

Resources