generic ssis package to create output text files using multiple procedures - sql

I've few procedures(sql server) which produce output and I manually save them in text files with headers.
I need a generic SSIS package which takes procedure_name, output_filename as inputs and produce ouput text file.
As different procedure produces different columns, I need to handle metadata dynamically.
Ex: proc1 produces following result
col1|col2
a|b
proc2 produces following result:
col1|col2|col3
1|2|a
One solution was to change all the procedures to produce output as a single row, thus SSIS treats it as one column and produces output file.
Ex:
'col1|col2|cole' AS rowdata
'1|2|a' AS rowdata
However, there are quite a few procedures and changing every procedure will take lot of time and might end-up in human errors.
So, I'd like to know, if there is a way in SSIS, where I could handle metadata dynamically and produce output file.

You can look into BiML, which uses meta data to dynamically build and then execute packages.

Well, it is hardly possible in traditional SSIS package, where all outputs are defined at design time and validated at runtime. Altering its number leads to validation errors and stops package execution.
You can play a trick with SQL Execute Task and Script Component. Exec SQL runs SP and returns SP result as 'Full Result Set' stored in an Object variable. Then at Script task you open this variable which is in fact an ADO recordset; it can be read by the System.Data.OleDb.OleDbDataAdapter. For example:
// Set up the DataAdapter to extract the data,
// and the DataTable object to capture those results
OleDbDataAdapter da = OleDbDataAdapter();
DataTable dt = DataTable();
// Extract the data from the object variable into the table
da.Fill(dt, Variables.vResults);
Then you can work with DataSet and flatten into single variable for later use. Or - create and save file in C# script directly.

Related

Variable values stored outside of SSIS

This is merely a SSIS question for advanced programmers. I have a sql table that holds clientid, clientname, Filename, Ftplocationfolderpath, filelocationfolderpath
This table holds a unique record for each of my clients. As my client list grows I add a new row in my sql table for that client.
My question is this: Can I use the values in my sql table and somehow reference each of them in my SSIS package variables based on client id?
The reason for the sql table is that sometimes we get request to change the delivery or file name of a file we send externally. We would like to be able to change those things dynamically on the fly within the sql table instead of having to export the package each time and manually change then re-import the package. Each client has it's own SSIS package
let me know if this is feasible..I'd appreciate any insight
Yes, it is possible. There are two ways to approach this and it depends on how the job runs. First is if you are running for a single client for a single job run or if you are running for multiple clients for a single job run.
Either way, you will use the Execute SQL Task to retrieve data from the database and assign it to your variables.
You are running for a single client. This is fairly straightforward. In the Result Set, select the option for Single Row and map the single row's result to the package variables and go about your processing.
You are running for multiple clients. In the Result Set, select Full Result Set and assign the result to a single package variable that is of type Object - give it a meaningful name like ObjectRs. You will then add a ForEachLoop Enumerator:
Type: Foreach ADO Enumerator
ADO object source variable: Select the ObjectRs.
Enumerator Mode: Rows in all the tables (ADO.NET dataset only)
In Variable mappings, map all of the columns in their sequential order to the package variables. This effectively transforms the package into a series of single transactions that are looped.
Yes.
I assume that you run your package once per client or use some loop.
At the beginning of the "per client" code read all required values from the database into SSIS varaibles and the use these variables to define what you need. You should not hardcode client specific information in the package.

SSIS FOREACH: Remove data from SQL Table if filename already present

apologies if I've phrased this terribly. I only started using SSIS today.
I've written a FOREACH which loops through all the files in a folder, and updates my table f_actuals together with the filename without the extension - this filename is a combination of a PeriodKey and Business Unit. It works well.
However, this is intended to be a daily upload from our system for the entire month for each business unit (so the month-to-date refreshes daily until we close that period), so what I really need is the FOREACH to include something which does the following: -
Checks the filenames due for import in the designated folder against the filenames already in the f_actuals table
Removes all the matches from the f_actuals table
Continues with the FOREACH I've already built
I know this is probably a massively inefficient way to do this (preference would be daily incremental uploads), but the files need to be month-to-date, as our system cannot provide anything else easily.
Hope this makes sense.
Any help greatly appreciated.
You can use an Execute SQL Task within the For Each Loop to do this.
You can either use an SQL statement:
DELETE
FROM f_actuals
WHERE filename = ?
Or perhaps a stored procedure (accepting your filename as a parameter and doing the same thing as the statement above), e.g.:
EXEC DeleteFromActuals ?
For each filename in your loop, you would store this in a variable, and pass the variable as a parameter in the Execute SQL Task (this is what the ? is).
To map the parameter in the Execute SQL Task, go to 'Parameter Mapping', and add a new parameter. Select the variable containing the filename from the dropdown list, choose a data type of VARCHAR, and set the 'Parameter Name' to 0. The 'Direction' should be 'Input', which is the default.

SSIS Script Component - only to change variables

I have a series of task that are very similar:
SELECT a,b FROM c
Lookup in another table and change value in column b.
Save new value back to c and if not match, send the result on to an error table.
That part is pretty straight forward and illustrated here:
Source ==> Lookup =match=> SQL Update command
=No match=> SQL Save Error command
(Hope you understand what I mean - but it works!)
I now have to repeat this a number of times, where my source-sql changes. So what I want to do is to insert a Script Component in front of the Source and set my User::Sql variable like:
Variables.Sql = "SELECT d, e FROM f"
All of the above is contained in a Data Flow. When I have created one I can then copy that one and only change the Sql variable in the script and then it should all work.
My problem is: When I insert the Script Command it asks me if it is a Source, Destination or Transscript script. And by only setting the variable it does not produce any rows for output and cannot connect to my Source.
Anyone know how to make that work?
(I have simplified the above. I actually want to update multiple variables and use those in my Source, Lookup and Error update as well - therefore it is not more simple just to change the SQL script in the initial Source! But being able to do the above, I will be able to achieve what I want :-))
You should set your variable containing the SQL query in the control flow, before you execute the dataflow.
Then you need to use that variable as an expression in your Dataflow. You can parametrize the query used in the lookup or any other parameters of your dataflow.
If your dataflows really have always the same structure, you could even generate a list of queries and call your dataflow task in a loop, preventing the duplication of the same tasks.

From an SSIS package how should I execute a sql script stored in a table as a varchar(max)?

I'm storing large (varchar(max)) SQL scripts in a table. I'd like to execute the scripts in an SSIS package.
Looking at other posts on this site it's easy enough to get the varchar(max) into an object variable. But then what to do? Is there a way for an Execute Sql Task (SQLSourceType of Variable) to specify an Object variable rather than a String variable?
Is there an approach that will work?
Here's how I might approach it:
Add a Data Flow task to your control flow
Add a Source (ADO.NET) that connects to your database
Create a Package level Object variable (for the next step)
Add a Recordset destination that populates your data into the Object variable created in the previous step
Back on the control flow:
Create a package level String variable for the "current" query (see next step)
Add a For Each ADO.NET enumerator
Connect the previous Data Flow task to the For Each task
Configure the For Each to use the Object variable as a source, and to store the column index with the SQL into the String variable
Add an Execute SQL task inside the For Each task
Configure it to execute a SQL Command from Variable, and pick the string variable containing the current query
Basically it will collect the queries from the table, then for each collected query, assign it to a variable, and then the Execute SQL command can pull the command text from that variable.

SSIS package to execute a stored procedure for each xml document is a specific directory

I have a table with a column type of xml. I also have a directory that can have 0 to n number of xml documents. For each xml document, i need to insert a new row in the table and throw the xml into the xml column.
To fit with our clients needs, I need to perform this operation using an SSIS package. I plan to use a Stored Procedure to insert the xml, passing in the file path.
I've created the stored procedure and tested, it functions as expected.
My question is, how do I execute the stored procedure from an SSIS package for each xml document is a specific directory?
Thanks in advance for any help.
-
Basically you just need to loop through the files and get the full file paths to pass to the stored proc. This can be done easily using a For Each Loop and the ForEach File Enumerator. This page has a good description of how to set that up:
http://www.sqlis.com/post/Looping-over-files-with-the-Foreach-Loop.aspx
Within the loop then you just access the variable that is populated each time the loop executes (an XML file is found) and send it as a parameter into an Execute SQL Task (residing inside your For Eacu Loop container) to call your stored procedure. Here is an example of passing variables as parameters:
http://geekswithblogs.net/stun/archive/2009/03/05/mapping-stored-procedure-parameters-in-ssis-ole-db-source-editor.aspx
You don't need to use a stored procedure for this. You can do all of this within an SSIS package. Here's how:
Have a For-Each Loop task read all available files in the folder. Put the full path of the file into a variable called XMLFileName
Inside the For-Each loop, use a Data-Flow task read the contents.
The OLE_SRC is reading from the same SQL Server and it's statement is SELECT GetDate() as CurrentDateTime
The DerivedColumn component creates a column called XMLFilePath with the full path of the XML file
The ImportColumn component is the one that does the magic. It will take the XMLFilePath as an input column, give it the LineageId of a new output column you create and it will import the full XML for you. Read more on how to set it up here:
http://www.bimonkey.com/2009/09/the-import-column-transformation/
Use the OleDB Destination to write to the table.