Execute informatica repository sql from informatica workflow - sql

I am new to Informatica. I am using Informatica 10.1.0 and I have created a workflow like below.
How can I make this workflow to execute the below informatica repository sql and fail the workflow is the count is greater than 0
select count(*) as cnt
from REP_TASK_INST_RUN
where workflow_run_id = (select max(workflow_run_id) from OPB_WFLOW_RUN where WORKFLOW_NAME = 'wf_Load_Customer_Transactions')
and RUN_STATUS_CODE <> 0

You have shared the view of a workflow manager. in the Informatica Designer, you can create a mapping with the source as your table. In the Source Qualifier, add a dummy query and then load this data into a designated target. Post that you can create the workflow for your mapping and run it.
https://www.guru99.com/mappings-informatica.html
The above link should be a good reference.
Once you have a functional workflow, you may add a control task for the above check in Control task to make the workflow to fail if count of target rows <1.

Design an informatica Mapping-
- SQ contains the query you have provided and output of SQ will be passed to an expression. Create a mapping variable which stores this value.
- with in the workflow using the post session workflow variable assignment- assign the mapping variable to workflow variable.
- create an assignment task which checks the value of this workflow variable- if the count >0 , use the control task to fail the workflow.

One way would be to create a mapping with your query inside of a SQL Transformation. Set it up to write to either a flat file or create a table in the DB. Add a filter to write the count to target only if it's greater than 0.
Then in the workflow, setup a session and link it to a Control Task that will fail if $TgtSuccessRows is < 1.

You can create a dummy session to put your query inside the session, then link with the next workflow. The linkage u can put $count=0. Then the next wkf session will run when the count is 0.

Related

Variable values stored outside of SSIS

This is merely a SSIS question for advanced programmers. I have a sql table that holds clientid, clientname, Filename, Ftplocationfolderpath, filelocationfolderpath
This table holds a unique record for each of my clients. As my client list grows I add a new row in my sql table for that client.
My question is this: Can I use the values in my sql table and somehow reference each of them in my SSIS package variables based on client id?
The reason for the sql table is that sometimes we get request to change the delivery or file name of a file we send externally. We would like to be able to change those things dynamically on the fly within the sql table instead of having to export the package each time and manually change then re-import the package. Each client has it's own SSIS package
let me know if this is feasible..I'd appreciate any insight
Yes, it is possible. There are two ways to approach this and it depends on how the job runs. First is if you are running for a single client for a single job run or if you are running for multiple clients for a single job run.
Either way, you will use the Execute SQL Task to retrieve data from the database and assign it to your variables.
You are running for a single client. This is fairly straightforward. In the Result Set, select the option for Single Row and map the single row's result to the package variables and go about your processing.
You are running for multiple clients. In the Result Set, select Full Result Set and assign the result to a single package variable that is of type Object - give it a meaningful name like ObjectRs. You will then add a ForEachLoop Enumerator:
Type: Foreach ADO Enumerator
ADO object source variable: Select the ObjectRs.
Enumerator Mode: Rows in all the tables (ADO.NET dataset only)
In Variable mappings, map all of the columns in their sequential order to the package variables. This effectively transforms the package into a series of single transactions that are looped.
Yes.
I assume that you run your package once per client or use some loop.
At the beginning of the "per client" code read all required values from the database into SSIS varaibles and the use these variables to define what you need. You should not hardcode client specific information in the package.

SSIS how to use a table created in a SQL Task as destination in a following Data Flow Task

In SSIS I have a SQL Task which drops and creates a table T. Then I have a Data Flow task which needs to use T as destination to write data.
The Destination Assistant and the fast-load option needs the table T already present in the database to show it as possible destination.
Maybe I could use SQL Command as data access mode but I don't know how to access the incoming data columns from the stream.
How can I use table T as destination in the data flow task?
Store the tablename inside a package variable, select destination type as Tablename from variable and use it, but make sure to set Delay Validation property to True (change this property in the dataflow task and destination)
Note: when designing package T table must be found in the database to read it's structure in the destination, also if tablename is fixed can achieve this without the use of a variable
instead of drop table T in first sql task, truncate table T and Table t will be a permanently available as destination assistant. Hope this helps
In the SQL Task instead of drop and create, can you just Delete or Truncate the data in table T?

SSIS Script Component - only to change variables

I have a series of task that are very similar:
SELECT a,b FROM c
Lookup in another table and change value in column b.
Save new value back to c and if not match, send the result on to an error table.
That part is pretty straight forward and illustrated here:
Source ==> Lookup =match=> SQL Update command
=No match=> SQL Save Error command
(Hope you understand what I mean - but it works!)
I now have to repeat this a number of times, where my source-sql changes. So what I want to do is to insert a Script Component in front of the Source and set my User::Sql variable like:
Variables.Sql = "SELECT d, e FROM f"
All of the above is contained in a Data Flow. When I have created one I can then copy that one and only change the Sql variable in the script and then it should all work.
My problem is: When I insert the Script Command it asks me if it is a Source, Destination or Transscript script. And by only setting the variable it does not produce any rows for output and cannot connect to my Source.
Anyone know how to make that work?
(I have simplified the above. I actually want to update multiple variables and use those in my Source, Lookup and Error update as well - therefore it is not more simple just to change the SQL script in the initial Source! But being able to do the above, I will be able to achieve what I want :-))
You should set your variable containing the SQL query in the control flow, before you execute the dataflow.
Then you need to use that variable as an expression in your Dataflow. You can parametrize the query used in the lookup or any other parameters of your dataflow.
If your dataflows really have always the same structure, you could even generate a list of queries and call your dataflow task in a loop, preventing the duplication of the same tasks.

Pentaho Data Integration (PD)I: After Selecting records I need to update the field value in the table using pentaho transforamtion

Have a requirement to create a transformation where I have to run a select statement. After selecting the values it should update the status, so it doesn't process the same record again.
Select file_id, location, name, status
from files
OUTPUT:
1, c/user/, abc, PROCESS
Updated output should be:
1, c/user/, abc, INPROCESS
Is it possible for me to do a database select and cache the records so it doesn't reprocess the same record again in a single transformation in PDI? So I don't need to update the status in the database. Something similar to dynamic lookup in Informatica. If not what's the best possible way to update the database after doing the select.
Thanks, that helps. You wouldn't do this in a single transformation, because of the multi-threaded execution model of PDI transformations; you can't count on a variable being set until the transform ends.
The way to do it is to put two transformations in a Job, and create a variable in the job. The first transform runs your select and flows the result into a Set Variables step. Configure it to set the variable you created in your Job. Next you run the second transform which contains your Excel Input step. Specify your Job level variable as the file name.
If the select gives more than one result, you can store the file names in the Jobs file results area. You do this with an Set files in result step. Then you can configure the job to run the second transform once for each result file.

From an SSIS package how should I execute a sql script stored in a table as a varchar(max)?

I'm storing large (varchar(max)) SQL scripts in a table. I'd like to execute the scripts in an SSIS package.
Looking at other posts on this site it's easy enough to get the varchar(max) into an object variable. But then what to do? Is there a way for an Execute Sql Task (SQLSourceType of Variable) to specify an Object variable rather than a String variable?
Is there an approach that will work?
Here's how I might approach it:
Add a Data Flow task to your control flow
Add a Source (ADO.NET) that connects to your database
Create a Package level Object variable (for the next step)
Add a Recordset destination that populates your data into the Object variable created in the previous step
Back on the control flow:
Create a package level String variable for the "current" query (see next step)
Add a For Each ADO.NET enumerator
Connect the previous Data Flow task to the For Each task
Configure the For Each to use the Object variable as a source, and to store the column index with the SQL into the String variable
Add an Execute SQL task inside the For Each task
Configure it to execute a SQL Command from Variable, and pick the string variable containing the current query
Basically it will collect the queries from the table, then for each collected query, assign it to a variable, and then the Execute SQL command can pull the command text from that variable.