Informatica PC: how do I make a decision in the flow upon scalar query result? - variables

I'm struggling with what seems like the simplest thing there is: assigning a value to a mapping variable that I later on use in my flow to make a decision upon... With my MS SSIS background this is a 10 seconds task, however in Informatica PowerCenter, it is taking me hours...
So I have a mapping variable $$V_FF and a workflow variable $$V_FF. At first the names were different but while trying things out, I changed that. But that shouldn't matter, right?
In a mapping, I have a view as a source that returns -1, 0 or 1. The mapping variable aggregate function is set to MIN.
In the session that I have created for this mapping, I have a post-session assignment between the wf variable and the mapping variable.
In this mapping I use setvariable function in an Extrans block.
Every time I run the wf, I see in the log that it uses a persistent value instead of assigning a new value everytime the flow is running...
What am I missing here?
Thanks in advance!

Well, the variables here work in a bit different way indeed. So it would be easier to come up with a good answer or you'd explain the whole scenario: what are you using the variable for?
Anyway, the variable values are persisted in repository and reused, as you've noticed. For your scenario you could add an Assignment Task to the Workflow before your Session. Set some low value (eg. -1 if you expect your variable to have some positive value after the Mapping run) and use PreSession Variable Assignment to pass the value to the Mapping. This will override the use of persisted repository value. If course in this case you will need to use Max aggregation.

In the end, I managed to accomplish what I wanted back then. There might be a better way, but this solution is easy to maintain and easy to understand.
Create a variable in your workflow, lets say $$FailureFlag with type integer.
Create a view in your DB that returns 1 row with a integer value between 0 and x, where x is a positive integer value.
Create a mapping with the view that we just created as the source and use a dummy table as destination.
In this mapping you also create a variable, lets say $$MYVAR, with type integer and aggregation "Count". In an "Expression Transformation" I assign the result of the view, column FF, to this variable $$MYVAR by using SETVARIABLE($$MYVAR,FF).
From this mapping, create a session in your workflow. In the Components tab, in the "Postsession_success_variable_mapping" section, add a row and link workflow variable $$FailureFlag with session variable $$MYVAR.
Add a Decision component right after the session you just created, and test the content of your workflow variable, for example $$V_FAILURE_FLAG_IMX = 1.
Connect your decision then with your destination and add a test clause for example:
"$MyDecision.Condition = true AND $MyDecision.PrevTaskStatus = succeeded"
Voila, that's it.

Related

pass SQL Query values to Data Factory variable as array for foreachloop

similar to this question how to pass variables to Azure Data Factory REST url's query stirng
However, I have a pipeline to query against graphapi, where I need to pass in a userid as part of the Url to get their manager to build an ActiveDirectory staff hierarchy, this is fine on an individual basis, or even as a predefined array variable where I insert["xx","xxx"] into the pipeline variable etc. My challenge is that I need to pass the results of a SQL query to be the array variable. So, instead of defining the list of users, I need to pass into the foreach loop the results from a SQL query.
I can use a lookup to a set variable, but the url seems to be misconstructed and has extra characters added in for some reason.
returning graph.microsoft.com/v1.0/users/%7B%7B%22id%22:%22xx9e7878-bwbbb-bwbwbwr-7897-414a8e60c78c%22%7D%7D/?$expand=xxxxxx where the "%7B%7B%22id%22:%" and "%22%7D%7D/" is all unnecessary and appears to come from the json rather than just utilising the value.
The lookup runs the query from SQL
The Set variable uses the lookup value's (below) to assign to a pipeline variable as an array.
then the foreachloop uses the variable value in the source
#concat('users/{',item(),'}/?$expand=manager($levels=max;$select=id,displayName,userPrincipalName,createdDate)')
If anyone can suggest how to construct the array value dynamically that would be great.
I have used
SELECT '["'+STRING_AGG(CONVERT(NVARCHAR(MAX),t.[id]),'","')+'"]' AS id FROM
stage.extract_msgraphapi_users t LEFT JOIN stage.extract_msgraphapi_users s ON s.id = t.id
and this returns something that looks like an array ["xx","xxx"] but data factory still interpreted this as a string and not an array. Any help would be appreciated.
10 minutes later:
#concat('users/{',item().id,'}/?$expand=manager($levels=max;$select=id,displayName,userPrincipalName,createdDate)')
note the reference to item().id to use the id level of the array. Works like a dream for anyone else facing the same issue

Pass value from job to transformation in Pentaho

I have the following transformation in Pentaho PDI (note the question mark in the SQL statement):
The transformation is called from a job. What I need is to get the value from the user when the job is run and pass it to the transformation so the question mark is replaced.
My problem is that there are parameters, arguments and variables, and I don't know which one to use. How to make this work?
What karan means is that your sql should look like delete from REFERENCE_DATA where rtepdate = ${you_name_it}, and check the box Variable substitution. The you_name_it parameter must be declared in the transformation option (click anywhere in the spoon panel, Option/Parameters), with or without a default value.
When running the transformation, you are prompted with a panel where you can set the value of the parameters, including you_name_it.
Parameters pass from job to transformation transparently, so you can declare you_name_it as a parameter of the job. Then when the user run the job, it will be prompted to give values to a list of parameters, including you_name_it.
An other way to achieve the same result, is to use arguments. The question marks will be replaced by the fields specified in the Parameters list box, in the same order. Of course the field you use must be defined in a previous step. In your case, a Get variable step, which reads the variable defined in the calling job, and put them in a row.
Note that, there is a ready made Delete step to delete records from a database. Specify the table name (which can be a parameter: just Crtl+Space in the box), the table column and the condition. The condition will come from a previous step defined in a Get parameter like in the argument method.
You can use variables or arguments. If you are using variables then use
${variable1}
syntax in your query and if you want to use arguments then you have to use? In your query and mention the names of those arguments in "Field names to be used as arguments" section. Both will work. Let me know if you need further clarifications.

Best Way to Handle SQL Parameters?

I essentially have a database layer that is totally isolated from any business logic. This means that whenever I get ready to commit some business data to a database, I have to pass all of the business properties into the data method's parameter. For example:
Public Function Commit(foo as object) as Boolean
This works fine, but when I get into commits and updates that take dozens of parameters, it can be a lot of typing. Not to mention that two of my methods--update and create--take the same parameters since they essentially do the same thing. What I'm wondering is, what would be an optimal solution for passing these parameters so that I don't have to change the parameters in both methods every time something changes as well as reduce my typing :) I've thought of a few possible solutions. One would be to move all the sql parameters to the class level of the data class and then store them in some sort of array that I set in the business layer. Any help would be useful!
So essentially you want to pass in a List of Parameters?
Why not redo your Commit function and have it accept a List of Parameter objects?
If your on SQL 2008 you can use merge to replace insert / update juggling. Sometimes called upsert.
You could create a struct to hold the parameter values.
Thanks for the responses, but I think I've figured out a better way for what I'm doing. It's similar to using the upsert, but what I do is have one method called Commit that looks for the given primary key. If the record is found in the database, then I execute an update command. If not, I do an insert command. Since the parameters are the same, you don't have to worry about changing them any.
For your problem I guess Iterator design pattern is the best solution. Pass in an Interface implementation say ICommitableValues you can pass in a key pair enumeration value like this. Keys are the column names and values are the column commitable values. A property is even dedicated as to return the table name in which to insert these value and or store procedures etc.
To save typing you can use declarative programming syntax (Attributes) to declare the commitable properties and a main class in middleware can use reflection to extract the values of these commitable properties and prepare a ICommitableEnumeration implementation from it.

NHibernate: Return A Constant In HQL

I need to return a constant from an HQL query in NHIbernate
SELECT new NDI.SomeQueryItem(user, account, " + someNumber + ")
FROM NDI.SomeObject object
I am trying for something like above. I've tried this:
SELECT new NDI.SomeQueryItem(user, account, :someNumber)
FROM NDI.SomeObject object
And then later:
.SetParameter("someNumber", 1).List<SomeQueryItem>();
But in the first case I get a 'Undefined alias or unknown mapping 1'. Which makes some sense since it probably thinks the 1 is an alias.
For the second I get a 'Undefined alias or unknown mapping :someNumber' which again makes some sense if it never set the parameter.
I have to believe there's some way to do this.
Please feel free to continue to believe there is some way to do this - but with HQL there isn't!
Why would you want to anyway? If you want to update the value this property to the value you specify, then do so after you've loaded the objects. Alternatively, if your result set doesn't quite match to your objects, you could alway use a SQL query (which you can still do via an NHibernate session). But the purpose of NHibernate is to map what's in your database onto objects, so specifying a manual override like this is quite rightly not allowed.
It sounds like there is a (small?) disconnect between your domain objects and your database model. What about creating a small "DTO" object to bridge this gap?
Have your query return a list of SomeQueryItemDTO (or whatever you want to call it) which, due to the naming, you know is not a true part of your domain. Then have some function to process the list and build a list of true SomeQueryItem objects by incorporating the data that is extraneous to the database.
If you're already using the Repository Pattern, this should be easier since all the ugly details are hidden inside of your repository.

Can I maintain state between calls to a SQL Server UDF?

I have a SQL script that inserts data (via INSERT statements currently numbering in the thousands) One of the columns contains a unique identifier (though not an IDENTITY type, just a plain ol' int) that's actually unique across a few different tables.
I'd like to add a scalar function to my script that gets the next available ID (i.e. last used ID + 1) but I'm not sure this is possible because there doesn't seem to be a way to use a global or static variable from within a UDF, I can't use a temp table, and I can't update a permanent table from within a function.
Currently my script looks like this:
declare #v_baseID int
exec dbo.getNextID #v_baseID out --sproc to get the next available id
--Lots of these - where n is a hardcoded value
insert into tableOfStuff (someStuff, uniqueID) values ('stuff', #v_baseID + n )
exec dbo.UpdateNextID #v_baseID + lastUsedn --sproc to update the last used id
But I would like it to look like this:
--Lots of these
insert into tableOfStuff (someStuff, uniqueID) values ('stuff', getNextID() )
Hardcoding the offset is a pain in the arse, and is error prone. Packaging it up into a simple scalar function is very appealing, but I'm starting to think it can't be done that way since there doesn't seem to be a way to maintain the offset counter between calls. Is that right, or is there something I'm missing.
We're using SQL Server 2005 at the moment.
edits for clarification:
Two users hitting it won't happen. This is an upgrade script that will be run only once, and never concurrently.
The actual sproc isn't prefixed with sp_, fixed the example code.
In normal usage, we do use an id table and a sproc to get IDs as needed, I was just looking for a cleaner way to do it in this script, which essentially just dumps a bunch of data into the db.
I'm starting to think it can't be done that way since there doesn't seem to be a way to maintain the offset counter between calls. Is that right, or is there something I'm missing.
You aren't missing anything; SQL Server does not support global variables, and it doesn't support data modification within UDFs. And even if you wanted to do something as kludgy as using CONTEXT_INFO (see http://weblogs.sqlteam.com/mladenp/archive/2007/04/23/60185.aspx), you can't set that from within a UDF anyway.
Is there a way you can get around the "hardcoding" of the offset by making that a variable and looping over the iteration of it, doing the inserts within that loop?
If you have 2 users hitting it at the same time they will get the same id. Why didn't you use an id table with an identity instead, insert into that and use that as the unique (which is guaranteed) id, this will also perform much faster
sp_getNextID
never ever prefix procs with sp_, this has performance implication because the optimizer first checks the master DB to see if that proc exists there and then th local DB, also if MS decide to create a sp_getNextID in a service pack yours will never get executed
It would probably be more work than it's worth, but you can use static C#/VB variables in a SQL CLR UDF, so I think you'd be able to do what you want to do by simply incrementing this variable every time the UDF is called. The static variable would be lost whenever the appdomain unloaded, of course. So if you need continuity of your ID from one day to the next, you'd need a way, on first access of NextId, to poll all of tables that use this ID, to find the highest value.