Check Stored procedure has retured a value - azure-data-factory-2

I am a newbie to Datafctory. As part of my pipeline, I execute an sp to fetch the next record to process using Lookup and then use the returned value in a Set Variable.
If the SP returns noting then the Set Variable fails with the following error
Activity SetBatchId failed: The expression 'activity('usp_get_next_archive_batch').output.firstRow.id' cannot be evaluated because property 'firstRow' doesn't exist, available properties are 'effectiveIntegrationRuntime'.
Is there a way in DF to check the property exists before using it
thanks

Please add a question mark after ‘output’. Means ‘output?.firstRow’.
See also this post.
Azure Data Factory: For each item() value does not exist for a particular attribute

The expression should be 'activity('usp_get_next_archive_batch').output['firstRow'].['id']

Related

read specific files names in adf pipeline

I have got requirement saying, blob storage has multiple files with names file_1.csv,file_2.csv,file_3.csv,file_4.csv,file_5.csv,file_6.csv,file_7.csv. From these i have to read only filenames from 5 to 7.
how we can achieve this in ADF/Synapse pipeline.
I have repro’d in my lab, please see the below repro steps.
ADF:
Using the Get Metadata activity, get a list of all files.
(Parameterize the source file name in the source dataset to pass ‘*’ in the dataset parameters to get all files.)
Get Metadata output:
Pass the Get Metadata output child items to ForEach activity.
#activity('Get Metadata1').output.childItems
Add If Condition activity inside ForEach and add the true case expression to copy only required files to sink.
#and(greater(int(substring(item().name,4,1)),4),lessOrEquals(int(substring(item().name,4,1)),7))
When the If Condition is True, add copy data activity to copy the current item (file) to sink.
Source:
Sink:
Output:
I took a slightly different approaching using a Filter activity and the endsWith function:
The filter expression is:
#or(or(endsWith(item().name, '_5.csv'),endsWith(item().name, '_6.csv')),endsWith(item().name, '_7.csv'))
Slightly different approaches, similar results, it depends what you need.
You can always do what #NiharikaMoola-MT suggested . But since you already know the range of the files ( 5-7) , I suggest
Declare two paramter as an upper and lower range
Create a Foreach loop and pass the parameter and to create a range[lowerlimit,upperlimit]
Create a paramterized dataset for source .
Use the fileNumber from the FE loop to create a dynamic expression like
#concat('file',item(),'.csv')

Write to a dynamic BigQuery table through Apache Beam

I am getting the BigQuery table name at runtime and I pass that name to the BigQueryIO.write operation at the end of my pipeline to write to that table.
The code that I've written for it is:
rows.apply("write to BigQuery", BigQueryIO
.writeTableRows()
.withSchema(schema)
.to("projectID:DatasetID."+tablename)
.withWriteDisposition(WriteDisposition.WRITE_TRUNCATE)
.withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED));
With this syntax I always get an error,
Exception in thread "main" java.lang.IllegalArgumentException: Table reference is not in [project_id]:[dataset_id].[table_id] format
How to pass the table name with the correct format when I don't know before hand which table it should put the data in? Any suggestions?
Thank You
Very late to the party on this however.
I suspect the issue is you were passing in a string not a table reference.
If you created a table reference I suspect you'd have no issues with the above code.
com.google.api.services.bigquery.model.TableReference table = new TableReference()
.setProjectId(projectID)
.setDatasetId(DatasetID)
.setTableId(tablename);
rows.apply("write to BigQuery", BigQueryIO
.writeTableRows()
.withSchema(schema)
.to(table)
.withWriteDisposition(WriteDisposition.WRITE_TRUNCATE)
.withCreateDisposition(CreateDisposition.CREATE_IF_NEEDED));

Mule: how to send Array parameter to DB Update

I have a PG table with a field of type char(10)[].
I need to update a record in the table with values from a Mule flow.
So, i did something like this:
flowVars.test=['aaa', 'bbb',ccc'];
Then, I'm trying to submit an update statement like this:
update tab1 set fld1=#[flowVars.test]
it's failing with the error:
Cannot cast an instance of java.util.ArrayList to type Types.ARRAY
My understanding is that SQL array should be used in this scenario but I can't figure out how to get an instance of such an array in a flow and how to work with it in MEL.
Can someone please advise?
Thank you,
There are many sources that suggest to use the Connection#createArrayOf(). But I don't know how to use it in the Database connector.
However, for this purpose I will do this solution:
Convert the ArrayList to a String. It should be formed as: {value1, value2, ...}
Change the Database Query Type from Parameterized into Dynamic
Update the SQL Query become: update tab1 set fld1 = '#[flowVars.test]'. The additional single quote is required for this query type.
Finally, by using the following configuration I can update field of type character(10)[]:
<expression-transformer expression="#[flowVars.test = ['aaa', 'bbb', 'ccc'].toString().replace('[', '{').replace(']', '}')]" doc:name="Expression"/>
<db:update config-ref="Postgre_Database_Configuration" doc:name="Database">
<db:dynamic-query><![CDATA[update tab1 set fld1 = '#[flowVars.test]']]></db:dynamic-query>
</db:update>
Ok, I've found an answer in MuleSoft doc.
Starting from version 3.6 DB connector supports custom types and allows defining mapping between SQL arrays and structures and custom user classes.
It's documented here .

.NET ODBC Oracle Params getting param name returned by db provider- possible?

I'm converting some RDO code to ODBC Provider code in .NET.
The problem is parameter names were not specified in the orignal code, but param values were retrieved by parameter name after the command was executed.
Is there anyway to have parameter names populated by the provider once the command is executed so calling code can access params by name.
Let me show you an example of the declaration of param and accessing of it.
With rdqryClntBasic
.Parameters.Add(.CreateParameter) : .Parameters(0).Direction = ParameterDirection.Input
.Parameters(0).DbType = DbType.String
.Parameters(0).Value = sClntProdCd
End With
.EffectiveDate = ToDate(rdqryClntBasic.Parameters("dtEffDt").Value)
You can now see how this "used to work in RDO/VB". For some reason it would accept this and know what the param names were after execution. I imagine it had to do another round trip to the db to get this info.
Is there anyway to mimic this behaviour in .NET for ODBC Provider (using Oracle)? Or am I stuck manually specifying the param names in the code (I understand this is the better option, but wondering what the alternative is to match the original code as closely as possible).
No, parameters in ODBC are positional not by name.

Using a variable from Foreach Loop Container in a SQL Task [SSIS]

Ok, I have a simple process...
Read a table and get the rows that
have a "StatusID" of 1. Simple.
Select ProductID from PreorderStatus where StatusID = 1
Foreach row returned from that
query, perform an action. For
simplicity sake, let's just modify
the original table to set the
"StatusID" to 2.
Update PreorderStatus set StatusID = 2 where ProductID = #ProductID
In order to do this in SSIS, I have created a simple "Execute SQL Task" with the first statement. In the editor I have set the Result Set to return a Full result set and the Result Name of 0 is set to fill an object variable named ReadySet.
The output is then routed to a For Each Loop container. The Enumerator is set to Foreach ADO Enumerator and the object source variable set to the ReadySet variable from above. I have also mapped the variable v_ProductID to index 0.
Setting a breakpoint at the begining of the Foreach loop shows the variable being set correctly. GREAT!! Now on to step two....
Now I have placed a new SQL task in the foreach container. Now I have a head scratcher. How do I actually use the variable in the SQL statement. Simply using "v___ProductID" or "User::v_ProductID" doesn't seem to work. Mapping a parameter seemed like a good idea (got a #ProductID and everything!) but that didn't seem to work either.
I get the feeling that I am missing something pretty simple but can't tell what. Thanks for any help!!
I think there is a better approach. Here are the approximate steps:
Drag a DataFlow task onto the design surface.
Open it up and add a OLE DB source and OLEDB Command components to the design surface.
Modify the source to use the query you have described.
Connect the source to the Command component.
Modify command component to use "Update PreorderStatus set StatusID = 2 where ProductID = ?" query and on param mapping page map the ? variable to the input coming from the datasource.
HTH
When I want to use an execute sql task and vary something based on a variable, I use a stored proc and make the variable the input parameter for the proc.
Then you set the parmeter in the execute SQL task and set the SQL statement to something like:
exec myproc ?