How to build dynamic url for http connector in Clover ETL - clover

How to build http dynamic url in http connector to map values from sql source in CloverDX? For now I have dbread component with selected one column (list of TAXIDs) in table which should be my dynamic attribute/parameter for url...
I need to build url with Path Parameter - TAXID and Query Parameter — getdate(today)
Something like:
GET baseurl/api/search/taxid/{taxid}?date={getdate(today)}

Use 'Input Mapping' property of HTTPConnector where you can build URL manually (in $out.0.URL = 'baseurl/api/search/taxid/' + $in.0. taxid + '?date='+ date2str(today(),'yyyy_Mm....')
OR
use 'Add input fields as parameters', provide records with field 'taxid' and 'date' (with prefilled value of date) and query will be build automatically, on the fly with provided values in mentioned fields.

Related

read specific files names in adf pipeline

I have got requirement saying, blob storage has multiple files with names file_1.csv,file_2.csv,file_3.csv,file_4.csv,file_5.csv,file_6.csv,file_7.csv. From these i have to read only filenames from 5 to 7.
how we can achieve this in ADF/Synapse pipeline.
I have repro’d in my lab, please see the below repro steps.
ADF:
Using the Get Metadata activity, get a list of all files.
(Parameterize the source file name in the source dataset to pass ‘*’ in the dataset parameters to get all files.)
Get Metadata output:
Pass the Get Metadata output child items to ForEach activity.
#activity('Get Metadata1').output.childItems
Add If Condition activity inside ForEach and add the true case expression to copy only required files to sink.
#and(greater(int(substring(item().name,4,1)),4),lessOrEquals(int(substring(item().name,4,1)),7))
When the If Condition is True, add copy data activity to copy the current item (file) to sink.
Source:
Sink:
Output:
I took a slightly different approaching using a Filter activity and the endsWith function:
The filter expression is:
#or(or(endsWith(item().name, '_5.csv'),endsWith(item().name, '_6.csv')),endsWith(item().name, '_7.csv'))
Slightly different approaches, similar results, it depends what you need.
You can always do what #NiharikaMoola-MT suggested . But since you already know the range of the files ( 5-7) , I suggest
Declare two paramter as an upper and lower range
Create a Foreach loop and pass the parameter and to create a range[lowerlimit,upperlimit]
Create a paramterized dataset for source .
Use the fileNumber from the FE loop to create a dynamic expression like
#concat('file',item(),'.csv')

How can I leverage a dynamic Data Flow in Azure Data Factory to map lookup tables based on a config file?

I am attempting to create a pipeline that accepts values from a config file (JSON) in an attempt to build a source query, lookup logic, and destination sink based on the values from the file.
An example of an object from the config file would look something like this:
{
/*Destination Table fields */
"destTableName": "DimTable1",
"destTableNaturalKey": "ClientKey, ClientNaturalKey",
"destTableSchema": "dbo",
/*Source Table fields */
"sourcePullFields": "ClientKey, ClientNaturalKey",
"sourcePullFilters": "WHERE ISNULL(ClientNaturalKey,'') <> ''",
"sourceTableName": "ClientDataStaged",
"sourceTableSchema": "stg"
}
The pipeline would identify the number of items within the config (for each) that need to be checked for new data, in a basic pipeline like this:
Pipeline pic
I would then pass these values into the data flow, from the ADF Pipeline:
ADF Parameters
And build the source pull and lookup values within the Data flow expressions with something like this:
concat('SELECT DISTINCT ', $sourcePullFields, ' FROM ', $sourceTableSchema, '.', $sourceTableName, ' ', $sourcePullFilters)
When I am within the data flow, and pass the same config values within the debug settings, I can correctly view projections and step through the data flow correctly. It is when I execute the data flow from the pipeline that I get errors.
As a second attempt, I simply passed through the source query within the config:
{
"destQuery": "SELECT Hashbytes('MD5', (cast(ClientKey as varchar(5)) + ClientNaturalKey)) AS DestHashVal FROM dbo.DimTable1",
"sourceQuery": "SELECT DISTINCT ClientKey, ClientNaturalKey, Hashbytes('MD5', ( Cast(SchoolKey AS VARCHAR(5)) + ClientNaturalKey )) AS SourceHashVal FROM stg.ClientNaturalKey WHERE Isnull(ClientNaturalKey, '') <> ''"
}
I had intended to use the md5 function within the data flow expressions, but at this point I simply want to:
Define a source query, whether it be via a SQL statement or built from variables
Define a lookup query, whether it be via a SQL statement or built from variables
Have the ability to compare a Hashed value(s) from source to the lookup (destination table)
If the lookup returns no match on the hash, load the values
ADF Data Flow Pic
Ideally I am not defining the SQL statement directly.. it just feels less intelligent. Regardless, this is to prevent migrating ~50 DFTs from SSIS to a few pipelines and a single data flow that can handle the dynamacy. Since the process has been working within the confines of the data flow, I have been messing with passing in the parameters in different ways, removing quotes, unsure of what the string interpolation is doing.. etc.

How to Map Input and Output Columns dynamically in SSIS?

I Have to Upload Data in SQL Server from .dbf Files through SSIS.
My Output Column is fixed but the input column is not fixed because the files come from the client and the client may have updated data by his own style. there may be some unused columns too or the input column name can be different from the output column.
One idea I had in my mind was to map files input column with output column in SQL Database table and use only those column which is present in the row for file id.
But I am not getting how to do that. Any idea?
Table Example
FileID
InputColumn
OutputColumn
Active
1
CustCd
CustCode
1
1
CName
CustName
1
1
Address
CustAdd
1
2
Cust_Code
CustCode
1
2
Customer Name
CustName
1
2
Location
CustAdd
1
If you create a similar table, you can use it in 2 approaches to map columns dynamically inside SSIS package, or you must build the whole package programmatically. In this answer i will try to give you some insights on how to do that.
(1) Building Source SQL command with aliases
Note: This approach will only work if all .dbf files has the same columns count but the names are differents
In this approach you will generate the SQL command that will be used as source based on the FileID and the Mapping table you created. You must know is the FileID and the .dbf File Path stored inside a Variable. as example:
Assuming that the Table name is inputoutputMapping
Add an Execute SQL Task with the following command:
DECLARE #strQuery as VARCHAR(4000)
SET #strQuery = 'SELECT '
SELECT #strQuery = #strQuery + '[' + InputColumn + '] as [' + OutputColumn + '],'
FROM inputoutputMapping
WHERE FileID = ?
SET #strQuery = SUBSTRING(#strQuery,1,LEN(#strQuery) - 1) + ' FROM ' + CAST(? as Varchar(500))
SELECT #strQuery
And in the Parameter Mapping Tab select the variable that contains the FileID to be Mapped to the parameter 0 and the variable that contains the .dbf file name (alternative to table name) to the parameter 1
Set the ResultSet type to Single Row and store the ResultSet 0 inside a variable of type string as example #[User::SourceQuery]
The ResultSet value will be as following:
SELECT [CustCd] as [CustCode],[CNAME] as [CustName],[Address] as [CustAdd] FROM database1
In the OLEDB Source select the Table Access Mode to SQL Command from Variable and use #[User::SourceQuery] variable as source.
(2) Using a Script Component as Source
In this approach you have to use a Script Component as Source inside the Data Flow Task:
First of all, you need to pass the .dbf file path and SQL Server connection to the script component via variables if you don't want to hard code them.
Inside the script editor, you must add an output column for each column found in the destination table and map them to the destination.
Inside the Script, you must read the .dbf file into a datatable:
C# Read from .DBF files into a datatable
Load a DBF into a DataTable
After loading the data into a datatable, also fill another datatable with the data found in the MappingTable you created in SQL Server.
After that loop over the datatable columns and change the .ColumnName to the relevant output column, as example:
foreach (DataColumn col in myTable.Columns)
{
col.ColumnName = MappingTable.AsEnumerable().Where(x => x.FileID = 1 && x.InputColumn = col.ColumnName).Select(y => y.OutputColumn).First();
}
After loop over each row in the datatable and create a script output row.
In addition, note that in while assigning output rows, you must check if the column exists, you can first add all columns names to list of string, then use it to check, as example:
var columnNames = myTable.Columns.Cast<DataColumn>()
.Select(x => x.ColumnName)
.ToList();
foreach (DataColumn row in myTable.Rows){
if(columnNames.contains("CustCode"){
OutputBuffer0.CustCode = row("CustCode");
}else{
OutputBuffer0.CustCode_IsNull = True
}
//continue checking all other columns
}
If you need more details about using a Script Component as a source, then check one of the following links:
SSIS Script Component as Source
Creating a Source with the Script Component
Script Component as Source – SSIS
SSIS – USING A SCRIPT COMPONENT AS A SOURCE
(3) Building the package dynamically
I don't think there are other methods that you can use to achieve this goal except you has the choice to build the package dynamically, then you should go with:
BIML
Integration Services managed object model
EzApi library
(4) SchemaMapper: C# schema mapping class library
Recently i started a new project on Git-Hub, which is a class library developed using C#. You can use it to import tabular data from excel, word , powerpoint, text, csv, html, json and xml into SQL server table with a different schema definition using schema mapping approach. check it out at:
SchemaMapper: C# Schema mapping class library
You can follow this Wiki page for a step-by-step guide:
Import data from multiple files into one SQL table step by step guide

Can Karate generate multiple query parameters with the same name?

I need to pass multiple query parameters with the same name in a URL, but I am having problems getting it to work with Karate. In my case, the URL should look like this:
http://mytestapi.com/v1/orders?sort=order.orderNumber&sort=order.customer.name,DESC
Notice 2 query parameters named "sort". I attempted to create these query string parameters with Karate, but only the last "sort" parameter gets created in the query string. Here are the ways I tried to do this:
Given path 'v1/orders'
And param sort = 'order.orderNumber'
And param sort = 'order.customer.name,DESC'
And header Authorization = authInfo.token
And method get
Then status 200
And:
Given path 'v1/orders'
And params sort = { sort: 'order.orderNumber', sort: 'order.customer.name,DESC' }
And header Authorization = authInfo.token
And method get
Then status 200
And:
Given path 'v1/order?sort=order.orderNumber&sort=order.customer.name,DESC'
And header Authorization = authInfo.token
And method get
Then status 200
The first two ways provide the same query string result: ?sort=order.customer.name%2CDESC
The last example does not work because the ? get encoded, which was expected and explained in this post - Karate API Tests - Escaping '?' in the url in a feature file
It's clear that the second "sort" param is overriding the first and only one parameter is being added to the URL. I have gone through the Karate documentation, which is very good, but I have not found a way to add multiple parameters with the same name.
So, is there a way in Karate to set multiple URL query parameters with the same name?
Yes you can generate multiple query parameters with the same name in karate
All values of similar key should be provided in an array.
Given path 'v1/orders'
And params {"sort":["order.orderNumber","order.customer.name,DESC"]}
And header Authorization = authInfo.token
And method get
Then status 200
And for setting single parameter using param it will be like
And param sort = ["order.orderNumber","order.customer.name,DESC"]

Adding a single query result into JMeter report

I have JMeter plan that starts with a single JDBC sampler query that captures session ID from the Teradata database (SELECT SESSION;). Same plan also has large number of JDBC samplers with complicated queries producing large output that I don't want to include in the report.
If I configure summary report and tick Save Response Data (XML) then the output from all sampler queries will be saved
How do I add only first query result (it's a single integer) into the test summary report and ignore results from all other queries? For example is there a way to set responseData = false after the first query output is captured?
Maybe sample_variables property can help?
Define something in "Variable Names" section of the JDBC Request, i.e. put session reference name there like:
Add the next line to user.properties file (lives in Jmeter's "bin" folder)
sample_variables=session_1
or alternatively pass it via -J command-line argument like:
jmeter -Jsample_variables=session_1 -n -t /path/to/testplan.jmx -l /path/to/results.csv
You need to use session_1 not session. As per JDBC Request Sampler documentation:
If the Variable Names list is provided, then for each row returned by a Select statement, the variables are set up with the value of the corresponding column (if a variable name is provided), and the count of rows is also set up. For example, if the Select statement returns 2 rows of 3 columns, and the variable list is A,,C, then the following variables will be set up:
A_#=2 (number of rows)
A_1=column 1, row 1
A_2=column 1, row 2
C_#=2 (number of rows)
C_1=column 3, row 1
C_2=column 3, row 2
So given your query returns only 1 row containing 1 integer - it will live in session_1 JMeter Variable. See Debugging JDBC Sampler Results in JMeter article for comprehensive information on working with database query results in JMeter.
When test completes you'll see an extra column in .jtl results file holding your "session" value:
Although not exactly solving your question as posted, I will suggest a workaround, using a "scope" of a listener (i.e. listener will only record items on the same or lower level than a listener itself). Specifically: have two Summary Reports: one on the level of test, the other (together with the sampler whose response you want to record) under a controller. For example:
here I have samplers 1, 2, 3, 4. I only want to save response data from sampler 2. So
Summary Report - Doesn't save responses is on global level, and it's configured to not save any response data. It only saves what I want to save for all samplers.
Summary Report - Saves '2' only is configured to save response data in XML format. But because this instance of Summary Report is under the same controller as sampler 2, but other samplers (1, 3, 4) are on higher level, it will only record responses of sampler 2.
So it doesn't exactly allow you to save response data from one sampler into the same file as all other Summary Report data. But at least you can filter which responses you are saving.
May be you can try assertion for ${__threadNum}
i.e. set condition for assertion as "${__threadNum}=1" and set your listner's "Log/display only" option as "successes"
This way it should log only the first response from samplers.