Azure Data Factory Lookup Activity - azure-data-factory-2

I've got a Data Factory Lookup Activity
The left (Primary) and right (Lookup Stream) side need to match on 4 columns to lookup a result.
I've got;
toString(byName('Col1')) == toString(byName('Col1'))
toString(byName('Col2')) == toString(byName('Col2'))
...
I've also tried referencing the datasets
toString(byName('Col1','LeftSideDataset')) == toString(byName('Col1','RightSideDataset'))
When I preview the data in debugging I'm expecting those Columns to match, and the additional columns from the right to be added but they don't, they all seem to have matched the first row in the left input (without all of the matching criteria).
Has anyone experienced this?
Thanks,
Dan

Hi Dan what settings did you use,
I could not understand correctly , does anything Match from the Rightside or nothing ?
If only some match, might try it with the Match multiple rows or the Broadcasting
As the Debug shows only some examples of Data, it may occur that only the first lookup that hits is shown

The data flow works as expected.
The data wasn't showing as a Linked Service was altered so the data wasn't as expected.

Related

PowerApps filter returning incomplete data record...?

I have an Azure SQL database, and my records inside table Spiderfood_RITMData in that database includes 13 different fields. Lots of stuff. I have confirmed in SQL-SMS that the records have data in each field.
There are way more items in the database than PowerApps can see using LOOKUP (1600-9000 records or more). However, I know FOR A FACT that there is only ONE record that has any given value in the NUMBER column. It's not a primary key, but it is unique in the table.
In PowerApps, I am trying to pull that field so that I can eventually parse out the individual items.
So, the commands I'm trying are:
ClearCollect(MLE_test1, Filter('Spiderfood_RITMData', "RITM2170467" in Number));
ClearCollect(MLE_test2, Search('Spiderfood_RITMData',"RITM2170467", "Number"));
However, the Collection results for MLE_test1 and MLE_test2 both are empty EXCEPT for the value of NUMBER. Say what?!
I'm trying to use the examples posted on https://learn.microsoft.com/en-us/powerapps/maker/canvas-apps/functions/function-filter-lookup but I am honestly getting baffled by this.
How should I be formatting this call such that I can pull the whole record?
Big picture explanation: I need to do a lot of data LOOKUPS into my table Spiderfood_RITMData table, but it has way more than 2000 rows, and PowerApps will not perform the Lookup correctly. So my presumably smart idea is to create a MUCH SMALLER "version" of Spiderfood_RITMData as a local collection, using a more delegateable function (such as FILTER or IN). If I filter by all records containing the values of NUMBER, then I go from, say a 10,000-record SQL table to a 10-record Collection. And I can do LOOKUPS against that collection for the rest of the function (uh, I think -- I'm still trying to experiment accordingly). Please let me know if this is crazy or not.
LookUp is just used to get one record, instead try this:
ClearCollect(MLE_test1, Filter('Spiderfood_RITMData', "RITM2170467" = Number));
This gets a collection with all the items where Number is = to "RITM2170467"
Collections are limited to only 2000 records in each collections.
I had same issue. Go to App settings. Under Upcoming Features make sure Explicit column selection is turned off. Hope this does it for you.

Compare Two Rows and Update Start and End Dates

I need some help and I know I am not the only one to deal with this issue but I am wondering if you might have some ideas on how to handle the situation of comparing two rows of data filling out start and end dates.
To give you some context, we have a huge hierarchy (approx 8,000 rows and about 12 columns wide) that is updated each year. Sometimes the values change and sometimes they don’t. When the values don’t change, then I don’t need to adjust the dates. When the values do change and a new row is added, I need to change the data.
I have attached some fake data to try and illustrate my data. I am building this in MS Access, so I think this is more of a DBA type question that is going to be manipulated via a recordset type method.
In my example I have two tables – Old Table and New Table. In each table there is a routing code field that represents my join field and primary key for this table.
The Old table represents existing data - tblMain. The New Table represents the data to be appended - tblTemp.
To append the data, I have an append query set up in Access. I perform a left join between the Old and New tables, joining on every field and append the rows that are null in the Old table. That’s fine and that is not where my issue is.
What is causing me issue is how to fill out the start and end dates.
So as you can see from my tables, we are running a zoo. Let’s just say for the sake of the argument, our zoo started off pretty simple and has become more sophisticated. We now want our hierarchy to expand out and become a bit more detailed as we are now capturing the type of animal (Level 4) and the native location (Level 5).
As you can see when comparing one table to another the routing codes are the same, so the append query has to have a join on each field. When you do this, you return the Result Table which is essentially the Old and New tables stacked on top of each other. You might think about a Union query but this is going to give me duplicates and I don’t want that.
If you notice in the Result Table there is a Start and End Date. Let’s just say I get the start and end dates via message box that pops up upon the import of the data and is held in a variable. I think there are dates in my real data but still trying to verify this.
So how do I compare (pseudo code for the logic needed)?
• For each routing code:
Compare Levels 1-5
If the routing code is the same but Levels 1 -5 are not the same
fill out the end date of the old record
fill out the start date of the new record
This idea of comparing two records and filling out a data is quite prevalent in my organization but I haven’t found a way of creating the logic that consistently works so any help or suggestions would be appreciated.
Old Table
New Table
Result Table

SQL identical values not matching

I'm experiencing something strange when joining SQL tables.
Let's say I want to output an image for each result where
tableA.FullNameA = tableB.FullNameB
I am entering the data in table B with a php form. To my human eyes, the names I have are identical in both columns. However, it does not detect a match. If I then manually go into the db and copy and paste the name from tableA.FullNameA to tableB.FullNameB (and when it is overwritten, there is no discernible change to the text) then it miraculously matches and my join is successful.
Is this something to do with encoding of the data when it is added to the DB? I have never come across this before - does anyone recognise this issue?
Thanks in advance for your help.

Retrieve results from a batch of SQL queries in Pentaho or Postgres?

I'm still relatively new to SQL and Pentaho.
I've pulled a table with two different IDs and need to run a query for each specific instance.
For example,
SELECT *
FROM Table
WHERE RecordA = 'value in column A'
AND RecordB = 'value in column B'
I need the results back, either appended to new columns in the original table or part of their own text file output.
I was initially looking at using a formula for this inside of Pentaho, but couldn't quite figure it out. Since I have the query written I threw it into Excel and got the concatenated results (so a string of 350 or so queries that I need to run). I'm just not sure how to accomplish this - I tried the Execute SQL Script inside of Pentaho but it doesn't seem to do output?
Any direction would be useful. I've searched a little but have come up short so far, possibly because I am still pretty new to this platform.
You can accomplish this behavior in a lot of ways, with a "Database Lookup" step for example, but I usually do that in a quite easy way and here is a example for your tests, I hope it helps.
The idea here is to have two Table input steps, the first one will fetch the IDs we want to look at. For example you may use a SQL query similar to note on the left. The result will be a 1 column stream of rows.
Next we have a Table Input that reads the rows received and executes it's query for each row. I'll add a screenshot with the options that I selected.
What it does is replace a placeholder '?' with the data that is received. If you need two columns use two '?' but remember that it will replace the first one with the first column and the second one with the second column
And you are good to go. Test it a couple of times and good luck.
And the config for the second table input.

Adding two extra columns to input data - Pentaho Kettle

I am working on a transformation step for Pentaho Kettle. It selects several input columns and based on that adds two new columns during transformation. I am unable to understand (based on code from other plugins), how I can add the two new columns so that 1) steps downstream are aware of these columns and 2) i can push the transformed data into these columns.
Thanks in advance.
You might need to override meta.getStepFields() to add new ValueMetaInterface objects to the RowMetaInterface passed in. This is the standard way to add columns at runtime; however, the row's metadata (i.e. list of ValueMetaInterface objects) must be the same from row to row or else the next step in your transformation will complain.
Often when doing data-driven custom plugins, you consume as many rows as you need (using getRow()) in order to figure out what the outgoing row format/metadata will be, then you can construct a RowMetaInterface (usually using meta.getStepFields()) that will be passed into the putRow() call. If you intend to pass through the incoming fields, do something like:
RowMetaInterface outputRowMeta = getInputRowMeta().clone();
If you're creating new rows use this:
RowMetaInterface outputRowMeta = new RowMeta();
Either way when you call meta.getStepFields(outputRowMeta, ...) it should populate outputRowMeta with the appropriate fields, by adding/changing/removing ValueMetaInterface objects from outputRowMeta.
I've got a blog post using Groovy to add/replace fields in the incoming rows here:
http://funpdi.blogspot.com/2014/10/flatten-json-to-key-value-pairs-in-pdi.html
Not sure if that is similar to your use case or not. If you have more questions, feel free to find me on IRC at ##pentaho (my nick is usually mburgess_pdi)
IF i have understood your question correctly, i think you are trying to create an output file with dynamic column. So you can do this by checking on the "fast dumping" option in Text File Output Step. While doing so , donot define any column names in the "Fields" tab
Check my image below:
Hope it helps :)