Error while ingesting data into dataverse using Synapse dataflow (Sustainability Manager) - azure-synapse

I'm trying to load data into dataverse using Azure Synapse pipelines (Dataflows).
Sink: Dataverse table - Mobile combustion (https://learn.microsoft.com/en-gb/common-data-model/schema/core/industrycommon/sustainability/mobilecombustion)
Below are 4 different combinations I tried for Organizational Unit which is a lookup column. I tried attribute name as msdyn_organizationalunitId & msdyn_organizationalunitid and values as GUID value & name of the organization. None of these combination worked for me.
Can someone help me with what parameter name and value should I pass for a lookup column while inserting a record in Dataverse entity having lookup property. In this case Mobile combustion is the entity name and Organizational unit is the lookup property.
Below is the error message for the last combination:
msdyn_organizationalunitId: 'OrgUnit01'
Microsoft.OData.ODataException: A 'PrimitiveValue' node with non-null value was found when trying to read the value of the property 'msdyn_organizationalunitId'; however, a 'StartArray' node, a 'StartObject' node, or a 'PrimitiveValue' node with null value was expected.

Related

Migration from Oracle to CDS using ADF

I am trying to migrate data from Oracle to an entity in Common Data Service( CDS) through Azure Data Factory Copy Activity. As CDS comes with GUID as a primary key and Oracle doesnt have primary key, my pipeline always fails.
I tried to create an additional column in source data set with value as #guid() however it throws that column must be of type guid
also tried
select REGEXP_REPLACE(SYS_GUID(), '(.{8})(.{4})(.{4})(.{4})(.{12})', '\1-\2-\3-\4-\5') MSSQL_GUID,c. * from table_name c;
guid is coming as string in the mapping
How do we automatically generate guid in this scenario
Could you please try updating your additional column (#guid()) data type from "type": "String" to "type": "Guid" by editing the JSON payload of your pipeline (look for {} symbol at top right corner of your pipeline). See below GIF:
Update:
After further analysis by collaborating with product team, it (type coversion) is identified as an unsupported feature with dynamics sink, where UX disables type conversion for dynamics sink. UX hasn't supported it since the release of type conversion feature.
Product team has opened a work item as a feature improvement for type conversion with Dynamics sink . The ETA for this feature support is mid of September (Note: This is tentative date), but product team is actively working on it. I will closely monitor the work item and will update this post as soon as I have additional information.
As a workaround, please try to split pipeline into 2 copies (copy activities). Oracle -> csv & csv -> dynamics. In first copy, add an additional column to write empty guid column in csv file. In second copy, change the type of guid column in csv to Guid and do the copy.
Please let us know how it goes.

Unable to load distinct records via lookup in informatica

This is a very strange thing happening i dont know why. I have created a mapping that transforms the data via expression and loads the data into the target(file) based on lookup on the same target.
Source table
#CompanyName
Acne Lmtd
Acne Ltd
N/A
None
Abc Ltd
Abc Ltd
X
Mapping
Source
->Exp(trim..)
->Lookup(source.company_name
= tgt.company_name)
ReturnPort is CompId
-> filter(ISNULL(CompId))
-> Target
Compid (via sequence
gen)
CompName
The above mapping logic inserts duplicate companynames as well like in source 2 Abc Ltd records same is repeated in target as well. I dont know why. I have tried to debug as well the condition evaluates to true in filter that companyid is null even if the record is already inserted in target.
Also, i thought it might be the case of lookup cache i do enabled dynamic as well but same result. It should have worked like an sql query
select company_id
From lkptarget where
company_name
In (select company_name
from
Source)
Therefore, for Abc Ltd the filter condition should have result in false
Isnull(company_id) false
But, this is getting true. How do I get unique records via lookup and without using distinct?
Note: lookup used is dynamic lookup already
That was in fact a dynamic cache issue the newLookupRow gets assigned a value of 0 on duplicates so I have added the condition in filter as ISNULL(COMPANYID) AND NEWLOOKUPROW=1
and finally that did work.
The Lookup transformation has not way to know what happens in further transformations in the mapping. It can't see results in the target itself, because the Lookup cache is loaded once at the beginning of the mapping using a separate connection to the database. Even if you disable caching (that would mean one query for each Lookup input row), data is not immediately committed (so not visible to other connections) when writing to the target.
That's the reason to use dynamic Lookup cache, which works by adding new lines to the Lookup cache. However in your case there is a catch : the company_id is created after the Lookup (it's the right place to do so), so it can't be added to the Lookup cache.
I think you could configure the Lookup so that :
You activate the options Dynamic Lookup Cache, Update Else Insert and Insert Else Update
You use the company_name to make comparison between source data and Lookup data
You create a fake field company_id with value 0 before the Lookup and associate it to the corresponding Lookup field
You check the checkbox Disable in comparison for the company_id field
You can then use the predefined field NewLookupRow (it appears when you check the Dynamic Lookup Cache option) which should have a value of 1 for new rows or 2 for existing rows with updates (0 for identical rows)
The Lookup should now output NewLookupRow = 1 for the first Abc Ltd and then NewLookupRow = 0 for the second. The filter just after the Lookup should have a condition like NewLookupRow = 1.
For more details you can have a look at the Informatica documentation :
https://docs.informatica.com/data-integration/data-services/10-2/developer-transformation-guide/dynamic-lookup-cache.html

Avoid discretization error in ssas

I have a dimension with attribute AGE.
I have applied discretization on that attribute where the bucket count is 20.
Everything works fine when we have enough values for AGE column in the underlying database.
But recently we updated the table and none but one row has value in AGE column.
Now I am getting processing error saying there is not enough value to create the bucket.
Can I bypass this error and still process the cube? I want the cube not to give processing error even if we do not have enough data in the underlying table to create buckets.
Unfortunately, no. The only way is to re-tune DiscretizationMethod property to None manually.
I also tried changing directly in XML:
From Automatic to None:
But failed as expected: no changes were applied.

How to implement a key lookup for generated keys table in pentaho Kettle

I just started to use Pentaho Kettle for integration. Seems great so far, quite intuitive compared to Talend, which I was also investigating.
I am trying to migrate some customers without their keys. So I have their email addresses.
The customer may already exist in the database, so what I need to do is:
If the customer exists, add it's id to the imported field and continue.
But if the customer doesn't exist I need to get the next Hibernate key from the table Hibernate_Sequences and set it as the id.
But I don't want to always allocate a key, so I want to conditionally execute a step to allocate the next key.
So what I want to do, is in the flow execute the db procedure, which allocates the next key and returns it, only if there's no value in id from the "lookup id" step.
Is this possible?
Just posting my updated flow - so the answer was to use a filter rows component which splits the data on true/false. I really had trouble getting the id out of the database stored proc because of a bug, so I had to use decimal and then convert back to integer (which I also couldn't figure out how to do, so used a javascript component).
Yes it is. As per official documentation (i left only valuable information) "Lookup values are added as new fields onto the stream". So u need just to put step "Filter row" in Flow section and check for "id" which suppose to be added in "Existing Id Lookup" step.

How to map dynamic dynamoDB columns in EMR Hive

I have a table in Amazon dynamoDB with a record structure like
{"username" : "joe bloggs" , "products" : ["1","2"] , "expires1" : "01/01/2013" , "expires2" : "01/02/2013"}
where the products property is a list of products belonging to the user and the expires n properties relate to the products in the list, the list of products is dynamic and there are many. I need to transfer this data to S3 in a format like
joe bloggs|1|01/01/2013
joe bloggs|2|01/02/2013
Using hive external tables I can map the username and products columns in dynamoDB, however I am unable to map the dynamic columns. Is there a way that I could extend or adapt the org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler in order to interpret and structure the data retrieved from dynamo before hive ingests it? or is there an alternative solution to convert the dynamo data to first normal form?
One of my key requirements is that i maintain the throttling provided by the dynamodb.throughput.read.percent setting so that I do not compromise operational use of the table.
You could build a specific UDTF(User defined table-generating functions) for that case.
I'm not sure how Hive handles asterisk(probably for your case) as an argument for the function.
Something like what Explode (source) does.