I am querying some data from Salesforce using a Mule flow after subscribing to one of the Push Topic. After the data is queried, I can see the payload using #[message.payload.next()] but when I am trying to retrieve 'StageName' field using these expressions : 'payload[0].StageName' message.payload.StageName payload['StageName'] it's not working. I can see in the log it's printed values that is a Map but retrieving the field is not working.
payload[0].StageName - this works fine in Mule 3.3.2 environment but not working in my Mule 3.7.3, appreciate if any of you could help.
The data returned after Salesforce query is of type ConsumerIterator. Just use a set payload with value #[org.apache.commons.collections.IteratorUtils.toList(payload)] after salesforce query connector to convert the payload into ArrayList type.
Related
I am using google cloud logging to sink Dialogflow CX requests data to big query. BigQuery tables are auto generated when you create the sink via Google Logging.
We keep getting a sink error - field: value is not a record.
This is because pageInfo/formInfo/parameterInfo/value is of type String in BigQuery BUT there are values that are records, not strings. One example is #sys.date-time
How do we fix this?
We have not tried anything at this point since the BigQuery dataset is auto created via a Logging Filter. We cannot modify the logs and if we could modify the table schema, what would we change it to since most of the time "Value" is a String but other times it is a Record
I need some advise on solving this requirement for auditing purpose . I am using airflow composer - dataflow java operator job which spits out json file after job completion with status and error message details (into airflow data folder ) . I want to extract the status and error message from json file via some operator and then pass the variable to next pipeline job Bigqueryinsertjoboperator which calls the stored proc and passes status and error message as input parameter and finally gets written into BQ dataset table.
Thanks
You need to do XCom and JINJA templating. When you return meta-data from the operator, the data is stored in XCom and you can retrieve it using JINJA templating or Python code in Python operator (or Python code in your custom operator).
Those are two very good articles from Marc Lamberti (who also has really nice courses on Airlfow) describing how templating and jinja can be leveraged in Airflow https://marclamberti.com/blog/templates-macros-apache-airflow/ and this one describes XCom: https://marclamberti.com/blog/airflow-xcom/
By combining the two you can get what you want.
In pentaho data integration I am using metadata injection within a stream of a transformation. How can I get the result of the metadata injection back to my stream in order to continue transforming the data outside of the metadata injection. Copy rows to result does not seem to be working here like it does with a transformation within a transformation.
Found it myself. In the Options tab you can select the step within the template to read the data from and below you can set the fields.
metadata injection options
I am building a Nifi flow to get json elements from a kafka and write them into a Have table.
However, there is very little to none documentation about the processors and how to use them.
What I plan to do is the following:
kafka consume --> ReplaceText --> PutHiveQL
Consuming kafka topic is doing great. I receive a json string.
I would like to extract the json data (with replaceText) and put them into the hive table (PutHiveQL).
However, I have absolutely no idea how to do this. Documentation is not helping and there is no precise example of processor usage (or I could not find one).
Is my theoretical solution valid ?
How to extract json data, build a HQL query and send it to my local hive database ?
basicly you want to transform your record from kafka into HQL request then send the request to putHiveQl processor.
I am not sur that the transformation kafka record -> putHQL can be done with replacing text ( seam little bit hard/ tricky) . In general i use custom groovy script processor to do this.
Edit
Global overview :
EvaluateJsonPath
This extract the properties timestamp and uuid of my Json flowfile and put them as attribute of the flowfile.
ReplaceText
This set flowfile string to empty string and replaces it by the replacement value property, in which I build the query.
You can directly inject the streaming data using Puthivestreaming process.
create an ORC table with the strcuture matching to the flow and pass the flow to PUTHIVE3STreaming processor it works.
I am using the Google-provided template for PubSub to BigQuery with no customizations. I am trying to put multiple entries(rows) into a single json payload onto the queue and then have the DataFlow template insert all entries(rows) into the BigQuery table. I have tried providing a newline delimited json payload like is required when loading data into BigQuery via the console. However, I am only able to get the first entry to insert into the table.
Does the default DataFlow template only take a single entry(row)?
Currently the Google-provided template only accepts a single JSON record as payload within the Cloud Pub/Sub message and will not detect any newline delimited JSON. Look for this to change in the near future as additional supported formats are added to the template.