Input:
{
"count": 3,
"employees":
[
{
"name":"appy",
"age":34
},
{
"name":"happy",
"age":38
},
{
"name":"cruise",
"age":36
}
]
}
Output:
[
{
"first":"appy",
"age":34
},
{
"first":"happy",
"age":38
},
{
"first":"cruise",
"age":36
}
]
This is my input i am trying to add "first" in "name" how can i do any suggestions i am using Map function here.
"first" in "name" place i am trying to use map function
Just map the elements:
%dw 2.0
output application/json
---
payload.employees map {
first: $.name,
age: $.age
}
Output:
[
{
"first": "appy",
"age": 34
},
{
"first": "happy",
"age": 38
},
{
"first": "cruise",
"age": 36
}
]
Related
In my mongodb collection documents are stored in the following format:
{ "_id" : ObjectId("62XXXXXX"), "res" : 12, ... }
{ "_id" : ObjectId("63XXXXXX"), "res" : 23, ... }
{ "_id" : ObjectId("64XXXXXX"), "res" : 78, ... }
...
I need to extract id's for the document for which the value of "res" is outlier (i.e. value < Q1 - 1.5 * IQR or value > Q3 + 1.5 * IQR (Q1, Q3 are percentiles)). I have done this using pandas functionality by retrieving all documents from the collection, which may become slow if the number of documents in collection become too big.
Is there a way to do this using mongodb aggregation pipeline (or just calculating percentiles)?
If I understand how you want to retrieve outliers, here's one way you might be able to do it.
db.collection.aggregate([
{ // partition res into quartiles
"$bucketAuto": {
"groupBy": "$res",
"buckets": 4
}
},
{ // get the max of each quartile
"$group": {
"_id": "$_id.max"
}
},
{ // sort the quartile maxs
"$sort": {
"_id": 1
}
},
{ // put sorted quartile maxs into array
"$group": {
"_id": null,
"maxs": {"$push": "$_id"}
}
},
{ // assign Q1 and Q3
"$project": {
"_id": 0,
"q1": {"$arrayElemAt": ["$maxs", 0]},
"q3": {"$arrayElemAt": ["$maxs", 2]}
}
},
{ // set IQR
"$set": {
"iqr": {
"$subtract": ["$q3", "$q1"]
}
}
},
{ // assign upper/lower outlier thresholds
"$project": {
"outlierThresholdLower": {
"$subtract": [
"$q1",
{"$multiply": ["$iqr", 1.5]}
]
},
"outlierThresholdUpper": {
"$add": [
"$q3",
{"$multiply": ["$iqr", 1.5]}
]
}
}
},
{ // get outlier _id's
"$lookup": {
"from": "collection",
"as": "outliers",
"let": {
"oTL": "$outlierThresholdLower",
"oTU": "$outlierThresholdUpper"
},
"pipeline": [
{
"$match": {
"$expr": {
"$or": [
{"$lt": ["$res", "$$oTL"]},
{"$gt": ["$res", "$$oTU"]}
]
}
}
},
{
"$project": {
"_id": 1
}
}
]
}
}
])
Try it on mongoplayground.net.
One more option based on #rickhg12hs's answer, is to use $setWindowFields:
db.collection.aggregate([
{$setWindowFields: {
sortBy: {res: 1},
output: {
totalCount: {$count: {}},
index: {$sum: 1, window: {documents: ["unbounded", "current"]}}
}
}
},
{$match: {
$expr: {$lte: [
{$abs: {$subtract: [
{$mod: [
{$multiply: [
{$add: ["$index", {$round: {$divide: ["$totalCount", 4]}}]}, 2]},
"$totalCount"
]}, 0]}
}, 1]}
}},
{$group: {_id: null, res: {$push: "$res"}}},
{$project: {_id: 0, q1: {$first: "$res"}, q3: {$last: "$res"},
iqr: {"$subtract": [{$last: "$res"}, {$first: "$res"}]}
}},
{$project: {
outlierThresholdLower: {$subtract: ["$q1", {$multiply: ["$iqr", 1.5]}]},
outlierThresholdUpper: {$add: ["$q3", {$multiply: ["$iqr", 1.5]}]}
}
},
{$lookup: {
from: "collection",
as: "outliers",
let: {oTL: "$outlierThresholdLower", oTU: "$outlierThresholdUpper"},
pipeline: [
{$match: {$expr: {$or: [{$lt: ["$res", "$$oTL"]}, {$gt: ["$res", "$$oTU"]}]}}},
{$project: {_id: 1}}
]
}
}
])
See how it works on the playground example
How to write Dataweave transformation in Anytime Studio for given input and output of Json array.
Input:
{
"result": [{
"Labels": [{
"value": [{
"fieldName": "firstName",
"value": "John"
},
{
"fieldName": "lastName",
"value": "Doe"
},
{
"fieldName": "fullName",
"value": "John Doe"
}
]
}]
}]
}
Output:
{
"result": [{
"Labels": [{
"value": [{
"firstName": "John",
"lastName": "Doe",
"fullName": "John Doe"
}]
}]
}]
}
https://docs.mulesoft.com/dataweave/2.4/dw-core-functions-reduce Reduce function might be the one should be used
Thank you in advance
You can just use map to map all the arrays to required format. For the value part you can map the values as fieldName: value array and deconstruct them to an object by wrapping the array around parentheses
%dw 2.0
output application/json
---
{
result: payload.result map ((item) -> {
Labels: item.Labels map ((label) -> {
value: [
{
(label.value map ((field) ->
(field.fieldName): field.value
)) //wrap the array, i.e. lavel.value map ... in parentheses so that it will give you individual key pair.
}
]
})
})
}
You can try below if you are aware that the keyNames will not change:
%dw 2.0
output application/json
---
payload update {
case res at .result -> res map (res, resIndex) -> (res update {
case lbl at .Labels -> lbl map (lbl, lblIndex) -> (lbl update {
case val at .value -> [
(val reduce ((item, acc = {}) -> acc ++ {
(item.fieldName): (item.value)
}))
]
}
)
}
)
}
Here's 2 caveats and a solution. Your input and output files, both are not valid JSON.
Input file, in your "result" object, "Labels" need curly braces {} since they are objects. Key-value pairs should look like this {key:value} not like that key:value
Output file, inside your "value" arrays, key-value pairs need to have the curlies {key:value}
So here's a valid JSON version of your input
{
"result": [
{"Labels": [
{
"value": [
{"fieldName": "firstName","value": "John"},
{"fieldName": "lastName","value": "Doe"},
{"fieldName": "fullName","value": "John Doe"}
]
}
]},
{"Labels": [
{
"value": [
{"fieldName": "firstName","value": "John"}
]
}
]}
]}
Here's a solution
%dw 2.0
import keySet from dw::core::Objects
// this is "result"
var layer1key = keySet(payload)[0]
// this is "Labels" and grabs the first Labels, so assumes Labels doesn't change
var layer2 = payload[layer1key]
var layer2key = keySet(layer2[0])[0]
// this is "value"
var layer3 = layer2[layer2key]
var layer3key = keySet(layer3[0][0])[0]
// this is "fieldName" and "value"
var layer4 = layer3 map (x) -> x['value']
var data1 = ((layer1key) : layer4 map (x) -> {
(layer2key): x map (y) -> {
(layer3key): y map (z) -> {
(z['fieldName']):z['value']
}
}
})
output application/json
---
data1
And a valid JSON version of your output
{
"result": [
{
"Labels": [
{
"value": [
{
"firstName": "John"
},
{
"lastName": "Doe"
},
{
"fullName": "John Doe"
}
]
}
]
},
{
"Labels": [
{
"value": [
{
"firstName": "John"
}
]
}
]
}
]
}
I have an incoming CSV file that looks like this (notice that the first field is common - this is the order number)
36319602,100,12458,HARVEY NORMAN,
36319602,101,12459,HARVEY NORMAN,
36319602,102,12457,HARVEY NORMAN,
36319601,110,12458,HARVEY NORMAN,
36319601,111,12459,HARVEY NORMAN,
36319601,112,12457,HARVEY NORMAN,
36319603,110,12458,HARVEY NORMAN,
36319603,121,12459,HARVEY NORMAN,
36319603,132,12457,HARVEY NORMAN,
This is my current Dataweave code
list_of_orders: {
order: payload map ((payload01 , indexOfPayload01) -> {
order_dtl:
[{
seq_nbr: payload01[1],
route_nbr: payload01[2]
}],
order_hdr: {
ord_nbr: payload01[0],
company: payload01[3],
city: payload01[4],
}
})
}
An example of the desired output would be something like this ... (this is just mocked up). Notice how I would like a single header grouped by the first column which is the order number - but with multiple detail lines
"list_of_orders": {
"order": [
{
"order_dtl": [
{
seq_nbr: 100,
route_nbr: 12458
},
{
seq_nbr: 101,
route_nbr: 12459
},
{
seq_nbr: 102,
route_nbr: 12457
}
],
"order_hdr":
{
ord_nbr: 36319602,
company: HARVEY NORMAN
}
}
]
}
It works fine except that it is repeating the order_hdr key.
What they would like is a single header key with multiple details beneath.
The grouping is to be based on "ord_nbr: payload01[0]"
Any help appreciated
Thanks
I think you're using Dataweave 1. In dw1, this groupBy gets the desired output(Note you can change the field pointers [0],1 etc to field name mappings if you have them set up as metadata etc):
%dw 1.0
%output application/json
---
list_of_orders: {
order: (payload groupBy ($[0])) map {
order_dtl: $ map {
seq_nbr: $[1],
route_nbr: $[2]
},
order_hdr:
{
ord_nbr: $[0][0],
company: $[0][3]
}
}}
UPDATE
Here is the output for the new input sample with multiple orders:
{
"list_of_orders": {
"order": [
{
"order_dtl": [
{
"seq_nbr": "110",
"route_nbr": "12458"
},
{
"seq_nbr": "121",
"route_nbr": "12459"
},
{
"seq_nbr": "132",
"route_nbr": "12457"
}
],
"order_hdr": {
"ord_nbr": "36319603",
"company": "HARVEY NORMAN"
}
},
{
"order_dtl": [
{
"seq_nbr": "100",
"route_nbr": "12458"
},
{
"seq_nbr": "101",
"route_nbr": "12459"
},
{
"seq_nbr": "102",
"route_nbr": "12457"
}
],
"order_hdr": {
"ord_nbr": "36319602",
"company": "HARVEY NORMAN"
}
},
{
"order_dtl": [
{
"seq_nbr": "110",
"route_nbr": "12458"
},
{
"seq_nbr": "111",
"route_nbr": "12459"
},
{
"seq_nbr": "112",
"route_nbr": "12457"
}
],
"order_hdr": {
"ord_nbr": "36319601",
"company": "HARVEY NORMAN"
}
}
]
}
}
Im am new to MuleSoft.I am trying to transform a JSON payload,using transform.
I want to transform my payload as below
Input:
{
"ResponseStatus": {
"Status": "SUCCESS",
"StatusText": "SUCCESS"
},
"Processes": {
"Process": [
{
"ProcessId": "1234567",
"ProcessProperties": {
"Property": [
{
"Name": "XXXXXXXXXXX",
"Value": "11111111",
"Desc": "YYYYYYYY"
},
{
"Name": "AAAAAAAAA",
"Value": "2222222",
"Desc": "BBBBBBBB"
},
{
"Name": "QQQQQQQQQ",
"Value": "#######",
"Desc": "CCCCCCCC"
},
{
"Name": "NNNNNNN",
"Value": "IIIIIIII",
"Desc": "UYUYUYUY"
}
]
},
"EditMode": "CCCCCC",
"ProcessType": "ABCD",
"AppName": "VFVFVGBG",
"StatusHistory": {
"STS": [
{
"Sts": "COMPLETED"
}
]
}
}
]
}
}
Output:
[
{
"ProcessId": "1234567",
"AAAAAAAAA": "2222222",
"QQQQQQQQQ": "#######"
}
]
I have read DWL reference from below Mulesoft link.Also reffered this SO link.
Below is what I have tried so far,
%dw 1.0
%output application/json
---
{
"ProcessId": (payload.Processes.Process.ProcessId)[0],
AAAAAAAAA: {
(payload.Processes.Process.ProcessProperties.Property mapObject {
($.Name):$.Value when $.Name =="AAAAAAAAA" otherwise ""
})
},
QQQQQQQQQ: {
(payload.Processes.Process.ProcessProperties.Property mapObject {
($.Name):$.Value when $.Name =="QQQQQQQQQ" otherwise ""
})
}
}
I am still not able to get the desired output.
It gives me "Cannot coerce a :array to a :key"
Can anyone please help me?
The "property" json element in your input json is "Array",which it is not able to parse to a single value.
Please try below snippet and let me know if that gives your deisred o/p.
payload.Processes.Process map (
(val , index) ->
{"ProcessId":(payload.Processes.Process.ProcessId)[0]
,
(val.ProcessProperties.Property map {
(($.Name) : $.Value) when $.Name =='AAAAAAAAA' }
),
(val.ProcessProperties.Property map {
(($.Name) : $.Value) when $.Name =='QQQQQQQQQ' }
)
}
)
I'm using elasticsearch and need to implement facet search for hierarchical object as follow:
category 1 (10)
subcategory 1 (4)
subcategory 2 (6)
category 2 (X)
...
So I need to get facets for two related objects. Documentation says that it's possible to get such kind of facets for numeric value, but I need it for strings http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-facets-terms-stats-facet.html
Here is another interesting topic, unfortunately it's old: http://elasticsearch-users.115913.n3.nabble.com/Pivot-facets-td2981519.html
Does it possible with elastic search?
If so, how can I do that?
The previous solution works really well until you have no more than a multi-level tag on a single-document. In this case a simple aggregation doesn't work, because the flat structure of the lucene fields mix the results on the internal aggregation.
See the example below:
DELETE /test_category
POST /test_category
# Insert a doc with 2 hierarchical tags
POST /test_category/test/1
{
"categories": [
{
"cat_1": "1",
"cat_2": "1.1"
},
{
"cat_1": "2",
"cat_2": "2.2"
}
]
}
# Simple two-levels aggregations query
GET /test_category/test/_search?search_type=count
{
"aggs": {
"main_category": {
"terms": {
"field": "categories.cat_1"
},
"aggs": {
"sub_category": {
"terms": {
"field": "categories.cat_2"
}
}
}
}
}
}
That's the WRONG response that I have got on ES 1.4, where the fields on the internal aggregation are mixed at a document level:
{
...
"aggregations": {
"main_category": {
"buckets": [
{
"key": "1",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1",
"doc_count": 1
},
{
"key": "2.2", <= WRONG
"doc_count": 1
}
]
}
},
{
"key": "2",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1", <= WRONG
"doc_count": 1
},
{
"key": "2.2",
"doc_count": 1
}
]
}
}
]
}
}
}
A Solution can be to use nested objects. These are the steps to do:
1) Define a new type in the schema with nested objects
POST /test_category/test2/_mapping
{
"test2": {
"properties": {
"categories": {
"type": "nested",
"properties": {
"cat_1": {
"type": "string"
},
"cat_2": {
"type": "string"
}
}
}
}
}
}
# Insert a single document
POST /test_category/test2/1
{"categories":[{"cat_1":"1","cat_2":"1.1"},{"cat_1":"2","cat_2":"2.2"}]}
2) Run a nested aggregation query:
GET /test_category/test2/_search?search_type=count
{
"aggs": {
"categories": {
"nested": {
"path": "categories"
},
"aggs": {
"main_category": {
"terms": {
"field": "categories.cat_1"
},
"aggs": {
"sub_category": {
"terms": {
"field": "categories.cat_2"
}
}
}
}
}
}
}
}
That's the response, now correct, that I have got:
{
...
"aggregations": {
"categories": {
"doc_count": 2,
"main_category": {
"buckets": [
{
"key": "1",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "1.1",
"doc_count": 1
}
]
}
},
{
"key": "2",
"doc_count": 1,
"sub_category": {
"buckets": [
{
"key": "2.2",
"doc_count": 1
}
]
}
}
]
}
}
}
}
The same solution can be extended to a more than two-levels hierarchy facet.
Currently, elasticsearch does not support hierarchical facetting out-of-the-box. But the upcoming 1.0 release features a new aggregations module, that can be used to get these kind of facets (which are more like pivot-facets rather than hierarchical facets). Version 1.0 is currently in beta, you can download the second beta and test out aggregatins by yourself. Your example might look like
curl -XPOST 'localhost:9200/_search?pretty' -d '
{
"aggregations": {
"main category": {
"terms": {
"field": "cat_1",
"order": {"_term": "asc"}
},
"aggregations": {
"sub category": {
"terms": {
"field": "cat_2",
"order": {"_term": "asc"}
}
}
}
}
}
}'
The idea is, to have a different field for each level of facetting and bucket your facets based on the terms of the first level (cat_1). These aggregations then would have sub-buckets, based on the terms of the second level (cat_2). The result may look like
{
"aggregations" : {
"main category" : {
"buckets" : [ {
"key" : "category 1",
"doc_count" : 10,
"sub category" : {
"buckets" : [ {
"key" : "subcategory 1",
"doc_count" : 4
}, {
"key" : "subcategory 2",
"doc_count" : 6
} ]
}
}, {
"key" : "category 2",
"doc_count" : 7,
"sub category" : {
"buckets" : [ {
"key" : "subcategory 1",
"doc_count" : 3
}, {
"key" : "subcategory 2",
"doc_count" : 4
} ]
}
} ]
}
}
}