Building a pandas condition query using a loop - pandas

I am having an object filters which gives me conditions to be applied to a dataframe as shown below:
"filters": [
{
"dimension" : "dimension1",
"operator" : "IN",
"value": ["value1", "value2", "value3"],
"conjunction": None
},
{
"dimension" : "dimension2",
"operator" : "NOT IN",
"value": ["value1", "value2", "value3"],
"conjunction": "OR"
},
{
"dimension" : "dimension3",
"operator" : ">=",
"value": ["value1", "value2", "value3"],
"conjunction": None
},
{
"dimension" : "dimension4",
"operator" : "==",
"value": ["value1", "value2", "value3"],
"conjunction": "AND"
},
{
"dimension" : "dimension5",
"operator" : "<=",
"value": ["value1", "value2", "value3"],
"conjunction": None
},
{
"dimension" : "dimension6",
"operator" : ">",
"value": ["value1", "value2", "value3"],
"conjunction": "OR"
},
]
Here is the grammar by which I used to build the SQL Query:
for eachFilter in filters:
conditionString = ""
dimension = eachFilter["dimension"]
operator = eachFilter["dimension"]
value = eachFilter["dimension"]
conjunction = eachFilter["dimension"]
if len(eachFilter["value"]) == 1:
value = value[0]
if operator != "IN" or operator != "NOT IN":
conditionString += f' {dimension} {operator} {value} {conjunction}'
else:
conditionString += f' {dimension} {operator} {value} ({conjunction})'
else:
value = ", ".join(value)
if operator != "IN" or operator != "NOT IN":
conditionString += f' {dimension} {operator} {value} {conjunction}'
else:
conditionString += f' {dimension} {operator} {value} ({conjunction})'
But when it comes to pandas I can't use such queries so wanted to know if there's a good way to loop these filter conditions based on the conditions given in filters. Note that these are the only conditions I will be operating through.
In case of None as conjunction it should have the conjunction as "AND".

I have used eval function to create nested eval statements for pandas conditional filtering and then used it at the end to evaluate them all as shown below:
for eachFilter in filtersArray:
valueString = ""
values = eachFilter[self.queryBuilderMap["FILTERS_MAP"]["VALUE"]]
dimension = eachFilter[self.queryBuilderMap["FILTERS_MAP"]["DIMENSION"]]
conjunction = self.defineConjunction(eachFilter[self.queryBuilderMap["FILTERS_MAP"]["CONJUNCTION"]])
if filterCheck==len(filtersArray) - 1:
conjunction = ""
if (eachFilter[self.queryBuilderMap["FILTERS_MAP"]["OPERATOR"]]).lower() == "in":
for eachValue in values:
valueString += f"(df['{dimension}'] == {eachValue}) {conjunction} "
evalString += valueString
elif (eachFilter[self.queryBuilderMap["FILTERS_MAP"]["OPERATOR"]]).lower() == "not in":
for eachValue in values:
valueString += f"(df['{dimension}'] != {eachValue}) {conjunction} "
evalString += valueString
else:
for eachValue in values:
valueString += f"(df['{dimension}'] {eachFilter[self.queryBuilderMap['FILTERS_MAP']['OPERATOR']]} {eachValue}) {conjunction} "
evalString += valueString
filterCheck += 1
print(valueString)
#print(evalString)
df = eval(f'df.loc[{evalString}]')
#print(df.keys())
return df
Here filtermap is the dictionary key value pair:
"FILTERS_MAP": {
"DIMENSION": "dimension",
"OPERATOR": "operator",
"VALUE": "value",
"CONJUNCTION": "conjunction",
"WRAPPER": "wrapper"
}

Related

Kotlin - switching object detail based on group by

For example I have a class with below json format
[
{
"name": "a",
"detail": [
"1",
"2",
"3"
]
},
{
"name": "b",
"detail": [
"2",
"3",
"4"
]
}
]
how to change grouped it based on the detail?
[
{
"detail": "1",
"name": [
"a"
]
},
{
"detail": "2",
"name": [
"a",
"b"
]
},
{
"detail": "3",
"name": [
"a",
"b"
]
},
{
"detail": "4",
"name": [
"b"
]
}
]
below is my class structure
data class funName(
#field:JsonProperty("name")
val name: String = "",
#field:JsonProperty("detail")
val detail: Array<String> = arrayOf(""),
)
and my object is based on the array of funName
val data: Array<funName> = ...
i really have no idea how to do it.
val convert = data.groupBy { x -> x.detail } ??
Is this doable in kotlin/java?
Since the original data is grouped by name, you can think of the original data as a list of pairs
name detail
a 1
a 2
a 3
b 2
b 3
b 4
Mapping it to this format first would make it very easy to group by the second thing (detail) in the pair.
Since each funName corresponds to multiple pairs like this, you should use flatMap on data.
val result = data.flatMap { funName ->
funName.detail.map { funName.name to it }
}
.groupBy(keySelector = { (name, detail) -> detail }, valueTransform = { (name, detail) -> name })
// or more concisely, but less readable
// .groupBy({ it.second }) { it.first }
This will get you a Map<String, List<String>>.
If you want a List<Result>, where Result is something like
data class Result(
val detail: String = "",
val names: List<String> = listOf(),
)
You can add an additional map:
.map { (k, v) -> Result(k, v) }

How to verify the response has at least one valid array data?

I got a response like:
{
"by_group": [
{
"key": "2021-03-17T00:00:00.000+08:00",
"by_state": [
{
"key": "STATE1",
"value": 1
},
{
"key": "STATE2",
"value": 2
}
]
},
{
"key": "2021-03-20T00:00:00.000+08:00",
"by_state": [
{
"key": "STATE3",
"value": 3
},
{
"key": "STATE4",
"value": 4
}
]
},
{
"key": "2021-03-24T00:00:00.000+08:00",
"by_state": []
}
]
}
schema used here:
* def schema2 = { key : '#string', value : '##number? _ >= 0' }
* def schema1 = { key : '#string', by_state : '#[_ > 0] schema2' }
And match response ==
"""
{
by_group: '#[_ > 0] schema1'
}
"""
It failed when we got the 3rd item which was an empty array, but we allowed this to happen.
We just need to make sure there is at least one valid "by_state" array in the response.
Your requirement can be expressed as one line:
* match response.by_group contains { key: '#string', by_state: '#[_ > 0]' }
Note that karate.filter() is something you should be aware of. So you can filter out the elements you want, and then count the number of results returned etc: https://github.com/intuit/karate#json-transforms

How to create a map of Excel file key-value without using column name in Dataweave?

I am reading an excel file (.xlsx) into a json array and I am creating into a map because I want to apply validations to each of the column individually. I am able to access it using the column name like so,
Excel file is :
column A, column B
value of Column A, value of column B
I am accessing it like this :
payload map(item, index) ->
"Column Name A" : item."Column Name A",
"Column Name B" : item."Column Name B"
Where column A and B are the excel column header.
What I want to do is to create the same map but using the column index like
payload map(item, index) ->
item[0].key : item[0],
item[1].key : item[1]
So that I do not have to hard code the excel header name and I can rely on the index of the excel columns.
I have tried using pluck $$ to create a map of Keys but I cannot create a map of keys-value, I am not able to use item[0] as key in a map.
How can I achieve above without using excel column header name?
Expected output should be like this :
{
"Column A " : "value of Column A",
"Column B" : "value of Column B",
"Errors" : "Column A is not valid"
}
Assuming that you'd like to validate each payload item loaded from an Excel file, you could use the following DataWeave expression:
%dw 2.0
output application/json
fun validate(col, val) =
if (isEmpty(val)) {"error": col ++ ": value is null or empty"}
else {}
fun validateRow(row) =
"Errors":
flatten([] << ((row mapObject ((value, key, index) -> ((validate((key), value))))).error default []))
---
payload map (item, index) -> item ++ validateRow(item)
Using the following input payload:
[
{"col1": "val1.1", "col2": "val1.2", "col3": "val1.3"},
{"col1": "val2.1", "col2": "val2.2", "col3": null}
]
would result in:
[
{
"col1": "val1.1",
"col2": "val1.2",
"col3": "val1.3",
"Errors": [
]
},
{
"col1": "val2.1",
"col2": "val2.2",
"col3": null,
"Errors": [
"col3: value is null or empty"
]
}
]
The expression will result in an output slightly different than the one you're expecting, but this version will allow you to have an array of error messages that can be easier to manipulate later on in your flow.
One thing to keep in mind is the possibility to have more than one error message per column. If that's the case, then the DataWeave expression would need some adjustments.
Try just using the index. It should work just fine.
%dw 2.0
output application/json
---
({ "someKey": "Val1", "lksajdfkl": "Val2" })[1]
results to
"Val2"
And if you want to use a variable as a key you have to wrap it in parentheses.
EG, to transform { "key": "SomeOtherKey", "val": 123 } to { "SomeOtherKey": 123 } you could do (payload.key): payload.val
Try this:
%dw 2.0
output application/json
var rules = {
"0": {
key: "Column A",
val: (val) -> !isEmpty(val),
},
"1": {
key: "Column B",
val: (val) -> val ~= "value of Column B"
}
}
fun validate(v, k, i) =
[
("Invalid column name: '$(k)' should be '$(rules[i].key)'") if (rules[i]? and rules[i].key? and k != rules[i].key),
("Invalid value for $(rules[i].key): '$(v default "null")'") if (rules[i]? and rules[i].val? and (!(rules[i].val(v))))
]
fun validate(obj) =
obj pluck { v: $, k: $$ as String, i: $$$ as String } reduce ((kvp,acc={}) ->
do {
var validation = validate(kvp.v, kvp.k, kvp.i)
---
{
(acc - "Errors"),
(kvp.k): kvp.v,
("Errors": (acc.Errors default []) ++
(if (sizeOf(validation) > 0) validation else [])
) if(acc.Errors? or sizeOf(validation) > 0)
}
}
)
---
payload map validate($)
Output:
[
{
"Column A": "value of Column A",
"Column B": "value of Column B"
},
{
"Column A": "",
"Column B": "value of Column B",
"Errors": [
"Invalid value for Column A: ''"
]
},
{
"Column A": "value of Column A",
"Column B": "value of Column C",
"Errors": [
"Invalid value for Column B: 'value of Column C'"
]
},
{
"Column A": null,
"Column C": "value of Column D",
"Errors": [
"Invalid value for Column A: 'null'",
"Invalid column name: 'Column C' should be 'Column B'",
"Invalid value for Column B: 'value of Column D'"
]
}
]

how to remove objects that have all keys with null values in dataweave?

I have this below payload and I want to remove object where all the keys have ALL empty values,
[
{
"Order" : "123",
"Product" : "456"
},
{
"Order" : "",
"Product" : ""
}
]
This is what the output should be like,
[
{
"Order" : "123",
"Product" : "456"
}
]
None of the posted solutions handle things like nested structures or arrays, so I thought I'd throw this recursive solution in the ring. This allows us to traverse the entire structure of the object until we hit the first non-null field.
%dw 2.0
output application/json
import everyEntry from dw::core::Objects
import every from dw::core::Arrays
var allFieldsNull = (obj: Any) ->
obj match {
case is Object -> obj everyEntry (allFieldsNull($))
case is Array -> (sizeOf(obj) == 0) or (obj every allFieldsNull($))
//case is Array -> false
else -> isEmpty(obj)
}
---
payload filter !allFieldsNull($)
If you wanted to consider an empty array as enough to keep the object since that technically isn't null, you would just need to comment out the case is Array line and uncomment the one below it.
Input:
[
{
"Order" : "123",
"Product" : "456"
},
{
"Order" : "",
"Product" : "",
"Address": {
"Field1": ""
},
"Test": [
{
"Order" : "",
"Product" : "",
"Address": {
"Field1": ""
}
}
]
},
{
"Order" : null,
"Product" : null,
"Address": {
"Field1": null
},
"Test": [
{
"Order" : null,
"Product" : null,
"Address": {
"Field1": "A value even in a deeply nested field means I show up"
}
}
]
}
]
output:
[
{
"Order": "123",
"Product": "456"
},
{
"Order": null,
"Product": null,
"Address": {
"Field1": null
},
"Test": [
{
"Order": null,
"Product": null,
"Address": {
"Field1": "A value even in a deeply nested field means I show up"
}
}
]
}
]
Would something like this work for you?
Input
[
{
"Order" : "123",
"Product" : "456"
},
{
"Order" : null,
"Product" : null
}
]
Script
%dw 2.0
output application/json
import * from dw::core::Objects
var valuesOfInputObjects = payload map { ($ takeWhile((value, key) -> value == null))}
---
payload -- valuesOfInputObjects
output
[
{
"Order": "123",
"Product": "456"
}
]
You can filter by a condition, using the everyEntry() function to see that not all values are empty.
%dw 2.0
output application/json
import * from dw::core::Objects
---
payload filter ($ someEntry (value, key) -> !isEmpty(value))

Comparing json in Karate

I have two jsons array responses with same data but the attributes are different. How to compare this kind of jsons.
json 1:
comments: [
{
"onetag1": "1",
"onetag2": "2"
},
{
"onetag11": "3",
"onetage12": "4"
}
]
json 2:
newcommentslist: [
{
"newtag2": "2",
"newtag1": "1"
},
{
"newtag11": "3",
"newtage12": "4"
}
]
Use JsonPath:
* def first = [ { "onetag1": "1", "onetag2": "2" }, { "onetag11": "3", "onetage12": "4" } ]
* def values = $first[*].*
* match values == ['1', '2', '3', '4']
Or transform one of them: https://stackoverflow.com/a/53120851/143475