pig, turn an array into multiple records - apache-pig

I have an array in my record:
{
"resource":"rest-api-v1",
"accessControlList":[
{
"methods":{
"methodTypes":[
"DELETE"
]
},
"Users":[
"user2"
]
},
{
"methods":{
"methodTypes":[
"CREATE"
]
},
"Users":[
"user1",
"user2"
]
}
]
}
in accessControlList array, there are 2 elements in the array.
How can i turn this 1 record into 2?
I want the result looks like:
resource: rest-api-v1
accessControl:
{
"methods":{
"methodTypes":[
"DELETE"
]
},
"Users":[
"user2"
]
}
And
resource: rest-api-v1
accessControl:
{
"methods":{
"methodTypes":[
"DELETE"
]
},
"Users":[
"user2"
]
}
In hive i can do a LATERAL VIEW EXPLODE(), but in pig i don't know how.

The pig FLATTEN operator does what you're looking for: https://pig.apache.org/docs/r0.16.0/basic.html#flatten

Related

getting lvalue and rvalue of a declaration

Im parsing C++ with ANTLR4 grammar, I have a visitor function for visitDeclarationStatement. In the C++ code that Im trying to parse Person p; or a declaration of any custom type, in the tree I get two similar nodes and I cannot differentiate between the Lvalue and Rvalue!
"declarationStatement": [
{
"blockDeclaration": [
{
"simpleDeclaration": [
{
"declSpecifierSeq": [
{
"declSpecifier": [
{
"typeSpecifier": [
{
"trailingTypeSpecifier": [
{
"simpleTypeSpecifier": [
{
"theTypeName": [
{
"className": [
{
"type": 128,
"text": "Person"
}
]
}
]
}
]
}
]
}
]
}
]
},
{
"declSpecifier": [
{
"typeSpecifier": [
{
"trailingTypeSpecifier": [
{
"simpleTypeSpecifier": [
{
"theTypeName": [
{
"className": [
{
"type": 128,
"text": "p"
}
]
}
]
}
]
}
]
}
]
}
]
}
]
},
{
"type": 124,
"text": ";"
}
]
}
I want to be able to get Variable Type and Variable Name separately. What is the right way of doing that? How can I change the g4 file to get those results in a way that I can differentiate between the type and the name?
Thanks
You don't need to change the grammar. In case of Person p;, which is matched by:
declSpecifierSeq: declSpecifier+ attributeSpecifierSeq?;
the first child of declSpecifierSeq (which is declSpecifier+) will be a List of declSpecifier-contexts, of which the first is the type and the second the name.

MongoError: Expression $in takes exactly 2 arguments. 1 were passed in

below is my query:
while executing below query getting mongoerror : Expression $in takes exactly 2 arguments. 1 were passed in.
i am using $in Comparison operator
{
"$expr": {
"$not": {
"$eq":{
"$and": [
{
"PrName": {
"$in": [
"pname"
]
}
},
{
"AccountId": {
"$in": [
"34562",
"88765",
"87654",
"12345"
]
}
}
]
}
}
}
}
When you use $expr, the operator expressions syntax changes a little:
{ $in: [ <expression>, <array expression> ] }
{
"$expr": {
"$not": {
"$and": [
{
"$in": [
"$PrName",
[
"pname"
]
]
},
{
"$in": [
"$AccountId",
[
"34562",
"88765",
"87654",
"12345"
]
]
}
]
}
}
}

Error while matchig nested json array in Karate

Can anybody help me with the below error? I am not sure what I am missing. I guess something very simple I am missing.
assertion failed: path: $[0].drives[*], actual: [{"partitionData":[{"label":"Recovery"},{"label":""},{"label":"New Volume"},{"label":""}]}], expected: {partitionData=[{"label":"#present"}]}, reason: actual value does not contain expected
Below is my schema code:
* set schema
| path | 0 |
| drives | [{"partitionData": [{"label":"#present"}] }] |
Below is the output:
[
{
"drives": [
{
"partitionData": [
{
"label": "Recovery"
},
{
"label": ""
},
{
"label": "New Volume"
},
{
"label": ""
}
]
}
]
}
]
And match each output contains schema[0]
Since your question is confusing, here is a simple example. Don't use set if not needed.
* def schema = { "partitionData": [ { "label" : "#present" } ] }
* def response = { drives: [ { "partitionData": [ { "label" : "foo" } ] } ] }
* match each response.drives == schema

Karate: Match JSON Array responses where the order of array is different in each hit

I have a scenario where a portion of the response arrays is the response from a child API.
child API response looks like below, but there is no specific order. And I need to check whether the child API response is present in the parent API(irrespective of the order of the elements in the child API). I followed this Karate - Match two dynamic responses thread but its not working in my case.
* def response1 =
"""
{
"array1": [
{
"element": {
"id": "A1",
"array11": [
{
"uid": "u123",
"gid": [
"g1"
]
}
]
}
},
{
"element": {
"id": "A2",
"array11": [
{
"uid": "u124",
"gid": [
"g2"
]
}
]
}
}
]
}
"""
* def response2 =
"""
{
"array1": [
{
"element": {
"id": "A2",
"array11": [
{
"uid": "u124",
"gid": [
"g2"
]
}
]
}
},
{
"element": {
"id": "A1",
"array11": [
{
"uid": "u123",
"gid": [
"g1"
]
}
]
}
}
]
}
"""
This is a one liner :)
* match response2.array1 contains response1.array1
Guess what, you don't have to match pure JSON all the time, using child-sections is fine.
But also read this specific part of the docs: https://github.com/intuit/karate#contains-short-cuts
And this example: https://github.com/intuit/karate/blob/master/karate-demo/src/test/java/demo/graphql/graphql.feature

How to query mongodb with “like” for number data type? [duplicate]

I want to regex search an integer value in MongoDB. Is this possible?
I'm building a CRUD type interface that allows * for wildcards on the various fields. I'm trying to keep the UI consistent for a few fields that are integers.
Consider:
> db.seDemo.insert({ "example" : 1234 });
> db.seDemo.find({ "example" : 1234 });
{ "_id" : ObjectId("4bfc2bfea2004adae015220a"), "example" : 1234 }
> db.seDemo.find({ "example" : /^123.*/ });
>
As you can see, I insert an object and I'm able to find it by the value. If I try a simple regex, I can't actually find the object.
Thanks!
If you are wanting to do a pattern match on numbers, the way to do it in mongo is use the $where expression and pass in a pattern match.
> db.test.find({ $where: "/^123.*/.test(this.example)" })
{ "_id" : ObjectId("4bfc3187fec861325f34b132"), "example" : 1234 }
I am not a big fan of using the $where query operator because of the way it evaluates the query expression, it doesn't use indexes and the security risk if the query uses user input data.
Starting from MongoDB 4.2 you can use the $regexMatch|$regexFind|$regexFindAll available in MongoDB 4.1.9+ and the $expr to do this.
let regex = /123/;
$regexMatch and $regexFind
db.col.find({
"$expr": {
"$regexMatch": {
"input": {"$toString": "$name"},
"regex": /123/
}
}
})
$regexFinAll
db.col.find({
"$expr": {
"$gt": [
{
"$size": {
"$regexFindAll": {
"input": {"$toString": "$name"},
"regex": "123"
}
}
},
0
]
}
})
From MongoDB 4.0 you can use the $toString operator which is a wrapper around the $convert operator to stringify integers.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toString": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
If what you want is retrieve all the document which contain a particular substring, starting from release 3.4, you can use the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])
which produces:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334
}
Prior to MongoDB 3.4, you need to $project your document and add another computed field which is the string value of your number.
The $toLower and his sibling $toUpper operators respectively convert a string to lowercase and uppercase but they have a little unknown feature which is that they can be used to convert an integer to string.
The $match operator returns all those documents that match your pattern using the $regex operator.
db.seDemo.aggregate(
[
{ "$project": {
"stringifyExample": { "$toLower": "$example" },
"example": 1
}},
{ "$match": { "stringifyExample": /^123.*/ } }
]
)
which yields:
{
"_id" : ObjectId("579c668c1c52188b56a235b7"),
"example" : 1234,
"stringifyExample" : "1234"
}
{
"_id" : ObjectId("579c66971c52188b56a235b9"),
"example" : 12334,
"stringifyExample" : "12334"
}
Now, if what you want is retrieve all the document which contain a particular substring, the easier and better way to do this is in the upcoming release of MongoDB (as of this writing) using the $redact operator which allows a $conditional logic processing.$indexOfCP.
db.seDemo.aggregate([
{ "$redact": {
"$cond": [
{ "$gt": [
{ "$indexOfCP": [
{ "$toLower": "$example" },
"123"
] },
-1
] },
"$$KEEP",
"$$PRUNE"
]
}}
])