create an array of strings from and existing array keys - kql

I have the following column returned from ARG:
{
"a": {
"key1": [
"text1",
"text2"
]
}
"b": {
"key2": [
"text1",
"text2"
]
}
}
I'm trying to create another column which would contain a list of all the keys.
So in the example above, the new column would contain:
["key1, "key2"]
I also see that I don't have all the functionality to run in ARG so I'm not sure if what I'm trying to do is possilbe.

| mv-expand kind=array doc
| summarize make_list(bag_keys(doc[1]))

Related

Want to parse string BLOB as JSON, modify few fields and parse it back to string BIGQUERY

So I have a column which contains JSONs as string BLOBs. For example,
{
"a": "a-1",
"b": "b-1",
"c":
{
"c-1": "c-1-1",
"c-2": "c-2-1"
},
"d":
[
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}
Now my use case is to parse the key b, hash the value of key b, assign it back to the key b and store it back as a string in Bigquery.
I initially tried to do a lazy approach where I am only extracting the key b using the json_extract_scalar function in Bigquery and for other keys (like c and d - which I dont want to modify), I used json_extract function. Then I converted back to string after doing hashing the key b. Here is the query -
SELECT
TO_JSON_STRING(
STRUCT(
json_EXTRACT(COL_NAME, "$.a") AS a,
MD5(json_extract_scalar(_airbyte_data,"$.b")) AS b,
json_EXTRACT(COL_NAME,"$.c") AS c,
json_EXTRACT(COL_NAME,"$.d") AS d ) )
FROM
dataset.TABLE
But the issue with this query is the JSON objects are getting converted to string and double quotes getting escaped due to TO_JSON_STRING (I tried using CAST AS STRING on top of STRUCT but it isn't supported). For example, the output row for this query looks like this:
{
"a": "a-1",
"b": "b-1",
"c":
"{
\"c-1\": \"c-1-1\",
\"c-2\": \"c-2-1\"
}",
"d":
"[
{
\"k\": \"v1\"
},
{
\"k\": \"v2\"
},
{
\"k\": \"v3\"
}
]"
}
I can achieve the required output if I use JSON_EXTRACT and JSON_EXTRACT_SCALAR functions on every key (and on every nested keys) but this approach isn't scalable as I have close to 200 keys and many of them are nested 2-3 levels deep.
Can anyone suggest a better approach of achieving this? TIA
This should work
declare _json string default """{
"a": "a-1",
"b": "b-1",
"c":
{
"c-1": "c-1-1",
"c-2": "c-2-1"
},
"d":
[
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}""";
SELECT regexp_replace(_json, r'"b": "(\w+)-(\w+)"',concat('"b":',TO_JSON_STRING( MD5(json_extract_scalar(_json,"$.b")))))
output
{
"a": "a-1",
"b":"sdENsgFsL4PBOyX8sXDN6w==",
"c":
{
"c-1": "c-1-1",
"c-2": "c-2-1"
},
"d":
[
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}
If you need specific regex then please specify the example for b values.

extract all occurrences of same field from request body splunk

I have a same field multiple times in one request body and need to find the value for each occurrence. like subTypeCodeId filed. result should have subTypeCodeId = 2
subTypeCodeId = 3
{
"Items": [
{
"emailId": "#stny.com",
"item": {
"subTypeCodeId": "2"
}
},
{
"emailId": "#comcast.com",
"item": {
"subTypeCodeId": "3"
}
}
]
}
splunk query: index="gcp_prod_ecomm_cx_wallet" "1570081534220" "API_NAME:wallet.addItemsToWalletBulk" |rex "subTypeCodeId\x5C\":\x5C\"(?.*)\""
Use the max_match option of rex. It will make subTypeCodeId a multi-value field containing all values.
index="gcp_prod_ecomm_cx_wallet" "1570081534220" "API_NAME:wallet.addItemsToWalletBulk"
| rex max_match=0 "subTypeCodeId\x5C\":\x5C\"(?<subTypeCodeId>.*)\""
You also might want to look into the spath command, which can parse json data.

Avro Schema for unstructured data with random names

I need to save nested data objects with unpredictable names in an Avro schema. For example:
{
"foo": "bar",
"baz": {
"randomName1": 0.23,
...
}
}
Because creating recursive maps is only possible with records, but records must have a field name, I think I need to transform the object into something else.
I though about one of these three options:
(1) Array of nested key/value pairs
Example:
[
{
key: "foo",
value: "bar"
},
{
key: "baz",
value: [
{
name: "randomName1",
value; 0.23
},
...
]
},
]
(2) Flat map with dot-syntaxed key/value pairs
Example:
{
"foo": "bar",
"baz.randomName1": 0.23,
...
}
(3) Array of flattened key/value objects
Example:
[
{
"name": "foo",
"value": "bar"
},
{
"name": "baz.randomName1",
"value": 0.23
},
...
]
All three approaches translate well to Avro, but I am unsure of the implications of either approach, for example when trying to query those values via KSQL.
Any hint towards potentials gotchas further down the road is highly appreciated.

Validating that a property value exists withing the keys of an object

Wise crowd,
I already have a working JSON Schema (v0.7) to validate my data. This is an example of valid JSON:
{
"people": [
{ "id": 1, "name": "bob" },
...
]
}
Now I need to a bunch of strings in it:
{
"people": [
{ "id": 1, "name": "bob", "appears_in": "long_string_id_1" },
{ "id": 2, "name": "ann", "appears_in": "long_string_id_1" }
...
],
"long_strings": {
"long_string_id_1": "blah blah blah.....",
...
}
}
What I need is:
a value for key appears_in MUST be a key of the long_strings object
(optional) a key of the long_strings object MUST be used as value in on of the appears_in key
Property dependencies are nice, but don't seem to address my needs.
Any idea?
And this question is not a duplicate, because I do not know the values in advance.
Sorry. You cannot do this in JSON schema. You cannot reference data in your schema.

Google sheets APIv4: Getting notes from cell

I can't find a way to retrieve notes from a cell. I looked here:https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/get
I can update the note by using the code here: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/cells
... but I can't find a way to retrieve the note in the first place.
This the result of a values.get call on a cell that has a note...
{u'range': u'Sheet1!D5', u'values': [[u'update CD data']], u'majorDimension': u'ROWS'}
as you can see, the notes to the cell are not there.
How about using sheets/data/rowData/values/note to the fields?
The endpoint is as follows.
GET https://sheets.googleapis.com/v4/spreadsheets/### Spreadsheet ID ###?fields=sheets%2Fdata%2FrowData%2Fvalues%2Fnote
From your profile, if you use Python, how about this?
response = service.spreadsheets().get(spreadsheetId=id, fields="sheets/data/rowData/values/note").execute()
Result:
When there are notes at "A1:B2", the result is as follows.
{
"sheets": [
{
"data": [
{
"rowData": [
{
"values": [
{
"note": "sample note A1"
},
{
"note": "sample note B1"
}
]
},
{
"values": [
{
"note": "sample note A2"
},
{
"note": "sample note B2"
}
]
}
]
}
]
}
]
}
Reference:
CellData
If I misunderstand your question, please tell me. I would like to modify it.