Splunk : Extracting the elements from JSON structure as separate fields

Splunk : Extracting the elements from JSON structure as separate fields - splunk

In Splunk, I'm trying to extract the key value pairs inside that "tags" element of the JSON structure so each one of the become a separate column so I can search through them.
for example :
| spath data | rename data.tags.EmailAddress AS Email
This does not help though and Email field comes as empty.I'm trying to do this for all the tags. Any thoughts/pointers?
{
"timestamp": "2021-10-26T18:23:05.180707Z",
"data": {
"tags": [
{
"key": "Email",
"value": "john.doe#example.com"
},
{
"key": "ProjectCode",
"value": "ABCD"
},
{
"key": "Owner",
"value": "John Doe"
}
]
},
"field1": "random1",
"field2": "random2"
}

I think does what you want:
| spath data.tags{}
| mvexpand data.tags{}
| spath input=data.tags{}
| table key value
| transpose header_field=key
| fields - column
How it works:
| spath data.tags{} takes the json and creates a multi value field that contains each item in the tags array
| mvexpand data.tags{} splits the multi value field into individual events - each one contains one of the items in the tags array
| spath input=data.tags{} takes the json in each event and makes a field for each KVP in that item (key and value in this case)
| table key value limits further commands to these two fields
| transpose header_field=key makes a field for each value of the key field (including one for the field named column)`
| fields - column removes the column field from the output
Here is a fully runnable example:
| makeresults
| eval _raw="
{
\"timestamp\": \"2021-10-26T18:23:05.180707Z\",
\"data\": {
\"tags\": [
{\"key\": \"Email\", \"value\": \"john.doe#example.com\"},
{\"key\": \"ProjectCode\", \"value\": \"ABCD\"},
{\"key\": \"Owner\", \"value\": \"John Doe\"}
]
},
\"field1\": \"random1\",
\"field2\": \"random2\"
}
"
| spath data.tags{}
| mvexpand data.tags{}
| spath input=data.tags{}
| table key value
| transpose header_field=key
It creates this output:
+----------------------+-------------+----------+
| Email | ProjectCode | Owner |
+----------------------+-------------+----------+
| john.doe#example.com | ABCD | John Doe |
+----------------------+-------------+----------+

Related

Splunk : Spath searching the JSON array

I have below two JSON events where under "appliedConditionalAccessPolicies", in one event policy1 has results =failure and policy2 has results=notApplied. In the other event the values are reversed.
Now I'm trying to get the event where the policy1 has the status="failure", it gives both the events
index=test
| spath path="appliedConditionalAccessPolicies{}" | search "appliedConditionalAccessPolicies{}.displayName"="policy1" "appliedConditionalAccessPolicies{}.result"="failure"
It looks like Its searching within all the elements in the array.
How can I ensure It searches both the conditions on each element of the array and return the event which has the element satisfying both the conditions.
Events :
appDisplayName: App1
appId: aaaa-1111-111aeff-aad222221111
appliedConditionalAccessPolicies: [
{
displayName: policy1
enforcedGrantControls: [
Block
]
enforcedSessionControls: [
SignInFrequency
ContinuousAccessEvaluation
]
id: f111113-111-400c-a251-2123bbe4233e1
result: failure
}
{ [-]
displayName: policy2
enforcedGrantControls: [ [-]
Block
]
enforcedSessionControls: [ [-]
]
id: sdsds-8c92-45ef-sdsds-c0b2e006d39b
result: notApplied
}
]
appDisplayName: App1
appId: aaaa-1111-111aeff-aad222221111
appliedConditionalAccessPolicies: [
{
displayName: policy1
enforcedGrantControls: [
Block
]
enforcedSessionControls: [
SignInFrequency
ContinuousAccessEvaluation
]
id: f111113-111-400c-a251-2123bbe4233e1
result: notApplied
}
{ [-]
displayName: policy2
enforcedGrantControls: [ [-]
Block
]
enforcedSessionControls: [ [-]
]
id: sdsds-8c92-45ef-sdsds-c0b2e006d39b
result: failure
}
]

The problem is that appliedConditionalAccessPolicies{}.displayName and appliedConditionalAccessPolicies{}.result are multi-value fields so you need to do something that determines if the search matches the same index of both multi-value fields.
Here is a way using mvfind:
And mvfind gives you the multi-value field index so you can compare them, but from my testing mvfind hates field names like appliedConditionalAccessPolicies{}.displayName and appliedConditionalAccessPolicies{}.result so you need to rename them before you can use them with mvfind. This works for me:
| rename "appliedConditionalAccessPolicies{}.displayName" as displayName
| rename "appliedConditionalAccessPolicies{}.result" as result
| where mvfind(displayName,"policy1")=mvfind(result,"failure")
Here is a full example that you can play with:
| makeresults
| eval data="
{\"appDisplayName\":\"App1\",\"appId\":\"aaaa-1111-111aeff-aad222221111\",\"appliedConditionalAccessPolicies\":[{\"displayName\":\"policy1\",\"enforcedGrantControls\":[\"Block1\"],\"enforcedSessionControls\":[\"SignInFrequency\",\"ContinuousAccessEvaluation\"],\"id\":\"f111113-111-400c-a251-2123bbe4233e1\",\"result\":\"failure\"},{\"displayName\":\"policy2\",\"enforcedGrantControls\":[\"Block2\"],\"enforcedSessionControls\":[],\"id\":\"sdsds-8c92-45ef-sdsds-c0b2e006d39b\",\"result\":\"notApplied\"}]}
###
{\"appDisplayName\":\"App2\",\"appId\":\"aaaa-1111-111aeff-aad222221112\",\"appliedConditionalAccessPolicies\":[{\"displayName\":\"policy1\",\"enforcedGrantControls\":[\"Block1\"],\"enforcedSessionControls\":[\"SignInFrequency\",\"ContinuousAccessEvaluation\"],\"id\":\"f111113-111-400c-a251-2123bbe4233e1\",\"result\":\"notApplied\"},{\"displayName\":\"policy2\",\"enforcedGrantControls\":[\"Block2\"],\"enforcedSessionControls\":[],\"id\":\"sdsds-8c92-45ef-sdsds-c0b2e006d39b\",\"result\":\"failure\"}]}
"
| makemv data delim="###"
| mvexpand data
| spath input=data
| fields - data
| rename "appliedConditionalAccessPolicies{}.displayName" as displayName
| rename "appliedConditionalAccessPolicies{}.result" as result
| where mvfind(displayName,"policy1")=mvfind(result,"failure")
Here is a way using mvzip: (thanks to #warren)
You can join the multi-value fields together nad then just search for the string that contains both values. It looks like mvzip also hates field names like appliedConditionalAccessPolicies{}.displayName and appliedConditionalAccessPolicies{}.result so you need to rename them before you can use them with mvzip. This works for me:
| rename "appliedConditionalAccessPolicies{}.displayName" as displayName
| rename "appliedConditionalAccessPolicies{}.result" as result
| where mvzip(displayName,result)="policy1,failure"
Here is a full example that you can play with:
| makeresults
| eval data="
{\"appDisplayName\":\"App1\",\"appId\":\"aaaa-1111-111aeff-aad222221111\",\"appliedConditionalAccessPolicies\":[{\"displayName\":\"policy1\",\"enforcedGrantControls\":[\"Block1\"],\"enforcedSessionControls\":[\"SignInFrequency\",\"ContinuousAccessEvaluation\"],\"id\":\"f111113-111-400c-a251-2123bbe4233e1\",\"result\":\"failure\"},{\"displayName\":\"policy2\",\"enforcedGrantControls\":[\"Block2\"],\"enforcedSessionControls\":[],\"id\":\"sdsds-8c92-45ef-sdsds-c0b2e006d39b\",\"result\":\"notApplied\"}]}
###
{\"appDisplayName\":\"App2\",\"appId\":\"aaaa-1111-111aeff-aad222221112\",\"appliedConditionalAccessPolicies\":[{\"displayName\":\"policy1\",\"enforcedGrantControls\":[\"Block1\"],\"enforcedSessionControls\":[\"SignInFrequency\",\"ContinuousAccessEvaluation\"],\"id\":\"f111113-111-400c-a251-2123bbe4233e1\",\"result\":\"notApplied\"},{\"displayName\":\"policy2\",\"enforcedGrantControls\":[\"Block2\"],\"enforcedSessionControls\":[],\"id\":\"sdsds-8c92-45ef-sdsds-c0b2e006d39b\",\"result\":\"failure\"}]}
"
| makemv data delim="###"
| mvexpand data
| spath input=data
| fields - data
| rename "appliedConditionalAccessPolicies{}.displayName" as displayName
| rename "appliedConditionalAccessPolicies{}.result" as result
| where mvzip(displayName,result)="policy1,failure"

Pulling text out of JSON using VARCHAR

Trying to pull out text value out of column with json using varchar but get an invalid argument error on snowflake while running on mode. This json has a bit of different structure that what I'm used to seeing.
Have tried these to pull out the text:
changes:comment:new_value::varchar
changes:new_value::varchar
changes:comment::varchar
JSON looks like this:
{
"comment":
{
"new_value": "Hello there. Welcome to our facility.",
"old_value": ""
}
}
Wish to pull out the data in this column so the output reads:
Hello there. Welcome to our facility.

You can't extract fields from VARCHAR. If your string is JSON, you have to convert it to the VARIANT type, e.g. through PARSE_JSON function.
Example below:
create or replace table x(v varchar) as select * from values('{
"comment":
{
"new_value": "Hello there. Welcome to our facility.",
"old_value": ""
}
}');
select v, parse_json(v):comment.new_value::varchar from x;
--------------------------------------------------------------+------------------------------------------+
V | PARSE_JSON(V):COMMENT.NEW_VALUE::VARCHAR |
--------------------------------------------------------------+------------------------------------------+
{ | Hello there. Welcome to our facility. |
"comment": | |
{ | |
"new_value": "Hello there. Welcome to our facility.", | |
"old_value": "" | |
} | |
} | |
--------------------------------------------------------------+------------------------------------------+

BigQuery JSON Field Extraction

I have the following JSON payload stored in a single string column in a BQ table.
{
"customer" : "ABC Ltd",
"custom_fields" : [
{
"name" : "DOB",
"value" : "2000-01-01"
},
{
"name" : "Account_Open_Date",
"value" : "2019-01-01"
}
]
}
I am trying to figure out how I can extract the custom_fields name value pairs as columns?
Something like follows.
| Customer.name | Customer.DOB | Customer.Account_Open_Date |
| ABC Ltd | 2000-01-01 | 2019-01-01 |

You can use json-functions , such as
JSON_EXTRACT(json_string_expr, json_path_string_literal)
In your case will be
SELECT
JSON_EXTRACT(json_text, '$.customer') as Customer.Name,
JSON_EXTRACT(json_text, '$.custom_fields[0].value') as Customer.DOB,
JSON_EXTRACT(json_text, '$.custom_fields[1].value') as Customer.Account_Open_Date

How to parse JSON metrics array in Splunk

I receive JSON from API in the following format:
[
{
"scId": "000DD2",
"sensorId": 2,
"metrics": [
{
"s": 5414,
"dateTime": "2018-02-02T13:03:30+01:00"
},
{
"s": 5526,
"dateTime": "2018-02-02T13:04:56+01:00"
},
{
"s": 5631,
"dateTime": "2018-02-02T13:06:22+01:00"
}
}, .... ]
Currently trying to display these metrics on the linear chart with dateTime for the X-axis and "s" for Y.
I use the following search query:
index="main" source="rest://test3" | spath input=metrics{}.s| mvexpand metrics{}.s
| mvexpand metrics{}.dateTime | rename metrics{}.s as s
| rename metrics{}.dateTime as dateTime| table s,dateTime
And I receive the data in the following format which is not applicable for linear chart. The point is - how to correctly parse the JSON to apply date-time from dateTime field in JSON to _time in Splunk.
Query results

#Max Zhylochkin,
Can you please try following search?
index="main" source="rest://test3"
| spath input=metrics{}.s
| mvexpand metrics{}.s
| mvexpand metrics{}.dateTime
| rename metrics{}.s as s
| rename metrics{}.dateTime as dateTime
| table s,dateTime
| eval _time = strptime(dateTime,"%Y-%m-%dT%H:%M:%S.%3N")
Thanks

How to extract a repeated nested field from json string and join with existing repeated nested field in bigquery

I have a table with one nested repeated field called article_id and a string field that contains a json string.
Here is the schema of my table:
Here is an example row of the table:
[
{
"article_id": "2732930586",
"author_names": [
{
"AuN": "h kanahashi",
"AuId": "2591665239",
"AfN": null,
"AfId": null,
"S": "1"
},
{
"AuN": "t mukai",
"AuId": "2607493793",
"AfN": null,
"AfId": null,
"S": "2"
},
{
"AuN": "y yamada",
"AuId": "2606624579",
"AfN": null,
"AfId": null,
"S": "3"
},
{
"AuN": "k shimojima",
"AuId": "2606600298",
"AfN": null,
"AfId": null,
"S": "4"
},
{
"AuN": "m mabuchi",
"AuId": "2606138976",
"AfN": null,
"AfId": null,
"S": "5"
},
{
"AuN": "t aizawa",
"AuId": "2723380540",
"AfN": null,
"AfId": null,
"S": "6"
},
{
"AuN": "k higashi",
"AuId": "2725066679",
"AfN": null,
"AfId": null,
"S": "7"
}
],
"extra_informations": "{
\"DN\": \"Experimental study for improvement of crashworthiness in AZ91 magnesium foam controlling its microstructure.\",
\"S\":[{\"Ty\":1,\"U\":\"https://shibaura.pure.elsevier.com/en/publications/experimental-study-for-improvement-of-crashworthiness-in-az91-mag\"}],
\"VFN\":\"Materials Science and Engineering\",
\"FP\":283,
\"LP\":287,
\"RP\":[{\"Id\":2024275625,\"CoC\":5},{\"Id\":2035451257,\"CoC\":5}, {\"Id\":2141952446,\"CoC\":5},{\"Id\":2126566553,\"CoC\":6}, {\"Id\":2089573897,\"CoC\":5},{\"Id\":2069241702,\"CoC\":7}, {\"Id\":2000323790,\"CoC\":6},{\"Id\":1988924750,\"CoC\":16}],
\"ANF\":[
{\"FN\":\"H.\",\"LN\":\"Kanahashi\",\"S\":1},
{\"FN\":\"T.\",\"LN\":\"Mukai\",\"S\":2},
{\"FN\":\"Y.\",\"LN\":\"Yamada\",\"S\":3},
{\"FN\":\"K.\",\"LN\":\"Shimojima\",\"S\":4},
{\"FN\":\"M.\",\"LN\":\"Mabuchi\",\"S\":5},
{\"FN\":\"T.\",\"LN\":\"Aizawa\",\"S\":6},
{\"FN\":\"K.\",\"LN\":\"Higashi\",\"S\":7}
],
\"BV\":\"Materials Science and Engineering\",\"BT\":\"a\"}"
}
]
In the extra_information.ANF I have an nested array that contains some more author name information.
The nested repeated author_name field has a sub-field author_name.S which can be mapped into extra_informations.ANF.S for a join. Using this mapping I am trying to achieve the following table:
| article_id | author_names.AuN | S | extra_information.ANF.FN | extra_information.ANF.LN|
| 2732930586 | h kanahashi | 1 | H. | Kanahashi |
| 2732930586 | t mukai | 2 | T. | Mukai |
| 2732930586 | y yamada | 3 | Y. | Yamada. |
| 2732930586 | k shimojima | 4 | K. | Shimojima |
| 2732930586 | m mabuchi | 5 | M. | Mabuchi |
| 2732930586 | t aizawa | 6 | T. | Aizawa |
| 2732930586 | k higashi | 7 | K. | Higashi |
The primary problem I faced is that when I convert a json_string using JSON_EXTRACT(extra_information,"$.ANF"), it does not give me an array, instead it gives me the string format of the nested repeated array, which I could not convert into an array.
Is it possible to produce such table using standards-sql in bigquery?

Option 1
This is based on REGEXP_REPLACE function and few more functions (REPLACE, SPLIT, etc.) to manipulate with result. Note - we need extra manipulation because wildcards and filters are not supported in JsonPath expressions in BigQuery?
#standard SQL
SELECT
article_id, author.AuN, author.S,
REPLACE(SPLIT(extra, '","')[OFFSET(0)], '"FN":"', '') FirstName,
REPLACE(SPLIT(extra, '","')[OFFSET(1)], 'LN":"', '') LastName
FROM `table` , UNNEST(author_names) author
LEFT JOIN UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(extra_informations, '$.ANF'), r'\[{|}\]', ''), '},{')) extra
ON author.S = CAST(REPLACE(SPLIT(extra, '","')[OFFSET(2)], 'S":', '') AS INT64)
Option 2
To overcome BigQuery "limitation" for JsonPath, you can use custom function as the example below shows:
Note : it uses jsonpath-0.8.0.js that can be downloaded from https://code.google.com/archive/p/jsonpath/downloads and assumed to be uploaded to Google Cloud Storage - gs://your_bucket/jsonpath-0.8.0.js
#standard SQL
CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING)
RETURNS STRING
LANGUAGE js AS """
try { var parsed = JSON.parse(json);
return jsonPath(parsed, json_path);
} catch (e) { return null }
"""
OPTIONS (
library="gs://your_bucket/jsonpath-0.8.0.js"
);
SELECT
article_id, author.AuN, author.S,
CUSTOM_JSON_EXTRACT(extra_informations, CONCAT('$.ANF[?(#.S==', CAST(author.S AS STRING), ')].FN')) FirstName,
CUSTOM_JSON_EXTRACT(extra_informations, CONCAT('$.ANF[?(#.S==', CAST(author.S AS STRING), ')].LN')) LastName
FROM `table`, UNNEST(author_names) author
As you can see - now you can do all magic in one simple JsonPath

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Splunk : Extracting the elements from JSON structure as separate fields - splunk

Related

Splunk : Spath searching the JSON array

Pulling text out of JSON using VARCHAR

BigQuery JSON Field Extraction

How to parse JSON metrics array in Splunk

How to extract a repeated nested field from json string and join with existing repeated nested field in bigquery

Categories

Resources