How to extract the data present in {} in Splunk Search - splunk

If the data present in json format {[]} get extracted, however when data present in {} as shown below doesn't behave same. How fields and values can be extracted from data in {}
_raw data:
{"AlertEntityId": "abc#domai.com", "AlertId": "21-3-1-2-4--12", "AlertType": "System", "Comments": "New alert", "CreationTime": "2022-06-08T16:52:51", "Data": "{\"etype\":\"User\",\"eid\":\"abc#domai.com\",\"op\":\"UserSubmission\",\"tdc\":\"1\",\"suid\":\"abc#domai.com\",\"ut\":\"Regular\",\"ssic\":\"0\",\"tsd\":\"Jeff Nichols <jeff#Nichols.com>\",\"sip\":\"1.2.3.4\",\"srt\":\"1\",\"trc\":\"abc#domai.com\",\"ms\":\"Grok - AI/ML summary, case study, datasheet\",\"lon\":\"UserSubmission\"}"}
When I perform query "| table Data", I get the below result, But how to get values of "eid", "tsd".
{"etype":"User","eid":"abc#domai.com","op":"UserSubmission","tdc":"1","suid":"abc#domai.com","ut":"Regular","ssic":"0","tsd":"Jeff Nichols <jeff#Nichols.com>","sip":"1.2.3.4","srt":"1","trc":"abc#domai.com","ms":"Grok - AI/ML summary, case study, datasheet","lon":"UserSubmission"}

| spath
by default this will parse the _raw field if the data is in the field "Data"
| spath input=Data
After which eid and tsd will be in fields of the same name.
https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Spath

Related

ADF DataFlow Activity how to create dynamic derived column

I have input fixed width txt file as source.
test file sample below
column_1
12ABC3455
13XYZ5678
How to build dynamic column pattern to produce derived columns.
column Name : empId -> substring(column_1,1,2)
derive Column setting
I can hardcode the empid in & substring(column_1,1,2) in expression.
but i need to make it dynamic with the JSON input to derive dynamic derived columns with column pattern.
Below sample JSON input parameter.
My input JSON formatted parameter
[
{
"colname": "empid",
"startpos": 1,
"length": 2
},
{
"colname": "empname",
"startpos": 3,
"length": 3
},
{
"colname": "empSal",
"startpos": 6,
"length": 4
}
]
help me to build the column pattern with the json input
I tested many times and can't achieve that.
Just per my experience, I'm afraid to tell you that it's impossible to to that in Data Factory actives or Data Flow with json parameter.

Accessing values in JSON array

I am following the instruction in the documentation for how to access JSON values in CloudWatch Insights where the recomendation is as follows
JSON arrays are flattened into a list of field names and values. For example, to specify the value of instanceId for the first item in requestParameters.instancesSet, use requestParameters.instancesSet.items.0.instanceId.
ref
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_AnalyzeLogData-discoverable-fields.html
I am trying the following and getting nothing in return. The intellisense autofills up to processList.0 but no further
fields processList.0.vss
| sort #timestamp desc
| limit 1
The JSON I am woking with is
"processList": [
{
"vss": xxxxx,
"name": "aurora",
"tgid": xxxx,
"vmlimit": "unlimited",
"parentID": 1,
"memoryUsedPc": 16.01,
"cpuUsedPc": 0.01,
"id": xxxxx,
"rss": xxxxx
},
{
"vss": xxxx,
"name": "aurora",
"tgid": xxxxxx,
"vmlimit": "unlimited",
"parentID": 1,
"memoryUsedPc": 16.01,
"cpuUsedPc": 0.06,
"id": xxxxx,
"rss": xxxxx
}]
Have you tried the following?
fields ##timestamp, #processList.0.vss
| sort ##timestamp desc
| limit 5
It may be a syntax error. If not, please post a couple of records worth of the overall structure, with #timestamp included.
The reference link that you have posted also states the following.
CloudWatch Logs Insights can extract a maximum of 100 log event fields
from a JSON log. For extra fields that are not extracted, you can use
the parse command to parse these fields from the raw unparsed log
event in the message field.
https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/CWL_AnalyzeLogData-discoverable-fields.html
For very large JSON messages, Insights intellisense may not be parsing all the fields into named fields. So, the solution is to use parse on the complete JSON string in the field where you expect your data field to be present. In your example and mine it is processList.
I was able to extract the value of specific cpuUsedPc under processList by using a query like the following.
fields #timestamp, cpuUtilization.total, processList
| parse processList /"name":"RDS processes","tgid":.*?,"parentID":.*?,"memoryUsedPc":.*?,"cpuUsedPc":(?<RDSProcessesCPUUsedPc>.*?),/
| sort #timestamp asc
| display #timestamp, cpuUtilization.total, RDSProcessesCPUUsedPc

Exporting data from big query to GCS - include columns with null values

I try to export the big query table to Google Cloud storage with format specified as JSON
Here, I noticed like columns with null values are not included in resulting JSON files
So Is there a way to get all the fields of row to be converted into JSON ?
My intention is to export data from table , do some transformations and reload the data back into new table . So , i basically need all fields to be included in generated JSON files .
For example , I tried exporting Bigquery-public-data.samples.wikipedia table
After exporting, JSON rows include only columns with non-null value
{
"title": "Strait of Messina Bridge",
"id": "1462053",
"language": "",
"wp_namespace": "0",
"revision_id": "115349459",
"contributor_ip": "80.129.30.196",
"timestamp": "1173977859",
"comment": "/* Controversy and concerns */",
"num_characters": "20009"
}
Few columns like contrinutor_id , contributor_username , others with null values are not included in generated JSON

Query data inside an attribute array in a json column in Postgres 9.6

I have a table say types, which had a JSON column, say location that looks like this:
{ "attribute":[
{
"type": "state",
"value": "CA"
},
{
"type": "distance",
"value": "200.00"
} ...
]
}
Each row in the table has the data, and all have the "type": "state" in it. I want to just extract the value of "type": "state" from every row in the table, and put it in a new column. I checked out several questions on SO, like:
Query for element of array in JSON column
Index for finding an element in a JSON array
Query for array elements inside JSON type
but could not get it working. I do not need to query on this. I need the value of this column. I apologize in advance if I missed something.
create table t(data json);
insert into t values('{"attribute":[{"type": "state","value": "CA"},{"type": "distance","value": "200.00"}]}'::json);
select elem->>'value' as state
from t, json_array_elements(t.data->'attribute') elem
where elem->>'type' = 'state';
| state |
| :---- |
| CA |
dbfiddle here
I mainly use Redshift where there is a built-in function to do this. So on the off-chance you're there, check it out.
redshift docs
It looks like Postgres has a similar function set:
https://www.postgresql.org/docs/current/static/functions-json.html
I think you'll need to chain three functions together to make this work.
SELECT
your_field::json->'attribute'->0->'value'
FROM
your_table
What I'm trying is a json extract by key name, followed by a json array extract by index (always the 1st, if your example is consistent with the full data), followed finally by another extract by key name.
Edit: got it working for your example
SELECT
'{ "attribute":[
{
"type": "state",
"value": "CA"
},
{
"type": "distance",
"value": "200.00"
}
]
}'::json->'attribute'->0->'value'
Returns "CA"
2nd edit: nested querying
#McNets is the right, better answer. But in this dive, I discovered you can nest queries in Postgres! How frickin' cool!
I stored the json as a text field in a dummy table and successfully ran this:
SELECT
(SELECT value FROM json_to_recordset(
my_column::json->'attribute') as x(type text, value text)
WHERE
type = 'state'
)
FROM dummy_table

Issues with JSON_EXTRACT in Presto for keys containing ' ' character

I'm using Presto(0.163) to query data and am trying to extract fields from a json.
I have a json like the one given below, which is present in the column 'style_attributes':
"attributes": {
"Brand Fit Name": "Regular Fit",
"Fabric": "Cotton",
"Fit": "Regular",
"Neck or Collar": "Round Neck",
"Occasion": "Casual",
"Pattern": "Striped",
"Sleeve Length": "Short Sleeves",
"Tshirt Type": "T-shirt"
}
I'm unable to extract field 'Short Sleeves'.
Below is the query i'm using:
Select JSON_EXTRACT(style_attributes,'$.attributes.Sleeve Length') as length from table;
The query fails with the following error- Invalid JSON path: '$.attributes.Sleeve Length'
For fields without ' '(space), query is running fine.
I tried to find the resolution in the Presto documentation, but with no success.
presto:default> select json_extract_scalar('{"attributes":{"Sleeve Length": "Short Sleeves"}}','$.attributes["Sleeve Length"]');
_col0
---------------
Short Sleeves
or
presto:default> select json_extract_scalar('{"attributes":{"Sleeve Length": "Short Sleeves"}}','$["attributes"]["Sleeve Length"]');
_col0
---------------
Short Sleeves
JSON Function Changes
The :func:json_extract and :func:json_extract_scalar functions now
support the square bracket syntax:
SELECT json_extract(json, '$.store[book]');
SELECT json_extract(json,'$.store["book name"]');
As part of this change, the set of characters
allowed in a non-bracketed path segment has been restricted to
alphanumeric, underscores and colons. Additionally, colons cannot be
used in a un-quoted bracketed path segment. Use the new bracket syntax
with quotes to match elements that contain special characters.
https://github.com/prestodb/presto/blob/c73359fe2173e01140b7d5f102b286e81c1ae4a8/presto-docs/src/main/sphinx/release/release-0.75.rst
SELECT
tags -- It is column with Json string data
,json_extract(tags , '$.Brand') AS Brand
,json_extract(tags , '$.Portfolio') AS Portfolio
,cost
FROM
TableName
Sample data for tags - {"Name": "pxyblob", "Owner": "", "Env": "prod", "Service": "", "Product": "", "Portfolio": "OPSXYZ", "Brand": "Limo", "AssetProtectionLevel": "", "ComponentInfo": ""}
Here is your Correct answer.
Let Say:
JSON : {"Travel Date":"2017-9-22", "City": "Seattle"}
Column Name: ITINERARY
And i wana extract 'Travel Date' form the current JSON then:
Query: SELECT JSON_EXTRACT(ITINERARY, "$.\"Travel Date\"") from Table
Note: Just add \" at starting and end of the key name.
Hope this will surely work for you need. :)