Trying to pull out text value out of column with json using varchar but get an invalid argument error on snowflake while running on mode. This json has a bit of different structure that what I'm used to seeing.
Have tried these to pull out the text:
changes:comment:new_value::varchar
changes:new_value::varchar
changes:comment::varchar
JSON looks like this:
{
"comment":
{
"new_value": "Hello there. Welcome to our facility.",
"old_value": ""
}
}
Wish to pull out the data in this column so the output reads:
Hello there. Welcome to our facility.
You can't extract fields from VARCHAR. If your string is JSON, you have to convert it to the VARIANT type, e.g. through PARSE_JSON function.
Example below:
create or replace table x(v varchar) as select * from values('{
"comment":
{
"new_value": "Hello there. Welcome to our facility.",
"old_value": ""
}
}');
select v, parse_json(v):comment.new_value::varchar from x;
--------------------------------------------------------------+------------------------------------------+
V | PARSE_JSON(V):COMMENT.NEW_VALUE::VARCHAR |
--------------------------------------------------------------+------------------------------------------+
{ | Hello there. Welcome to our facility. |
"comment": | |
{ | |
"new_value": "Hello there. Welcome to our facility.", | |
"old_value": "" | |
} | |
} | |
--------------------------------------------------------------+------------------------------------------+
Related
In Splunk, I'm trying to extract the key value pairs inside that "tags" element of the JSON structure so each one of the become a separate column so I can search through them.
for example :
| spath data | rename data.tags.EmailAddress AS Email
This does not help though and Email field comes as empty.I'm trying to do this for all the tags. Any thoughts/pointers?
{
"timestamp": "2021-10-26T18:23:05.180707Z",
"data": {
"tags": [
{
"key": "Email",
"value": "john.doe#example.com"
},
{
"key": "ProjectCode",
"value": "ABCD"
},
{
"key": "Owner",
"value": "John Doe"
}
]
},
"field1": "random1",
"field2": "random2"
}
I think does what you want:
| spath data.tags{}
| mvexpand data.tags{}
| spath input=data.tags{}
| table key value
| transpose header_field=key
| fields - column
How it works:
| spath data.tags{} takes the json and creates a multi value field that contains each item in the tags array
| mvexpand data.tags{} splits the multi value field into individual events - each one contains one of the items in the tags array
| spath input=data.tags{} takes the json in each event and makes a field for each KVP in that item (key and value in this case)
| table key value limits further commands to these two fields
| transpose header_field=key makes a field for each value of the key field (including one for the field named column)`
| fields - column removes the column field from the output
Here is a fully runnable example:
| makeresults
| eval _raw="
{
\"timestamp\": \"2021-10-26T18:23:05.180707Z\",
\"data\": {
\"tags\": [
{\"key\": \"Email\", \"value\": \"john.doe#example.com\"},
{\"key\": \"ProjectCode\", \"value\": \"ABCD\"},
{\"key\": \"Owner\", \"value\": \"John Doe\"}
]
},
\"field1\": \"random1\",
\"field2\": \"random2\"
}
"
| spath data.tags{}
| mvexpand data.tags{}
| spath input=data.tags{}
| table key value
| transpose header_field=key
It creates this output:
+----------------------+-------------+----------+
| Email | ProjectCode | Owner |
+----------------------+-------------+----------+
| john.doe#example.com | ABCD | John Doe |
+----------------------+-------------+----------+
I have the following JSON payload stored in a single string column in a BQ table.
{
"customer" : "ABC Ltd",
"custom_fields" : [
{
"name" : "DOB",
"value" : "2000-01-01"
},
{
"name" : "Account_Open_Date",
"value" : "2019-01-01"
}
]
}
I am trying to figure out how I can extract the custom_fields name value pairs as columns?
Something like follows.
| Customer.name | Customer.DOB | Customer.Account_Open_Date |
| ABC Ltd | 2000-01-01 | 2019-01-01 |
You can use json-functions , such as
JSON_EXTRACT(json_string_expr, json_path_string_literal)
In your case will be
SELECT
JSON_EXTRACT(json_text, '$.customer') as Customer.Name,
JSON_EXTRACT(json_text, '$.custom_fields[0].value') as Customer.DOB,
JSON_EXTRACT(json_text, '$.custom_fields[1].value') as Customer.Account_Open_Date
I receive JSON from API in the following format:
[
{
"scId": "000DD2",
"sensorId": 2,
"metrics": [
{
"s": 5414,
"dateTime": "2018-02-02T13:03:30+01:00"
},
{
"s": 5526,
"dateTime": "2018-02-02T13:04:56+01:00"
},
{
"s": 5631,
"dateTime": "2018-02-02T13:06:22+01:00"
}
}, .... ]
Currently trying to display these metrics on the linear chart with dateTime for the X-axis and "s" for Y.
I use the following search query:
index="main" source="rest://test3" | spath input=metrics{}.s| mvexpand metrics{}.s
| mvexpand metrics{}.dateTime | rename metrics{}.s as s
| rename metrics{}.dateTime as dateTime| table s,dateTime
And I receive the data in the following format which is not applicable for linear chart. The point is - how to correctly parse the JSON to apply date-time from dateTime field in JSON to _time in Splunk.
Query results
#Max Zhylochkin,
Can you please try following search?
index="main" source="rest://test3"
| spath input=metrics{}.s
| mvexpand metrics{}.s
| mvexpand metrics{}.dateTime
| rename metrics{}.s as s
| rename metrics{}.dateTime as dateTime
| table s,dateTime
| eval _time = strptime(dateTime,"%Y-%m-%dT%H:%M:%S.%3N")
Thanks
I have a table with one nested repeated field called article_id and a string field that contains a json string.
Here is the schema of my table:
Here is an example row of the table:
[
{
"article_id": "2732930586",
"author_names": [
{
"AuN": "h kanahashi",
"AuId": "2591665239",
"AfN": null,
"AfId": null,
"S": "1"
},
{
"AuN": "t mukai",
"AuId": "2607493793",
"AfN": null,
"AfId": null,
"S": "2"
},
{
"AuN": "y yamada",
"AuId": "2606624579",
"AfN": null,
"AfId": null,
"S": "3"
},
{
"AuN": "k shimojima",
"AuId": "2606600298",
"AfN": null,
"AfId": null,
"S": "4"
},
{
"AuN": "m mabuchi",
"AuId": "2606138976",
"AfN": null,
"AfId": null,
"S": "5"
},
{
"AuN": "t aizawa",
"AuId": "2723380540",
"AfN": null,
"AfId": null,
"S": "6"
},
{
"AuN": "k higashi",
"AuId": "2725066679",
"AfN": null,
"AfId": null,
"S": "7"
}
],
"extra_informations": "{
\"DN\": \"Experimental study for improvement of crashworthiness in AZ91 magnesium foam controlling its microstructure.\",
\"S\":[{\"Ty\":1,\"U\":\"https://shibaura.pure.elsevier.com/en/publications/experimental-study-for-improvement-of-crashworthiness-in-az91-mag\"}],
\"VFN\":\"Materials Science and Engineering\",
\"FP\":283,
\"LP\":287,
\"RP\":[{\"Id\":2024275625,\"CoC\":5},{\"Id\":2035451257,\"CoC\":5}, {\"Id\":2141952446,\"CoC\":5},{\"Id\":2126566553,\"CoC\":6}, {\"Id\":2089573897,\"CoC\":5},{\"Id\":2069241702,\"CoC\":7}, {\"Id\":2000323790,\"CoC\":6},{\"Id\":1988924750,\"CoC\":16}],
\"ANF\":[
{\"FN\":\"H.\",\"LN\":\"Kanahashi\",\"S\":1},
{\"FN\":\"T.\",\"LN\":\"Mukai\",\"S\":2},
{\"FN\":\"Y.\",\"LN\":\"Yamada\",\"S\":3},
{\"FN\":\"K.\",\"LN\":\"Shimojima\",\"S\":4},
{\"FN\":\"M.\",\"LN\":\"Mabuchi\",\"S\":5},
{\"FN\":\"T.\",\"LN\":\"Aizawa\",\"S\":6},
{\"FN\":\"K.\",\"LN\":\"Higashi\",\"S\":7}
],
\"BV\":\"Materials Science and Engineering\",\"BT\":\"a\"}"
}
]
In the extra_information.ANF I have an nested array that contains some more author name information.
The nested repeated author_name field has a sub-field author_name.S which can be mapped into extra_informations.ANF.S for a join. Using this mapping I am trying to achieve the following table:
| article_id | author_names.AuN | S | extra_information.ANF.FN | extra_information.ANF.LN|
| 2732930586 | h kanahashi | 1 | H. | Kanahashi |
| 2732930586 | t mukai | 2 | T. | Mukai |
| 2732930586 | y yamada | 3 | Y. | Yamada. |
| 2732930586 | k shimojima | 4 | K. | Shimojima |
| 2732930586 | m mabuchi | 5 | M. | Mabuchi |
| 2732930586 | t aizawa | 6 | T. | Aizawa |
| 2732930586 | k higashi | 7 | K. | Higashi |
The primary problem I faced is that when I convert a json_string using JSON_EXTRACT(extra_information,"$.ANF"), it does not give me an array, instead it gives me the string format of the nested repeated array, which I could not convert into an array.
Is it possible to produce such table using standards-sql in bigquery?
Option 1
This is based on REGEXP_REPLACE function and few more functions (REPLACE, SPLIT, etc.) to manipulate with result. Note - we need extra manipulation because wildcards and filters are not supported in JsonPath expressions in BigQuery?
#standard SQL
SELECT
article_id, author.AuN, author.S,
REPLACE(SPLIT(extra, '","')[OFFSET(0)], '"FN":"', '') FirstName,
REPLACE(SPLIT(extra, '","')[OFFSET(1)], 'LN":"', '') LastName
FROM `table` , UNNEST(author_names) author
LEFT JOIN UNNEST(SPLIT(REGEXP_REPLACE(JSON_EXTRACT(extra_informations, '$.ANF'), r'\[{|}\]', ''), '},{')) extra
ON author.S = CAST(REPLACE(SPLIT(extra, '","')[OFFSET(2)], 'S":', '') AS INT64)
Option 2
To overcome BigQuery "limitation" for JsonPath, you can use custom function as the example below shows:
Note : it uses jsonpath-0.8.0.js that can be downloaded from https://code.google.com/archive/p/jsonpath/downloads and assumed to be uploaded to Google Cloud Storage - gs://your_bucket/jsonpath-0.8.0.js
#standard SQL
CREATE TEMPORARY FUNCTION CUSTOM_JSON_EXTRACT(json STRING, json_path STRING)
RETURNS STRING
LANGUAGE js AS """
try { var parsed = JSON.parse(json);
return jsonPath(parsed, json_path);
} catch (e) { return null }
"""
OPTIONS (
library="gs://your_bucket/jsonpath-0.8.0.js"
);
SELECT
article_id, author.AuN, author.S,
CUSTOM_JSON_EXTRACT(extra_informations, CONCAT('$.ANF[?(#.S==', CAST(author.S AS STRING), ')].FN')) FirstName,
CUSTOM_JSON_EXTRACT(extra_informations, CONCAT('$.ANF[?(#.S==', CAST(author.S AS STRING), ')].LN')) LastName
FROM `table`, UNNEST(author_names) author
As you can see - now you can do all magic in one simple JsonPath
I have a Json data stored in SQL server:
{
"group":{
"operator":"AND",
"rules":[
{
"condition":"=",
"field":"F1",
"table":"ATT",
"data":"TEST",
"readOnly":false,
"hidden":false,
"$$hashKey":"005"
},
{
"condition":"=",
"field":"CLASS",
"table":"OBJ",
"data":"A1",
"readOnly":false,
"hidden":false,
"$$hashKey":"008"
},
{
"group":{
"operator":"AND",
"rules":[
{
"condition":"=",
"field":"F1",
"table":"ATT",
"data":"TEST2",
"readOnly":false,
"hidden":false,
"$$hashKey":"00D"
},
{
"condition":"=",
"field":"F1",
"table":"ATT",
"data":"TEST3",
"readOnly":false,
"hidden":false,
"$$hashKey":"00G"
}
]
},
"table":"",
"$$hashKey":"009"
}
]
}
}
How can I get the count of the element field having value =F1 using SQL?
DECLARE #json NVARCHAR(MAX) = '{"group":{
"operator":"AND",
"rules":
[{"condition":"=",
"field":"F1",
"table":"ATT",
"data":"TEST",
"readOnly":false,
"hidden":false,
"$$hashKey":"005"},
{"condition":"=",
"field":"BANKID",
"table":"ATT",
"data":"A1",
"readOnly":false,
"hidden":false,
"$$hashKey":"008"}]}}';
SELECT COUNT(*)
FROM OPENJSON(#json, '$.group.rules')
WHERE JSON_VALUE(value, '$.field') = 'F1'
You have a table, you say? CROSS APPLY is your friend:
SELECT T.[data], rules.F1_count
FROM T CROSS APPLY (
SELECT COUNT(*) AS F1_count
FROM OPENJSON(jsonData, '$.group.rules')
WHERE JSON_VALUE(value, '$.field') = 'F1'
) AS rules
Try this ->
Select Count(*)
From mytable
WHERE JSON_VALUE(Serialized, '$.field')="F1"
As an advice JSON data should not be stored it self. You should make tables and columns depending on the data you are storing. Maybe make a database called group, with a table called operator, which would have three columns, operator, rule and rule value. So it would look something like this:
Operator Rule RuleValue
-------- ----------- ---------
| #1 | condition | = |
| #1 | field | f1 |
| #1 | table | ATT |
| #2 | data | test |
You can change the table to your needs but remember JSON and XML is a way of storing data in files. So when storing data in database, you shouldn't store it in XMl or JSON