extract from json string oracle - sql

I have JSON string in a column like below.
oracle version - 12c
{
[
{
"Employee_id": 1
,"Salary":3206.93
}
]
}
How do I remove first flower brace and the square bracket..
Result should look like below..
{
"Employee_id": 1
,"Salary":3206.93
}
Tried using regex like SELECT regexp_substr('"abc{[{def}]}ghi"', '\[(.+)\]') match FROM dual;
But it didn't work..

select replace(replace(myColumn, '{[{', '{'), '}]}', '}')
from myTable
Note, this method will ONLY work if this column is always going to contain a single element of a JSON array, and if the JSON object doesn't contain any other JSON objects... otherwise it could break. Use at your own risk :)

Related

AWS Athena json_extract query from string field returns empty values

I have in Athena json string:
{"recurrent_jobs.new_page.career_level.trainee":0,"recurrent_jobs.new_page.career_level.assistant":1}
I need get result: trainee=0
I make a query:
select
json_extract(
'{"recurrent_jobs.new_page.career_level.trainee":0,"recurrent_jobs.new_page.career_level.assistant":1}',
'$.recurrent_jobs.new_page.career_level.trainee')
And I have a empty result. I think the problem is mit dots. What can I do?
'$.recurrent_jobs.new_page.career_level.trainee' represents path to property of deeply nested object, something like the following:
{
"recurrent_jobs":{
"new_page":{
"career_level":{
"trainee":0
}
}
}
}
You need to escape the property name with dots - '$["recurrent_jobs.new_page.career_level.trainee"]':
select json_extract(
'{
"recurrent_jobs.new_page.career_level.trainee":0,
"recurrent_jobs.new_page.career_level.assistant":1
}',
'$["recurrent_jobs.new_page.career_level.trainee"]');
Output:
_col0
0

Extracting JSON returns null (Presto Athena)

I'm working with SQL Presto in Athena and in a table I have a column named "data.input.additional_risk_data.basket" that has a json like this:
[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]
I need to extract some of the data there, for example data.input.additional_risk_data.basket.val.item_reference. I'm not used to working with jsons but I tried a few things:
json_extract("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference')
json_extract_scalar("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference)
They all returned null. I'm wondering what is the correct way to get the values from that json
Thank you!
There are multiple "problems" with your data and json path selector. Keys are not conventional (and I have not found a way to tell athena to escape them) and your json is actually an array of json objects. What you can do - cast data to an array and process it. For example:
-- sample data
WITH dataset (json_val) AS (
VALUES (json '[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]')
)
--query
select arr[1]['data.input.additional_risk_data.basket.val.item_reference'] item_reference -- or use unnest if there are actually more than 1 element in array expected
from(
select cast(json_val as array(map(varchar, json))) arr
from dataset
)
Output:
item_reference
"26484651"

How to remove all \ from nested json in SQL Redshift?

I've got some problems with extracting values from nested json values in column.
I've got a column of data with values that looks almost like nested json, but some of jsons got \ between values and I need to clean them.
JSON looks like this:
{"mopub_json":
"{\"currency\":\"USD\",
\"country\":\"US\",
\"publisher_revenue\":0.01824}
"}
I need to get currency and publisher revenue as different columns and try this:
SET json_serialization_enable TO true;
SET json_serialization_parse_nested_strings TO true;
SELECT
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'publisher_revenue') as revenue_mopub,
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'currency') as currency_mopub
FROM(
SELECT replace(column_name, "\t", '')
FROM table_name)
I receive the next error:
[Amazon](500310) Invalid operation: column "\t" does not exist in events
When I'm trying this:
SET json_serialization_parse_nested_strings TO true;
SELECT
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'publisher_revenue') as revenue_mopub,
JSON_EXTRACT_PATH_TEXT(column_name, 'mopub_json', 'currency') as currency_mopub
FROM(
SELECT replace(column_name, chr(92), '')
FROM table_name)
I receive
Invalid operation: JSON parsing error
When I'm trying to extract values without replacing , I'm receiving empty columns.
Thank you for your help!
So your json isn't valid. JSON doesn't allow multiline text strings but I expect that the issue. Based on your query I think you don't want a single key and string but the whole structure. The reason the that quotes are backslashed is because they are inside a string. The json should look like:
{
"mopub_json": {
"currency": "USD",
"country": "US",
"publisher_revenue": 0.01824
}
}
Then the SQL you have should work.

SQL Server 2017 Selecting JSON embedded within a JSON field

In SQL Server 2017, I'd like to "SELECT" a JSON object embedded within another as a string so we can store/process them later.
eg JSON:
[
{"key1":"value1",
"level2_Obj":{"key2":"value12"}
},
{"key1":"value2",
"level2_Obj":{"key22":"value22"}
},
]
From above JSON, I'd like to SELECT whole of the level2Obj JSON object, see below for what I'd like to see the "selection" result.
value1 |{"key2" :"value12"}
value2 |{"key22":"value22"}
I tried below with no luck:
SELECT * FROM
OPENJSON(#json,'$."data1"')
WITH(
[key1] nvarchar(50),
[embedded_json] nvarchar(max) '$."level2Obj"'
) AS DAP
Can some one please help how I select the contents of the 2nd level JSON object as a string?
The idea is to Write 1st level JSON properties into individual cells and rest of JSON levels into a single column of type nvarchar(max) (i.e whole of sub-level JSON object into a single column as a string for further processing in later stages).
Good day,
Firstly, Your JSON text is not properly formatted. There is extra comma after the last object in the array. I will remove this extra comma for the sake of the answer, but if this is the format you have then first step will be to clear the text and make sure that is is well formatted.
Please check if this solve your needs:
declare #json nvarchar(MAX) = '
[
{
"key1":"value1",
"level2_Obj":{"key2":"value12"}
}
,
{
"key1":"value2",
"level2_Obj":{"key22":"value22"}
}
]
'
SELECT JSON_VALUE (t1.[value], '$."key1"'), JSON_QUERY (t1.[value], '$."level2_Obj"')
FROM OPENJSON(#json,'$') t1

Get JSON_VALUE with Oracle SQL when multiple nodes share the same name

I have an issue where I have some JSON stored in my oracle database, and I need to extract values from it.
The problem is, there are some fields that are duplicated.
When I try this, it works as there is only one firstname key in the options array:
SELECT
JSON_VALUE('{"increment_id":"2500000043","item_id":"845768","options":[{"firstname":"Kevin"},{"lastname":"Test"}]}', '$.options.firstname') AS value
FROM DUAL;
Which returns 'Kevin'.
However, when there are two values for the firstname field:
SELECT JSON_VALUE('{"increment_id":"2500000043","item_id":"845768","options":[{"firstname":"Kevin"},{"firstname":"Okay"},{"lastname":"Test"}]}', '$.options.firstname') AS value
FROM DUAL;
It only returns NULL.
Is there any way to select the first occurence of 'firstname' in this context?
JSON_VALUE returns one SQL VALUE from the JSON data (or SQL NULL if the key does not exists).
If you have a collection of values (a JSON array) an you want one specific item of the array you use array subscripts (square brackets) like in JavaScript, for example [2] to select the third item. [0] selects the first item.
To get the first array item in your example you have to change the path expression from '$.options.firstname' to '$.options[0].firstname'
You can follow this query:-
SELECT JSON_VALUE('{
"increment_id": "2500000043",
"item_id": "845768",
"options": [
{
"firstname": "Kevin"
},
{
"firstname": "Okay"
},
{
"lastname": "Test"
}
]
}', '$.options[0].firstname') AS value
FROM DUAL;