Transforming JSON data to relational data - sql

I want to display data from SQL Server where the data is in JSON format. But when the select process, the data does not appear:
id
item_pieces_list
0
[{"id":2,"satuan":"BOX","isi":1,"aktif":true},{"id":4,"satuan":"BOX10","isi":1,"aktif":true}]
1
[{"id":0,"satuan":"AMPUL","isi":1,"aktif":"true"},{"id":4,"satuan":"BOX10","isi":5,"aktif":true}]
I've written a query like this, but nothing appears. Can anyone help?
Query :
SELECT id, JSON_Value(item_pieces_list, '$.satuan') AS Name
FROM [cisea.bamedika.co.id-hisys].dbo.medicine_alkes AS medicalkes

Your Path is wrong. Your JSON is an array, and you are trying to retrieve it as a flat object
SELECT id, JSON_Value(item_pieces_list,'$[0].satuan') AS Name
FROM [cisea.bamedika.co.id-hisys].dbo.medicine_alkes
Only in the case of data without the [] (array sign) you could use your original query '$.satuan', but since you are using an array I change it to retrieve only the first element in the array '$[0].satuan'

Related

extract values inside an array column in amazon athena

I have a table in athena aws where the column 'metadata_stopinfo' has the structure that you can see in the image.
I am trying to extract values that are inside that array, however when I try
SELECT
"json_extract_scalar"(metadata_stopinfo, '$.city')
FROM "table"
I have the following problem
SYNTAX_ERROR: line 2:5: Unexpected parameters (array(row("address" row("addressline" varchar,"city" varchar,"countrycode" varchar,"countrycodeoriginal" varchar,"state" varchar,"zipcode" varchar),"carrierreference" varchar,"contacts" array(row("contacttype" varchar,"email" varchar,"fax" varchar,"mobilephone" varchar,"name" varchar,"officephone" varchar,"userid" varchar)),"containerinfo" array(row("containerid" varchar,"containeridtype" varchar,"equipmentcode" varchar,"equipmenttype" varchar)),"conveyancelinenumber" varchar,"conveyancetype" varchar,"conveyancetypeoriginal" varchar,"dateinfo" row("arrivalestimateddate" varchar,"arrivalestimateddateend" varchar,"arrivalestimatedendoffset" varchar,"arrivalestimatedoffset" varchar,"arrivalrequesteddate" varchar,"deliveryestimateddate" varchar,"deliveryestimateddateend" varchar,"deliveryestimatedendoffset" varchar,"deliveryestimatedoffset" varchar,"deliveryrequesteddate" varchar,"deliveryrequesteddateend" varchar,"deliveryrequestedendoffset" varchar,"deliveryrequestedoffset" varchar,"departureestimateddate" varchar,"departureestimateddateend" varchar,"departureestimatedendoffset" varchar,"departureestimatedoffset" varchar,"departurerequesteddate" varchar,"pickuprequesteddate" varchar,"pickuprequesteddateend" varchar,"pickuprequestedendoffset" varchar,"pickuprequestedoffset" varchar,"pickupestimateddate" varchar,"pickupestimateddateend" varchar,"pickupestimatedendoffset" varchar,"pickupestimatedoffset" varchar),"deliverynotenumber" varchar,"instructions" array(row("customerspecificsubtype" varchar,"header" boolean,"instructionsubtype" varchar,"instructiontype" varchar,"text" varchar)),"locationid" varchar,"partnercarrieraddress" row("addressline" varchar,"city" varchar,"countrycode" varchar,"countrycodeoriginal" varchar,"state" varchar,"zipcode" varchar),"partnercarriercontacts" array(row("contacttype" varchar,"email" varchar,"fax" varchar,"name" varchar,"officephone" varchar)),"partnercarrierid" varchar,"partnercarriername" varchar,"partnerid" varchar,"partnername" varchar,"partnertimezone" varchar,"partnertype" varchar,"productquantity" row("number" double,"originalunitofmeasure" varchar,"quantitytype" varchar,"unitofmeasure" varchar),"sequencenumber" bigint,"shipmentidentifier" varchar,"stoptype" varchar,"transportinfo" row("description" varchar,"transportcode" varchar,"transportoriginalcode" varchar),"vesselinfo" row("lloydsnumber" varchar,"shipsradiocallnumber" varchar,"vesselname" varchar,"vesselnumber" varchar,"voyagetripnumber" varchar))), varchar(6)) for function json_extract_scalar. Expected: json_extract_scalar(varchar(x), JsonPath) , json_extract_scalar(json, JsonPath)
My question is, how can i extract values inside de column ?
json_extract_scalar unsurprisingly works with json (note that even if yur data was in json format, json_extract_scalar(metadata_stopinfo, '$.city') still would not have worked cause your data is an array), while your column contains array's of row's, so you need to work with it correspondingly. For example you can use indexes to access elements in array (in presto array indexes start from 1):
SELECT
metadata_stopinfo[1] r
FROM "table"
And then access the fields:
The fields may be of any SQL type, and are accessed with field reference operator .
SELECT
metadata_stopinfo[1].city city
FROM "table"
Also you can flatten the array with unnest:
SELECT r.city
FROM "table",
unnest(metadata_stopinfo) as t(r)

Presto extract string from array of JSON elements

I am on Presto 0.273 and I have a complex JSON data from which I am trying to extract only specific values.
First, I ran SELECT JSON_EXTRACT(library_data, '.$books') which gets me all the books from a certain library. The problem is this returns an array of JSON objects that look like this:
[{
"book_name":"abc",
"book_size":"453",
"requestor":"27657899462"
"comments":"this is a comment"
}, {
"book_name":"def",
"book_size":"354",
"requestor":"67657496274"
"comments":"this is a comment"
}, ...
]
I would like the code to return just a list of the JSON objects, not an array. My intention is to later be able to loop through the JSON objects to find ones from a specific requester. Currently, when I loop through the given arrays using python, I get a range of errors around this data being a Series, hence trying to extract it properly rather.
I tried this SELECT JSON_EXTRACT(JSON_EXTRACT(data, '$.domains'), '$[0]') but this doesn't work because the index position of the object needed is not known.
I also tried SELECT array_join(array[books], ', ') but getting "Error casting array element to VARCHAR " error.
Can anyone please point me in the right direction?
Cast to array(json):
SELECT CAST(JSON_EXTRACT(library_data, '.$books') as array(json))
Or you can use it in unnest to flatten it to rows:
SELECT *,
js_obj -- will contain single json object
FROM table
CROSS JOIN UNNEST CAST(JSON_EXTRACT(library_data, '.$books') as array(json)) as t(js_obj)

Extracting JSON returns null (Presto Athena)

I'm working with SQL Presto in Athena and in a table I have a column named "data.input.additional_risk_data.basket" that has a json like this:
[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]
I need to extract some of the data there, for example data.input.additional_risk_data.basket.val.item_reference. I'm not used to working with jsons but I tried a few things:
json_extract("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference')
json_extract_scalar("data.input.additional_risk_data.basket", '$.data.input.additional_risk_data.basket.val.item_reference)
They all returned null. I'm wondering what is the correct way to get the values from that json
Thank you!
There are multiple "problems" with your data and json path selector. Keys are not conventional (and I have not found a way to tell athena to escape them) and your json is actually an array of json objects. What you can do - cast data to an array and process it. For example:
-- sample data
WITH dataset (json_val) AS (
VALUES (json '[
{
"data.input.additional_risk_data.basket.val.brand":null,
"data.input.additional_risk_data.basket.val.category":null,
"data.input.additional_risk_data.basket.val.item_reference":"26484651",
"data.input.additional_risk_data.basket.val.name":"Nike Force 1",
"data.input.additional_risk_data.basket.val.product_name":null,
"data.input.additional_risk_data.basket.val.published_date":null,
"data.input.additional_risk_data.basket.val.quantity":"1",
"data.input.additional_risk_data.basket.val.size":null,
"data.input.additional_risk_data.basket.val.subCategory":null,
"data.input.additional_risk_data.basket.val.unit_price":769.0,
"data.input.additional_risk_data.basket.val.upc":null,
"data.input.additional_risk_data.basket.val.url":null
}
]')
)
--query
select arr[1]['data.input.additional_risk_data.basket.val.item_reference'] item_reference -- or use unnest if there are actually more than 1 element in array expected
from(
select cast(json_val as array(map(varchar, json))) arr
from dataset
)
Output:
item_reference
"26484651"

Export data from SQL to ADLS using ADF as JSON

I am trying to load data to ADLS gen2 from Azure SQL DB in json format.
Below is the query I am using to load it in JSON format
select k2.[mandt],k2.[kunnr],
'knb1' = (select [bukrs] as 'bukrs' , [pernr]
from [ods_cdc_sdr].[knb1] k1
where k2.mandt=k1.mandt AND K1.kunnr=K2.kunnr
FOR JSON PATH),
'knvp' =(select knvp.vkorg, vtweg from [ods_cdc_sdr].[knvp] knvp where k2.mandt=knvp.mandt AND knvp.kunnr=K2.kunnr FOR JSON PATH)
from [ods_cdc_sdr].[kna1] k2
group by k2.[mandt],k2.[kunnr]
FOR JSON PATH
For one or two records data looks fine but when I am trying to load 1000 and above records, json seems to be splitting also not in a proper format (below is the example)
**{"JSON_F52E2B61-18A1-11d1-B105-00805F49916B"**:"[{\"mandt\":\"172\",\"kunnr\":\"\"},{\"mandt\":\"172\",\"kunnr\":\"0000000001\"},{\"mandt\":\"172\",\"kunnr\":\"0000000004\",\"knvp\":[{\"vkorg\":\"FR12\",\"vtweg\":\"01\"},{\"vkorg\":\"FR12\",\"vtweg\":\"01\"},{\"vkorg\":\"FR12\",\"vtweg\":\"01\"},{\"vkorg\":\"FR12\",\"vtweg\":\"01\"},{\"vkorg\":\"FR12\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000006\"},{\"mandt\":\"172\",\"kunnr\":\"0000000008\",\"knvp\":[{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000012\",\"knvp\":[{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000015\"},{\"mandt\":\"172\",\"kunnr\":\"0000000021\"},{\"mandt\":\"172\",\"kunnr\":\"0000000022\"},{\"mandt\":\"172\",\"kunnr\":\"0000000023\"},{\"mandt\":\"172\",\"kunnr\":\"0000000026\",\"knvp\":[{\"vkorg\":\"IN14\",\"vtweg\":\"01\"},{\"vkorg\":\"IN14\",\"vtweg\":\"01\"},{\"vkorg\":\"IN14\",\"vtweg\":\"01\"},{\"vkorg\":\"IN14\",\"vtweg\":\"01\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000045\",\"knvp\":[{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000046\",\"knvp\":[{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"},{\"vkorg\":\"FR13\",\"vtweg\":\"04\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000048\"},{\"mandt\":\"172\",\"kunnr\":\"0000000050\"},{\"mandt\":\"172\",\"kunnr\":\"0000000054\"},{\"mandt\":\"172\",\"kunnr\":\"0000000057\"},{\"mandt\":\"172\",\"kunnr\":\"0000000058\"},{\"mandt\":\"172\",\"kunnr\":\"0000000060\"},{\"mandt\":\"172\",\"kunnr\":\"0000000065\"},{\"mandt\":\"172\",\"kunnr\":\"0000000085\"},{\"mandt\":\"172\",\"kunnr\":\"0000000086\"},{\"mandt\":\"172\",\"kunnr\":\"0000000089\"},{\"mandt\":\"172\",\"kunnr\":\"0000000090\"},{\"mandt\":\"172\",\"kunnr\":\"0000000092\"},{\"mandt\":\"172\",\"kunnr\":\"0000000106\"},{\"mandt\":\"172\",\"kunnr\":\"0000000124\"},{\"mandt\":\"172\",\"kunnr\":\"0000000129\",\"knvp\":[{\"vkorg\":\"FR40\",\"vtweg\":\"01\"},{\"vkorg\":\"FR40\",\"vtweg\":\"01\"},{\"vkorg\":\"FR40\""}
**{"JSON_F52E2B61-18A1-11d1-B105-00805F49916B"**:",\**"vtweg\":\"01\"},{\"vkorg\":\"FR40\",\"vtweg\":\"01\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000149\"},{\"mandt\":\"172\",\"kunnr\":\"0000000164\"},{\"mandt\":\"172\",\"kunnr\":\"0000000167\"},{\"mandt\":\"172\",\"kunnr\":\"0000000174\"},{\"mandt\":\"172\",\"kunnr\":\"0000000178\"},{\"mandt\":\"172\",\"kunnr\":\"0000000181\"},{\"mandt\":\"172\",\"kunnr\":\"0000000185\",\"knvp\":[{\"vkorg\":\"FR65\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"},{\"vkorg\":\"FR65\",\"vtweg\":\"01\"}]},{\"mandt\":\"172\",\"kunnr\":\"0000000189\"},{\"mandt\":\"172\",\"kunnr\":\"0000000214\"},{\"mandt\":\"172\",\"kunnr\":\"0000000223\"},{\"mandt\":\"172\",\"kunnr\":\"0000000228\"},{\"mandt\":\"172\",\"kunnr\":\"0000000239\"},{\"mandt\":\"172\",\"kunnr\":\"0000000240\"},{\"mandt\":\"172\",\"kunnr\":\"0000000249\"},{\"mandt\":\"172\",\"kunnr\":\"0000000251\"},{\"mandt\":\"172\",\"kunnr\":\"0000000257\"},{\"mandt\":\"172\",\"kunnr\":\"0000000260\"},{\"mandt\":\"172\",\"kunnr\":\"0000000261\"},{\"mandt\":\"172\",\"kunnr\":\"0000000262\"},{\"mandt\":\"172\",\"kunnr\":\"0000000286\"},{\"mandt\":\"172\",\"kunnr\":\"0000000301\"},{\"mandt\":\"172\",\"kunnr\":\"0000000320\"},{\"mandt\":\"172\",\"kunnr\":\"0000000347\"},{\"mandt\":\"172\",\"kunnr\":\"0000000350\"},{\"mandt\":\"172\",\"kunnr\":\"0000000353\"},{\"mandt\":\"172\",\"kunnr\":\"0000000364\"},{\"mandt\":\"172\",\"kunnr\":\"0000000370\"},{\"mandt\":\"172\",\"kunnr\":\"0000000372\"},{\"mandt\":\"172\",\"kunnr\":\"0000000373\"},{\"mandt\":\"172\",\"kunnr\":\"0000000375\"},{\"mandt\":\"172\",\"kunnr\":\"0000000377\"},{\"mandt\":\"172\",\"kunnr\":\"0000000380\"},{\"mandt\":\"172\",\"kunnr\":\"0000000381\"},{\"mandt\":\"172\",\"kunnr\":\"0000000383\"},{\"mandt\":\"172\",\"kunnr\":\"0000000384\"},{\"mandt\":\"172\",\"kunnr\":\"0000000386\"},{\"mandt\":\"172\",\"kunnr\":\"0000000387\"},{\"mandt\":\"172\",\"kunnr\":\"0000000391\"},{\"mandt\":\"172\",\"kunnr\":\"0000000393\"},{\"mandt\":\"172\",\"kunnr\":\"0000000396\"},{\"mandt\":\"172\",\"kunnr\":\"0000000397\"},{\"mandt\":\"172\",\"kunnr\":\"0000000408\"},{\"mandt\":\"172\",\"kunnr\":\"0000000416\"},{\"mandt\":\"172\",\"kunnr\":\"0000000421\"},{\"mandt\":\"172\",\"kunnr\":\"0000000424\"},{\"mandt\":\"172\",\"kunnr\":\"0000000425\"},{\"mandt\":\"172\",\"kunnr\":\"0000000428\"},{\"mandt\":\"172\",\"kunnr\":\"0000000443\"},{\"mandt\":\"172\",\"kunnr\":\"0000000447\"},{\"mandt\":\"172\",\"kunnr\":\"0000000453\"},{\"mandt"}
**{"JSON_F52E2B61-18A1-11d1-B105-00805F49916B"**:"\":\"172\",\"kunnr\":\"0000000475\"},{\"mandt\":\"172\",\"kunnr\":\"0000000478\"},{\"mandt\":\"172\\",\"kunnr\":\"2100000001\",\"knvp\":[{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"},{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"},{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"},{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"}]},{\"mandt\":\"172\",\"kunnr\":\"2100000002\",\"knvp\":[{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"},{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"},{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"},{\"vkorg\":\"Y200\",\"vtweg\":\"Z1\"}]}]"}
Please help me how can I get entire message in a proper format```
If you just want to save query result in json format to ADLS. You'd better remove FOR JSON PATH. We can use Azure Data Factory to generate nested JSON.
I created a simple test with two tables. Table Entities and table EntitiesEmails are related through the Id and EntitiyId fields. Usually we can use following sql to return a nested json type array. But ADF will automatically add escape characters '\' to escape double quotes.
SELECT
ent.Id AS 'Id',
ent.Name AS 'Name',
ent.Age AS 'Age',
EMails = (
SELECT
Emails.EntitiyId AS 'Id',
Emails.Email AS 'Email'
FROM EntitiesEmails Emails WHERE Emails.EntitiyId = ent.Id
FOR JSON PATH
)
FROM Entities ent
FOR JSON PATH
I researched out the use of data flow to generate a nested json array. As follows show:
Set source1 to the SQL table Entities .
Set source2 to the SQL table EntitiesEmails .
At Aggregate1 activity, set Group By Aggregate1
Type in expression collect(Email) at Aggregates. It will collect all email addresses into an array.
Data preview is as follows:
Then we can join these two streams at Join1 activity.
Then we can filter out extra columns at Select1 activity.
Then we can sink the result to our json file in ADLS.
Debug output is as follows:

U-sql call data in json array

I have browsed the web and forum to download the data from the file json, but my script does not work.
I have a problem with downloading the list of objects of rates. Can someone please help? I can not find fault.
{"table":"C","no":"195/C/NBP/2016","tradingDate":"2016-10-06","effectiveDate":"2016-10-07","rates":
[
{"currency":"dolar amerykański","code":"USD","bid":3.8011,"ask":3.8779},
{"currency":"dolar australijski","code":"AUD","bid":2.8768,"ask":2.935},
{"currency":"dolar kanadyjski","code":"CAD","bid":2.8759,"ask":2.9339},
{"currency":"euro","code":"EUR","bid":4.2493,"ask":4.3351},
{"currency":"forint (Węgry)","code":"HUF","bid":0.013927,"ask":0.014209},
{"currency":"frank szwajcarski","code":"CHF","bid":3.8822,"ask":3.9606},
{"currency":"funt szterling","code":"GBP","bid":4.8053,"ask":4.9023},
{"currency":"jen (Japonia)","code":"JPY","bid":0.036558,"ask":0.037296},
{"currency":"korona czeska","code":"CZK","bid":0.1573,"ask":0.1605},
{"currency":"korona duńska","code":"DKK","bid":0.571,"ask":0.5826},
{"currency":"korona norweska","code":"NOK","bid":0.473,"ask":0.4826},
{"currency":"korona szwedzka","code":"SEK","bid":0.4408,"ask":0.4498},
{"currency":"SDR (MFW)","code":"XDR","bid":5.3142,"ask":5.4216}
],
"EventProcessedUtcTime":"2016-10-09T10:48:41.6338718Z","PartitionId":1,"EventEnqueuedUtcTime":"2016-10-09T10:48:42.6170000Z"}
This is my script in sql.
#trial =
EXTRACT jsonString string
FROM #"adl://kamilsepin.azuredatalakestore.net/ExchangeRates/2016/10/09/10_0_c60d8b8895b047c896ce67d19df3cdb2.json"
USING Extractors.Text(delimiter:'\b', quoting:false);
#json =
SELECT Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(jsonString) AS rec
FROM #trial;
#columnized =
SELECT
rec["table"]AS table,
rec["no"]AS no,
rec["tradingDate"]AS tradingDate,
rec["effectiveDate"]AS effectiveDate,
rec["rates"]AS rates
FROM #json;
#rateslist =
SELECT
table, no, tradingDate, effectiveDate,
Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(rates) AS recl
FROM #columnized;
#selectrates =
SELECT
recl["currency"]AS currency,
recl["code"]AS code,
recl["bid"]AS bid,
recl["ask"]AS ask
FROM #rateslist;
OUTPUT #selectrates
TO "adl://kamilsepin.azuredatalakestore.net/datastreamanalitics/ExchangeRates.tsv"
USING Outputters.Tsv();
You need to look at the structure of your JSON and identify, what constitutes your first path inside your JSON that you want to map to correlated rows. In your case, you are really only interested in the array in rates where you want one row per array item.
Thus, you use the JSONExtractor with a JSONPath that gives you one row per array element (e.g., rates[*]) and then project each of its fields.
Here is the code (with slightly changed paths):
REFERENCE ASSEMBLY JSONBlog.[Newtonsoft.Json];
REFERENCE ASSEMBLY JSONBlog.[Microsoft.Analytics.Samples.Formats];
#selectrates =
EXTRACT currency string, code string, bid decimal, ask decimal
FROM #"/Temp/rates.json"
USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor("rates[*]");
OUTPUT #selectrates
TO "/Temp/ExchangeRates.tsv"
USING Outputters.Tsv();