How to extract this json into a table? - sql

I've a sql column filled with json document, one for row:
[{
"ID":"TOT",
"type":"ABS",
"value":"32.0"
},
{
"ID":"T1",
"type":"ABS",
"value":"9.0"
},
{
"ID":"T2",
"type":"ABS",
"value":"8.0"
},
{
"ID":"T3",
"type":"ABS",
"value":"15.0"
}]
How is it possible to trasform it into tabular form? I tried with redshift json_extract_path_text and JSON_EXTRACT_ARRAY_ELEMENT_TEXT function, also I tried with json_each and json_each_text (on postgres) but didn't get what expected... any suggestions?
desired results should appear like this:
T1 T2 T3 TOT
9.0 8.0 15.0 32.0

I assume you printed 4 rows. In postgresql
SELECT this_column->'ID'
FROM that_table;
will return column with JSON strings. Use ->> if you want text column. More info here: https://www.postgresql.org/docs/current/static/functions-json.html
In case you were using some old Postgresql (before 9.3), this gets harder : )

Your best option is to use COPY from JSON Format. This will load the JSON directly into a normal table format. You then query it as normal data.
However, I suspect that you will need to slightly modify the format of the file by removing the outer [...] square brackets and also the commas between records, eg:
{
"ID": "TOT",
"type": "ABS",
"value": "32.0"
}
{
"ID": "T1",
"type": "ABS",
"value": "9.0"
}
If, however, your data is already loaded and you cannot re-load the data, you could either extract the data into a new table, or add additional columns to the existing table and use an UPDATE command to extract each field into a new column.
Or, very worst case, you can use one of the JSON Functions to access the information in a JSON field, but this is very inefficient for large requests (eg in a WHERE clause).

Related

Need Pentaho JSON without array

I wanted to output json data not as array object and I did the changes mentioned in the pentaho document, but the output is always array even for the single set of values. I am using PDI 9.1 and I tested using the ktr from the below link
https://wiki.pentaho.com/download/attachments/25043814/json_output.ktr?version=1&modificationDate=1389259055000&api=v2
below statement is from https://wiki.pentaho.com/display/EAI/JSON+output
Another special case is when 'Nr. rows in a block' = 1.
If used with empty json block name output will looks like:
{
"name" : "item",
"value" : 25
}
My output comes like below
{ "": [ {"name":"item","value":25} ] }
I have resolved myself. I have added another JSON input step and defined as below
$.wellDesign[0] to get the array as string object

Snowflake Searching string in semi structured data

I have a table. There are many columns and rows. One column that I am trying to query in Snowflake has semi structured data. For example, when I query
select response
from table
limit 5
This is what is returned
[body={\n "id": "xxxxx",\n "object": "charge",\n "amount": 500,\n "amount_refunded": 0,\n "application": null,\n "application_fee": null,\n "application_fee_amount": null,\n "balance_transaction": null,\n "billing_details": {\n "address": {\n "city": null,\n "zip": "xxxxx",]
I want to select only the zip in this data. When I run code:
select response:zip
from table
limit 5
I get an error.
SQL compilation error: error line 1 at position 21 Invalid argument types for function 'GET': (VARCHAR(16777216), VARCHAR(11))
Is there a reason why this is happening? I am new to snowflake so trying to parse out this data but stuck. Thanks!
Snowflake has very good documentation on the subject
For your specific case, have you attempted to use dot notation? It's the appropiate method for accessing JSON. So
Select result:body.zip
from table
Remember that you have your 'body' element. You need to access that one first with semicolon because it's a level 1 element. Zip is located within body so it's a level 2. Level 1 elements are accessed with semicolon, level 2 elements are accessed with dot notation.
I think you have multiple issues with this.
First I think your response column is not a variant column. Please run the below query and confirm
SHOW COLUMNS ON table;
Even if the column is variant, the way the data is stored is not in a valid JSON format. You will need to strip the JSON part and then store that in the variant column.
Please do the first part and share the information, I will then suggest next steps. I wanted to put that in the comment but comment does not allow to write so many sentences.

Is it possible to prevent ORDS from escaping my GeoJSON?

I have a problem with Oracle ORDS escaping my GeoJSON with "
{
"id": 1,
"city": "New York",
"state_abrv": "NY",
"location": "{\"type\":\"Point\",\"coordinates\":[-73.943849, 40.6698]}"
}
In Oracle DB it is stated correctly:
{"type":"Point","coordinates":[-73.943849, 40.6698]}
Need help to figure out why the " are added and how to prevent this from happening
add this column alias to your restful service handler query for the JSON column
SELECT id,
jsons "{}jsons" --this one
FROM table_with_json
Then when ords sees the data for the column, it won't format it as JSON because it already IS json
You can use whatever you want, in your case it should probably be
"{}location"

Left join did not working properly in Azure Stream Analytics

I'm trying to create a simple left join between two inputs (event hubs), the source of inputs is an app function that process a rabbitmq queue and send to a event hub.
In my eventhub1 I have this data:
[{
"user": "user_aa_1"
}, {
"user": "user_aa_2"
}, {
"user": "user_aa_3"
}, {
"user": "user_cc_1"
}]
In my eventhub2 I have this data:
[{
"user": "user_bb_1"
}, {
"user": "user_bb_2"
}, {
"user": "user_bb_3
}, {
"user": "user_cc_1"
}]
I use that sql to create my left join
select hub1.[user] h1,hub2.[user] h2
into thirdTestDataset
from hub1
left join hub2
on hub2.[user] = hub1.[user]
and datediff(hour,hub1,hub2) between 0 and 5
and test result looks ok...
the problem is when I try it on job running... I got this result in power bi dataset...
Any idea why my left isn't working like any sql query?
I tested your query sql and it works well for me too.So when you can't get expected output after executing ASA job,i suggest you following troubleshoot solutions in this document.
Based on your output,it seems that the HUB2 becomes the left table.You could use diagnostic log in ASA to locate the truly output of job execution.
I tested the end-to-end using blob storage for input 1 and 2 and your sample and a PowerBI dataset as output and observed the expected result.
I think there are few things that can go wrong with your query:
First, your join has a 5-hours windows: basically that means it looks at EH1 and EH2 for matches during that large window, so live results will be different from sample input for which you have only 1 row. Can you validate that you had no match during this 5-hour window?
Additionally by default PBI streaming datasets are "hybrid datasets" so it will accumulate results without a good way to know when the result was emitted since there is no timestamp in your output schema. So you can also view previous data here. I'd suggest few things here:
In Power BI, change the option of your dataset: disable "Historic data analysis" to remove caching of data
Add a timestamp column to make sure to identify when the data is generated (the first line of you query will become: select System.timestamp() as time, hub1.[user] h1,hub2.[user] h2 )
Let me know if it works for you.
Thanks,
JS (Azure Stream Analytics)

Get display value of structured datatables cell

I'm following https://datatables.net/reference/option/columns.data ("data": { "_": "phone", "filter": "phone_filter", "display": "phone_display"}) to supply structured values for certain columns of the dataTables table, other columns are just simple:
{"filter": "1964486", "display": "Elite 2022 Tryout ('17-'18)", "_": 1964486}
It works fine, displays the display value, searches by the filter value. But in certain places I need to programatically obtain the full data structure (see above) from the cell. However when I try to access it through the API (let's say we are talking about the first row's 6th column's cell data):
myTable.cell(0, 5).data()
This returns only 1964486 instead of the full structure. How can I access the display value?
render() can do that:
myTable.cell(0, 5).render('display')
https://datatables.net/reference/api/cell().render()
It can also return filter and sort values.