SQL: Unnesting a variable length JSON into multiple columns - sql

I have a JSON array in a redshift SQL column that will vary in number of nests which I need to unnest and select the values such that they print in columns on the same row.
i.e: From: Name|JSON
to
Name|First Play Price|First Play Status| Second Play Price|Second Play Status... etc
The syntax is roughly
[
{
"price": "price1",
"status": "status1",
},
{
"price": "price2",
"status": "status2",
},
{
"price": "price3",
"status": "status3",
}
]
I'm familiar with JSON extraction however I've got a bit stuck on this varying no keys ([{},{},{}]) nest issue.
Any help or direction to resources would be greatly apreciated! Thank you

Related

Add computed field to Query in Grafana using JSON API als data source

What am I trying to achieve:
I would like to have a time series chart showing the total number of members in my club at any time. This member count should be calculated by using the field "Eintrittsdatum" (joining-date) and "Austrittsdatum" (leaving-date). I’m thinking of it as a running sum - every filled field with a joining-date means +1 on the member count, every leaving-date entry is a -1.
Data structure
I’m calling the API of webling.ch with a secret key. This is my data structure with sample data per member:
[
{
"type": "member",
"meta": {
"created": "2020-03-02 11:33:00",
"createuser": {
"label": "Joana Doe",
"type": "user"
},
"lastmodified": "2022-12-06 16:32:56",
"lastmodifieduser": {
"label": "Joana Doe",
"type": "user"
}
},
"readonly": true,
"properties": {
"Mitglieder ID": 99,
"Anrede": "Dear",
"Vorname": "Jon",
"Name": "Doe",
"Strasse": "Doeington Street",
"Adresszusatz": null,
"PLZ": "9999",
"Ort": "Doetown",
"E-Mail": "jon.doe#doenet.net",
"Telefon Privat": null,
"Telefon Geschäft": null,
"Mobile": "099 877 54 54",
"Geschlecht": "m",
"Geburtstag": "1966-03-10",
"Mitgliedschaftstyp": "Aktivmitgliedschaft",
"Eintrittsdatum": "2020-03-01",
"Austrittsdatum": null,
"Passfoto": null,
"Wordpress Benutzername": null,
"Wohnhaft im Glarnerland": false,
"Lat": "43.1563379",
"Long": "6.0474622"
},
"parents": [
240
],
"children": {
},
"links": {
"debitor": [
2124,
3056,
3897
],
"attendee": [
2576
]
},
"id": 1815
}
]
Grafana data source
I am using the “JSON API” by Marcus Olsson: GitHub - grafana/grafana-json-datasource: A data source plugin for loading JSON APIs into Grafana.
Grafana v9.3.1 (89b365f8b1) on Linux
My current approach
Queries:
Query C - uses a filter on the source-API to only show entries with "Eintrittsdatum" IS NOT EMPTY
Field 1 (alias "datum") has a JSONata-Query of:
properties.Eintrittsdatum
Field 2 (alias "names") should return the full name and has a query of:
$map($.properties, function($v) {(
($v.Vorname&" "&$v.Name);
)})
Field 3 (alias "value") should return "1" for every entry and has a query of:
$map($.properties, function($v) {(
(1);
)})
Query D - uses a filter on the source-API to only show entries with "Austrittsdatum" IS NOT EMPTY
Field 1 (alias "datum") has a JSONata-Query of:
properties.Austrittsdatum
Field 2 (alias "names") should return the full name and has a query of:
$map($.properties, function($v) {(
($v.Vorname&" "&$v.Name);
)})
Field 3 (alias "value") should return "1" for every entry and has a query of:
$map($.properties, function($v) {(
(1);
)})
Here's a screenshot to clarify things
(https://zigerschlitzmakers.ch/wp-content/uploads/2023/01/ScreenshotGrafana-1.png)
Transformations:
My applied transformations
(https://zigerschlitzmakers.ch/wp-content/uploads/2023/01/ScreenshotGrafana-2.png)
What's working
I can correctly gather the number of members added/subtracted per day.
What's not working
I can't get the graph to display the way i want: I'd like to have a running sum of these numbers instead of the following two graphs.
Time series graph with merged queries
(https://zigerschlitzmakers.ch/wp-content/uploads/2023/01/ScreenshotGrafana-3.png)
Time series graph with unmerged queries
(https://zigerschlitzmakers.ch/wp-content/uploads/2023/01/ScreenshotGrafana-4.png)
I can't get the names to display within the tooltip of the data points (really not THAT necessary).

how to use trino/presto to query redis

I have a simple string and hash stored in redis
get test
"1"
hget htest first
"first hash"
I'm able to see the "table" test, but there are no columns
trino> show columns from redis.default.test;
Column | Type | Extra | Comment
--------+------+-------+---------
(0 rows)
and obviously I can't get result from select
trino> select * from redis.default.test;
Query 20210918_174414_00006_dmp3x failed: line 1:8: SELECT * not allowed from relation
that has no columns
I see in the documentation that I might need to create a table definition file, but I wasn't able to create one that will work.
I had few variations of this, but this is the one for example:
{
"tableName": "test",
"schemaName": "default",
"value": {
"dataFormat": "json",
"fields": [
{
"name": "number",
"mapping": 0,
"type": "INT"
}
]
}
}
any idea what am I doing wrong?
I focused on the string since it's simpler, but I also need to query the hash

How do I extract the data (rows) in PostgresQL queryies?

Good Morning guys,
Im stucking on extract data from the PostgresQL queries.
I made query like this
pool.query(' SELECT * FROM images_info where id < 50', genericQueryHandler(res));
The data sent back format is like this
{
"command": "SELECT",
"rowCount": 49,
"oid": null,
"rows": [...],
"fields": [],
"_parsers": [],
"_types": {},
"RowCtor": null,
"rowAsArray": false
}
I only need the data in "rows". How do I extract the "rows"? I tried limit, group by, it doesn't work. Could you guys help me ? Very appreciate your time and help

Convert CSV containing nested JSON rows to SQL table

I have a CSV file with several million rows, and want to load it as a PostgreSQL table. One of the rows in the column 'json_doc' as an example contains:
{"id": <>,
"base":
{"ateco":
[
{
"code": "<>",
"rootCode": "<>",
"description": "<>"
}
],
"founded": "<>",
"legalName": "<>",
"legalForms":
[
{
"name": "<>",
"level": <>
},
{
"name": "<>",
"level": <>
}
]
},
"name": "<>",
"people":
{
"items":
[
{
"name": "<>",
"givenName": "<>",
"familyName": "<>"
}
]
},
"country": "<>",
"locations": {}
}
Which as you can see has many nested dictionaries. And there are several million of these.
I'd like to get this file into an SQL table with even the sub-dictionary values in their own columns. How can I do this? It would seem I have to use some sort of name spacing technique for the nested data as there are some duplicate keys i.e. 'name'.
The data will be analysed using Pandas, but I'd like to get this straight into Postgres if possible. Any assistance greatly appreciated.
The result will look like:
id | base_ateco_code | etc | base_ateco_legalForms_name | etc |
Unless there are any ideas about this - it's a pretty open project from my employer - I just need to be able to use this information as part of a JOIN with another table.
Many thanks.

Select JSON object that appears more than one time

I am trying to write a query to return all trains that have more than one etapesSupervision.
My table has a column called DETAIL, in this column I can find the JSON of my train.
"nomTrain": "EVOL99",
"compositionCourtLong": "LONG",
"sillons": [{
"numeroTrain": "EVOL99"
}],
"sillonsV4": [{
"refSillon": "sillons/4289505/2"
}],
"branchesStif": [{
"data": "49",
"data": "BP",
"data": "ORIGINE"
} ],
"etapesSupervision": [{
"data": "PR/0087-758896-00",
"data": "PR/0087-758607-BV",
"superviseur": "1287",
"uoSuperviseur": "B"
},
{
"data": "PR/0087-758607-BV",
"data": "PR/0087-001479-BV",
"superviseur": "1287",
"uoSuperviseur": "B"
}],
This is the query I wrote :
select * from course where CODE_LIGNE_COMMERCIALE='B'
--and ref = 'train/2018-11-12'
and instr(count(train.detail,'"etapesSupervision":'))> 1 ;
Using this, I return trains with only one etapesSupervision.
The thing is the column DETAIL is JSON, so I feel like I can't do a lot with it.
I tried also with like, but it doesn't work either.
Thank you for your comments.
This is the query that worked:
select data,data,data
from train
where
length(DETAIL) - length(replace(DETAIL,'uoSuperviseur',null)) > 20 ;
And this way I have only trains that have more than one supervisor.
Thanks again