Get average of JSONB array in postgres - sql

I have a postgres table 'games' containing different scores for a game. I want to query all the games and have the average score of all the scores for that specific game. I tried a lot of different queries but I always get in trouble because of the JSONB datatype. The data of the games are saved in JSONB format and the games table looks like this:
gameID gameInfo
---------------------------------------------------------------
1 {
"scores": [
{
"scoreType": "skill",
"score": 1
},
{
"scoreType": "speed",
"score": 3
},
{
"scoreType": "strength",
"score": 2
}
]}
2 {
"scores": [
{
"scoreType": "skill",
"score": 4
},
{
"scoreType": "speed",
"score": 4
},
{
"scoreType": "strength",
"score": 4
}
]}
3 {
"scores": [
{
"scoreType": "skill",
"score": 1
},
{
"scoreType": "speed",
"score": 3
},
{
"scoreType": "strength",
"score": 5
}
]}
Expected output:
GameId
AverageScore
1
2
2
4
2
3
What query can I use to get the expected output?

Extract JSONB representing an array, use a JSONB function to get array of JSONB, extract the string value.
select gameid, avg(score::int) s
from (
select gameid, jsonb_array_elements(gameInfo #>'{scores}') ->'score' score
from foo
) t
group by gameid
order by gameid

Also you can use lateral join in next way:
select gameID, avg((s->>'score')::int) avg_score
from g, lateral jsonb_array_elements((gameInfo->>'scores')::jsonb) s
group by gameID
;
SQL editor online
Result:
+========+====================+
| gameid | avg_score |
+========+====================+
| 3 | 3.0000000000000000 |
+--------+--------------------+
| 2 | 4.0000000000000000 |
+--------+--------------------+
| 1 | 2.0000000000000000 |
+--------+--------------------+

Related

MongoDB Aggregating counts with different conditions at once

I'm totally new to MongoDB.
I wonder if it is possible to aggregate counts with different conditions at once.
Such as, there is a collection like below.
_id | no | val |
--------------------
1 | 1 | a |
--------------------
2 | 2 | a |
--------------------
3 | 3 | b |
--------------------
4 | 4 | c |
--------------------
And I want result like below.
Value a : 2
Value b : 1
Value c : 1
How can I get this result all at once?
Thank you:)
db.collection.aggregate([
{
"$match": {}
},
{
"$group": {
"_id": "$val",
"count": { "$sum": 1 }
}
},
{
"$project": {
"field": {
"$arrayToObject": [
[ { k: { "$concat": [ "Value ", "$_id" ] }, v: "$count" } ]
]
}
}
},
{
"$replaceWith": "$field"
}
])
mongoplayground

What is the best practise for outputting JSON Type arrays from Table values without repetition?

I have the following SQL data that I am trying to output as a structured JSON string as follows:
Table data
TableId ContainerId MaterialId SizeId
848 1 1 1
849 1 1 2
850 1 2 1
851 1 2 2
852 1 3 1
853 1 4 1
854 2 2 1
855 2 2 2
856 2 2 3
JSON output
{
"container": [
{
"id": 1,
"material": [
{
"id": 1,
"size": [
{
"id": 1
},
{
"id": 2
}
]
},
{
"id": 2,
"size": [
{
"id": 1
},
{
"id": 2
}
]
},
{
"id": 3,
"size": [
{
"id": 1
}
]
},
{
"id": 4,
"size": [
{
"id": 1
}
]
}
]
},
{
"id": 2,
"material": [
{
"id": 2,
"size": [
{
"id": 1
},
{
"id": 2
},
{
"id": 3
}
]
}
]
}
]
}
I have tried several ways of outputting it but I am struggling to stop duplicated Container and Material Id records. Is anyone able to demonstrate the best working practices for extracting JSON from a table such as this please?
Well, it isn't pretty but this appears to work:
WITH
container As (SELECT distinct containerid As id FROM jsonArray1 As container)
, material As (SELECT distinct materialid As id, containerid As cid FROM jsonArray1 As material)
, size As (SELECT sizeid As id, materialid As tid, containerid As cid FROM jsonArray1 As size)
SELECT container.id id, material.id id, size.id id
FROM container
JOIN material ON material.cid = container.id
JOIN size ON size.tid = material.id AND size.cid = material.cid
FOR JSON AUTO, ROOT
sqlfiddle example
AUTO will structure JSON for you, but only by following the structure of the data tables used in the query. Since the data starts out "flat" in a single table, AUTO won't create any structure. So the trick I applied here was to use WITH CTE's to restructure this flat data into three virtual tables whose relationships had the necessary structure.
Everything here is super-sensitive in a way that normal relational SQL would not be. For instance, just changing the order of the JOINs will restructure the JSON hierarchy even though that would have no effect on a normal SQL query.
I also had to play around with the table and column aliases (a lot) to get it to put the right names on everything in the JSON.
You can use the following query
SELECT CONCAT('{"container": [',string_agg(json,','),']}') as json
FROM
(SELECT CONCAT('{"id:"',CAST(ContainerId as nvarchar(100)),
',"material":[{',string_agg(json,','),'}]}') as json,
dense_rank() over(partition by ContainerId order by ContainerId) rnk
FROM
(SELECT ContainerId ,MaterialId,CONCAT('"id":',CAST(MaterialId as nvarchar(100))
,',"size":[',string_agg('{"id":' + CAST(SizeId as nvarchar(100)) + '}',','),']') as json
FROM tb
GROUP BY ContainerId,MaterialId) T
GROUP BY ContainerId) T
GROUP BY rnk
demo in db<>fiddle

BigQuery : best use of UNNEST Arrays

I really need some help, I have a big file JSON that I ingested into BigQuery, I want to write a query that uses UNNEST twice, namely I have this like :
{
"categories": [
{
"id": 1,
"name" : "C0",
"properties": [
{
"name": "Property_1",
"value": {
"type": "String",
"value": "11111"
}
},
{
"name": "Property_2",
"value": {
"type": "String",
"value": "22222"
}
}
]}
]}
And I want to do a query that give's me something like this result
---------------------------------------------------------------------
| Category_ID | Name_ID | Property_1 | Property_2 |
------------------------------------------------------------------
| 1 | C0 | 11111 | 22222 |
---------------------------------------------------------------------
I already made something like but it's not working :
SELECT
c.id as Category_ID,
c.name as Name_ID,
p.value.value as p.name
From `DataBase-xxxxxx` CROSS JOIN
UNNEST(categories) AS c,
UNNEST(c.properties) AS p;
Thank you more 🙏

How to sum of jsonb inner field in postgres?

I want to sum one field PC value from table test group by metal_id.
I tried this Postgres GROUP BY on jsonb inner field
but not worked for me as I have different JSON format
tbl_test
id json
1 [
{
"pc": "4",
"metal_id": "1"
},
{
"pc": "4",
"metal_id": "5"
}
]
2. [
{
"pc": "1",
"metal_id": "1"
},
{
"pc": "1",
"metal_id": "2"
},
]
output I want is :(group by metal_id and sum of pc).
Thanks in advance!
[
"metal_id": 1
{
"pc": "5",
}
]
You can use json_array_element() to expand the jsonb array, and then aggregate by metal id:
select obj ->> 'metal_id' metal_id, sum((obj ->> 'pc')::int) cnt
from mytable t
cross join lateral jsonb_array_elements(t.js) j(obj)
group by obj ->> 'metal_id'
order by metal_id
Demo on DB Fiddle:
metal_id | cnt
:------- | --:
1 | 5
2 | 1
5 | 4

Postgres query nested JSONB

I have a JSONB column containing list of objects.>
Here's the table schema:
column Name | Datatype
---------------------
timestamp | timestamp
data | JSONB
Sample Data
1.
timestamp : 2020-02-02 19:01:21.571429+00
data : [
{
"tracker_id": "5",
"position": 1
},
{
"tracker_id": "11",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]
2.
timestamp : 2020-02-02 19:01:23.571429+00
data : [
{
"tracker_id": "7",
"position": 3
},
{
"tracker_id": "4",
"position": 2
}
]
3.
timestamp : 2020-02-02 19:02:23.571429+00
data : [
{
"tracker_id": "5",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]
I need to find the count of the transitions of tracker_id from position: 1 to position: 2
Here, the output will be 2, since tracker_id 4 and 5 changed their position from 1 to 2.
Note
The transition should be in ascending order depending on the timestamp
The position change need not to be in the consecutive records.
I'm using timescaledb extension
So far I've tried querying the objects in the list od individual record, but I'm not sure how to merge the list objects of each record and query them.
What would be the query for this? Should I write down a stored procedure instead?
I don't use timescaledb extension so I would choose pure SQL solution based on unnesting json:
with t (timestamp,data) as (values
(timestamp '2020-02-02 19:01:21.571429+00', '[
{
"tracker_id": "5",
"position": 1
},
{
"tracker_id": "11",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]'::jsonb),
(timestamp '2020-02-02 19:01:23.571429+00', '[
{
"tracker_id": "7",
"position": 3
},
{
"tracker_id": "4",
"position": 2
}
]
'::jsonb),
(timestamp '2020-02-02 19:02:23.571429+00', '[
{
"tracker_id": "5",
"position": 2
},
{
"tracker_id": "4",
"position": 1
}
]
'::jsonb)
), unnested as (
select t.timestamp, r.tracker_id, r.position
from t
cross join lateral jsonb_to_recordset(t.data) AS r(tracker_id text, position int)
)
select count(*)
from unnested u1
join unnested u2
on u1.tracker_id = u2.tracker_id
and u1.position = 1
and u2.position = 2
and u1.timestamp < u2.timestamp;
There are various functions that will help compose several database rows into a single JSON structure: row_to_json(), array_to_json(), and, array_agg().
You will then use the usual SELECT with an ORDER BY clause to get the timestamps/JSON data you want and the use above functions to create a single JSON struture.